[jira] [Assigned] (IMPALA-11345) Query failed when creating equal conjunction map for Parquet bloom filter

2022-07-22 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker reassigned IMPALA-11345:
--

Assignee: Daniel Becker

> Query failed when creating equal conjunction map for Parquet bloom filter
> -
>
> Key: IMPALA-11345
> URL: https://issues.apache.org/jira/browse/IMPALA-11345
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend, Distributed Exec
>Affects Versions: Impala 4.1.0
> Environment: CentOS-7, Impala-4.1
>Reporter: Yuchen Fan
>Assignee: Daniel Becker
>Priority: Critical
>
> When querying Hive table was added columns without using 'cascade', Impala 
> will encounter error like "Unable to find SchemaNode for path 
> 'db.table.column' in the schema of file 
> 'hdfs://xxx/path/to/parquet_file_before_add_column'." I checked parquet file 
> in error log and found that the schema is not compatible with table metadata. 
> Call stack is attached as below. Path and table name is masked: 
> {code:java}
> I0609 18:04:25.970052 115413 status.cc:129] 
> c94d0ab3fdf8f943:320300610002] Unable to find SchemaNode for path 
> 'xxx_db.xxx_table.xxx_column' in the schema of file 
> 'hdfs://xxx_nn/xxx_table_path/00_0'.
>     @           0xea543b  impala::Status::Status()
>     @          0x1e3225c  
> impala::HdfsParquetScanner::CreateColIdx2EqConjunctMap()
>     @          0x1e363ea  impala::HdfsParquetScanner::Open()
>     @          0x19b40d0  
> impala::HdfsScanNodeBase::CreateAndOpenScannerHelper()
>     @          0x1b5cbae  impala::HdfsScanNode::ProcessSplit()
>     @          0x1b5e12a  impala::HdfsScanNode::ScannerThread()
>     @          0x1b5e9c6  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
>     @          0x18eafa9  impala::Thread::SuperviseThread()
>     @          0x18ee11a  boost::detail::thread_data<>::run()
>     @          0x2385510  thread_proxy
>     @     0x7fb5b0745162  start_thread
>     @     0x7fb5ad21df6c  __clone{code}
> The error may be relation with 
> [IMPALA-10640|https://issues.apache.org/jira/browse/IMPALA-10640]. Bloom 
> filter requires right  hand values of equal conjunction matches with current 
> file schema. The filter will be unavailable if the column does not exist in 
> all parquet files scanned. I think we can disable parquet bloom filter for 
> this single query or scan node when discovered such situation.
> How to reproduce (using impala-shell):
>  # create table parquet_test (id INT) stored as parquet;
>  # insert into parquet_test values (1),(2),(3);
>  # alter table parquet_test add columns (name STRING);
>  # insert into parquet_test values (4, "James");
>  # select * from parquet_test where name in ("Lily");
>  # Error occured.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-11345) Query failed when creating equal conjunction map for Parquet bloom filter

2022-07-25 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-11345 started by Daniel Becker.
--
> Query failed when creating equal conjunction map for Parquet bloom filter
> -
>
> Key: IMPALA-11345
> URL: https://issues.apache.org/jira/browse/IMPALA-11345
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend, Distributed Exec
>Affects Versions: Impala 4.1.0
> Environment: CentOS-7, Impala-4.1
>Reporter: Yuchen Fan
>Assignee: Daniel Becker
>Priority: Critical
>
> When querying Hive table was added columns without using 'cascade', Impala 
> will encounter error like "Unable to find SchemaNode for path 
> 'db.table.column' in the schema of file 
> 'hdfs://xxx/path/to/parquet_file_before_add_column'." I checked parquet file 
> in error log and found that the schema is not compatible with table metadata. 
> Call stack is attached as below. Path and table name is masked: 
> {code:java}
> I0609 18:04:25.970052 115413 status.cc:129] 
> c94d0ab3fdf8f943:320300610002] Unable to find SchemaNode for path 
> 'xxx_db.xxx_table.xxx_column' in the schema of file 
> 'hdfs://xxx_nn/xxx_table_path/00_0'.
>     @           0xea543b  impala::Status::Status()
>     @          0x1e3225c  
> impala::HdfsParquetScanner::CreateColIdx2EqConjunctMap()
>     @          0x1e363ea  impala::HdfsParquetScanner::Open()
>     @          0x19b40d0  
> impala::HdfsScanNodeBase::CreateAndOpenScannerHelper()
>     @          0x1b5cbae  impala::HdfsScanNode::ProcessSplit()
>     @          0x1b5e12a  impala::HdfsScanNode::ScannerThread()
>     @          0x1b5e9c6  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
>     @          0x18eafa9  impala::Thread::SuperviseThread()
>     @          0x18ee11a  boost::detail::thread_data<>::run()
>     @          0x2385510  thread_proxy
>     @     0x7fb5b0745162  start_thread
>     @     0x7fb5ad21df6c  __clone{code}
> The error may be relation with 
> [IMPALA-10640|https://issues.apache.org/jira/browse/IMPALA-10640]. Bloom 
> filter requires right  hand values of equal conjunction matches with current 
> file schema. The filter will be unavailable if the column does not exist in 
> all parquet files scanned. I think we can disable parquet bloom filter for 
> this single query or scan node when discovered such situation.
> How to reproduce (using impala-shell):
>  # create table parquet_test (id INT) stored as parquet;
>  # insert into parquet_test values (1),(2),(3);
>  # alter table parquet_test add columns (name STRING);
>  # insert into parquet_test values (4, "James");
>  # select * from parquet_test where name in ("Lily");
>  # Error occured.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-11345) Query failed when creating equal conjunction map for Parquet bloom filter

2022-07-25 Thread Daniel Becker (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-11345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17570939#comment-17570939
 ] 

Daniel Becker commented on IMPALA-11345:


We don't have to disable Bloom filtering for the query, we can simply disregard 
the conjuncts that involve the column that is missing in the file.
Maybe we could even skip the whole file in cases where an EQ conjunct refers to 
a missing column because all the values should be NULL then. But I'm not 100% 
sure about the semantics of NULLs in this case and I guess it's not a very 
important use case so I'd choose ignoring these conjuncts - this is simpler and 
correctness is always preserved.

> Query failed when creating equal conjunction map for Parquet bloom filter
> -
>
> Key: IMPALA-11345
> URL: https://issues.apache.org/jira/browse/IMPALA-11345
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend, Distributed Exec
>Affects Versions: Impala 4.1.0
> Environment: CentOS-7, Impala-4.1
>Reporter: Yuchen Fan
>Assignee: Daniel Becker
>Priority: Critical
>
> When querying Hive table was added columns without using 'cascade', Impala 
> will encounter error like "Unable to find SchemaNode for path 
> 'db.table.column' in the schema of file 
> 'hdfs://xxx/path/to/parquet_file_before_add_column'." I checked parquet file 
> in error log and found that the schema is not compatible with table metadata. 
> Call stack is attached as below. Path and table name is masked: 
> {code:java}
> I0609 18:04:25.970052 115413 status.cc:129] 
> c94d0ab3fdf8f943:320300610002] Unable to find SchemaNode for path 
> 'xxx_db.xxx_table.xxx_column' in the schema of file 
> 'hdfs://xxx_nn/xxx_table_path/00_0'.
>     @           0xea543b  impala::Status::Status()
>     @          0x1e3225c  
> impala::HdfsParquetScanner::CreateColIdx2EqConjunctMap()
>     @          0x1e363ea  impala::HdfsParquetScanner::Open()
>     @          0x19b40d0  
> impala::HdfsScanNodeBase::CreateAndOpenScannerHelper()
>     @          0x1b5cbae  impala::HdfsScanNode::ProcessSplit()
>     @          0x1b5e12a  impala::HdfsScanNode::ScannerThread()
>     @          0x1b5e9c6  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
>     @          0x18eafa9  impala::Thread::SuperviseThread()
>     @          0x18ee11a  boost::detail::thread_data<>::run()
>     @          0x2385510  thread_proxy
>     @     0x7fb5b0745162  start_thread
>     @     0x7fb5ad21df6c  __clone{code}
> The error may be relation with 
> [IMPALA-10640|https://issues.apache.org/jira/browse/IMPALA-10640]. Bloom 
> filter requires right  hand values of equal conjunction matches with current 
> file schema. The filter will be unavailable if the column does not exist in 
> all parquet files scanned. I think we can disable parquet bloom filter for 
> this single query or scan node when discovered such situation.
> How to reproduce (using impala-shell):
>  # create table parquet_test (id INT) stored as parquet;
>  # insert into parquet_test values (1),(2),(3);
>  # alter table parquet_test add columns (name STRING);
>  # insert into parquet_test values (4, "James");
>  # select * from parquet_test where name in ("Lily");
>  # Error occured.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-11345) Query failed when creating equal conjunction map for Parquet bloom filter

2022-07-25 Thread Daniel Becker (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-11345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17570944#comment-17570944
 ] 

Daniel Becker commented on IMPALA-11345:


https://gerrit.cloudera.org/#/c/18779/
Please review.

> Query failed when creating equal conjunction map for Parquet bloom filter
> -
>
> Key: IMPALA-11345
> URL: https://issues.apache.org/jira/browse/IMPALA-11345
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend, Distributed Exec
>Affects Versions: Impala 4.1.0
> Environment: CentOS-7, Impala-4.1
>Reporter: Yuchen Fan
>Assignee: Daniel Becker
>Priority: Critical
>
> When querying Hive table was added columns without using 'cascade', Impala 
> will encounter error like "Unable to find SchemaNode for path 
> 'db.table.column' in the schema of file 
> 'hdfs://xxx/path/to/parquet_file_before_add_column'." I checked parquet file 
> in error log and found that the schema is not compatible with table metadata. 
> Call stack is attached as below. Path and table name is masked: 
> {code:java}
> I0609 18:04:25.970052 115413 status.cc:129] 
> c94d0ab3fdf8f943:320300610002] Unable to find SchemaNode for path 
> 'xxx_db.xxx_table.xxx_column' in the schema of file 
> 'hdfs://xxx_nn/xxx_table_path/00_0'.
>     @           0xea543b  impala::Status::Status()
>     @          0x1e3225c  
> impala::HdfsParquetScanner::CreateColIdx2EqConjunctMap()
>     @          0x1e363ea  impala::HdfsParquetScanner::Open()
>     @          0x19b40d0  
> impala::HdfsScanNodeBase::CreateAndOpenScannerHelper()
>     @          0x1b5cbae  impala::HdfsScanNode::ProcessSplit()
>     @          0x1b5e12a  impala::HdfsScanNode::ScannerThread()
>     @          0x1b5e9c6  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
>     @          0x18eafa9  impala::Thread::SuperviseThread()
>     @          0x18ee11a  boost::detail::thread_data<>::run()
>     @          0x2385510  thread_proxy
>     @     0x7fb5b0745162  start_thread
>     @     0x7fb5ad21df6c  __clone{code}
> The error may be relation with 
> [IMPALA-10640|https://issues.apache.org/jira/browse/IMPALA-10640]. Bloom 
> filter requires right  hand values of equal conjunction matches with current 
> file schema. The filter will be unavailable if the column does not exist in 
> all parquet files scanned. I think we can disable parquet bloom filter for 
> this single query or scan node when discovered such situation.
> How to reproduce (using impala-shell):
>  # create table parquet_test (id INT) stored as parquet;
>  # insert into parquet_test values (1),(2),(3);
>  # alter table parquet_test add columns (name STRING);
>  # insert into parquet_test values (4, "James");
>  # select * from parquet_test where name in ("Lily");
>  # Error occured.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-11345) Query failed when creating equal conjunction map for Parquet bloom filter

2022-08-01 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker resolved IMPALA-11345.

Resolution: Fixed

> Query failed when creating equal conjunction map for Parquet bloom filter
> -
>
> Key: IMPALA-11345
> URL: https://issues.apache.org/jira/browse/IMPALA-11345
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend, Distributed Exec
>Affects Versions: Impala 4.1.0
> Environment: CentOS-7, Impala-4.1
>Reporter: Yuchen Fan
>Assignee: Daniel Becker
>Priority: Critical
>
> When querying Hive table was added columns without using 'cascade', Impala 
> will encounter error like "Unable to find SchemaNode for path 
> 'db.table.column' in the schema of file 
> 'hdfs://xxx/path/to/parquet_file_before_add_column'." I checked parquet file 
> in error log and found that the schema is not compatible with table metadata. 
> Call stack is attached as below. Path and table name is masked: 
> {code:java}
> I0609 18:04:25.970052 115413 status.cc:129] 
> c94d0ab3fdf8f943:320300610002] Unable to find SchemaNode for path 
> 'xxx_db.xxx_table.xxx_column' in the schema of file 
> 'hdfs://xxx_nn/xxx_table_path/00_0'.
>     @           0xea543b  impala::Status::Status()
>     @          0x1e3225c  
> impala::HdfsParquetScanner::CreateColIdx2EqConjunctMap()
>     @          0x1e363ea  impala::HdfsParquetScanner::Open()
>     @          0x19b40d0  
> impala::HdfsScanNodeBase::CreateAndOpenScannerHelper()
>     @          0x1b5cbae  impala::HdfsScanNode::ProcessSplit()
>     @          0x1b5e12a  impala::HdfsScanNode::ScannerThread()
>     @          0x1b5e9c6  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
>     @          0x18eafa9  impala::Thread::SuperviseThread()
>     @          0x18ee11a  boost::detail::thread_data<>::run()
>     @          0x2385510  thread_proxy
>     @     0x7fb5b0745162  start_thread
>     @     0x7fb5ad21df6c  __clone{code}
> The error may be relation with 
> [IMPALA-10640|https://issues.apache.org/jira/browse/IMPALA-10640]. Bloom 
> filter requires right  hand values of equal conjunction matches with current 
> file schema. The filter will be unavailable if the column does not exist in 
> all parquet files scanned. I think we can disable parquet bloom filter for 
> this single query or scan node when discovered such situation.
> How to reproduce (using impala-shell):
>  # create table parquet_test (id INT) stored as parquet;
>  # insert into parquet_test values (1),(2),(3);
>  # alter table parquet_test add columns (name STRING);
>  # insert into parquet_test values (4, "James");
>  # select * from parquet_test where name in ("Lily");
>  # Error occured.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-10800) Tidy up the be/src/exec directory

2022-08-01 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker reassigned IMPALA-10800:
--

Assignee: Daniel Becker

> Tidy up the be/src/exec directory
> -
>
> Key: IMPALA-10800
> URL: https://issues.apache.org/jira/browse/IMPALA-10800
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Zoltán Borók-Nagy
>Assignee: Daniel Becker
>Priority: Major
>  Labels: newbie, ramp-up
>
> Parquet-related code is already moved to exec/parquet.
> The same should be done with ORC, Kudu, HBase, etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-10753) Incorrect length when multiple CHAR(N) values are inserted

2022-08-01 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker reassigned IMPALA-10753:
--

Assignee: Daniel Becker

> Incorrect length when multiple CHAR(N) values are inserted
> --
>
> Key: IMPALA-10753
> URL: https://issues.apache.org/jira/browse/IMPALA-10753
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Csaba Ringhofer
>Assignee: Daniel Becker
>Priority: Minor
>  Labels: correctness, ramp-up
>
> To reproduce:
> {code}
> CREATE TABLE impala_char_insert (s STRING);
> -- all values are CHAR(N) with different N, but all will use the biggest N
> INSERT OVERWRITE impala_char_insert VALUES (CAST("1" AS CHAR(1))), (CAST("12" 
> AS CHAR(2))), (CAST("123" AS CHAR(3)));
> SELECT length(s) FROM impala_char_insert;
> results:
> 3
> 3
> 3
> -- inserting the same values in separate INSERTs works correctly
> INSERT OVERWRITE impala_char_insert VALUES (CAST("1" AS CHAR(1)));
> INSERT INTO impala_char_insert VALUES (CAST("12" AS CHAR(2)));
> INSERT INTO impala_char_insert VALUES (CAST("123" AS CHAR(3)));
> SELECT length(s) FROM impala_char_insert;
> results:
> 1
> 2
> 3
> -- if one value is not CHAR(N), then the lengths are correct
> INSERT OVERWRITE impala_char_insert VALUES (CAST("1" AS CHAR(1))), (CAST("12" 
> AS VARCHAR(2))), (CAST("123" AS CHAR(3)));
> SELECT length(s) FROM impala_char_insert;
> results:
> 1
> 2
> 3
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-10800) Tidy up the be/src/exec directory

2022-08-02 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker reassigned IMPALA-10800:
--

Assignee: (was: Daniel Becker)

> Tidy up the be/src/exec directory
> -
>
> Key: IMPALA-10800
> URL: https://issues.apache.org/jira/browse/IMPALA-10800
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Zoltán Borók-Nagy
>Priority: Major
>  Labels: newbie, ramp-up
>
> Parquet-related code is already moved to exec/parquet.
> The same should be done with ORC, Kudu, HBase, etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-10800) Tidy up the be/src/exec directory

2022-08-02 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker reassigned IMPALA-10800:
--

Assignee: Daniel Becker

> Tidy up the be/src/exec directory
> -
>
> Key: IMPALA-10800
> URL: https://issues.apache.org/jira/browse/IMPALA-10800
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Zoltán Borók-Nagy
>Assignee: Daniel Becker
>Priority: Major
>  Labels: newbie, ramp-up
>
> Parquet-related code is already moved to exec/parquet.
> The same should be done with ORC, Kudu, HBase, etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-10800) Tidy up the be/src/exec directory

2022-08-02 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker reassigned IMPALA-10800:
--

Assignee: Peter Rozsa  (was: Daniel Becker)

> Tidy up the be/src/exec directory
> -
>
> Key: IMPALA-10800
> URL: https://issues.apache.org/jira/browse/IMPALA-10800
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Zoltán Borók-Nagy
>Assignee: Peter Rozsa
>Priority: Major
>  Labels: newbie, ramp-up
>
> Parquet-related code is already moved to exec/parquet.
> The same should be done with ORC, Kudu, HBase, etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9499) Display support for all complex types in a SELECT * query

2022-08-03 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker reassigned IMPALA-9499:
-

Assignee: Daniel Becker

> Display support for all complex types in a SELECT * query
> -
>
> Key: IMPALA-9499
> URL: https://issues.apache.org/jira/browse/IMPALA-9499
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Reporter: Gabor Kaszab
>Assignee: Daniel Becker
>Priority: Major
>  Labels: complextype, ramp-up
>
> Covers all complex types (Struct, Array, Map) for both Parquet and ORC file 
> formats.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-10356) Analyzed query in explain plan is not quite right for insert with values clause

2022-09-07 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker reassigned IMPALA-10356:
--

Assignee: Daniel Becker

> Analyzed query in explain plan is not quite right for insert with values 
> clause
> ---
>
> Key: IMPALA-10356
> URL: https://issues.apache.org/jira/browse/IMPALA-10356
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0.0
>Reporter: Tim Armstrong
>Assignee: Daniel Becker
>Priority: Major
>  Labels: newbie, ramp-up
>
> In impala-shell:
> {noformat}
> create table double_tbl (d double) stored as textfile;
> set explain_level=2;
> explain insert into double_tbl values (-0.43149576573887316);
> {noformat}
> {noformat}
> +--+
> | Explain String  
>  |
> +--+
> | Max Per-Host Resource Reservation: Memory=0B Threads=1  
>  |
> | Per-Host Resource Estimates: Memory=10MB
>  |
> | Codegen disabled by planner 
>  |
> | Analyzed query: SELECT CAST(-0.43149576573887316 AS DECIMAL(17,17)) UNION 
> SELECT |
> | CAST(-0.43149576573887316 AS DECIMAL(17,17))
>  |
> | 
>  |
> | F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1   
>  |
> | |  Per-Host Resources: mem-estimate=8B mem-reservation=0B 
> thread-reservation=1   |
> | WRITE TO HDFS [default.double_tbl, OVERWRITE=false] 
>  |
> | |  partitions=1 
>  |
> | |  output exprs: CAST(-0.43149576573887316 AS DOUBLE)   
>  |
> | |  mem-estimate=8B mem-reservation=0B thread-reservation=0  
>  |
> | |   
>  |
> | 00:UNION
>  |
> |constant-operands=1  
>  |
> |mem-estimate=0B mem-reservation=0B thread-reservation=0  
>  |
> |tuple-ids=0 row-size=8B cardinality=1
>  |
> |in pipelines:  
>  |
> +--+
> {noformat}
> The analyzed query does not make sense. We should investigate and fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10918) Allow map type in SELECT list

2022-09-08 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker resolved IMPALA-10918.

Resolution: Implemented

> Allow map type in SELECT list
> -
>
> Key: IMPALA-10918
> URL: https://issues.apache.org/jira/browse/IMPALA-10918
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend, Frontend
>Reporter: Gabor Kaszab
>Assignee: Daniel Becker
>Priority: Major
>  Labels: complextype
>
> This covers collections: Map
> Expected printout format:
> Map:   {"k1":2,"k2":null}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-11427) TestOrcStats.test_orc_stats fails

2022-09-20 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker resolved IMPALA-11427.

Resolution: Done

> TestOrcStats.test_orc_stats fails
> -
>
> Key: IMPALA-11427
> URL: https://issues.apache.org/jira/browse/IMPALA-11427
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Blocker
>  Labels: broken-build
>
> In one of the builds, query_test.test_orc_stats.TestOrcStats.test_orc_stats 
> fails:
> {code:java}
> query_test/test_orc_stats.py:40: in test_orc_stats
> self.run_test_case('QueryTest/orc-stats', vector, use_db=unique_database)
> common/impala_test_suite.py:820: in run_test_case
> update_section=pytest.config.option.update_results)
> common/test_result_verifier.py:665: in verify_runtime_profile
> % (function, field, expected_value, actual_value, op, actual))
> E   AssertionError: Aggregation of SUM over RowsRead did not match expected 
> results.
> E   EXPECTED VALUE:
> E   5
> E   
> E   
> E   ACTUAL VALUE:
> E   0
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Reopened] (IMPALA-10753) Incorrect length when multiple CHAR(N) values are inserted

2022-09-20 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker reopened IMPALA-10753:


Closed by mistake.

> Incorrect length when multiple CHAR(N) values are inserted
> --
>
> Key: IMPALA-10753
> URL: https://issues.apache.org/jira/browse/IMPALA-10753
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Csaba Ringhofer
>Assignee: Daniel Becker
>Priority: Minor
>  Labels: correctness, ramp-up
>
> To reproduce:
> {code}
> CREATE TABLE impala_char_insert (s STRING);
> -- all values are CHAR(N) with different N, but all will use the biggest N
> INSERT OVERWRITE impala_char_insert VALUES (CAST("1" AS CHAR(1))), (CAST("12" 
> AS CHAR(2))), (CAST("123" AS CHAR(3)));
> SELECT length(s) FROM impala_char_insert;
> results:
> 3
> 3
> 3
> -- inserting the same values in separate INSERTs works correctly
> INSERT OVERWRITE impala_char_insert VALUES (CAST("1" AS CHAR(1)));
> INSERT INTO impala_char_insert VALUES (CAST("12" AS CHAR(2)));
> INSERT INTO impala_char_insert VALUES (CAST("123" AS CHAR(3)));
> SELECT length(s) FROM impala_char_insert;
> results:
> 1
> 2
> 3
> -- if one value is not CHAR(N), then the lengths are correct
> INSERT OVERWRITE impala_char_insert VALUES (CAST("1" AS CHAR(1))), (CAST("12" 
> AS VARCHAR(2))), (CAST("123" AS CHAR(3)));
> SELECT length(s) FROM impala_char_insert;
> results:
> 1
> 2
> 3
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10753) Incorrect length when multiple CHAR(N) values are inserted

2022-09-20 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker resolved IMPALA-10753.

Resolution: Cannot Reproduce

> Incorrect length when multiple CHAR(N) values are inserted
> --
>
> Key: IMPALA-10753
> URL: https://issues.apache.org/jira/browse/IMPALA-10753
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Csaba Ringhofer
>Assignee: Daniel Becker
>Priority: Minor
>  Labels: correctness, ramp-up
>
> To reproduce:
> {code}
> CREATE TABLE impala_char_insert (s STRING);
> -- all values are CHAR(N) with different N, but all will use the biggest N
> INSERT OVERWRITE impala_char_insert VALUES (CAST("1" AS CHAR(1))), (CAST("12" 
> AS CHAR(2))), (CAST("123" AS CHAR(3)));
> SELECT length(s) FROM impala_char_insert;
> results:
> 3
> 3
> 3
> -- inserting the same values in separate INSERTs works correctly
> INSERT OVERWRITE impala_char_insert VALUES (CAST("1" AS CHAR(1)));
> INSERT INTO impala_char_insert VALUES (CAST("12" AS CHAR(2)));
> INSERT INTO impala_char_insert VALUES (CAST("123" AS CHAR(3)));
> SELECT length(s) FROM impala_char_insert;
> results:
> 1
> 2
> 3
> -- if one value is not CHAR(N), then the lengths are correct
> INSERT OVERWRITE impala_char_insert VALUES (CAST("1" AS CHAR(1))), (CAST("12" 
> AS VARCHAR(2))), (CAST("123" AS CHAR(3)));
> SELECT length(s) FROM impala_char_insert;
> results:
> 1
> 2
> 3
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-11432) TestRanger.test_grant_revoke_with_role fails with impalad stuck at startup

2022-09-20 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker resolved IMPALA-11432.

Resolution: Cannot Reproduce

> TestRanger.test_grant_revoke_with_role fails with impalad stuck at startup
> --
>
> Key: IMPALA-11432
> URL: https://issues.apache.org/jira/browse/IMPALA-11432
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Blocker
>  Labels: broken-build
>
> In one of the exhaustive builds 
> authorization.test_ranger.TestRanger.test_grant_revoke_with_role failed with 
> one of the impalads stuck during startup:
> Stacktrace
> {code:java}
> common/custom_cluster_test_suite.py:181: in setup_method
> self._start_impala_cluster(cluster_args, **kwargs)
> common/custom_cluster_test_suite.py:285: in _start_impala_cluster
> check_call(cmd + options, close_fds=True)
> /data/jenkins/workspace/impala-cdpd-master-exhaustive-release/Impala-Toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/subprocess.py:190:
>  in check_call
> raise CalledProcessError(retcode, cmd)
> E   CalledProcessError: Command 
> '['/data/jenkins/workspace/impala-cdpd-master-exhaustive-release/repos/Impala/bin/start-impala-cluster.py',
>  '--state_store_args=--statestore_update_frequency_ms=50 
> --statestore_priority_update_frequency_ms=50 
> --statestore_heartbeat_frequency_ms=50', '--cluster_size=3', 
> '--num_coordinators=3', 
> '--log_dir=/data/jenkins/workspace/impala-cdpd-master-exhaustive-release/repos/Impala/logs/custom_cluster_tests',
>  '--log_level=1', '--impalad_args=--server-name=server1 
> --ranger_service_type=hive --ranger_app_id=impala 
> --authorization_provider=ranger 
> --use_customized_user_groups_mapper_for_ranger ', '--state_store_args=None ', 
> '--catalogd_args=--server-name=server1 --ranger_service_type=hive 
> --ranger_app_id=impala --authorization_provider=ranger 
> --use_customized_user_groups_mapper_for_ranger ', 
> '--impalad_args=--default_query_options=']' returned non-zero exit status 1
> {code}
> Standard Error
> {code:java}
> -- 2022-07-14 01:07:04,943 INFO MainThread: Starting cluster with 
> command: 
> /data/jenkins/workspace/impala-cdpd-master-exhaustive-release/repos/Impala/bin/start-impala-cluster.py
>  '--state_store_args=--statestore_update_frequency_ms=50 
> --statestore_priority_update_frequency_ms=50 
> --statestore_heartbeat_frequency_ms=50' --cluster_size=3 --num_coordinators=3 
> --log_dir=/data/jenkins/workspace/impala-cdpd-master-exhaustive-release/repos/Impala/logs/custom_cluster_tests
>  --log_level=1 '--impalad_args=--server-name=server1 
> --ranger_service_type=hive --ranger_app_id=impala 
> --authorization_provider=ranger 
> --use_customized_user_groups_mapper_for_ranger ' '--state_store_args=None ' 
> '--catalogd_args=--server-name=server1 --ranger_service_type=hive 
> --ranger_app_id=impala --authorization_provider=ranger 
> --use_customized_user_groups_mapper_for_ranger ' 
> --impalad_args=--default_query_options=
> 01:07:05 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es)
> 01:07:05 MainThread: Starting State Store logging to 
> /data/jenkins/workspace/impala-cdpd-master-exhaustive-release/repos/Impala/logs/custom_cluster_tests/statestored.INFO
> 01:07:05 MainThread: Starting Catalog Service logging to 
> /data/jenkins/workspace/impala-cdpd-master-exhaustive-release/repos/Impala/logs/custom_cluster_tests/catalogd.INFO
> 01:07:05 MainThread: Starting Impala Daemon logging to 
> /data/jenkins/workspace/impala-cdpd-master-exhaustive-release/repos/Impala/logs/custom_cluster_tests/impalad.INFO
> 01:07:05 MainThread: Starting Impala Daemon logging to 
> /data/jenkins/workspace/impala-cdpd-master-exhaustive-release/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO
> 01:07:05 MainThread: Starting Impala Daemon logging to 
> /data/jenkins/workspace/impala-cdpd-master-exhaustive-release/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO
> 01:07:08 MainThread: Found 2 impalad/1 statestored/1 catalogd process(es)
> 01:07:09 MainThread: Found 2 impalad/1 statestored/1 catalogd process(es)
> 01:07:10 MainThread: Found 2 impalad/1 statestored/1 catalogd process(es)
> 01:07:11 MainThread: Found 2 impalad/1 statestored/1 catalogd process(es)
> 01:07:12 MainThread: Found 2 impalad/1 statestored/1 catalogd process(es)
> 01:07:13 MainThread: Found 2 impalad/1 statestored/1 catalogd process(es)
> 01:07:14 MainThread: Found 2 impalad/1 statestored/1 catalogd process(es)
> 01:07:15 MainThread: Found 2 impalad/1 statestored/1 catalogd process(es)
> 01:07:16 MainThread: Found 2 impalad/1 statestored/1 catalogd process(es)
> 01:07:17 MainThread: Found 2 impalad/1 statestored/1 catalogd process(e

[jira] [Resolved] (IMPALA-11431) TestComputeStatsWithNestedTypes.test_compute_stats_with_structs fails in an exhaustive build

2022-09-20 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker resolved IMPALA-11431.

Resolution: Cannot Reproduce

> TestComputeStatsWithNestedTypes.test_compute_stats_with_structs fails in an 
> exhaustive build
> 
>
> Key: IMPALA-11431
> URL: https://issues.apache.org/jira/browse/IMPALA-11431
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Blocker
>  Labels: broken-build
>
> In one of the exhaustive builds, 
> query_test.test_nested_types.TestComputeStatsWithNestedTypes.test_compute_stats_with_structs
>  fails:
> {code:java}
> query_test/test_nested_types.py:252: in test_compute_stats_with_structs
> self.run_test_case('QueryTest/compute-stats-with-structs', vector)
> common/impala_test_suite.py:778: in run_test_case
> self.__verify_results_and_errors(vector, test_section, result, use_db)
> common/impala_test_suite.py:588: in __verify_results_and_errors
> replace_filenames_with_placeholder)
> common/test_result_verifier.py:469: in verify_raw_results
> VERIFIER_MAP[verifier](expected, actual)
> common/test_result_verifier.py:278: in verify_query_result_is_equal
> assert expected_results == actual_results
> E   assert Comparing QueryTestResults (expected vs actual):
> E 
> 'alltypes','STRUCT',-1,-1,-1,-1.0,-1,-1
>  == 
> 'alltypes','STRUCT',-1,-1,-1,-1,-1,-1
> E 'id','INT',6,0,4,4.0,-1,-1 != 'id','INT',-1,-1,4,4,-1,-1
> E 'small_struct','STRUCT',-1,-1,-1,-1.0,-1,-1 == 
> 'small_struct','STRUCT',-1,-1,-1,-1,-1,-1
> E 'str','STRING',6,0,11,10.330154,-1,-1 != 
> 'str','STRING',-1,-1,-1,-1,-1,-1
> E 'tiny_struct','STRUCT',-1,-1,-1,-1.0,-1,-1 == 
> 'tiny_struct','STRUCT',-1,-1,-1,-1,-1,-1
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-11425) Python TypeError: super() takes at least 1 argument (0 given)

2022-09-20 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker resolved IMPALA-11425.

Resolution: Duplicate

> Python TypeError: super() takes at least 1 argument (0 given)
> -
>
> Key: IMPALA-11425
> URL: https://issues.apache.org/jira/browse/IMPALA-11425
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Blocker
>  Labels: broken-build
>
> The following error happens in various builds during tarball creation:
> {code:java}
> Traceback (most recent call last):
>   File "setup.py", line 167, in 
> 'Topic :: Database :: Front-Ends'
>   File "/usr/lib64/python2.7/distutils/core.py", line 152, in setup
> dist.run_commands()
>   File "/usr/lib64/python2.7/distutils/dist.py", line 953, in run_commands
> self.run_command(cmd)
>   File "/usr/lib64/python2.7/distutils/dist.py", line 972, in run_command
> cmd_obj.run()
>   File 
> "/tmp/impala-venv-huM4f/lib/python2.7/site-packages/setuptools/command/sdist.py",
>  line 153, in run
> self.run_command(cmd_name)
>   File "/usr/lib64/python2.7/distutils/cmd.py", line 326, in run_command
> self.distribution.run_command(command)
>   File "/usr/lib64/python2.7/distutils/dist.py", line 970, in run_command
> cmd_obj = self.get_command_obj(command)
>   File "/usr/lib64/python2.7/distutils/dist.py", line 845, in get_command_obj
> klass = self.get_command_class(command)
>   File 
> "/tmp/impala-venv-huM4f/lib/python2.7/site-packages/setuptools/dist.py", line 
> 410, in get_command_class
> return _Distribution.get_command_class(self, command)
>   File "/usr/lib64/python2.7/distutils/dist.py", line 815, in 
> get_command_class
> __import__ (module_name)
>   File "/usr/lib64/python2.7/distutils/command/check.py", line 13, in 
> from docutils.utils import Reporter
>   File 
> "/tmp/impala-venv-huM4f/lib/python2.7/site-packages/docutils/__init__.py", 
> line 123, in 
> release=True  # True for official releases and pre-releases
>   File 
> "/tmp/impala-venv-huM4f/lib/python2.7/site-packages/docutils/__init__.py", 
> line 93, in __new__
> return super().__new__(cls, major, minor, micro,
> TypeError: super() takes at least 1 argument (0 given)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-10753) Incorrect length when multiple CHAR(N) values are inserted

2022-09-23 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-10753 started by Daniel Becker.
--
> Incorrect length when multiple CHAR(N) values are inserted
> --
>
> Key: IMPALA-10753
> URL: https://issues.apache.org/jira/browse/IMPALA-10753
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Csaba Ringhofer
>Assignee: Daniel Becker
>Priority: Minor
>  Labels: correctness, ramp-up
>
> To reproduce:
> {code}
> CREATE TABLE impala_char_insert (s STRING);
> -- all values are CHAR(N) with different N, but all will use the biggest N
> INSERT OVERWRITE impala_char_insert VALUES (CAST("1" AS CHAR(1))), (CAST("12" 
> AS CHAR(2))), (CAST("123" AS CHAR(3)));
> SELECT length(s) FROM impala_char_insert;
> results:
> 3
> 3
> 3
> -- inserting the same values in separate INSERTs works correctly
> INSERT OVERWRITE impala_char_insert VALUES (CAST("1" AS CHAR(1)));
> INSERT INTO impala_char_insert VALUES (CAST("12" AS CHAR(2)));
> INSERT INTO impala_char_insert VALUES (CAST("123" AS CHAR(3)));
> SELECT length(s) FROM impala_char_insert;
> results:
> 1
> 2
> 3
> -- if one value is not CHAR(N), then the lengths are correct
> INSERT OVERWRITE impala_char_insert VALUES (CAST("1" AS CHAR(1))), (CAST("12" 
> AS VARCHAR(2))), (CAST("123" AS CHAR(3)));
> SELECT length(s) FROM impala_char_insert;
> results:
> 1
> 2
> 3
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10356) Analyzed query in explain plan is not quite right for insert with values clause

2022-09-26 Thread Daniel Becker (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609449#comment-17609449
 ] 

Daniel Becker commented on IMPALA-10356:


The problem seems to be with how we print the analysed statement, not the 
analysed statement itself. {{SetOperationStmt.toSql()}} implicitly assumes that 
there are at least 2 operands: first it prints the first one separately, then 
prints operands from the second to the one before the last in a loop, then the 
last one also separately. The problem is that when there is only one operand, 
the first and the last ones are the same but no check is performed. See 
[https://github.com/apache/impala/blob/296e94411d3344e2969d4b083036ff238e80ad19/fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java#L540]

A {{SetOperationStmt}} with only one operand is only possible in a 
{{{}ValuesStms{}}}, which is a specialised {{{}UnionStmt{}}}; otherwise it is 
syntactically impossible. The question is how we should print \{{ValuesStmt}}s 
with a single operand:
 * print only the operand
 ** {*}pro{*}: reflects that there is no set operation in the original SQL 
statement
 ** {*}con{*}: doesn't reflect how we actually represent the analysed query in 
Impala
 * print a union with 2 identical operands (this is what we do now)
 ** {*}pro{*}: reflects the representation of the analysed query in that there 
is a UNION statement
 ** {*}con{*}: adds a second operand that is not present in either the original 
SQL or the analysed query
 * invent some syntax to print a union with a single operand
 ** {*}pro{*}: reflects how we  represent the analysed query
 ** {*}con{*}: prints invalid SQL
 * don't convert single-operand VALUES clauses to a UnionStmt
 ** {*}pro{*}: we can correctly represent the analysed query; simplify the 
statement tree - one less level
 ** {*}con{*}: different handling of single-operand VALUES clauses than other 
VALUES clauses

> Analyzed query in explain plan is not quite right for insert with values 
> clause
> ---
>
> Key: IMPALA-10356
> URL: https://issues.apache.org/jira/browse/IMPALA-10356
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0.0
>Reporter: Tim Armstrong
>Assignee: Daniel Becker
>Priority: Major
>  Labels: newbie, ramp-up
>
> In impala-shell:
> {noformat}
> create table double_tbl (d double) stored as textfile;
> set explain_level=2;
> explain insert into double_tbl values (-0.43149576573887316);
> {noformat}
> {noformat}
> +--+
> | Explain String  
>  |
> +--+
> | Max Per-Host Resource Reservation: Memory=0B Threads=1  
>  |
> | Per-Host Resource Estimates: Memory=10MB
>  |
> | Codegen disabled by planner 
>  |
> | Analyzed query: SELECT CAST(-0.43149576573887316 AS DECIMAL(17,17)) UNION 
> SELECT |
> | CAST(-0.43149576573887316 AS DECIMAL(17,17))
>  |
> | 
>  |
> | F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1   
>  |
> | |  Per-Host Resources: mem-estimate=8B mem-reservation=0B 
> thread-reservation=1   |
> | WRITE TO HDFS [default.double_tbl, OVERWRITE=false] 
>  |
> | |  partitions=1 
>  |
> | |  output exprs: CAST(-0.43149576573887316 AS DOUBLE)   
>  |
> | |  mem-estimate=8B mem-reservation=0B thread-reservation=0  
>  |
> | |   
>  |
> | 00:UNION
>  |
> |constant-operands=1  
>  |
> |mem-estimate=0B mem-reservation=0B thread-reservation=0  
>  |
> |tuple-ids=0 row-size=8B cardinality=1
>  |
> |in pipelines:  
>  |
> +--+
> {noformat}
> The analyzed query does not make sense. We should investigate and fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-al

[jira] [Commented] (IMPALA-10356) Analyzed query in explain plan is not quite right for insert with values clause

2022-09-27 Thread Daniel Becker (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609919#comment-17609919
 ] 

Daniel Becker commented on IMPALA-10356:


Thanks [~csringhofer] . If there are multiple operands the operation is UNION 
ALL.

> Analyzed query in explain plan is not quite right for insert with values 
> clause
> ---
>
> Key: IMPALA-10356
> URL: https://issues.apache.org/jira/browse/IMPALA-10356
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0.0
>Reporter: Tim Armstrong
>Assignee: Daniel Becker
>Priority: Major
>  Labels: newbie, ramp-up
>
> In impala-shell:
> {noformat}
> create table double_tbl (d double) stored as textfile;
> set explain_level=2;
> explain insert into double_tbl values (-0.43149576573887316);
> {noformat}
> {noformat}
> +--+
> | Explain String  
>  |
> +--+
> | Max Per-Host Resource Reservation: Memory=0B Threads=1  
>  |
> | Per-Host Resource Estimates: Memory=10MB
>  |
> | Codegen disabled by planner 
>  |
> | Analyzed query: SELECT CAST(-0.43149576573887316 AS DECIMAL(17,17)) UNION 
> SELECT |
> | CAST(-0.43149576573887316 AS DECIMAL(17,17))
>  |
> | 
>  |
> | F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1   
>  |
> | |  Per-Host Resources: mem-estimate=8B mem-reservation=0B 
> thread-reservation=1   |
> | WRITE TO HDFS [default.double_tbl, OVERWRITE=false] 
>  |
> | |  partitions=1 
>  |
> | |  output exprs: CAST(-0.43149576573887316 AS DOUBLE)   
>  |
> | |  mem-estimate=8B mem-reservation=0B thread-reservation=0  
>  |
> | |   
>  |
> | 00:UNION
>  |
> |constant-operands=1  
>  |
> |mem-estimate=0B mem-reservation=0B thread-reservation=0  
>  |
> |tuple-ids=0 row-size=8B cardinality=1
>  |
> |in pipelines:  
>  |
> +--+
> {noformat}
> The analyzed query does not make sense. We should investigate and fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-10356) Analyzed query in explain plan is not quite right for insert with values clause

2022-09-28 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-10356 started by Daniel Becker.
--
> Analyzed query in explain plan is not quite right for insert with values 
> clause
> ---
>
> Key: IMPALA-10356
> URL: https://issues.apache.org/jira/browse/IMPALA-10356
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0.0
>Reporter: Tim Armstrong
>Assignee: Daniel Becker
>Priority: Major
>  Labels: newbie, ramp-up
>
> In impala-shell:
> {noformat}
> create table double_tbl (d double) stored as textfile;
> set explain_level=2;
> explain insert into double_tbl values (-0.43149576573887316);
> {noformat}
> {noformat}
> +--+
> | Explain String  
>  |
> +--+
> | Max Per-Host Resource Reservation: Memory=0B Threads=1  
>  |
> | Per-Host Resource Estimates: Memory=10MB
>  |
> | Codegen disabled by planner 
>  |
> | Analyzed query: SELECT CAST(-0.43149576573887316 AS DECIMAL(17,17)) UNION 
> SELECT |
> | CAST(-0.43149576573887316 AS DECIMAL(17,17))
>  |
> | 
>  |
> | F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1   
>  |
> | |  Per-Host Resources: mem-estimate=8B mem-reservation=0B 
> thread-reservation=1   |
> | WRITE TO HDFS [default.double_tbl, OVERWRITE=false] 
>  |
> | |  partitions=1 
>  |
> | |  output exprs: CAST(-0.43149576573887316 AS DOUBLE)   
>  |
> | |  mem-estimate=8B mem-reservation=0B thread-reservation=0  
>  |
> | |   
>  |
> | 00:UNION
>  |
> |constant-operands=1  
>  |
> |mem-estimate=0B mem-reservation=0B thread-reservation=0  
>  |
> |tuple-ids=0 row-size=8B cardinality=1
>  |
> |in pipelines:  
>  |
> +--+
> {noformat}
> The analyzed query does not make sense. We should investigate and fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-11623) Put *-ir.cc files into their own libraries to avoid extra recompilation

2022-09-29 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker reassigned IMPALA-11623:
--

Assignee: Daniel Becker

> Put *-ir.cc files into their own libraries to avoid extra recompilation
> ---
>
> Key: IMPALA-11623
> URL: https://issues.apache.org/jira/browse/IMPALA-11623
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 4.2.0
>Reporter: Joe McDonnell
>Assignee: Daniel Becker
>Priority: Major
>
> It is desirable to be able to iterate quickly by running "make -j impalad" 
> while modifying a file. Currently, modifying most files incurs a rebuild of 
> the LLVM IR, which is a slow serial step. For example:
>  
> {noformat}
> $ touch be/src/runtime/coordinator.cc
> $ make -j impalad
> ...
> [ 98%] Generating ../../../llvm-ir/impala.bc
> [ 98%] Generating ../../../llvm-ir/impala-legacy-avx.bc
> [ 98%] Generating ../../generated-sources/impala-ir/impala-ir.cc
> [ 98%] Generating ../../generated-sources/impala-ir/impala-ir-legacy-avx.cc
> ...{noformat}
> This can add several seconds to an incremental build. This step happens for 
> files that do not actually impact the LLVM IR, so there are ways to avoid 
> this.
> The reason that LLVM IR is rebuilt is because it has a dependencies on Exec, 
> Exprs, Runtime, Udf, Util, and other libraries:
>  
> {noformat}
> add_custom_command(
>   OUTPUT ${IR_OUTPUT_FILE}
>   COMMAND ${LLVM_CLANG_EXECUTABLE} ${CLANG_IR_CXX_FLAGS} 
> ${PLATFORM_SPECIFIC_FLAGS}
>           ${CLANG_INCLUDE_FLAGS} ${IR_INPUT_FILES} -o ${IR_TMP_OUTPUT_FILE}
>   COMMAND ${LLVM_OPT_EXECUTABLE} ${LLVM_OPT_IR_FLAGS} < ${IR_TMP_OUTPUT_FILE} 
> > ${IR_OUTPUT_FILE}
>   COMMAND rm ${IR_TMP_OUTPUT_FILE}
>   DEPENDS Exec ExecAvro ExecKudu Exprs Runtime Udf Util ${IR_INPUT_FILES}
> ){noformat}
> From a correctness perspective, the LLVM IR only cares about things that 
> impact the content of the *-ir.cc files, because impala-ir.cc includes every 
> *-ir.cc file. That list of libraries is a superset of what is needed.
> If the *-ir.cc files were split off into their own libraries (i.e. ExecIr 
> rather than Exec), then this target would only depend on the ExecIr rather 
> than the larger Exec. This would reduce the number of files that would cause 
> LLVM IR to be rebuilt. That should reduce the runtime of an incremental "make 
> -j impalad" for quite a few C++ files.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-11623) Put *-ir.cc files into their own libraries to avoid extra recompilation

2022-09-29 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-11623 started by Daniel Becker.
--
> Put *-ir.cc files into their own libraries to avoid extra recompilation
> ---
>
> Key: IMPALA-11623
> URL: https://issues.apache.org/jira/browse/IMPALA-11623
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 4.2.0
>Reporter: Joe McDonnell
>Assignee: Daniel Becker
>Priority: Major
>
> It is desirable to be able to iterate quickly by running "make -j impalad" 
> while modifying a file. Currently, modifying most files incurs a rebuild of 
> the LLVM IR, which is a slow serial step. For example:
>  
> {noformat}
> $ touch be/src/runtime/coordinator.cc
> $ make -j impalad
> ...
> [ 98%] Generating ../../../llvm-ir/impala.bc
> [ 98%] Generating ../../../llvm-ir/impala-legacy-avx.bc
> [ 98%] Generating ../../generated-sources/impala-ir/impala-ir.cc
> [ 98%] Generating ../../generated-sources/impala-ir/impala-ir-legacy-avx.cc
> ...{noformat}
> This can add several seconds to an incremental build. This step happens for 
> files that do not actually impact the LLVM IR, so there are ways to avoid 
> this.
> The reason that LLVM IR is rebuilt is because it has a dependencies on Exec, 
> Exprs, Runtime, Udf, Util, and other libraries:
>  
> {noformat}
> add_custom_command(
>   OUTPUT ${IR_OUTPUT_FILE}
>   COMMAND ${LLVM_CLANG_EXECUTABLE} ${CLANG_IR_CXX_FLAGS} 
> ${PLATFORM_SPECIFIC_FLAGS}
>           ${CLANG_INCLUDE_FLAGS} ${IR_INPUT_FILES} -o ${IR_TMP_OUTPUT_FILE}
>   COMMAND ${LLVM_OPT_EXECUTABLE} ${LLVM_OPT_IR_FLAGS} < ${IR_TMP_OUTPUT_FILE} 
> > ${IR_OUTPUT_FILE}
>   COMMAND rm ${IR_TMP_OUTPUT_FILE}
>   DEPENDS Exec ExecAvro ExecKudu Exprs Runtime Udf Util ${IR_INPUT_FILES}
> ){noformat}
> From a correctness perspective, the LLVM IR only cares about things that 
> impact the content of the *-ir.cc files, because impala-ir.cc includes every 
> *-ir.cc file. That list of libraries is a superset of what is needed.
> If the *-ir.cc files were split off into their own libraries (i.e. ExecIr 
> rather than Exec), then this target would only depend on the ExecIr rather 
> than the larger Exec. This would reduce the number of files that would cause 
> LLVM IR to be rebuilt. That should reduce the runtime of an incremental "make 
> -j impalad" for quite a few C++ files.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-11623) Put *-ir.cc files into their own libraries to avoid extra recompilation

2022-09-30 Thread Daniel Becker (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-11623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17611521#comment-17611521
 ] 

Daniel Becker commented on IMPALA-11623:


I tried using a CMake Interface library but it doesn't seem to be good. I may 
be doing something wrong but even if I touch an ir.cc file it doesn't recompile 
the LLVM IR.

But even if it worked, we would still have to leave the *-ir.cc files in the 
normal libraries, too, because INTERFACE libraries don't actually compile code. 
This would mean we would have to add new *-ir.cc files in 2 places. A normal 
library seems to be better.

> Put *-ir.cc files into their own libraries to avoid extra recompilation
> ---
>
> Key: IMPALA-11623
> URL: https://issues.apache.org/jira/browse/IMPALA-11623
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 4.2.0
>Reporter: Joe McDonnell
>Assignee: Daniel Becker
>Priority: Major
>
> It is desirable to be able to iterate quickly by running "make -j impalad" 
> while modifying a file. Currently, modifying most files incurs a rebuild of 
> the LLVM IR, which is a slow serial step. For example:
>  
> {noformat}
> $ touch be/src/runtime/coordinator.cc
> $ make -j impalad
> ...
> [ 98%] Generating ../../../llvm-ir/impala.bc
> [ 98%] Generating ../../../llvm-ir/impala-legacy-avx.bc
> [ 98%] Generating ../../generated-sources/impala-ir/impala-ir.cc
> [ 98%] Generating ../../generated-sources/impala-ir/impala-ir-legacy-avx.cc
> ...{noformat}
> This can add several seconds to an incremental build. This step happens for 
> files that do not actually impact the LLVM IR, so there are ways to avoid 
> this.
> The reason that LLVM IR is rebuilt is because it has a dependencies on Exec, 
> Exprs, Runtime, Udf, Util, and other libraries:
>  
> {noformat}
> add_custom_command(
>   OUTPUT ${IR_OUTPUT_FILE}
>   COMMAND ${LLVM_CLANG_EXECUTABLE} ${CLANG_IR_CXX_FLAGS} 
> ${PLATFORM_SPECIFIC_FLAGS}
>           ${CLANG_INCLUDE_FLAGS} ${IR_INPUT_FILES} -o ${IR_TMP_OUTPUT_FILE}
>   COMMAND ${LLVM_OPT_EXECUTABLE} ${LLVM_OPT_IR_FLAGS} < ${IR_TMP_OUTPUT_FILE} 
> > ${IR_OUTPUT_FILE}
>   COMMAND rm ${IR_TMP_OUTPUT_FILE}
>   DEPENDS Exec ExecAvro ExecKudu Exprs Runtime Udf Util ${IR_INPUT_FILES}
> ){noformat}
> From a correctness perspective, the LLVM IR only cares about things that 
> impact the content of the *-ir.cc files, because impala-ir.cc includes every 
> *-ir.cc file. That list of libraries is a superset of what is needed.
> If the *-ir.cc files were split off into their own libraries (i.e. ExecIr 
> rather than Exec), then this target would only depend on the ExecIr rather 
> than the larger Exec. This would reduce the number of files that would cause 
> LLVM IR to be rebuilt. That should reduce the runtime of an incremental "make 
> -j impalad" for quite a few C++ files.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-11643) Implement ColumnType::ToIR() for non-scalar types

2022-10-07 Thread Daniel Becker (Jira)
Daniel Becker created IMPALA-11643:
--

 Summary: Implement ColumnType::ToIR() for non-scalar types
 Key: IMPALA-11643
 URL: https://issues.apache.org/jira/browse/IMPALA-11643
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Reporter: Daniel Becker
Assignee: Daniel Becker


Currently ColumnType::ToIR() is only implemented for scalar types. It should be 
extended to support all types.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-11643) Implement ColumnType::ToIR() for non-scalar types

2022-10-07 Thread Daniel Becker (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-11643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17614117#comment-17614117
 ] 

Daniel Becker commented on IMPALA-11643:


The reason why structs are not supported is that information about the children 
is stored in std::vectors, which are difficult to deal with in LLVM code. As 
the layout of std::vector depends on the compiler and can change (and is also 
quite complicated), we shouldn't touch it from hand-crafted LLVM code directly, 
only through IR functions (functions compiled from C++ to LLVM).

A possible solution is to add an IR function that takes a ColumnType* and 
inserts an element into its vectors. We could construct the non-vector parts of 
ColumnType in LLVM as we do now, then call this function from hand-crafted LLVM 
code repeatedly to insert the necessary elements into the vectors.

However, the resulting value is no longer an llvm::ConstantStruct*, which is 
the return type of ColumnType::ToIR(). Some callers depend on the result being 
a constant, so we can't change the return type in these cases.

A solution is to have two functions: one that returns a constant and is only 
valid for non-struct types, and a general function that supports all types and 
returns llvm::Value*. Creating a function that supports all types and returns a 
constant is not possible in my opinion without meddling with the internals of 
std::vector from LLVM code.

This is in a way just moving the problem a bit further as we still end up with 
a function that only supports scalar types. On the other hand, we would have 
full support for converting ColumnType objects to LLVM in the cases where 
constantness is not needed.

> Implement ColumnType::ToIR() for non-scalar types
> -
>
> Key: IMPALA-11643
> URL: https://issues.apache.org/jira/browse/IMPALA-11643
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Minor
>  Labels: codegen
>
> Currently ColumnType::ToIR() is only implemented for scalar types. It should 
> be extended to support all types.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-11643) Implement ColumnType::ToIR() for non-scalar types

2022-10-07 Thread Daniel Becker (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-11643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17614118#comment-17614118
 ] 

Daniel Becker commented on IMPALA-11643:


[~csringhofer] what is your opinion on this?

> Implement ColumnType::ToIR() for non-scalar types
> -
>
> Key: IMPALA-11643
> URL: https://issues.apache.org/jira/browse/IMPALA-11643
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Minor
>  Labels: codegen
>
> Currently ColumnType::ToIR() is only implemented for scalar types. It should 
> be extended to support all types.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-11640) Build fails on Ubuntu 18/20 when using shared libraries

2022-10-10 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker reassigned IMPALA-11640:
--

Assignee: Daniel Becker

> Build fails on Ubuntu 18/20 when using shared libraries
> ---
>
> Key: IMPALA-11640
> URL: https://issues.apache.org/jira/browse/IMPALA-11640
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.2.0
>Reporter: Joe McDonnell
>Assignee: Daniel Becker
>Priority: Major
>
> When building on Ubuntu 18 or Ubuntu 20 with shared libraries (-so), the 
> build fails because the unifiedbetests binary fails to run as part of the 
> validate-unified-backend-test-filters.py invocation:
> {noformat}
> 16:39:22 F1005 23:39:22.237543 88570 unwind_safeness.cc:76] Check failed: 
> !error failed to find symbol dlopen: 
> /home/ubuntu/Impala/be/build/release/kudu_util/libkudu_util.so: undefined 
> symbol: dlopen
> 16:39:22 FAILED: Unified backend test executable returned an error when trying
> 16:39:22 to list tests.
> 16:39:22 Command: /home/ubuntu/Impala/bin/run-binary.sh 
> /home/ubuntu/Impala/be/build/release//service/unifiedbetests 
> --gtest_list_tests
> 16:39:22 Return Code: -6
> 16:39:22 stdout:
> 16:39:22 
> 16:39:22 stderr:
> 16:39:22 None{noformat}
> When building locally, other binaries also fail to execute with the same 
> message.
> One theory is that the code in unwind_safeness.cc has never worked, but 
> Ubuntu 16.04 is impacted by a glibc bug that prevents it from setting an 
> error ([https://sourceware.org/bugzilla/show_bug.cgi?id=19509]). Newer Ubuntu 
> correctly report the error, which leads to the failure.
> One option is to change unwind_safeness.cc to tolerate missing 
> dlopen/dlclose. Impala doesn't ship using shared libraries. If the 
> unwind_safeness.cc variables that contain dlopen/dlclose are actually used 
> after a failure to resolve dlopen/dlclose, then it would result in a SIGSEGV 
> and it would be very obvious.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-11640) Build fails on Ubuntu 18/20 when using shared libraries

2022-10-10 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-11640 started by Daniel Becker.
--
> Build fails on Ubuntu 18/20 when using shared libraries
> ---
>
> Key: IMPALA-11640
> URL: https://issues.apache.org/jira/browse/IMPALA-11640
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.2.0
>Reporter: Joe McDonnell
>Assignee: Daniel Becker
>Priority: Major
>
> When building on Ubuntu 18 or Ubuntu 20 with shared libraries (-so), the 
> build fails because the unifiedbetests binary fails to run as part of the 
> validate-unified-backend-test-filters.py invocation:
> {noformat}
> 16:39:22 F1005 23:39:22.237543 88570 unwind_safeness.cc:76] Check failed: 
> !error failed to find symbol dlopen: 
> /home/ubuntu/Impala/be/build/release/kudu_util/libkudu_util.so: undefined 
> symbol: dlopen
> 16:39:22 FAILED: Unified backend test executable returned an error when trying
> 16:39:22 to list tests.
> 16:39:22 Command: /home/ubuntu/Impala/bin/run-binary.sh 
> /home/ubuntu/Impala/be/build/release//service/unifiedbetests 
> --gtest_list_tests
> 16:39:22 Return Code: -6
> 16:39:22 stdout:
> 16:39:22 
> 16:39:22 stderr:
> 16:39:22 None{noformat}
> When building locally, other binaries also fail to execute with the same 
> message.
> One theory is that the code in unwind_safeness.cc has never worked, but 
> Ubuntu 16.04 is impacted by a glibc bug that prevents it from setting an 
> error ([https://sourceware.org/bugzilla/show_bug.cgi?id=19509]). Newer Ubuntu 
> correctly report the error, which leads to the failure.
> One option is to change unwind_safeness.cc to tolerate missing 
> dlopen/dlclose. Impala doesn't ship using shared libraries. If the 
> unwind_safeness.cc variables that contain dlopen/dlclose are actually used 
> after a failure to resolve dlopen/dlclose, then it would result in a SIGSEGV 
> and it would be very obvious.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-11643) Implement ColumnType::ToIR() for non-scalar types

2022-10-10 Thread Daniel Becker (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-11643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17615075#comment-17615075
 ] 

Daniel Becker commented on IMPALA-11643:


https://gerrit.cloudera.org/#/c/19101/

> Implement ColumnType::ToIR() for non-scalar types
> -
>
> Key: IMPALA-11643
> URL: https://issues.apache.org/jira/browse/IMPALA-11643
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Minor
>  Labels: codegen
>
> Currently ColumnType::ToIR() is only implemented for scalar types. It should 
> be extended to support all types.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-11640) Build fails on Ubuntu 18/20 when using shared libraries

2022-10-10 Thread Daniel Becker (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-11640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17615080#comment-17615080
 ] 

Daniel Becker commented on IMPALA-11640:


[~joemcdonnell] I see you've already uploaded a workaround at 
[https://gerrit.cloudera.org/#/c/19104/]

I abandoned my similar gerrit patch.

> Build fails on Ubuntu 18/20 when using shared libraries
> ---
>
> Key: IMPALA-11640
> URL: https://issues.apache.org/jira/browse/IMPALA-11640
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.2.0
>Reporter: Joe McDonnell
>Assignee: Daniel Becker
>Priority: Major
>
> When building on Ubuntu 18 or Ubuntu 20 with shared libraries (-so), the 
> build fails because the unifiedbetests binary fails to run as part of the 
> validate-unified-backend-test-filters.py invocation:
> {noformat}
> 16:39:22 F1005 23:39:22.237543 88570 unwind_safeness.cc:76] Check failed: 
> !error failed to find symbol dlopen: 
> /home/ubuntu/Impala/be/build/release/kudu_util/libkudu_util.so: undefined 
> symbol: dlopen
> 16:39:22 FAILED: Unified backend test executable returned an error when trying
> 16:39:22 to list tests.
> 16:39:22 Command: /home/ubuntu/Impala/bin/run-binary.sh 
> /home/ubuntu/Impala/be/build/release//service/unifiedbetests 
> --gtest_list_tests
> 16:39:22 Return Code: -6
> 16:39:22 stdout:
> 16:39:22 
> 16:39:22 stderr:
> 16:39:22 None{noformat}
> When building locally, other binaries also fail to execute with the same 
> message.
> One theory is that the code in unwind_safeness.cc has never worked, but 
> Ubuntu 16.04 is impacted by a glibc bug that prevents it from setting an 
> error ([https://sourceware.org/bugzilla/show_bug.cgi?id=19509]). Newer Ubuntu 
> correctly report the error, which leads to the failure.
> One option is to change unwind_safeness.cc to tolerate missing 
> dlopen/dlclose. Impala doesn't ship using shared libraries. If the 
> unwind_safeness.cc variables that contain dlopen/dlclose are actually used 
> after a failure to resolve dlopen/dlclose, then it would result in a SIGSEGV 
> and it would be very obvious.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-11640) Build fails on Ubuntu 18/20 when using shared libraries

2022-10-10 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker reassigned IMPALA-11640:
--

Assignee: (was: Daniel Becker)

> Build fails on Ubuntu 18/20 when using shared libraries
> ---
>
> Key: IMPALA-11640
> URL: https://issues.apache.org/jira/browse/IMPALA-11640
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.2.0
>Reporter: Joe McDonnell
>Priority: Major
>
> When building on Ubuntu 18 or Ubuntu 20 with shared libraries (-so), the 
> build fails because the unifiedbetests binary fails to run as part of the 
> validate-unified-backend-test-filters.py invocation:
> {noformat}
> 16:39:22 F1005 23:39:22.237543 88570 unwind_safeness.cc:76] Check failed: 
> !error failed to find symbol dlopen: 
> /home/ubuntu/Impala/be/build/release/kudu_util/libkudu_util.so: undefined 
> symbol: dlopen
> 16:39:22 FAILED: Unified backend test executable returned an error when trying
> 16:39:22 to list tests.
> 16:39:22 Command: /home/ubuntu/Impala/bin/run-binary.sh 
> /home/ubuntu/Impala/be/build/release//service/unifiedbetests 
> --gtest_list_tests
> 16:39:22 Return Code: -6
> 16:39:22 stdout:
> 16:39:22 
> 16:39:22 stderr:
> 16:39:22 None{noformat}
> When building locally, other binaries also fail to execute with the same 
> message.
> One theory is that the code in unwind_safeness.cc has never worked, but 
> Ubuntu 16.04 is impacted by a glibc bug that prevents it from setting an 
> error ([https://sourceware.org/bugzilla/show_bug.cgi?id=19509]). Newer Ubuntu 
> correctly report the error, which leads to the failure.
> One option is to change unwind_safeness.cc to tolerate missing 
> dlopen/dlclose. Impala doesn't ship using shared libraries. If the 
> unwind_safeness.cc variables that contain dlopen/dlclose are actually used 
> after a failure to resolve dlopen/dlclose, then it would result in a SIGSEGV 
> and it would be very obvious.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-11641) When building with shared libraries, Boost should use shared libraries

2022-10-10 Thread Daniel Becker (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-11641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17615103#comment-17615103
 ] 

Daniel Becker commented on IMPALA-11641:


[https://gerrit.cloudera.org/#/c/19104/] handles this as well, right?

> When building with shared libraries, Boost should use shared libraries
> --
>
> Key: IMPALA-11641
> URL: https://issues.apache.org/jira/browse/IMPALA-11641
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.2.0
>Reporter: Joe McDonnell
>Priority: Major
>
> When building with shared libraries, Boost libraries are still statically 
> linked. 
> {noformat}
> $ ./buildall.sh -so -skiptests -cmake_only
> ...
> -- Boost libraries: 
> /opt/Impala-Toolchain/toolchain-packages-gcc10.4.0/boost-1.74.0-p1/lib/libboost_thread.a-lpthread/opt/Impala-Toolchain/toolchain-packages-gcc10.4.0/boost-1.74.0-p1/lib/libboost_regex.a/opt/Impala-Toolchain/toolchain-packages-gcc10.4.0/boost-1.74.0-p1/lib/libboost_filesystem.a/opt/Impala-Toolchain/toolchain-packages-gcc10.4.0/boost-1.74.0-p1/lib/libboost_system.a/opt/Impala-Toolchain/toolchain-packages-gcc10.4.0/boost-1.74.0-p1/lib/libboost_date_time.a/opt/Impala-Toolchain/toolchain-packages-gcc10.4.0/boost-1.74.0-p1/lib/libboost_random.a/opt/Impala-Toolchain/toolchain-packages-gcc10.4.0/boost-1.74.0-p1/lib/libboost_locale.a/opt/Impala-Toolchain/toolchain-packages-gcc10.4.0/boost-1.74.0-p1/lib/libboost_chrono.a/opt/Impala-Toolchain/toolchain-packages-gcc10.4.0/boost-1.74.0-p1/lib/libboost_atomic.a
> ...{noformat}
> Binaries fail to startup due to being unable to find Boost symbols. This is 
> governed by this logic in CMake:
> {noformat}
> set(Boost_USE_STATIC_LIBS NOT ${BUILD_SHARED_LIBS}){noformat}
> That doesn't seem to work. This should be changed to something like this:
> {noformat}
> if(BUILD_SHARED_LIBS)
>   set(Boost_USE_STATIC_LIBS OFF)
> else()
>   set(Boost_USE_STATIC_LIBS ON)
> endif(){noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-11645) Remove PrintThriftEnum functions in debug-utils.cc

2022-10-10 Thread Daniel Becker (Jira)
Daniel Becker created IMPALA-11645:
--

 Summary: Remove PrintThriftEnum functions in debug-utils.cc
 Key: IMPALA-11645
 URL: https://issues.apache.org/jira/browse/IMPALA-11645
 Project: IMPALA
  Issue Type: Improvement
Reporter: Daniel Becker
Assignee: Daniel Becker


Before IMPALA-5690 we implemented operator<< for Thrift enums in Impala code. 
These functions printed the names of the enums.

Then we upgraded to Thrift 0.9.3, but that release included THRIFT-2067, which 
implemented operator<< for Thrift enums, but printed the number value of enums 
instead of their names. To preserve the old behaviour in Impala, we renamed our 
own implementations of operator<< to PrintThriftEnum, a function that we 
defined for each Thrift enum we used, and which returned a string with the 
names - not the numbers - of the enums.

After upgrading Thrift to a version that included THRIFT-3921 (any version 
starting from 0.11.0), these PrintThriftEnum functions are no longer necessary 
as the operator<< provided by Thrift now prints the names of enums, which is 
the behaviour we want.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-11645) Remove PrintThriftEnum functions in debug-utils.cc

2022-10-11 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-11645 started by Daniel Becker.
--
> Remove PrintThriftEnum functions in debug-utils.cc
> --
>
> Key: IMPALA-11645
> URL: https://issues.apache.org/jira/browse/IMPALA-11645
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> Before IMPALA-5690 we implemented operator<< for Thrift enums in Impala code. 
> These functions printed the names of the enums.
> Then we upgraded to Thrift 0.9.3, but that release included THRIFT-2067, 
> which implemented operator<< for Thrift enums, but printed the number value 
> of enums instead of their names. To preserve the old behaviour in Impala, we 
> renamed our own implementations of operator<< to PrintThriftEnum, a function 
> that we defined for each Thrift enum we used, and which returned a string 
> with the names - not the numbers - of the enums.
> After upgrading Thrift to a version that included THRIFT-3921 (any version 
> starting from 0.11.0), these PrintThriftEnum functions are no longer 
> necessary as the operator<< provided by Thrift now prints the names of enums, 
> which is the behaviour we want.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-11462) shiftleft problem

2022-10-11 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker reassigned IMPALA-11462:
--

Assignee: Daniel Becker

> shiftleft problem
> -
>
> Key: IMPALA-11462
> URL: https://issues.apache.org/jira/browse/IMPALA-11462
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 3.4.1
>Reporter: jack sun
>Assignee: Daniel Becker
>Priority: Minor
> Attachments: screenshot-1.png
>
>
> if change the second param of function 'shiftleft' as a dynamic value , it 
> will change the first param as tinnyint
>  !screenshot-1.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-11462) shiftleft problem

2022-10-11 Thread Daniel Becker (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-11462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17615736#comment-17615736
 ] 

Daniel Becker commented on IMPALA-11462:


Steps to reproduce:

With literal second argument the query works as expected:
{code:java}
select shiftleft(cast(1 as bigint), 7);
+-+
| shiftleft(cast(1 as bigint), 7) |
+-+
| 128                             |
+-+{code}
With non-literal second argument overflow occurs:
{code:java}
select shiftleft(cast(1 as bigint), z) c from (select 7 z ) x;
+--+
| c    |
+--+
| -128 |
+--+{code}
However, if we disable expression rewriting, it works with a non-literal second 
argument, too:
{code:java}
set ENABLE_EXPR_REWRITES=0;
select shiftleft(cast(1 as bigint), z) c from (select 7 z ) x;
+-+
| c   |
+-+
| 128 |
+-+{code}

> shiftleft problem
> -
>
> Key: IMPALA-11462
> URL: https://issues.apache.org/jira/browse/IMPALA-11462
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 3.4.1
>Reporter: jack sun
>Assignee: Daniel Becker
>Priority: Minor
> Attachments: screenshot-1.png
>
>
> if change the second param of function 'shiftleft' as a dynamic value , it 
> will change the first param as tinnyint
>  !screenshot-1.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-11462) shiftleft problem

2022-10-11 Thread Daniel Becker (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-11462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17615743#comment-17615743
 ] 

Daniel Becker commented on IMPALA-11462:


If both operands are literals, expression rewrite will execute the shift 
operation and replace the expression with literal 128.

If the second operand is not a literal, the expression cannot be evaluated by 
the rewriter. On the other hand, the cast to bigint will be folded into the 
literal during expression rewrite. This modifies the expression, so re-analysis 
is needed. Re-analysis resets the literal expression (a NumericLiteral), which 
loses its type and becomes TINYINT again.

The problem is that when FoldConstantsRule folds the cast into the literal, it 
doesn't set its explicit type, and only its implicit type becomes BIGINT (for 
the various types of NumericLiterals, see 
[https://github.com/apache/impala/blob/11157a87016fc2408a5ae649aed7cdfb8a0e5d3b/fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java#L41).]

In the case where expression rewrite is disabled, no re-analysis takes place so 
the literal and its type are not reset.

The solution is to set the explicit type of NumericLiteral when folding the 
cast into it.

> shiftleft problem
> -
>
> Key: IMPALA-11462
> URL: https://issues.apache.org/jira/browse/IMPALA-11462
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 3.4.1
>Reporter: jack sun
>Assignee: Daniel Becker
>Priority: Minor
> Attachments: screenshot-1.png
>
>
> if change the second param of function 'shiftleft' as a dynamic value , it 
> will change the first param as tinnyint
>  !screenshot-1.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-11462) shiftleft problem

2022-10-11 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-11462 started by Daniel Becker.
--
> shiftleft problem
> -
>
> Key: IMPALA-11462
> URL: https://issues.apache.org/jira/browse/IMPALA-11462
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 3.4.1
>Reporter: jack sun
>Assignee: Daniel Becker
>Priority: Minor
> Attachments: screenshot-1.png
>
>
> if change the second param of function 'shiftleft' as a dynamic value , it 
> will change the first param as tinnyint
>  !screenshot-1.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-11462) shiftleft problem

2022-10-11 Thread Daniel Becker (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-11462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17615863#comment-17615863
 ] 

Daniel Becker commented on IMPALA-11462:


https://gerrit.cloudera.org/#/c/19124/

> shiftleft problem
> -
>
> Key: IMPALA-11462
> URL: https://issues.apache.org/jira/browse/IMPALA-11462
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 3.4.1
>Reporter: jack sun
>Assignee: Daniel Becker
>Priority: Minor
> Attachments: screenshot-1.png
>
>
> if change the second param of function 'shiftleft' as a dynamic value , it 
> will change the first param as tinnyint
>  !screenshot-1.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-11581) ALTER TABLE RENAME TO doesn't update transient_lastDdlTime

2022-10-12 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker reassigned IMPALA-11581:
--

Assignee: Daniel Becker

> ALTER TABLE RENAME TO doesn't update transient_lastDdlTime
> --
>
> Key: IMPALA-11581
> URL: https://issues.apache.org/jira/browse/IMPALA-11581
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Zoltán Borók-Nagy
>Assignee: Daniel Becker
>Priority: Major
>  Labels: ramp-up
>
> ALTER TABLE RENAME TO doesn't update transient_lastDdlTime.
> The following statements behave differently when executed via Hive or Impala:
> {noformat}
> CREATE TABLE rename_from (i int);
> ALTER TABLE rename_from RENAME TO rename_to;
> {noformat}
> During ALTER TABLE ... RENAME TO ... Hive updates transient_lastDdlTime while 
> Impala leaves it unchanged.
> Impala should follow Hive's behavior.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10356) Analyzed query in explain plan is not quite right for insert with values clause

2022-10-12 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker resolved IMPALA-10356.

Resolution: Fixed

> Analyzed query in explain plan is not quite right for insert with values 
> clause
> ---
>
> Key: IMPALA-10356
> URL: https://issues.apache.org/jira/browse/IMPALA-10356
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0.0
>Reporter: Tim Armstrong
>Assignee: Daniel Becker
>Priority: Major
>  Labels: newbie, ramp-up
>
> In impala-shell:
> {noformat}
> create table double_tbl (d double) stored as textfile;
> set explain_level=2;
> explain insert into double_tbl values (-0.43149576573887316);
> {noformat}
> {noformat}
> +--+
> | Explain String  
>  |
> +--+
> | Max Per-Host Resource Reservation: Memory=0B Threads=1  
>  |
> | Per-Host Resource Estimates: Memory=10MB
>  |
> | Codegen disabled by planner 
>  |
> | Analyzed query: SELECT CAST(-0.43149576573887316 AS DECIMAL(17,17)) UNION 
> SELECT |
> | CAST(-0.43149576573887316 AS DECIMAL(17,17))
>  |
> | 
>  |
> | F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1   
>  |
> | |  Per-Host Resources: mem-estimate=8B mem-reservation=0B 
> thread-reservation=1   |
> | WRITE TO HDFS [default.double_tbl, OVERWRITE=false] 
>  |
> | |  partitions=1 
>  |
> | |  output exprs: CAST(-0.43149576573887316 AS DOUBLE)   
>  |
> | |  mem-estimate=8B mem-reservation=0B thread-reservation=0  
>  |
> | |   
>  |
> | 00:UNION
>  |
> |constant-operands=1  
>  |
> |mem-estimate=0B mem-reservation=0B thread-reservation=0  
>  |
> |tuple-ids=0 row-size=8B cardinality=1
>  |
> |in pipelines:  
>  |
> +--+
> {noformat}
> The analyzed query does not make sense. We should investigate and fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-11581) ALTER TABLE RENAME TO doesn't update transient_lastDdlTime

2022-10-13 Thread Daniel Becker (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-11581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17617088#comment-17617088
 ] 

Daniel Becker commented on IMPALA-11581:


https://gerrit.cloudera.org/#/c/19137/

> ALTER TABLE RENAME TO doesn't update transient_lastDdlTime
> --
>
> Key: IMPALA-11581
> URL: https://issues.apache.org/jira/browse/IMPALA-11581
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Zoltán Borók-Nagy
>Assignee: Daniel Becker
>Priority: Major
>  Labels: ramp-up
>
> ALTER TABLE RENAME TO doesn't update transient_lastDdlTime.
> The following statements behave differently when executed via Hive or Impala:
> {noformat}
> CREATE TABLE rename_from (i int);
> ALTER TABLE rename_from RENAME TO rename_to;
> {noformat}
> During ALTER TABLE ... RENAME TO ... Hive updates transient_lastDdlTime while 
> Impala leaves it unchanged.
> Impala should follow Hive's behavior.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-11623) Put *-ir.cc files into their own libraries to avoid extra recompilation

2022-10-18 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker resolved IMPALA-11623.

Resolution: Implemented

> Put *-ir.cc files into their own libraries to avoid extra recompilation
> ---
>
> Key: IMPALA-11623
> URL: https://issues.apache.org/jira/browse/IMPALA-11623
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 4.2.0
>Reporter: Joe McDonnell
>Assignee: Daniel Becker
>Priority: Major
>
> It is desirable to be able to iterate quickly by running "make -j impalad" 
> while modifying a file. Currently, modifying most files incurs a rebuild of 
> the LLVM IR, which is a slow serial step. For example:
>  
> {noformat}
> $ touch be/src/runtime/coordinator.cc
> $ make -j impalad
> ...
> [ 98%] Generating ../../../llvm-ir/impala.bc
> [ 98%] Generating ../../../llvm-ir/impala-legacy-avx.bc
> [ 98%] Generating ../../generated-sources/impala-ir/impala-ir.cc
> [ 98%] Generating ../../generated-sources/impala-ir/impala-ir-legacy-avx.cc
> ...{noformat}
> This can add several seconds to an incremental build. This step happens for 
> files that do not actually impact the LLVM IR, so there are ways to avoid 
> this.
> The reason that LLVM IR is rebuilt is because it has a dependencies on Exec, 
> Exprs, Runtime, Udf, Util, and other libraries:
>  
> {noformat}
> add_custom_command(
>   OUTPUT ${IR_OUTPUT_FILE}
>   COMMAND ${LLVM_CLANG_EXECUTABLE} ${CLANG_IR_CXX_FLAGS} 
> ${PLATFORM_SPECIFIC_FLAGS}
>           ${CLANG_INCLUDE_FLAGS} ${IR_INPUT_FILES} -o ${IR_TMP_OUTPUT_FILE}
>   COMMAND ${LLVM_OPT_EXECUTABLE} ${LLVM_OPT_IR_FLAGS} < ${IR_TMP_OUTPUT_FILE} 
> > ${IR_OUTPUT_FILE}
>   COMMAND rm ${IR_TMP_OUTPUT_FILE}
>   DEPENDS Exec ExecAvro ExecKudu Exprs Runtime Udf Util ${IR_INPUT_FILES}
> ){noformat}
> From a correctness perspective, the LLVM IR only cares about things that 
> impact the content of the *-ir.cc files, because impala-ir.cc includes every 
> *-ir.cc file. That list of libraries is a superset of what is needed.
> If the *-ir.cc files were split off into their own libraries (i.e. ExecIr 
> rather than Exec), then this target would only depend on the ExecIr rather 
> than the larger Exec. This would reduce the number of files that would cause 
> LLVM IR to be rebuilt. That should reduce the runtime of an incremental "make 
> -j impalad" for quite a few C++ files.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-11581) ALTER TABLE RENAME TO doesn't update transient_lastDdlTime

2022-10-19 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker resolved IMPALA-11581.

Resolution: Fixed

> ALTER TABLE RENAME TO doesn't update transient_lastDdlTime
> --
>
> Key: IMPALA-11581
> URL: https://issues.apache.org/jira/browse/IMPALA-11581
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Zoltán Borók-Nagy
>Assignee: Daniel Becker
>Priority: Major
>  Labels: ramp-up
>
> ALTER TABLE RENAME TO doesn't update transient_lastDdlTime.
> The following statements behave differently when executed via Hive or Impala:
> {noformat}
> CREATE TABLE rename_from (i int);
> ALTER TABLE rename_from RENAME TO rename_to;
> {noformat}
> During ALTER TABLE ... RENAME TO ... Hive updates transient_lastDdlTime while 
> Impala leaves it unchanged.
> Impala should follow Hive's behavior.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-11645) Remove PrintThriftEnum functions in debug-utils.cc

2022-10-20 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker resolved IMPALA-11645.

Resolution: Implemented

> Remove PrintThriftEnum functions in debug-utils.cc
> --
>
> Key: IMPALA-11645
> URL: https://issues.apache.org/jira/browse/IMPALA-11645
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> Before IMPALA-5690 we implemented operator<< for Thrift enums in Impala code. 
> These functions printed the names of the enums.
> Then we upgraded to Thrift 0.9.3, but that release included THRIFT-2067, 
> which implemented operator<< for Thrift enums, but printed the number value 
> of enums instead of their names. To preserve the old behaviour in Impala, we 
> renamed our own implementations of operator<< to PrintThriftEnum, a function 
> that we defined for each Thrift enum we used, and which returned a string 
> with the names - not the numbers - of the enums.
> After upgrading Thrift to a version that included THRIFT-3921 (any version 
> starting from 0.11.0), these PrintThriftEnum functions are no longer 
> necessary as the operator<< provided by Thrift now prints the names of enums, 
> which is the behaviour we want.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-11645) Remove PrintThriftEnum functions in debug-utils.cc

2022-10-20 Thread Daniel Becker (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-11645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17621055#comment-17621055
 ] 

Daniel Becker commented on IMPALA-11645:


https://gerrit.cloudera.org/#/c/19118/

> Remove PrintThriftEnum functions in debug-utils.cc
> --
>
> Key: IMPALA-11645
> URL: https://issues.apache.org/jira/browse/IMPALA-11645
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> Before IMPALA-5690 we implemented operator<< for Thrift enums in Impala code. 
> These functions printed the names of the enums.
> Then we upgraded to Thrift 0.9.3, but that release included THRIFT-2067, 
> which implemented operator<< for Thrift enums, but printed the number value 
> of enums instead of their names. To preserve the old behaviour in Impala, we 
> renamed our own implementations of operator<< to PrintThriftEnum, a function 
> that we defined for each Thrift enum we used, and which returned a string 
> with the names - not the numbers - of the enums.
> After upgrading Thrift to a version that included THRIFT-3921 (any version 
> starting from 0.11.0), these PrintThriftEnum functions are no longer 
> necessary as the operator<< provided by Thrift now prints the names of enums, 
> which is the behaviour we want.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-11462) shiftleft problem

2022-10-24 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker resolved IMPALA-11462.

Resolution: Fixed

> shiftleft problem
> -
>
> Key: IMPALA-11462
> URL: https://issues.apache.org/jira/browse/IMPALA-11462
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 3.4.1
>Reporter: jack sun
>Assignee: Daniel Becker
>Priority: Minor
> Attachments: screenshot-1.png
>
>
> if change the second param of function 'shiftleft' as a dynamic value , it 
> will change the first param as tinnyint
>  !screenshot-1.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-11685) Slot memory sharing between struct and field not working if the field is also a struct

2022-10-25 Thread Daniel Becker (Jira)
Daniel Becker created IMPALA-11685:
--

 Summary: Slot memory sharing between struct and field not working 
if the field is also a struct
 Key: IMPALA-11685
 URL: https://issues.apache.org/jira/browse/IMPALA-11685
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Reporter: Daniel Becker
Assignee: Daniel Becker


IMPALA-10838 introduced that if a struct and one of its fields are both present 
in the select list, no extra slot is generated in the row for the struct field 
but the memory of the struct is reused, i.e. the row size is the same as when 
only the struct is queried. It works when the struct field is a primitive type:
{code:java}
explain select id, outer_struct from 
functional_orc_def.complextypes_nested_structs;
row-size=64B{code}
{code:java}
explain select id, outer_struct, outer_struct.str from 
functional_orc_def.complextypes_nested_structs;
row-size=64B{code}
However, it does not if the child is itself a struct:
{code:java}
explain select id, outer_struct, outer_struct.inner_struct3 from 
functional_orc_def.complextypes_nested_structs;
row-size=80B{code}
This is because struct slot descriptors are registered before others so that it 
is easier to reuse the slot memory of the struct fields, but struct slot 
descriptors among themselves are sorted in the wrong order (see 
[https://github.com/apache/impala/blob/c12ac6c27b2df1eae693b44c157d65499f491d21/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java#L340).]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-11685) Slot memory sharing between struct and field not working if the field is also a struct

2022-10-25 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-11685 started by Daniel Becker.
--
> Slot memory sharing between struct and field not working if the field is also 
> a struct
> --
>
> Key: IMPALA-11685
> URL: https://issues.apache.org/jira/browse/IMPALA-11685
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> IMPALA-10838 introduced that if a struct and one of its fields are both 
> present in the select list, no extra slot is generated in the row for the 
> struct field but the memory of the struct is reused, i.e. the row size is the 
> same as when only the struct is queried. It works when the struct field is a 
> primitive type:
> {code:java}
> explain select id, outer_struct from 
> functional_orc_def.complextypes_nested_structs;
> row-size=64B{code}
> {code:java}
> explain select id, outer_struct, outer_struct.str from 
> functional_orc_def.complextypes_nested_structs;
> row-size=64B{code}
> However, it does not if the child is itself a struct:
> {code:java}
> explain select id, outer_struct, outer_struct.inner_struct3 from 
> functional_orc_def.complextypes_nested_structs;
> row-size=80B{code}
> This is because struct slot descriptors are registered before others so that 
> it is easier to reuse the slot memory of the struct fields, but struct slot 
> descriptors among themselves are sorted in the wrong order (see 
> [https://github.com/apache/impala/blob/c12ac6c27b2df1eae693b44c157d65499f491d21/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java#L340).]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-11687) Select * with EXPAND_COMPLEX_TYPES=1 and explicit complex types fails

2022-10-27 Thread Daniel Becker (Jira)
Daniel Becker created IMPALA-11687:
--

 Summary: Select * with EXPAND_COMPLEX_TYPES=1 and explicit complex 
types fails
 Key: IMPALA-11687
 URL: https://issues.apache.org/jira/browse/IMPALA-11687
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Reporter: Daniel Becker
Assignee: Daniel Becker


If EXPAND_COMPLEX_TYPES is set to true, some queries that combine star 
expressions and explicitly given complex columns fail:
{code:java}
select outer_struct, * from functional_orc_def.complextypes_nested_structs;
ERROR: IllegalStateException: Illegal reference to non-materialized slot: tid=1 
sid=1{code}
{code:java}
select *, outer_struct.str from functional_orc_def.complextypes_nested_structs;
ERROR: IllegalStateException: null{code}
Having two stars in a table with complex columns also fails.
{code:java}
select *, * from functional_orc_def.complextypes_nested_structs;{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-11687) Select * with EXPAND_COMPLEX_TYPES=1 and explicit complex types fails

2022-10-27 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker updated IMPALA-11687:
---
Description: 
If EXPAND_COMPLEX_TYPES is set to true, some queries that combine star 
expressions and explicitly given complex columns fail:
{code:java}
select outer_struct, * from functional_orc_def.complextypes_nested_structs;
ERROR: IllegalStateException: Illegal reference to non-materialized slot: tid=1 
sid=1{code}
{code:java}
select *, outer_struct.str from functional_orc_def.complextypes_nested_structs;
ERROR: IllegalStateException: null{code}
Having two stars in a table with complex columns also fails.
{code:java}
select *, * from functional_orc_def.complextypes_nested_structs;
ERROR: IllegalStateException: Illegal reference to non-materialized slot: tid=6 
sid=13{code}

  was:
If EXPAND_COMPLEX_TYPES is set to true, some queries that combine star 
expressions and explicitly given complex columns fail:
{code:java}
select outer_struct, * from functional_orc_def.complextypes_nested_structs;
ERROR: IllegalStateException: Illegal reference to non-materialized slot: tid=1 
sid=1{code}
{code:java}
select *, outer_struct.str from functional_orc_def.complextypes_nested_structs;
ERROR: IllegalStateException: null{code}
Having two stars in a table with complex columns also fails.
{code:java}
select *, * from functional_orc_def.complextypes_nested_structs;{code}


> Select * with EXPAND_COMPLEX_TYPES=1 and explicit complex types fails
> -
>
> Key: IMPALA-11687
> URL: https://issues.apache.org/jira/browse/IMPALA-11687
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> If EXPAND_COMPLEX_TYPES is set to true, some queries that combine star 
> expressions and explicitly given complex columns fail:
> {code:java}
> select outer_struct, * from functional_orc_def.complextypes_nested_structs;
> ERROR: IllegalStateException: Illegal reference to non-materialized slot: 
> tid=1 sid=1{code}
> {code:java}
> select *, outer_struct.str from 
> functional_orc_def.complextypes_nested_structs;
> ERROR: IllegalStateException: null{code}
> Having two stars in a table with complex columns also fails.
> {code:java}
> select *, * from functional_orc_def.complextypes_nested_structs;
> ERROR: IllegalStateException: Illegal reference to non-materialized slot: 
> tid=6 sid=13{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-11685) Slot memory sharing between struct and field not working if the field is also a struct

2022-10-27 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker resolved IMPALA-11685.

Resolution: Fixed

> Slot memory sharing between struct and field not working if the field is also 
> a struct
> --
>
> Key: IMPALA-11685
> URL: https://issues.apache.org/jira/browse/IMPALA-11685
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> IMPALA-10838 introduced that if a struct and one of its fields are both 
> present in the select list, no extra slot is generated in the row for the 
> struct field but the memory of the struct is reused, i.e. the row size is the 
> same as when only the struct is queried. It works when the struct field is a 
> primitive type:
> {code:java}
> explain select id, outer_struct from 
> functional_orc_def.complextypes_nested_structs;
> row-size=64B{code}
> {code:java}
> explain select id, outer_struct, outer_struct.str from 
> functional_orc_def.complextypes_nested_structs;
> row-size=64B{code}
> However, it does not if the child is itself a struct:
> {code:java}
> explain select id, outer_struct, outer_struct.inner_struct3 from 
> functional_orc_def.complextypes_nested_structs;
> row-size=80B{code}
> This is because struct slot descriptors are registered before others so that 
> it is easier to reuse the slot memory of the struct fields, but struct slot 
> descriptors among themselves are sorted in the wrong order (see 
> [https://github.com/apache/impala/blob/c12ac6c27b2df1eae693b44c157d65499f491d21/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java#L340).]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-11692) Struct slot memory sharing involving select * not working properly

2022-10-28 Thread Daniel Becker (Jira)
Daniel Becker created IMPALA-11692:
--

 Summary: Struct slot memory sharing involving select * not working 
properly 
 Key: IMPALA-11692
 URL: https://issues.apache.org/jira/browse/IMPALA-11692
 Project: IMPALA
  Issue Type: Bug
Reporter: Daniel Becker
Assignee: Daniel Becker


With EXPAND_COMPLEX_TYPES=1, if there are structs coming from the star 
expansion and members of the structs are also given explicitly, slot memory 
sharing does not work in some cases:
{code:java}
explain select * from functional_orc_def.complextypes_nested_structs;
row-size=64B{code}
{code:java}
explain select *, outer_struct.inner_struct1 from 
functional_orc_def.complextypes_nested_structs;
row-size=80B{code}
The row size should be the same in both cases as outer_struct.inner_struct1 is 
part of outer_struct which is included in the star.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-11687) Select * with EXPAND_COMPLEX_TYPES=1 and explicit complex types fails

2022-11-02 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker updated IMPALA-11687:
---
Fix Version/s: Impala 4.2.0

> Select * with EXPAND_COMPLEX_TYPES=1 and explicit complex types fails
> -
>
> Key: IMPALA-11687
> URL: https://issues.apache.org/jira/browse/IMPALA-11687
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
> Fix For: Impala 4.2.0
>
>
> If EXPAND_COMPLEX_TYPES is set to true, some queries that combine star 
> expressions and explicitly given complex columns fail:
> {code:java}
> select outer_struct, * from functional_orc_def.complextypes_nested_structs;
> ERROR: IllegalStateException: Illegal reference to non-materialized slot: 
> tid=1 sid=1{code}
> {code:java}
> select *, outer_struct.str from 
> functional_orc_def.complextypes_nested_structs;
> ERROR: IllegalStateException: null{code}
> Having two stars in a table with complex columns also fails.
> {code:java}
> select *, * from functional_orc_def.complextypes_nested_structs;
> ERROR: IllegalStateException: Illegal reference to non-materialized slot: 
> tid=6 sid=13{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-11692) Struct slot memory sharing involving select * not working properly

2022-11-02 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker updated IMPALA-11692:
---
Fix Version/s: Impala 4.2.0

> Struct slot memory sharing involving select * not working properly 
> ---
>
> Key: IMPALA-11692
> URL: https://issues.apache.org/jira/browse/IMPALA-11692
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
> Fix For: Impala 4.2.0
>
>
> With EXPAND_COMPLEX_TYPES=1, if there are structs coming from the star 
> expansion and members of the structs are also given explicitly, slot memory 
> sharing does not work in some cases:
> {code:java}
> explain select * from functional_orc_def.complextypes_nested_structs;
> row-size=64B{code}
> {code:java}
> explain select *, outer_struct.inner_struct1 from 
> functional_orc_def.complextypes_nested_structs;
> row-size=80B{code}
> The row size should be the same in both cases as outer_struct.inner_struct1 
> is part of outer_struct which is included in the star.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-11692) Struct slot memory sharing involving select * not working properly

2022-11-02 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-11692 started by Daniel Becker.
--
> Struct slot memory sharing involving select * not working properly 
> ---
>
> Key: IMPALA-11692
> URL: https://issues.apache.org/jira/browse/IMPALA-11692
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
> Fix For: Impala 4.2.0
>
>
> With EXPAND_COMPLEX_TYPES=1, if there are structs coming from the star 
> expansion and members of the structs are also given explicitly, slot memory 
> sharing does not work in some cases:
> {code:java}
> explain select * from functional_orc_def.complextypes_nested_structs;
> row-size=64B{code}
> {code:java}
> explain select *, outer_struct.inner_struct1 from 
> functional_orc_def.complextypes_nested_structs;
> row-size=80B{code}
> The row size should be the same in both cases as outer_struct.inner_struct1 
> is part of outer_struct which is included in the star.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-11692) Struct slot memory sharing involving select * not working properly

2022-11-02 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker updated IMPALA-11692:
---
 Fix Version/s: (was: Impala 4.2.0)
Target Version: Impala 4.2.0

> Struct slot memory sharing involving select * not working properly 
> ---
>
> Key: IMPALA-11692
> URL: https://issues.apache.org/jira/browse/IMPALA-11692
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> With EXPAND_COMPLEX_TYPES=1, if there are structs coming from the star 
> expansion and members of the structs are also given explicitly, slot memory 
> sharing does not work in some cases:
> {code:java}
> explain select * from functional_orc_def.complextypes_nested_structs;
> row-size=64B{code}
> {code:java}
> explain select *, outer_struct.inner_struct1 from 
> functional_orc_def.complextypes_nested_structs;
> row-size=80B{code}
> The row size should be the same in both cases as outer_struct.inner_struct1 
> is part of outer_struct which is included in the star.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-11687) Select * with EXPAND_COMPLEX_TYPES=1 and explicit complex types fails

2022-11-02 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker updated IMPALA-11687:
---
 Fix Version/s: (was: Impala 4.2.0)
Target Version: Impala 4.2.0

> Select * with EXPAND_COMPLEX_TYPES=1 and explicit complex types fails
> -
>
> Key: IMPALA-11687
> URL: https://issues.apache.org/jira/browse/IMPALA-11687
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> If EXPAND_COMPLEX_TYPES is set to true, some queries that combine star 
> expressions and explicitly given complex columns fail:
> {code:java}
> select outer_struct, * from functional_orc_def.complextypes_nested_structs;
> ERROR: IllegalStateException: Illegal reference to non-materialized slot: 
> tid=1 sid=1{code}
> {code:java}
> select *, outer_struct.str from 
> functional_orc_def.complextypes_nested_structs;
> ERROR: IllegalStateException: null{code}
> Having two stars in a table with complex columns also fails.
> {code:java}
> select *, * from functional_orc_def.complextypes_nested_structs;
> ERROR: IllegalStateException: Illegal reference to non-materialized slot: 
> tid=6 sid=13{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9551) Allow embedding complex types into other complex types

2022-11-02 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker reassigned IMPALA-9551:
-

Assignee: Daniel Becker

> Allow embedding complex types into other complex types
> --
>
> Key: IMPALA-9551
> URL: https://issues.apache.org/jira/browse/IMPALA-9551
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend, Frontend
>Reporter: Gabor Kaszab
>Assignee: Daniel Becker
>Priority: Major
>  Labels: complextype
>
> For some examples please check functional.complextypestbl. Any of the columns 
> in that table should be given in the select list.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-11687) Select * with EXPAND_COMPLEX_TYPES=1 and explicit complex types fails

2022-11-03 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-11687 started by Daniel Becker.
--
> Select * with EXPAND_COMPLEX_TYPES=1 and explicit complex types fails
> -
>
> Key: IMPALA-11687
> URL: https://issues.apache.org/jira/browse/IMPALA-11687
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> If EXPAND_COMPLEX_TYPES is set to true, some queries that combine star 
> expressions and explicitly given complex columns fail:
> {code:java}
> select outer_struct, * from functional_orc_def.complextypes_nested_structs;
> ERROR: IllegalStateException: Illegal reference to non-materialized slot: 
> tid=1 sid=1{code}
> {code:java}
> select *, outer_struct.str from 
> functional_orc_def.complextypes_nested_structs;
> ERROR: IllegalStateException: null{code}
> Having two stars in a table with complex columns also fails.
> {code:java}
> select *, * from functional_orc_def.complextypes_nested_structs;
> ERROR: IllegalStateException: Illegal reference to non-materialized slot: 
> tid=6 sid=13{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-11687) Select * with EXPAND_COMPLEX_TYPES=1 and explicit complex types fails

2022-11-03 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker updated IMPALA-11687:
---
Epic Link: IMPALA-9494

> Select * with EXPAND_COMPLEX_TYPES=1 and explicit complex types fails
> -
>
> Key: IMPALA-11687
> URL: https://issues.apache.org/jira/browse/IMPALA-11687
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> If EXPAND_COMPLEX_TYPES is set to true, some queries that combine star 
> expressions and explicitly given complex columns fail:
> {code:java}
> select outer_struct, * from functional_orc_def.complextypes_nested_structs;
> ERROR: IllegalStateException: Illegal reference to non-materialized slot: 
> tid=1 sid=1{code}
> {code:java}
> select *, outer_struct.str from 
> functional_orc_def.complextypes_nested_structs;
> ERROR: IllegalStateException: null{code}
> Having two stars in a table with complex columns also fails.
> {code:java}
> select *, * from functional_orc_def.complextypes_nested_structs;
> ERROR: IllegalStateException: Illegal reference to non-materialized slot: 
> tid=6 sid=13{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-11692) Struct slot memory sharing involving select * not working properly

2022-11-03 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker updated IMPALA-11692:
---
Epic Link: IMPALA-9494

> Struct slot memory sharing involving select * not working properly 
> ---
>
> Key: IMPALA-11692
> URL: https://issues.apache.org/jira/browse/IMPALA-11692
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> With EXPAND_COMPLEX_TYPES=1, if there are structs coming from the star 
> expansion and members of the structs are also given explicitly, slot memory 
> sharing does not work in some cases:
> {code:java}
> explain select * from functional_orc_def.complextypes_nested_structs;
> row-size=64B{code}
> {code:java}
> explain select *, outer_struct.inner_struct1 from 
> functional_orc_def.complextypes_nested_structs;
> row-size=80B{code}
> The row size should be the same in both cases as outer_struct.inner_struct1 
> is part of outer_struct which is included in the star.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-11643) Implement ColumnType::ToIR() for non-scalar types

2022-11-03 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker updated IMPALA-11643:
---
Epic Link: IMPALA-9494

> Implement ColumnType::ToIR() for non-scalar types
> -
>
> Key: IMPALA-11643
> URL: https://issues.apache.org/jira/browse/IMPALA-11643
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Minor
>  Labels: codegen
>
> Currently ColumnType::ToIR() is only implemented for scalar types. It should 
> be extended to support all types.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-11687) Select * with EXPAND_COMPLEX_TYPES=1 and explicit complex types fails

2022-11-04 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker resolved IMPALA-11687.

Resolution: Fixed

> Select * with EXPAND_COMPLEX_TYPES=1 and explicit complex types fails
> -
>
> Key: IMPALA-11687
> URL: https://issues.apache.org/jira/browse/IMPALA-11687
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> If EXPAND_COMPLEX_TYPES is set to true, some queries that combine star 
> expressions and explicitly given complex columns fail:
> {code:java}
> select outer_struct, * from functional_orc_def.complextypes_nested_structs;
> ERROR: IllegalStateException: Illegal reference to non-materialized slot: 
> tid=1 sid=1{code}
> {code:java}
> select *, outer_struct.str from 
> functional_orc_def.complextypes_nested_structs;
> ERROR: IllegalStateException: null{code}
> Having two stars in a table with complex columns also fails.
> {code:java}
> select *, * from functional_orc_def.complextypes_nested_structs;
> ERROR: IllegalStateException: Illegal reference to non-materialized slot: 
> tid=6 sid=13{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-11712) Sort out column masking with complex types

2022-11-08 Thread Daniel Becker (Jira)
Daniel Becker created IMPALA-11712:
--

 Summary: Sort out column masking with complex types
 Key: IMPALA-11712
 URL: https://issues.apache.org/jira/browse/IMPALA-11712
 Project: IMPALA
  Issue Type: Improvement
  Components: Frontend
Reporter: Daniel Becker


We determine whether a SlotDescriptor created from a star expanded path should 
be registered for column masking based on the path of the star item:

??Empty matched types means this is expanded from star of a catalog table.??
??For star of complex types, e.g. my_struct.*, my_array.*, my_map.*, the 
matched??
??types will have the complex type so it's not empty.??

[https://github.com/apache/impala/blob/b28da054f3595bb92873433211438306fc22fbc7/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java#L659]

However, this comment may be wrong because in the query                         
                 
{code:java}
select a.* from mix_struct_array t, t.struct_in_arr a;{code}
{{getMatchedTypes()}} returns an empty list for the star path even though it is 
not from a catalog table.

We should also find out whether we can determine from the expanded path alone 
(and not the path of the star item) whether we need to register it for column 
masking, for example by checking if it is within a complex type.                
                                                 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-11692) Struct slot memory sharing involving select * not working properly

2022-11-09 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker resolved IMPALA-11692.

Resolution: Fixed

> Struct slot memory sharing involving select * not working properly 
> ---
>
> Key: IMPALA-11692
> URL: https://issues.apache.org/jira/browse/IMPALA-11692
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> With EXPAND_COMPLEX_TYPES=1, if there are structs coming from the star 
> expansion and members of the structs are also given explicitly, slot memory 
> sharing does not work in some cases:
> {code:java}
> explain select * from functional_orc_def.complextypes_nested_structs;
> row-size=64B{code}
> {code:java}
> explain select *, outer_struct.inner_struct1 from 
> functional_orc_def.complextypes_nested_structs;
> row-size=80B{code}
> The row size should be the same in both cases as outer_struct.inner_struct1 
> is part of outer_struct which is included in the star.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-11717) Use rapidjson for printing collections

2022-11-10 Thread Daniel Becker (Jira)
Daniel Becker created IMPALA-11717:
--

 Summary: Use rapidjson for printing collections
 Key: IMPALA-11717
 URL: https://issues.apache.org/jira/browse/IMPALA-11717
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Reporter: Daniel Becker
Assignee: Daniel Becker


We use rapidjson to print structs but don't use it to print collections (arrays 
and maps). We should switch to rapidjson also for collections to have a uniform 
approach.

This is also needed if we want to support embedding structs and collections in 
each other, see [IMPALA-9551|https://issues.apache.org/jira/browse/IMPALA-9551].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-11717) Use rapidjson for printing collections

2022-11-10 Thread Daniel Becker (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-11717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17631610#comment-17631610
 ] 

Daniel Becker commented on IMPALA-11717:


See 
https://github.com/apache/impala/blob/f617e3648734ffaff655382f911d256424bcda7b/be/src/service/hs2-util.cc#L441.

> Use rapidjson for printing collections
> --
>
> Key: IMPALA-11717
> URL: https://issues.apache.org/jira/browse/IMPALA-11717
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> We use rapidjson to print structs but don't use it to print collections 
> (arrays and maps). We should switch to rapidjson also for collections to have 
> a uniform approach.
> This is also needed if we want to support embedding structs and collections 
> in each other, see 
> [IMPALA-9551|https://issues.apache.org/jira/browse/IMPALA-9551].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-11719) Inconsistency in printing NULL values

2022-11-10 Thread Daniel Becker (Jira)
Daniel Becker created IMPALA-11719:
--

 Summary: Inconsistency in printing NULL values
 Key: IMPALA-11719
 URL: https://issues.apache.org/jira/browse/IMPALA-11719
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Reporter: Daniel Becker


If they are top level or in collections, null values are printed as "NULL":
{code:java}
select int_array from functional_parquet.complextypestbl;
++
| int_array              |
++
| [-1]                   |
| [1,2,3]                |
| [NULL,1,2,NULL,3,NULL] |
| []                     |
| NULL                   |
| NULL                   |
| NULL                   |
| NULL                   |
++{code}
If they are in a struct, they are printed as "null":
{code:java}
select small_struct from functional_parquet.complextypes_structs;
++
| small_struct                       |
++
| NULL                               |
| {"i":19191,"s":"small_struct_str"} |
| {"i":98765,"s":null}               |
| {"i":null,"s":"str"}               |
| {"i":98765,"s":"abcde f"}          |
| {"i":null,"s":null}                |
++{code}
In Hive the situation is a bit different: "NULL" is used only for top level 
values and "null" is printed in both collections and structs.
{code:java}
select int_array from functional_parquet.complextypestbl;
+-+
|        int_array        |
+-+
| [-1]                    |
| [1,2,3]                 |
| [null,1,2,null,3,null]  |
| []                      |
| NULL                    |
| NULL                    |
| NULL                    |
| NULL                    |
+-+{code}
{code:java}
select small_struct from functional_parquet.complextypes_structs;
+-+
|            small_struct             |
+-+
| NULL                                |
| {"i":19191,"s":"small_struct_str"}  |
| {"i":98765,"s":null}                |
| {"i":null,"s":"str"}                |
| {"i":98765,"s":"abcde f"}           |
| {"i":null,"s":null}                 |
+-+{code}
In JSON the relevant keyword is "null".

We should decide how we handle this situation.
 # Have a uniform NULL representation everywhere: top level, collections and 
structs
 ** either "NULL" or "null" everywhere
 # Have "NULL" on the top level and "null" in collections and structs, like Hive
 # Leave everything as it is now: "NULL" at the top level and in collections, 
"null" in structs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-11719) Inconsistency in printing NULL values

2022-11-10 Thread Daniel Becker (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-11719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17631753#comment-17631753
 ] 

Daniel Becker commented on IMPALA-11719:


[~csringhofer], [~gaborkaszab], [~prozsa] what do you think? 

> Inconsistency in printing NULL values
> -
>
> Key: IMPALA-11719
> URL: https://issues.apache.org/jira/browse/IMPALA-11719
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Daniel Becker
>Priority: Major
>
> If they are top level or in collections, null values are printed as "NULL":
> {code:java}
> select int_array from functional_parquet.complextypestbl;
> ++
> | int_array              |
> ++
> | [-1]                   |
> | [1,2,3]                |
> | [NULL,1,2,NULL,3,NULL] |
> | []                     |
> | NULL                   |
> | NULL                   |
> | NULL                   |
> | NULL                   |
> ++{code}
> If they are in a struct, they are printed as "null":
> {code:java}
> select small_struct from functional_parquet.complextypes_structs;
> ++
> | small_struct                       |
> ++
> | NULL                               |
> | {"i":19191,"s":"small_struct_str"} |
> | {"i":98765,"s":null}               |
> | {"i":null,"s":"str"}               |
> | {"i":98765,"s":"abcde f"}          |
> | {"i":null,"s":null}                |
> ++{code}
> In Hive the situation is a bit different: "NULL" is used only for top level 
> values and "null" is printed in both collections and structs.
> {code:java}
> select int_array from functional_parquet.complextypestbl;
> +-+
> |        int_array        |
> +-+
> | [-1]                    |
> | [1,2,3]                 |
> | [null,1,2,null,3,null]  |
> | []                      |
> | NULL                    |
> | NULL                    |
> | NULL                    |
> | NULL                    |
> +-+{code}
> {code:java}
> select small_struct from functional_parquet.complextypes_structs;
> +-+
> |            small_struct             |
> +-+
> | NULL                                |
> | {"i":19191,"s":"small_struct_str"}  |
> | {"i":98765,"s":null}                |
> | {"i":null,"s":"str"}                |
> | {"i":98765,"s":"abcde f"}           |
> | {"i":null,"s":null}                 |
> +-+{code}
> In JSON the relevant keyword is "null".
> We should decide how we handle this situation.
>  # Have a uniform NULL representation everywhere: top level, collections and 
> structs
>  ** either "NULL" or "null" everywhere
>  # Have "NULL" on the top level and "null" in collections and structs, like 
> Hive
>  # Leave everything as it is now: "NULL" at the top level and in collections, 
> "null" in structs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-11719) Inconsistency in printing NULL values

2022-11-10 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker updated IMPALA-11719:
---
Description: 
If they are top level or in collections, null values are printed as "NULL":
{code:java}
select int_array from functional_parquet.complextypestbl;
++
| int_array              |
++
| [-1]                   |
| [1,2,3]                |
| [NULL,1,2,NULL,3,NULL] |
| []                     |
| NULL                   |
| NULL                   |
| NULL                   |
| NULL                   |
++{code}
If they are in a struct, they are printed as "null":
{code:java}
select small_struct from functional_parquet.complextypes_structs;
++
| small_struct                       |
++
| NULL                               |
| {"i":19191,"s":"small_struct_str"} |
| {"i":98765,"s":null}               |
| {"i":null,"s":"str"}               |
| {"i":98765,"s":"abcde f"}          |
| {"i":null,"s":null}                |
++{code}
In Hive the situation is a bit different: "NULL" is used only for top level 
values and "null" is printed in both collections and structs.
{code:java}
select int_array from functional_parquet.complextypestbl;
+-+
|        int_array        |
+-+
| [-1]                    |
| [1,2,3]                 |
| [null,1,2,null,3,null]  |
| []                      |
| NULL                    |
| NULL                    |
| NULL                    |
| NULL                    |
+-+{code}
{code:java}
select small_struct from functional_parquet.complextypes_structs;
+-+
|            small_struct             |
+-+
| NULL                                |
| {"i":19191,"s":"small_struct_str"}  |
| {"i":98765,"s":null}                |
| {"i":null,"s":"str"}                |
| {"i":98765,"s":"abcde f"}           |
| {"i":null,"s":null}                 |
+-+{code}
Officially we print collections and structs in JSON form. In JSON the relevant 
keyword is "null".

We should decide how we handle this situation.
 # Have a uniform NULL representation everywhere: top level, collections and 
structs
 ** either "NULL" or "null" everywhere
 # Have "NULL" on the top level and "null" in collections and structs, like Hive
 # Leave everything as it is now: "NULL" at the top level and in collections, 
"null" in structs.

  was:
If they are top level or in collections, null values are printed as "NULL":
{code:java}
select int_array from functional_parquet.complextypestbl;
++
| int_array              |
++
| [-1]                   |
| [1,2,3]                |
| [NULL,1,2,NULL,3,NULL] |
| []                     |
| NULL                   |
| NULL                   |
| NULL                   |
| NULL                   |
++{code}
If they are in a struct, they are printed as "null":
{code:java}
select small_struct from functional_parquet.complextypes_structs;
++
| small_struct                       |
++
| NULL                               |
| {"i":19191,"s":"small_struct_str"} |
| {"i":98765,"s":null}               |
| {"i":null,"s":"str"}               |
| {"i":98765,"s":"abcde f"}          |
| {"i":null,"s":null}                |
++{code}
In Hive the situation is a bit different: "NULL" is used only for top level 
values and "null" is printed in both collections and structs.
{code:java}
select int_array from functional_parquet.complextypestbl;
+-+
|        int_array        |
+-+
| [-1]                    |
| [1,2,3]                 |
| [null,1,2,null,3,null]  |
| []                      |
| NULL                    |
| NULL                    |
| NULL                    |
| NULL                    |
+-+{code}
{code:java}
select small_struct from functional_parquet.complextypes_structs;
+-+
|            small_struct             |
+-+
| NULL                                |
| {"i":19191,"s":"small_struct_str"}  |
| {"i":98765,"s":null}                |
| {"i":null,"s":"str"}                |
| {"i":98765,"s":"abcde f"}           |
| {"i":null,"s":null}                 |
+-+{code}
In JSON the relevant keyword is "null".

We should decide how we handle this situation.
 # Have a uniform NULL representation everywhere: top level, collections and 
structs
 ** either "NULL" or "null" everywhere
 # Have "NULL" on the top leve

[jira] [Work started] (IMPALA-11717) Use rapidjson for printing collections

2022-11-10 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-11717 started by Daniel Becker.
--
> Use rapidjson for printing collections
> --
>
> Key: IMPALA-11717
> URL: https://issues.apache.org/jira/browse/IMPALA-11717
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> We use rapidjson to print structs but don't use it to print collections 
> (arrays and maps). We should switch to rapidjson also for collections to have 
> a uniform approach.
> This is also needed if we want to support embedding structs and collections 
> in each other, see 
> [IMPALA-9551|https://issues.apache.org/jira/browse/IMPALA-9551].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-11719) Inconsistency in printing NULL values

2022-11-11 Thread Daniel Becker (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-11719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17632289#comment-17632289
 ] 

Daniel Becker commented on IMPALA-11719:


[~gaborkaszab] [~csringhofer] I agree that 2) is a good solution.

The question is, do we break any use cases of our customers if we change that?
Should we introduce a query option?
If yes, what should be the default?

I would say the default should be the new, valid JSON option.

> Inconsistency in printing NULL values
> -
>
> Key: IMPALA-11719
> URL: https://issues.apache.org/jira/browse/IMPALA-11719
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Daniel Becker
>Priority: Major
>
> If they are top level or in collections, null values are printed as "NULL":
> {code:java}
> select int_array from functional_parquet.complextypestbl;
> ++
> | int_array              |
> ++
> | [-1]                   |
> | [1,2,3]                |
> | [NULL,1,2,NULL,3,NULL] |
> | []                     |
> | NULL                   |
> | NULL                   |
> | NULL                   |
> | NULL                   |
> ++{code}
> If they are in a struct, they are printed as "null":
> {code:java}
> select small_struct from functional_parquet.complextypes_structs;
> ++
> | small_struct                       |
> ++
> | NULL                               |
> | {"i":19191,"s":"small_struct_str"} |
> | {"i":98765,"s":null}               |
> | {"i":null,"s":"str"}               |
> | {"i":98765,"s":"abcde f"}          |
> | {"i":null,"s":null}                |
> ++{code}
> In Hive the situation is a bit different: "NULL" is used only for top level 
> values and "null" is printed in both collections and structs.
> {code:java}
> select int_array from functional_parquet.complextypestbl;
> +-+
> |        int_array        |
> +-+
> | [-1]                    |
> | [1,2,3]                 |
> | [null,1,2,null,3,null]  |
> | []                      |
> | NULL                    |
> | NULL                    |
> | NULL                    |
> | NULL                    |
> +-+{code}
> {code:java}
> select small_struct from functional_parquet.complextypes_structs;
> +-+
> |            small_struct             |
> +-+
> | NULL                                |
> | {"i":19191,"s":"small_struct_str"}  |
> | {"i":98765,"s":null}                |
> | {"i":null,"s":"str"}                |
> | {"i":98765,"s":"abcde f"}           |
> | {"i":null,"s":null}                 |
> +-+{code}
> Officially we print collections and structs in JSON form. In JSON the 
> relevant keyword is "null".
> We should decide how we handle this situation.
>  # Have a uniform NULL representation everywhere: top level, collections and 
> structs
>  ** either "NULL" or "null" everywhere
>  # Have "NULL" on the top level and "null" in collections and structs, like 
> Hive
>  # Leave everything as it is now: "NULL" at the top level and in collections, 
> "null" in structs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-11719) Inconsistency in printing NULL values

2022-11-11 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker reassigned IMPALA-11719:
--

Assignee: Daniel Becker

> Inconsistency in printing NULL values
> -
>
> Key: IMPALA-11719
> URL: https://issues.apache.org/jira/browse/IMPALA-11719
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> If they are top level or in collections, null values are printed as "NULL":
> {code:java}
> select int_array from functional_parquet.complextypestbl;
> ++
> | int_array              |
> ++
> | [-1]                   |
> | [1,2,3]                |
> | [NULL,1,2,NULL,3,NULL] |
> | []                     |
> | NULL                   |
> | NULL                   |
> | NULL                   |
> | NULL                   |
> ++{code}
> If they are in a struct, they are printed as "null":
> {code:java}
> select small_struct from functional_parquet.complextypes_structs;
> ++
> | small_struct                       |
> ++
> | NULL                               |
> | {"i":19191,"s":"small_struct_str"} |
> | {"i":98765,"s":null}               |
> | {"i":null,"s":"str"}               |
> | {"i":98765,"s":"abcde f"}          |
> | {"i":null,"s":null}                |
> ++{code}
> In Hive the situation is a bit different: "NULL" is used only for top level 
> values and "null" is printed in both collections and structs.
> {code:java}
> select int_array from functional_parquet.complextypestbl;
> +-+
> |        int_array        |
> +-+
> | [-1]                    |
> | [1,2,3]                 |
> | [null,1,2,null,3,null]  |
> | []                      |
> | NULL                    |
> | NULL                    |
> | NULL                    |
> | NULL                    |
> +-+{code}
> {code:java}
> select small_struct from functional_parquet.complextypes_structs;
> +-+
> |            small_struct             |
> +-+
> | NULL                                |
> | {"i":19191,"s":"small_struct_str"}  |
> | {"i":98765,"s":null}                |
> | {"i":null,"s":"str"}                |
> | {"i":98765,"s":"abcde f"}           |
> | {"i":null,"s":null}                 |
> +-+{code}
> Officially we print collections and structs in JSON form. In JSON the 
> relevant keyword is "null".
> We should decide how we handle this situation.
>  # Have a uniform NULL representation everywhere: top level, collections and 
> structs
>  ** either "NULL" or "null" everywhere
>  # Have "NULL" on the top level and "null" in collections and structs, like 
> Hive
>  # Leave everything as it is now: "NULL" at the top level and in collections, 
> "null" in structs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-11719) Inconsistency in printing NULL values

2022-11-11 Thread Daniel Becker (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-11719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17632547#comment-17632547
 ] 

Daniel Becker commented on IMPALA-11719:


https://gerrit.cloudera.org/#/c/19236/

> Inconsistency in printing NULL values
> -
>
> Key: IMPALA-11719
> URL: https://issues.apache.org/jira/browse/IMPALA-11719
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> If they are top level or in collections, null values are printed as "NULL":
> {code:java}
> select int_array from functional_parquet.complextypestbl;
> ++
> | int_array              |
> ++
> | [-1]                   |
> | [1,2,3]                |
> | [NULL,1,2,NULL,3,NULL] |
> | []                     |
> | NULL                   |
> | NULL                   |
> | NULL                   |
> | NULL                   |
> ++{code}
> If they are in a struct, they are printed as "null":
> {code:java}
> select small_struct from functional_parquet.complextypes_structs;
> ++
> | small_struct                       |
> ++
> | NULL                               |
> | {"i":19191,"s":"small_struct_str"} |
> | {"i":98765,"s":null}               |
> | {"i":null,"s":"str"}               |
> | {"i":98765,"s":"abcde f"}          |
> | {"i":null,"s":null}                |
> ++{code}
> In Hive the situation is a bit different: "NULL" is used only for top level 
> values and "null" is printed in both collections and structs.
> {code:java}
> select int_array from functional_parquet.complextypestbl;
> +-+
> |        int_array        |
> +-+
> | [-1]                    |
> | [1,2,3]                 |
> | [null,1,2,null,3,null]  |
> | []                      |
> | NULL                    |
> | NULL                    |
> | NULL                    |
> | NULL                    |
> +-+{code}
> {code:java}
> select small_struct from functional_parquet.complextypes_structs;
> +-+
> |            small_struct             |
> +-+
> | NULL                                |
> | {"i":19191,"s":"small_struct_str"}  |
> | {"i":98765,"s":null}                |
> | {"i":null,"s":"str"}                |
> | {"i":98765,"s":"abcde f"}           |
> | {"i":null,"s":null}                 |
> +-+{code}
> Officially we print collections and structs in JSON form. In JSON the 
> relevant keyword is "null".
> We should decide how we handle this situation.
>  # Have a uniform NULL representation everywhere: top level, collections and 
> structs
>  ** either "NULL" or "null" everywhere
>  # Have "NULL" on the top level and "null" in collections and structs, like 
> Hive
>  # Leave everything as it is now: "NULL" at the top level and in collections, 
> "null" in structs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-11719) Inconsistency in printing NULL values

2022-11-11 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-11719 started by Daniel Becker.
--
> Inconsistency in printing NULL values
> -
>
> Key: IMPALA-11719
> URL: https://issues.apache.org/jira/browse/IMPALA-11719
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> If they are top level or in collections, null values are printed as "NULL":
> {code:java}
> select int_array from functional_parquet.complextypestbl;
> ++
> | int_array              |
> ++
> | [-1]                   |
> | [1,2,3]                |
> | [NULL,1,2,NULL,3,NULL] |
> | []                     |
> | NULL                   |
> | NULL                   |
> | NULL                   |
> | NULL                   |
> ++{code}
> If they are in a struct, they are printed as "null":
> {code:java}
> select small_struct from functional_parquet.complextypes_structs;
> ++
> | small_struct                       |
> ++
> | NULL                               |
> | {"i":19191,"s":"small_struct_str"} |
> | {"i":98765,"s":null}               |
> | {"i":null,"s":"str"}               |
> | {"i":98765,"s":"abcde f"}          |
> | {"i":null,"s":null}                |
> ++{code}
> In Hive the situation is a bit different: "NULL" is used only for top level 
> values and "null" is printed in both collections and structs.
> {code:java}
> select int_array from functional_parquet.complextypestbl;
> +-+
> |        int_array        |
> +-+
> | [-1]                    |
> | [1,2,3]                 |
> | [null,1,2,null,3,null]  |
> | []                      |
> | NULL                    |
> | NULL                    |
> | NULL                    |
> | NULL                    |
> +-+{code}
> {code:java}
> select small_struct from functional_parquet.complextypes_structs;
> +-+
> |            small_struct             |
> +-+
> | NULL                                |
> | {"i":19191,"s":"small_struct_str"}  |
> | {"i":98765,"s":null}                |
> | {"i":null,"s":"str"}                |
> | {"i":98765,"s":"abcde f"}           |
> | {"i":null,"s":null}                 |
> +-+{code}
> Officially we print collections and structs in JSON form. In JSON the 
> relevant keyword is "null".
> We should decide how we handle this situation.
>  # Have a uniform NULL representation everywhere: top level, collections and 
> structs
>  ** either "NULL" or "null" everywhere
>  # Have "NULL" on the top level and "null" in collections and structs, like 
> Hive
>  # Leave everything as it is now: "NULL" at the top level and in collections, 
> "null" in structs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-11719) Inconsistency in printing NULL values

2022-11-11 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker updated IMPALA-11719:
---
Target Version: Impala 4.2.0

> Inconsistency in printing NULL values
> -
>
> Key: IMPALA-11719
> URL: https://issues.apache.org/jira/browse/IMPALA-11719
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> If they are top level or in collections, null values are printed as "NULL":
> {code:java}
> select int_array from functional_parquet.complextypestbl;
> ++
> | int_array              |
> ++
> | [-1]                   |
> | [1,2,3]                |
> | [NULL,1,2,NULL,3,NULL] |
> | []                     |
> | NULL                   |
> | NULL                   |
> | NULL                   |
> | NULL                   |
> ++{code}
> If they are in a struct, they are printed as "null":
> {code:java}
> select small_struct from functional_parquet.complextypes_structs;
> ++
> | small_struct                       |
> ++
> | NULL                               |
> | {"i":19191,"s":"small_struct_str"} |
> | {"i":98765,"s":null}               |
> | {"i":null,"s":"str"}               |
> | {"i":98765,"s":"abcde f"}          |
> | {"i":null,"s":null}                |
> ++{code}
> In Hive the situation is a bit different: "NULL" is used only for top level 
> values and "null" is printed in both collections and structs.
> {code:java}
> select int_array from functional_parquet.complextypestbl;
> +-+
> |        int_array        |
> +-+
> | [-1]                    |
> | [1,2,3]                 |
> | [null,1,2,null,3,null]  |
> | []                      |
> | NULL                    |
> | NULL                    |
> | NULL                    |
> | NULL                    |
> +-+{code}
> {code:java}
> select small_struct from functional_parquet.complextypes_structs;
> +-+
> |            small_struct             |
> +-+
> | NULL                                |
> | {"i":19191,"s":"small_struct_str"}  |
> | {"i":98765,"s":null}                |
> | {"i":null,"s":"str"}                |
> | {"i":98765,"s":"abcde f"}           |
> | {"i":null,"s":null}                 |
> +-+{code}
> Officially we print collections and structs in JSON form. In JSON the 
> relevant keyword is "null".
> We should decide how we handle this situation.
>  # Have a uniform NULL representation everywhere: top level, collections and 
> structs
>  ** either "NULL" or "null" everywhere
>  # Have "NULL" on the top level and "null" in collections and structs, like 
> Hive
>  # Leave everything as it is now: "NULL" at the top level and in collections, 
> "null" in structs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-11719) Inconsistency in printing NULL values

2022-11-11 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker updated IMPALA-11719:
---
Epic Link: IMPALA-9494

> Inconsistency in printing NULL values
> -
>
> Key: IMPALA-11719
> URL: https://issues.apache.org/jira/browse/IMPALA-11719
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> If they are top level or in collections, null values are printed as "NULL":
> {code:java}
> select int_array from functional_parquet.complextypestbl;
> ++
> | int_array              |
> ++
> | [-1]                   |
> | [1,2,3]                |
> | [NULL,1,2,NULL,3,NULL] |
> | []                     |
> | NULL                   |
> | NULL                   |
> | NULL                   |
> | NULL                   |
> ++{code}
> If they are in a struct, they are printed as "null":
> {code:java}
> select small_struct from functional_parquet.complextypes_structs;
> ++
> | small_struct                       |
> ++
> | NULL                               |
> | {"i":19191,"s":"small_struct_str"} |
> | {"i":98765,"s":null}               |
> | {"i":null,"s":"str"}               |
> | {"i":98765,"s":"abcde f"}          |
> | {"i":null,"s":null}                |
> ++{code}
> In Hive the situation is a bit different: "NULL" is used only for top level 
> values and "null" is printed in both collections and structs.
> {code:java}
> select int_array from functional_parquet.complextypestbl;
> +-+
> |        int_array        |
> +-+
> | [-1]                    |
> | [1,2,3]                 |
> | [null,1,2,null,3,null]  |
> | []                      |
> | NULL                    |
> | NULL                    |
> | NULL                    |
> | NULL                    |
> +-+{code}
> {code:java}
> select small_struct from functional_parquet.complextypes_structs;
> +-+
> |            small_struct             |
> +-+
> | NULL                                |
> | {"i":19191,"s":"small_struct_str"}  |
> | {"i":98765,"s":null}                |
> | {"i":null,"s":"str"}                |
> | {"i":98765,"s":"abcde f"}           |
> | {"i":null,"s":null}                 |
> +-+{code}
> Officially we print collections and structs in JSON form. In JSON the 
> relevant keyword is "null".
> We should decide how we handle this situation.
>  # Have a uniform NULL representation everywhere: top level, collections and 
> structs
>  ** either "NULL" or "null" everywhere
>  # Have "NULL" on the top level and "null" in collections and structs, like 
> Hive
>  # Leave everything as it is now: "NULL" at the top level and in collections, 
> "null" in structs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-11722) Wrong error message when unsupported complex type comes from * expression

2022-11-11 Thread Daniel Becker (Jira)
Daniel Becker created IMPALA-11722:
--

 Summary: Wrong error message when unsupported complex type comes 
from * expression
 Key: IMPALA-11722
 URL: https://issues.apache.org/jira/browse/IMPALA-11722
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Reporter: Daniel Becker


The following query fails with a NullPointerException:

 
{code:java}
select * from functional_orc_def.complextypestbl;
ERROR: NullPointerException: null
{code}
 

The table contains a struct, {{{}nested_struct{}}}, which is not supported yet 
because it contains collections. If the columns are listed explicitly, the 
error message is the correct one:

{code:java}
select id, int_array, int_array_array, int_map, int_map_array, nested_struct 
from functional_orc_def.complextypestbl;
ERROR: AnalysisException: Struct containing a collection type is not allowed in 
the select list.{code}
The same error message should be returned in the select * case.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-11722) Wrong error message when unsupported complex type comes from * expression

2022-11-11 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker updated IMPALA-11722:
---
Description: 
The following query fails with a NullPointerException:
{code:java}
select * from functional_orc_def.complextypestbl;
ERROR: NullPointerException: null
{code}
The table contains a struct, {{{}nested_struct{}}}, which is not supported yet 
because it contains collections. If the columns are listed explicitly, the 
error message is the correct one:
{code:java}
select id, int_array, int_array_array, int_map, int_map_array, nested_struct 
from functional_orc_def.complextypestbl;
ERROR: AnalysisException: Struct containing a collection type is not allowed in 
the select list.{code}
The same error message should be returned in the select * case.

  was:
The following query fails with a NullPointerException:

 
{code:java}
select * from functional_orc_def.complextypestbl;
ERROR: NullPointerException: null
{code}
 

The table contains a struct, {{{}nested_struct{}}}, which is not supported yet 
because it contains collections. If the columns are listed explicitly, the 
error message is the correct one:

{code:java}
select id, int_array, int_array_array, int_map, int_map_array, nested_struct 
from functional_orc_def.complextypestbl;
ERROR: AnalysisException: Struct containing a collection type is not allowed in 
the select list.{code}
The same error message should be returned in the select * case.


> Wrong error message when unsupported complex type comes from * expression
> -
>
> Key: IMPALA-11722
> URL: https://issues.apache.org/jira/browse/IMPALA-11722
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Daniel Becker
>Priority: Major
>
> The following query fails with a NullPointerException:
> {code:java}
> select * from functional_orc_def.complextypestbl;
> ERROR: NullPointerException: null
> {code}
> The table contains a struct, {{{}nested_struct{}}}, which is not supported 
> yet because it contains collections. If the columns are listed explicitly, 
> the error message is the correct one:
> {code:java}
> select id, int_array, int_array_array, int_map, int_map_array, nested_struct 
> from functional_orc_def.complextypestbl;
> ERROR: AnalysisException: Struct containing a collection type is not allowed 
> in the select list.{code}
> The same error message should be returned in the select * case.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-11722) Wrong error message when unsupported complex type comes from * expression

2022-11-11 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker reassigned IMPALA-11722:
--

Assignee: Daniel Becker

> Wrong error message when unsupported complex type comes from * expression
> -
>
> Key: IMPALA-11722
> URL: https://issues.apache.org/jira/browse/IMPALA-11722
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> The following query fails with a NullPointerException:
> {code:java}
> select * from functional_orc_def.complextypestbl;
> ERROR: NullPointerException: null
> {code}
> The table contains a struct, {{{}nested_struct{}}}, which is not supported 
> yet because it contains collections. If the columns are listed explicitly, 
> the error message is the correct one:
> {code:java}
> select id, int_array, int_array_array, int_map, int_map_array, nested_struct 
> from functional_orc_def.complextypestbl;
> ERROR: AnalysisException: Struct containing a collection type is not allowed 
> in the select list.{code}
> The same error message should be returned in the select * case.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-11722) Wrong error message when unsupported complex type comes from * expression

2022-11-13 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker reassigned IMPALA-11722:
--

Assignee: Peter Rozsa  (was: Daniel Becker)

> Wrong error message when unsupported complex type comes from * expression
> -
>
> Key: IMPALA-11722
> URL: https://issues.apache.org/jira/browse/IMPALA-11722
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Daniel Becker
>Assignee: Peter Rozsa
>Priority: Major
>
> The following query fails with a NullPointerException:
> {code:java}
> select * from functional_orc_def.complextypestbl;
> ERROR: NullPointerException: null
> {code}
> The table contains a struct, {{{}nested_struct{}}}, which is not supported 
> yet because it contains collections. If the columns are listed explicitly, 
> the error message is the correct one:
> {code:java}
> select id, int_array, int_array_array, int_map, int_map_array, nested_struct 
> from functional_orc_def.complextypestbl;
> ERROR: AnalysisException: Struct containing a collection type is not allowed 
> in the select list.{code}
> The same error message should be returned in the select * case.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-11719) Inconsistency in printing NULL values

2022-11-15 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker resolved IMPALA-11719.

Resolution: Fixed

> Inconsistency in printing NULL values
> -
>
> Key: IMPALA-11719
> URL: https://issues.apache.org/jira/browse/IMPALA-11719
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> If they are top level or in collections, null values are printed as "NULL":
> {code:java}
> select int_array from functional_parquet.complextypestbl;
> ++
> | int_array              |
> ++
> | [-1]                   |
> | [1,2,3]                |
> | [NULL,1,2,NULL,3,NULL] |
> | []                     |
> | NULL                   |
> | NULL                   |
> | NULL                   |
> | NULL                   |
> ++{code}
> If they are in a struct, they are printed as "null":
> {code:java}
> select small_struct from functional_parquet.complextypes_structs;
> ++
> | small_struct                       |
> ++
> | NULL                               |
> | {"i":19191,"s":"small_struct_str"} |
> | {"i":98765,"s":null}               |
> | {"i":null,"s":"str"}               |
> | {"i":98765,"s":"abcde f"}          |
> | {"i":null,"s":null}                |
> ++{code}
> In Hive the situation is a bit different: "NULL" is used only for top level 
> values and "null" is printed in both collections and structs.
> {code:java}
> select int_array from functional_parquet.complextypestbl;
> +-+
> |        int_array        |
> +-+
> | [-1]                    |
> | [1,2,3]                 |
> | [null,1,2,null,3,null]  |
> | []                      |
> | NULL                    |
> | NULL                    |
> | NULL                    |
> | NULL                    |
> +-+{code}
> {code:java}
> select small_struct from functional_parquet.complextypes_structs;
> +-+
> |            small_struct             |
> +-+
> | NULL                                |
> | {"i":19191,"s":"small_struct_str"}  |
> | {"i":98765,"s":null}                |
> | {"i":null,"s":"str"}                |
> | {"i":98765,"s":"abcde f"}           |
> | {"i":null,"s":null}                 |
> +-+{code}
> Officially we print collections and structs in JSON form. In JSON the 
> relevant keyword is "null".
> We should decide how we handle this situation.
>  # Have a uniform NULL representation everywhere: top level, collections and 
> structs
>  ** either "NULL" or "null" everywhere
>  # Have "NULL" on the top level and "null" in collections and structs, like 
> Hive
>  # Leave everything as it is now: "NULL" at the top level and in collections, 
> "null" in structs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-11734) TestIcebergTable.test_compute_stats fails in RELEASE builds

2022-11-21 Thread Daniel Becker (Jira)
Daniel Becker created IMPALA-11734:
--

 Summary: TestIcebergTable.test_compute_stats fails in RELEASE 
builds
 Key: IMPALA-11734
 URL: https://issues.apache.org/jira/browse/IMPALA-11734
 Project: IMPALA
  Issue Type: Improvement
Reporter: Daniel Becker
Assignee: Daniel Becker


If the Impala version is set to a release build as described in point 8 in the 
"How to Release" document 
([https://cwiki.apache.org/confluence/display/IMPALA/How+to+Release#HowtoRelease-HowtoVoteonaReleaseCandidate),]
 TestIcebergTable.test_compute_stats fails:
h3. Stacktrace
{code:java}
query_test/test_iceberg.py:852: in test_compute_stats 
self.run_test_case('QueryTest/iceberg-compute-stats', vector, unique_database) 
common/impala_test_suite.py:742: in run_test_case 
self.__verify_results_and_errors(vector, test_section, result, use_db) 
common/impala_test_suite.py:578: in __verify_results_and_errors 
replace_filenames_with_placeholder) common/test_result_verifier.py:469: in 
verify_raw_results VERIFIER_MAP[verifier](expected, actual) 
common/test_result_verifier.py:278: in verify_query_result_is_equal assert 
expected_results == actual_results E assert Comparing QueryTestResults 
(expected vs actual): E 2,1,'2.33KB','NOT CACHED','NOT 
CACHED','PARQUET','false','hdfs://localhost:20500/test-warehouse/test_compute_stats_74dbc105.db/ice_alltypes'
 != 2,1,'2.32KB','NOT CACHED','NOT 
CACHED','PARQUET','false','hdfs://localhost:20500/test-warehouse/test_compute_stats_74dbc105.db/ice_alltypes'{code}
The problem is the file size which is 2.32KB instead of 2.33KB. This is because 
the version is written into the file, and "x.y.z-RELEASE" is one byte shorter 
than "x.y.z-SNAPSHOT". The size of the file in this test is on the boundary 
between 2.32KB and 2.33KB, so this one byte can change the value.

We could use a row_regex to accept both values so it works for both snapshot 
and release versions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-11734) TestIcebergTable.test_compute_stats fails in RELEASE builds

2022-11-22 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker resolved IMPALA-11734.

Resolution: Implemented

> TestIcebergTable.test_compute_stats fails in RELEASE builds
> ---
>
> Key: IMPALA-11734
> URL: https://issues.apache.org/jira/browse/IMPALA-11734
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> If the Impala version is set to a release build as described in point 8 in 
> the "How to Release" document 
> ([https://cwiki.apache.org/confluence/display/IMPALA/How+to+Release#HowtoRelease-HowtoVoteonaReleaseCandidate),]
>  TestIcebergTable.test_compute_stats fails:
> h3. Stacktrace
> {code:java}
> query_test/test_iceberg.py:852: in test_compute_stats 
> self.run_test_case('QueryTest/iceberg-compute-stats', vector, 
> unique_database) common/impala_test_suite.py:742: in run_test_case 
> self.__verify_results_and_errors(vector, test_section, result, use_db) 
> common/impala_test_suite.py:578: in __verify_results_and_errors 
> replace_filenames_with_placeholder) common/test_result_verifier.py:469: in 
> verify_raw_results VERIFIER_MAP[verifier](expected, actual) 
> common/test_result_verifier.py:278: in verify_query_result_is_equal assert 
> expected_results == actual_results E assert Comparing QueryTestResults 
> (expected vs actual): E 2,1,'2.33KB','NOT CACHED','NOT 
> CACHED','PARQUET','false','hdfs://localhost:20500/test-warehouse/test_compute_stats_74dbc105.db/ice_alltypes'
>  != 2,1,'2.32KB','NOT CACHED','NOT 
> CACHED','PARQUET','false','hdfs://localhost:20500/test-warehouse/test_compute_stats_74dbc105.db/ice_alltypes'{code}
> The problem is the file size which is 2.32KB instead of 2.33KB. This is 
> because the version is written into the file, and "x.y.z-RELEASE" is one byte 
> shorter than "x.y.z-SNAPSHOT". The size of the file in this test is on the 
> boundary between 2.32KB and 2.33KB, so this one byte can change the value.
> We could use a row_regex to accept both values so it works for both snapshot 
> and release versions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-11400) Kudu scan bottleneck due to sharing a single Kudu client for multiple tablet scans

2022-11-23 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker updated IMPALA-11400:
---
Fix Version/s: Impala 4.3.0
   (was: Impala 4.2.0)

> Kudu scan bottleneck due to sharing a single Kudu client for multiple tablet 
> scans
> --
>
> Key: IMPALA-11400
> URL: https://issues.apache.org/jira/browse/IMPALA-11400
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.1.0
>Reporter: Sameera Wijerathne
>Priority: Major
>  Labels: performance
> Fix For: Impala 4.3.0
>
> Attachments: 0.JPG, 1.JPG, 2-1.jpeg, 2.JPG, 2.jpeg, 3.JPG, 4.JPG, 
> 5.JPG, Impala_1.png, Impala_2.png, Kudu_1.png, Kudu_2.png, WhatsApp Image 
> 2022-06-07 at 10.39.27 PM.jpeg
>
>
> This issue was observed when impala queries large datasets resides in Kudu. 
> Even single ImpalaD is scanning multiple kudu tablets, it shows a slowness to 
> retrive data eventhough ImpalaD makes parrellel scans. Reason for this is 
> ImpalaD only uses a single Kudu client for multiple scans but 
> KuduScanner::NextBatch runs on a single thread. So it's rpc reactor thread 
> utilizes upto a single core and bottlenecks all parrelel scans. 
> This behaviour makes Impala clusters that scans kudu cannot be vertically 
> scales to the maximum performance/cores of a node.
> Please refer the screenshots from Kudu slack channel for more information.
>  
> !2-1.jpeg|width=717,height=961!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10508) Add metrics for reading from remote scratch paths

2022-11-23 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker updated IMPALA-10508:
---
Fix Version/s: Impala 4.3.0

> Add metrics for reading from remote scratch paths
> -
>
> Key: IMPALA-10508
> URL: https://issues.apache.org/jira/browse/IMPALA-10508
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Yida Wu
>Assignee: Yida Wu
>Priority: Minor
> Fix For: Impala 4.3.0
>
>
> For reading data from a remote scratch path, the data can be fetched from the 
> local buffer if the file hasn't been uploaded yet, or fetched from remote 
> filesystem.
> The metrics can help to identify how much data is read from the local buffer, 
> how much is from the remote filesystem.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Reopened] (IMPALA-10508) Add metrics for reading from remote scratch paths

2022-11-23 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker reopened IMPALA-10508:


> Add metrics for reading from remote scratch paths
> -
>
> Key: IMPALA-10508
> URL: https://issues.apache.org/jira/browse/IMPALA-10508
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Yida Wu
>Assignee: Yida Wu
>Priority: Minor
>
> For reading data from a remote scratch path, the data can be fetched from the 
> local buffer if the file hasn't been uploaded yet, or fetched from remote 
> filesystem.
> The metrics can help to identify how much data is read from the local buffer, 
> how much is from the remote filesystem.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10508) Add metrics for reading from remote scratch paths

2022-11-23 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker resolved IMPALA-10508.

Resolution: Fixed

> Add metrics for reading from remote scratch paths
> -
>
> Key: IMPALA-10508
> URL: https://issues.apache.org/jira/browse/IMPALA-10508
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Yida Wu
>Assignee: Yida Wu
>Priority: Minor
> Fix For: Impala 4.3.0
>
>
> For reading data from a remote scratch path, the data can be fetched from the 
> local buffer if the file hasn't been uploaded yet, or fetched from remote 
> filesystem.
> The metrics can help to identify how much data is read from the local buffer, 
> how much is from the remote filesystem.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Reopened] (IMPALA-9496) Allow Struct type in SELECT list for Parquet tables

2022-11-23 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker reopened IMPALA-9496:
---

> Allow Struct type in SELECT list for Parquet tables
> ---
>
> Key: IMPALA-9496
> URL: https://issues.apache.org/jira/browse/IMPALA-9496
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend, Frontend
>Reporter: Gabor Kaszab
>Assignee: Gabor Kaszab
>Priority: Major
>  Labels: complextype
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9496) Allow Struct type in SELECT list for Parquet tables

2022-11-23 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker resolved IMPALA-9496.
---
Fix Version/s: Impala 4.2.0
   Resolution: Fixed

> Allow Struct type in SELECT list for Parquet tables
> ---
>
> Key: IMPALA-9496
> URL: https://issues.apache.org/jira/browse/IMPALA-9496
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend, Frontend
>Reporter: Gabor Kaszab
>Assignee: Gabor Kaszab
>Priority: Major
>  Labels: complextype
> Fix For: Impala 4.2.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-11193) Assertion fails in ClientCacheTest.MemLeak

2022-11-23 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker resolved IMPALA-11193.

Fix Version/s: Impala 4.1.0
   Resolution: Fixed

> Assertion fails in ClientCacheTest.MemLeak
> --
>
> Key: IMPALA-11193
> URL: https://issues.apache.org/jira/browse/IMPALA-11193
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Daniel Becker
>Assignee: Yida Wu
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 4.1.0
>
>
> The test {*}ClientCacheTest.MemLeak{*}, introduced in IMPALA-11176, fails in 
> several internal builds.
> h3. Error Message
> {code:java}
> Expected: (mem_before) > (0), actual: 0 vs 0{code}
> h3. Stacktrace
> {code:java}
> /data/jenkins/workspace/impala-cdw-master-staging-core-tsan/repos/Impala/be/src/runtime/client-cache-test.cc:100
> Expected: (mem_before) > (0), actual: 0 vs 0{code}
> Interestingly it is not the main assert that fails but a "precondition", 
> namely EXPECT_GT(mem_before, 0).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Reopened] (IMPALA-11193) Assertion fails in ClientCacheTest.MemLeak

2022-11-23 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker reopened IMPALA-11193:


> Assertion fails in ClientCacheTest.MemLeak
> --
>
> Key: IMPALA-11193
> URL: https://issues.apache.org/jira/browse/IMPALA-11193
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Daniel Becker
>Assignee: Yida Wu
>Priority: Blocker
>  Labels: broken-build
>
> The test {*}ClientCacheTest.MemLeak{*}, introduced in IMPALA-11176, fails in 
> several internal builds.
> h3. Error Message
> {code:java}
> Expected: (mem_before) > (0), actual: 0 vs 0{code}
> h3. Stacktrace
> {code:java}
> /data/jenkins/workspace/impala-cdw-master-staging-core-tsan/repos/Impala/be/src/runtime/client-cache-test.cc:100
> Expected: (mem_before) > (0), actual: 0 vs 0{code}
> Interestingly it is not the main assert that fails but a "precondition", 
> namely EXPECT_GT(mem_before, 0).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Reopened] (IMPALA-11196) Assertion failure in ClientCacheTest.MemLeak ASAN build

2022-11-23 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker reopened IMPALA-11196:


> Assertion failure in ClientCacheTest.MemLeak ASAN build
> ---
>
> Key: IMPALA-11196
> URL: https://issues.apache.org/jira/browse/IMPALA-11196
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Yida Wu
>Assignee: Yida Wu
>Priority: Blocker
>  Labels: broken-build
>
> The test ClientCacheTest.MemLeak, introduced in IMPALA-11176, fails in ASAN 
> and TSAN build.
> h3. Error Message
> Value of: mem_after Actual: 22012933906432 Expected: mem_before Which is: 
> 22012768583680
> h3. Stacktrace
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/runtime/client-cache-test.cc:112
>  Value of: mem_after Actual: 22012933906432 Expected: mem_before Which is: 
> 22012768583680



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-11196) Assertion failure in ClientCacheTest.MemLeak ASAN build

2022-11-23 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker resolved IMPALA-11196.

Fix Version/s: Impala 4.1.0
   Resolution: Fixed

> Assertion failure in ClientCacheTest.MemLeak ASAN build
> ---
>
> Key: IMPALA-11196
> URL: https://issues.apache.org/jira/browse/IMPALA-11196
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Yida Wu
>Assignee: Yida Wu
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 4.1.0
>
>
> The test ClientCacheTest.MemLeak, introduced in IMPALA-11176, fails in ASAN 
> and TSAN build.
> h3. Error Message
> Value of: mem_after Actual: 22012933906432 Expected: mem_before Which is: 
> 22012768583680
> h3. Stacktrace
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/runtime/client-cache-test.cc:112
>  Value of: mem_after Actual: 22012933906432 Expected: mem_before Which is: 
> 22012768583680



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-11536) Invalid push down predicates in outer join simplification

2022-11-23 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker updated IMPALA-11536:
---
Fix Version/s: (was: Impala 4.2.0)

> Invalid push down predicates in outer join simplification
> -
>
> Key: IMPALA-11536
> URL: https://issues.apache.org/jira/browse/IMPALA-11536
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0.0, Impala 4.1.0
>Reporter: Xianqing He
>Assignee: Xianqing He
>Priority: Major
> Attachments: image-2022-08-25-14-47-51-966.png
>
>
> When set ENABLE_OUTER_JOIN_TO_INNER_TRANSFORMATION = true;
> It may invalid push down the predicate  that is not null rejecting in outer 
> join simplification.
> e.g.
> SELECT COALESCE(jointbl.test_id, testtbl.id, dimtbl.id) AS id, 
> test_zip,testtbl.zip
> FROM functional.jointbl
> FULL OUTER JOIN
> functional.testtbl
> ON jointbl.test_id = testtbl.id
> FULL OUTER JOIN
> functional.dimtbl
> ON coalesce(jointbl.test_id, testtbl.id) = dimtbl.id
> WHERE
> `jointbl`.`test_zip` = 94611 and coalesce(`testtbl`.`zip`, 0) = 0;
>  
> !image-2022-08-25-14-47-51-966.png!  
> We can't push down the predicate 'coalesce(testtbl.zip, 0) = 0' to ScanNode 
> since it is not null rejecting
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-11536) Invalid push down predicates in outer join simplification

2022-11-23 Thread Daniel Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-11536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker updated IMPALA-11536:
---
Target Version: Impala 4.3.0

> Invalid push down predicates in outer join simplification
> -
>
> Key: IMPALA-11536
> URL: https://issues.apache.org/jira/browse/IMPALA-11536
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0.0, Impala 4.1.0
>Reporter: Xianqing He
>Assignee: Xianqing He
>Priority: Major
> Attachments: image-2022-08-25-14-47-51-966.png
>
>
> When set ENABLE_OUTER_JOIN_TO_INNER_TRANSFORMATION = true;
> It may invalid push down the predicate  that is not null rejecting in outer 
> join simplification.
> e.g.
> SELECT COALESCE(jointbl.test_id, testtbl.id, dimtbl.id) AS id, 
> test_zip,testtbl.zip
> FROM functional.jointbl
> FULL OUTER JOIN
> functional.testtbl
> ON jointbl.test_id = testtbl.id
> FULL OUTER JOIN
> functional.dimtbl
> ON coalesce(jointbl.test_id, testtbl.id) = dimtbl.id
> WHERE
> `jointbl`.`test_zip` = 94611 and coalesce(`testtbl`.`zip`, 0) = 0;
>  
> !image-2022-08-25-14-47-51-966.png!  
> We can't push down the predicate 'coalesce(testtbl.zip, 0) = 0' to ScanNode 
> since it is not null rejecting
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



  1   2   3   4   5   6   7   8   >