[jira] [Created] (IMPALA-9554) Expose complex type interface through UDFs

2020-03-26 Thread Gabor Kaszab (Jira)
Gabor Kaszab created IMPALA-9554:


 Summary: Expose complex type interface through UDFs
 Key: IMPALA-9554
 URL: https://issues.apache.org/jira/browse/IMPALA-9554
 Project: IMPALA
  Issue Type: New Feature
  Components: Backend, Frontend
Reporter: Gabor Kaszab


Once there is a better understanding of how complex types could be added to 
builtin functions then we can expose the complex type support through UDFs by 
allowing complex types as parameters or as return values.
The reason this should come as a second step is that once we have exposed these 
UDF changes we have to keep backward compatibility in the following releases so 
there won’t be much room to adjust.
What brings some complexity is that CollectionVal uses Impala’s internal tuple 
representation and it is not trivial to expose through UDFs. There might be 2 
ways:
 # Expose helper functions to extract tuples, fields, etc.
 # For UDFs use a different representation than tuples and translate them to 
the internal representation when a UDF is called.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-3766) Optionally compress spilled data before writing it to disk

2020-03-26 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-3766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-3766:
--
Summary: Optionally compress spilled data before writing it to disk  (was: 
Optionally compress spilled data before writing it do disk)

> Optionally compress spilled data before writing it to disk
> --
>
> Key: IMPALA-3766
> URL: https://issues.apache.org/jira/browse/IMPALA-3766
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Affects Versions: Impala 2.7.0
>Reporter: Mostafa Mokhtar
>Assignee: Tim Armstrong
>Priority: Minor
>  Labels: performance
>
> Evaluate compressing the buffers before writing them to disk for spilling 
> operators. 
> Applying LZ4 on row batches before sending them over the network as part of 
> exchange provides around 2x compression. 
> {code}
>  - BytesSent: 612.87 MB (642635712)
>  - NetworkThroughput(*): 1.88 GB/sec
>  - OverallThroughput: 1.21 GB/sec
>  - PeakMemoryUsage: 51.00 KB (52224)
>  - RowsReturned: 360.00K (36)
>  - SerializeBatchTime: 176.002ms
>  - TransmitDataRPCTime: 319.005ms
>  - UncompressedRowBatchSize: 1.47 GB (1573356320)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9343) Ensure that multithreaded plans are shown correctly in exec summary, profile, etc.

2020-03-26 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17067865#comment-17067865
 ] 

Tim Armstrong commented on IMPALA-9343:
---

The only glitch I'm currently aware of is the web UI plan visualisation, which 
doesn't connect across join plans.

> Ensure that multithreaded plans are shown correctly in exec summary, profile, 
> etc.
> --
>
> Key: IMPALA-9343
> URL: https://issues.apache.org/jira/browse/IMPALA-9343
> Project: IMPALA
>  Issue Type: Task
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: observability
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9555) TestDateQueries.test_queries failing because Hive3 switched back to the hybrid Julian Gregorian calendar

2020-03-26 Thread Attila Jeges (Jira)
Attila Jeges created IMPALA-9555:


 Summary: TestDateQueries.test_queries failing because Hive3 
switched back to the hybrid Julian Gregorian calendar
 Key: IMPALA-9555
 URL: https://issues.apache.org/jira/browse/IMPALA-9555
 Project: IMPALA
  Issue Type: Bug
Reporter: Attila Jeges


TestDateQueries.test_queries is failing after upgrading the CDP GBN with the 
following error:

{code}
query_test.test_date_queries.TestDateQueries.test_queries[protocol: beeswax | 
exec_option: {'disable_codegen_rows_threshold': 0, 'disable_codegen': 'true', 
'batch_size': 1} | table_format: avro/snap/block] (from pytest)

Error Message

query_test/test_date_queries.py:60: in test_queries 
self.run_test_case('QueryTest/avro_date', vector) 
common/impala_test_suite.py:690: in run_test_case 
self.__verify_results_and_errors(vector, test_section, result, use_db) 
common/impala_test_suite.py:523: in __verify_results_and_errors 
replace_filenames_with_placeholder) common/test_result_verifier.py:456: in 
verify_raw_results VERIFIER_MAP[verifier](expected, actual) 
common/test_result_verifier.py:278: in verify_query_result_is_equal assert 
expected_results == actual_results E   assert Comparing QueryTestResults 
(expected vs actual): E 0,0001-01-01,0001-01-01 != 10,1399-06-27,2017-11-28 
E 1,0001-01-01,0001-12-31 != 11,1399-06-27,NULL E 
10,1399-06-27,2017-11-28 != 12,1399-06-27,2018-12-31 E 11,1399-06-27,NULL 
!= 20,2017-11-27,0001-06-19 E 12,1399-06-27,2018-12-31 != 
21,2017-11-27,0001-06-20 E 2,0001-01-01,0002-01-01 != 
22,2017-11-27,0001-06-21 E 20,2017-11-27,0001-06-21 != 
23,2017-11-27,0001-06-22 E 21,2017-11-27,0001-06-22 != 
24,2017-11-27,0001-06-23 E 22,2017-11-27,0001-06-23 != 
25,2017-11-27,0001-06-24 E 23,2017-11-27,0001-06-24 != 
26,2017-11-27,0001-06-25 E 24,2017-11-27,0001-06-25 != 
27,2017-11-27,0001-06-26 E 25,2017-11-27,0001-06-26 != 
28,2017-11-27,0001-06-27 E 26,2017-11-27,0001-06-27 != 
29,2017-11-27,2017-11-28 E 27,2017-11-27,0001-06-28 != 
30,-12-31,-12-01 E 28,2017-11-27,0001-06-29 != 
31,-12-31,-12-31 E 29,2017-11-27,2017-11-28 != None E 
3,0001-01-01,1399-12-31 != None E 30,-12-31,-12-01 != None E 
31,-12-31,-12-31 != None E 4,0001-01-01,2017-11-28 != None E 
5,0001-01-01,-12-31 != None E 6,0001-01-01,NULL != None E Number of 
rows returned (expected vs actual): 22 != 15

Stacktrace

query_test/test_date_queries.py:60: in test_queries
self.run_test_case('QueryTest/avro_date', vector)
common/impala_test_suite.py:690: in run_test_case
self.__verify_results_and_errors(vector, test_section, result, use_db)
common/impala_test_suite.py:523: in __verify_results_and_errors
replace_filenames_with_placeholder)
common/test_result_verifier.py:456: in verify_raw_results
VERIFIER_MAP[verifier](expected, actual)
common/test_result_verifier.py:278: in verify_query_result_is_equal
assert expected_results == actual_results
E   assert Comparing QueryTestResults (expected vs actual):
E 0,0001-01-01,0001-01-01 != 10,1399-06-27,2017-11-28
E 1,0001-01-01,0001-12-31 != 11,1399-06-27,NULL
E 10,1399-06-27,2017-11-28 != 12,1399-06-27,2018-12-31
E 11,1399-06-27,NULL != 20,2017-11-27,0001-06-19
E 12,1399-06-27,2018-12-31 != 21,2017-11-27,0001-06-20
E 2,0001-01-01,0002-01-01 != 22,2017-11-27,0001-06-21
E 20,2017-11-27,0001-06-21 != 23,2017-11-27,0001-06-22
E 21,2017-11-27,0001-06-22 != 24,2017-11-27,0001-06-23
E 22,2017-11-27,0001-06-23 != 25,2017-11-27,0001-06-24
E 23,2017-11-27,0001-06-24 != 26,2017-11-27,0001-06-25
E 24,2017-11-27,0001-06-25 != 27,2017-11-27,0001-06-26
E 25,2017-11-27,0001-06-26 != 28,2017-11-27,0001-06-27
E 26,2017-11-27,0001-06-27 != 29,2017-11-27,2017-11-28
E 27,2017-11-27,0001-06-28 != 30,-12-31,-12-01
E 28,2017-11-27,0001-06-29 != 31,-12-31,-12-31
E 29,2017-11-27,2017-11-28 != None
E 3,0001-01-01,1399-12-31 != None
E 30,-12-31,-12-01 != None
E 31,-12-31,-12-31 != None
E 4,0001-01-01,2017-11-28 != None
E 5,0001-01-01,-12-31 != None
E 6,0001-01-01,NULL != None
E Number of rows returned (expected vs actual): 22 != 15

Standard Error

ERROR:test_configuration:Comparing QueryTestResults (expected vs actual):
0,0001-01-01,0001-01-01 != 10,1399-06-27,2017-11-28
1,0001-01-01,0001-12-31 != 11,1399-06-27,NULL
10,1399-06-27,2017-11-28 != 12,1399-06-27,2018-12-31
11,1399-06-27,NULL != 20,2017-11-27,0001-06-19
12,1399-06-27,2018-12-31 != 21,2017-11-27,0001-06-20
2,0001-01-01,0002-01-01 != 22,2017-11-27,0001-06-21
20,2017-11-27,0001-06-21 != 23,2017-11-27,0001-06-22
21,2017-11-27,0001-06-22 != 24,2017-11-27,0001-06-23
22,2017-11-27,0001-06-23 != 25,2017-11-27,0001-06-24
23,2017-11-27,0001-06-24 != 26,2017-11-27,0001-06-25
24

[jira] [Updated] (IMPALA-9555) TestDateQueries.test_queries failing because Hive3 switched back to the hybrid Julian Gregorian calendar

2020-03-26 Thread Attila Jeges (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Jeges updated IMPALA-9555:
-
Description: 
TestDateQueries.test_queries is failing after upgrading the CDP GBN with the 
following error:

{code}
query_test.test_date_queries.TestDateQueries.test_queries[protocol: beeswax | 
exec_option: {'disable_codegen_rows_threshold': 0, 'disable_codegen': 'true', 
'batch_size': 1} | table_format: avro/snap/block] (from pytest)

Error Message

query_test/test_date_queries.py:60: in test_queries 
self.run_test_case('QueryTest/avro_date', vector) 
common/impala_test_suite.py:690: in run_test_case 
self.__verify_results_and_errors(vector, test_section, result, use_db) 
common/impala_test_suite.py:523: in __verify_results_and_errors 
replace_filenames_with_placeholder) common/test_result_verifier.py:456: in 
verify_raw_results VERIFIER_MAP[verifier](expected, actual) 
common/test_result_verifier.py:278: in verify_query_result_is_equal assert 
expected_results == actual_results E   assert Comparing QueryTestResults 
(expected vs actual): E 0,0001-01-01,0001-01-01 != 10,1399-06-27,2017-11-28 
E 1,0001-01-01,0001-12-31 != 11,1399-06-27,NULL E 
10,1399-06-27,2017-11-28 != 12,1399-06-27,2018-12-31 E 11,1399-06-27,NULL 
!= 20,2017-11-27,0001-06-19 E 12,1399-06-27,2018-12-31 != 
21,2017-11-27,0001-06-20 E 2,0001-01-01,0002-01-01 != 
22,2017-11-27,0001-06-21 E 20,2017-11-27,0001-06-21 != 
23,2017-11-27,0001-06-22 E 21,2017-11-27,0001-06-22 != 
24,2017-11-27,0001-06-23 E 22,2017-11-27,0001-06-23 != 
25,2017-11-27,0001-06-24 E 23,2017-11-27,0001-06-24 != 
26,2017-11-27,0001-06-25 E 24,2017-11-27,0001-06-25 != 
27,2017-11-27,0001-06-26 E 25,2017-11-27,0001-06-26 != 
28,2017-11-27,0001-06-27 E 26,2017-11-27,0001-06-27 != 
29,2017-11-27,2017-11-28 E 27,2017-11-27,0001-06-28 != 
30,-12-31,-12-01 E 28,2017-11-27,0001-06-29 != 
31,-12-31,-12-31 E 29,2017-11-27,2017-11-28 != None E 
3,0001-01-01,1399-12-31 != None E 30,-12-31,-12-01 != None E 
31,-12-31,-12-31 != None E 4,0001-01-01,2017-11-28 != None E 
5,0001-01-01,-12-31 != None E 6,0001-01-01,NULL != None E Number of 
rows returned (expected vs actual): 22 != 15

Stacktrace

query_test/test_date_queries.py:60: in test_queries
self.run_test_case('QueryTest/avro_date', vector)
common/impala_test_suite.py:690: in run_test_case
self.__verify_results_and_errors(vector, test_section, result, use_db)
common/impala_test_suite.py:523: in __verify_results_and_errors
replace_filenames_with_placeholder)
common/test_result_verifier.py:456: in verify_raw_results
VERIFIER_MAP[verifier](expected, actual)
common/test_result_verifier.py:278: in verify_query_result_is_equal
assert expected_results == actual_results
E   assert Comparing QueryTestResults (expected vs actual):
E 0,0001-01-01,0001-01-01 != 10,1399-06-27,2017-11-28
E 1,0001-01-01,0001-12-31 != 11,1399-06-27,NULL
E 10,1399-06-27,2017-11-28 != 12,1399-06-27,2018-12-31
E 11,1399-06-27,NULL != 20,2017-11-27,0001-06-19
E 12,1399-06-27,2018-12-31 != 21,2017-11-27,0001-06-20
E 2,0001-01-01,0002-01-01 != 22,2017-11-27,0001-06-21
E 20,2017-11-27,0001-06-21 != 23,2017-11-27,0001-06-22
E 21,2017-11-27,0001-06-22 != 24,2017-11-27,0001-06-23
E 22,2017-11-27,0001-06-23 != 25,2017-11-27,0001-06-24
E 23,2017-11-27,0001-06-24 != 26,2017-11-27,0001-06-25
E 24,2017-11-27,0001-06-25 != 27,2017-11-27,0001-06-26
E 25,2017-11-27,0001-06-26 != 28,2017-11-27,0001-06-27
E 26,2017-11-27,0001-06-27 != 29,2017-11-27,2017-11-28
E 27,2017-11-27,0001-06-28 != 30,-12-31,-12-01
E 28,2017-11-27,0001-06-29 != 31,-12-31,-12-31
E 29,2017-11-27,2017-11-28 != None
E 3,0001-01-01,1399-12-31 != None
E 30,-12-31,-12-01 != None
E 31,-12-31,-12-31 != None
E 4,0001-01-01,2017-11-28 != None
E 5,0001-01-01,-12-31 != None
E 6,0001-01-01,NULL != None
E Number of rows returned (expected vs actual): 22 != 15

Standard Error

ERROR:test_configuration:Comparing QueryTestResults (expected vs actual):
0,0001-01-01,0001-01-01 != 10,1399-06-27,2017-11-28
1,0001-01-01,0001-12-31 != 11,1399-06-27,NULL
10,1399-06-27,2017-11-28 != 12,1399-06-27,2018-12-31
11,1399-06-27,NULL != 20,2017-11-27,0001-06-19
12,1399-06-27,2018-12-31 != 21,2017-11-27,0001-06-20
2,0001-01-01,0002-01-01 != 22,2017-11-27,0001-06-21
20,2017-11-27,0001-06-21 != 23,2017-11-27,0001-06-22
21,2017-11-27,0001-06-22 != 24,2017-11-27,0001-06-23
22,2017-11-27,0001-06-23 != 25,2017-11-27,0001-06-24
23,2017-11-27,0001-06-24 != 26,2017-11-27,0001-06-25
24,2017-11-27,0001-06-25 != 27,2017-11-27,0001-06-26
25,2017-11-27,0001-06-26 != 28,2017-11-27,0001-06-27
26,2017-11-27,0001-06-27 != 29,2017-11-27,2017-11-28
27,2017-11-27,0001-06-28 

[jira] [Commented] (IMPALA-9380) Serialize query profile asynchronously

2020-03-26 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17067880#comment-17067880
 ] 

Tim Armstrong commented on IMPALA-9380:
---

I think the only feasible solution to this would be to make UnregisterQuery() 
partially asynchronous, e.g. move the archiving to a thread pool. The other 
alternative is to start the serialization *before* unregister is called then 
block in unregister. But that only works if there's a significant delay between 
the two things.

One thing I noticed that's weird is that we remove the ClientRequestState from 
the map before we archive the query, so there's a gap where the profile is 
probably not available. This glitch might be masked because clients block in 
UnregisterQuery().

> Serialize query profile asynchronously
> --
>
> Key: IMPALA-9380
> URL: https://issues.apache.org/jira/browse/IMPALA-9380
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Tim Armstrong
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9536) UdfExecutorTest.HiveStringsTest fails when using newer Hive

2020-03-26 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar resolved IMPALA-9536.
--
Fix Version/s: Impala 4.0
   Resolution: Fixed

> UdfExecutorTest.HiveStringsTest fails when using newer Hive
> ---
>
> Key: IMPALA-9536
> URL: https://issues.apache.org/jira/browse/IMPALA-9536
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0
>Reporter: Joe McDonnell
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 4.0
>
>
> When using a newer CDP Hive and USE_CDP_HIVE=true, 
> org.apache.impala.hive.executor.UdfExecutorTest.HiveStringsTest fails with 
> the following error:
> {noformat}
> java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hive/ql/stats/estimator/StatEstimatorProvider
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.impala.hive.executor.UdfExecutorTest.HiveStringsTest(UdfExecutorTest.java:456)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:272)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:236)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:386)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:323)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:143)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.ql.stats.estimator.StatEstimatorProvider
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   ... 37 more{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9548) UdfExecutorTest failures after HIVE-22893

2020-03-26 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar resolved IMPALA-9548.
--
Fix Version/s: Impala 4.0
   Resolution: Fixed

> UdfExecutorTest failures after HIVE-22893
> -
>
> Key: IMPALA-9548
> URL: https://issues.apache.org/jira/browse/IMPALA-9548
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Fix For: Impala 4.0
>
>
> HIVE-22893 added a dependency on StatEstimatorProvider to certain UDFs. This 
> causes {{UdfExecutorTest}} to start failing. {{shaded-deps/pom.xml}} defines 
> a specific set of classes that a pulled in from the {{hive-exec}} jar, 
> {{StatEstimatorProvider}} just needs to be added to that list.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-9555) TestDateQueries.test_queries failing because Hive3 switched back to the hybrid Julian Gregorian calendar

2020-03-26 Thread Attila Jeges (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-9555 started by Attila Jeges.

> TestDateQueries.test_queries failing because Hive3 switched back to the 
> hybrid Julian Gregorian calendar
> 
>
> Key: IMPALA-9555
> URL: https://issues.apache.org/jira/browse/IMPALA-9555
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Attila Jeges
>Assignee: Attila Jeges
>Priority: Critical
>
> TestDateQueries.test_queries is failing after upgrading the CDP GBN with the 
> following error:
> {code}
> query_test.test_date_queries.TestDateQueries.test_queries[protocol: beeswax | 
> exec_option: {'disable_codegen_rows_threshold': 0, 'disable_codegen': 'true', 
> 'batch_size': 1} | table_format: avro/snap/block] (from pytest)
> Error Message
> query_test/test_date_queries.py:60: in test_queries 
> self.run_test_case('QueryTest/avro_date', vector) 
> common/impala_test_suite.py:690: in run_test_case 
> self.__verify_results_and_errors(vector, test_section, result, use_db) 
> common/impala_test_suite.py:523: in __verify_results_and_errors 
> replace_filenames_with_placeholder) common/test_result_verifier.py:456: in 
> verify_raw_results VERIFIER_MAP[verifier](expected, actual) 
> common/test_result_verifier.py:278: in verify_query_result_is_equal 
> assert expected_results == actual_results E   assert Comparing 
> QueryTestResults (expected vs actual): E 0,0001-01-01,0001-01-01 != 
> 10,1399-06-27,2017-11-28 E 1,0001-01-01,0001-12-31 != 11,1399-06-27,NULL 
> E 10,1399-06-27,2017-11-28 != 12,1399-06-27,2018-12-31 E 
> 11,1399-06-27,NULL != 20,2017-11-27,0001-06-19 E 12,1399-06-27,2018-12-31 
> != 21,2017-11-27,0001-06-20 E 2,0001-01-01,0002-01-01 != 
> 22,2017-11-27,0001-06-21 E 20,2017-11-27,0001-06-21 != 
> 23,2017-11-27,0001-06-22 E 21,2017-11-27,0001-06-22 != 
> 24,2017-11-27,0001-06-23 E 22,2017-11-27,0001-06-23 != 
> 25,2017-11-27,0001-06-24 E 23,2017-11-27,0001-06-24 != 
> 26,2017-11-27,0001-06-25 E 24,2017-11-27,0001-06-25 != 
> 27,2017-11-27,0001-06-26 E 25,2017-11-27,0001-06-26 != 
> 28,2017-11-27,0001-06-27 E 26,2017-11-27,0001-06-27 != 
> 29,2017-11-27,2017-11-28 E 27,2017-11-27,0001-06-28 != 
> 30,-12-31,-12-01 E 28,2017-11-27,0001-06-29 != 
> 31,-12-31,-12-31 E 29,2017-11-27,2017-11-28 != None E 
> 3,0001-01-01,1399-12-31 != None E 30,-12-31,-12-01 != None E 
> 31,-12-31,-12-31 != None E 4,0001-01-01,2017-11-28 != None E 
> 5,0001-01-01,-12-31 != None E 6,0001-01-01,NULL != None E Number 
> of rows returned (expected vs actual): 22 != 15
> Stacktrace
> query_test/test_date_queries.py:60: in test_queries
> self.run_test_case('QueryTest/avro_date', vector)
> common/impala_test_suite.py:690: in run_test_case
> self.__verify_results_and_errors(vector, test_section, result, use_db)
> common/impala_test_suite.py:523: in __verify_results_and_errors
> replace_filenames_with_placeholder)
> common/test_result_verifier.py:456: in verify_raw_results
> VERIFIER_MAP[verifier](expected, actual)
> common/test_result_verifier.py:278: in verify_query_result_is_equal
> assert expected_results == actual_results
> E   assert Comparing QueryTestResults (expected vs actual):
> E 0,0001-01-01,0001-01-01 != 10,1399-06-27,2017-11-28
> E 1,0001-01-01,0001-12-31 != 11,1399-06-27,NULL
> E 10,1399-06-27,2017-11-28 != 12,1399-06-27,2018-12-31
> E 11,1399-06-27,NULL != 20,2017-11-27,0001-06-19
> E 12,1399-06-27,2018-12-31 != 21,2017-11-27,0001-06-20
> E 2,0001-01-01,0002-01-01 != 22,2017-11-27,0001-06-21
> E 20,2017-11-27,0001-06-21 != 23,2017-11-27,0001-06-22
> E 21,2017-11-27,0001-06-22 != 24,2017-11-27,0001-06-23
> E 22,2017-11-27,0001-06-23 != 25,2017-11-27,0001-06-24
> E 23,2017-11-27,0001-06-24 != 26,2017-11-27,0001-06-25
> E 24,2017-11-27,0001-06-25 != 27,2017-11-27,0001-06-26
> E 25,2017-11-27,0001-06-26 != 28,2017-11-27,0001-06-27
> E 26,2017-11-27,0001-06-27 != 29,2017-11-27,2017-11-28
> E 27,2017-11-27,0001-06-28 != 30,-12-31,-12-01
> E 28,2017-11-27,0001-06-29 != 31,-12-31,-12-31
> E 29,2017-11-27,2017-11-28 != None
> E 3,0001-01-01,1399-12-31 != None
> E 30,-12-31,-12-01 != None
> E 31,-12-31,-12-31 != None
> E 4,0001-01-01,2017-11-28 != None
> E 5,0001-01-01,-12-31 != None
> E 6,0001-01-01,NULL != None
> E Number of rows returned (expected vs actual): 22 != 15
> Standard Error
> ERROR:test_configuration:Comparing QueryTestResults (expected vs actual):
> 0,0001-01-01,0001-01-01 != 10,1399-06-27,2017-11-28
> 1,0001-01-01

[jira] [Updated] (IMPALA-9555) TestDateQueries.test_queries failing because Hive3 switched back to the hybrid Julian Gregorian calendar

2020-03-26 Thread Attila Jeges (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Jeges updated IMPALA-9555:
-
Affects Version/s: Impala 3.4.0

> TestDateQueries.test_queries failing because Hive3 switched back to the 
> hybrid Julian Gregorian calendar
> 
>
> Key: IMPALA-9555
> URL: https://issues.apache.org/jira/browse/IMPALA-9555
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.4.0
>Reporter: Attila Jeges
>Assignee: Attila Jeges
>Priority: Critical
>
> TestDateQueries.test_queries is failing after upgrading the CDP GBN with the 
> following error:
> {code}
> query_test.test_date_queries.TestDateQueries.test_queries[protocol: beeswax | 
> exec_option: {'disable_codegen_rows_threshold': 0, 'disable_codegen': 'true', 
> 'batch_size': 1} | table_format: avro/snap/block] (from pytest)
> Error Message
> query_test/test_date_queries.py:60: in test_queries 
> self.run_test_case('QueryTest/avro_date', vector) 
> common/impala_test_suite.py:690: in run_test_case 
> self.__verify_results_and_errors(vector, test_section, result, use_db) 
> common/impala_test_suite.py:523: in __verify_results_and_errors 
> replace_filenames_with_placeholder) common/test_result_verifier.py:456: in 
> verify_raw_results VERIFIER_MAP[verifier](expected, actual) 
> common/test_result_verifier.py:278: in verify_query_result_is_equal 
> assert expected_results == actual_results E   assert Comparing 
> QueryTestResults (expected vs actual): E 0,0001-01-01,0001-01-01 != 
> 10,1399-06-27,2017-11-28 E 1,0001-01-01,0001-12-31 != 11,1399-06-27,NULL 
> E 10,1399-06-27,2017-11-28 != 12,1399-06-27,2018-12-31 E 
> 11,1399-06-27,NULL != 20,2017-11-27,0001-06-19 E 12,1399-06-27,2018-12-31 
> != 21,2017-11-27,0001-06-20 E 2,0001-01-01,0002-01-01 != 
> 22,2017-11-27,0001-06-21 E 20,2017-11-27,0001-06-21 != 
> 23,2017-11-27,0001-06-22 E 21,2017-11-27,0001-06-22 != 
> 24,2017-11-27,0001-06-23 E 22,2017-11-27,0001-06-23 != 
> 25,2017-11-27,0001-06-24 E 23,2017-11-27,0001-06-24 != 
> 26,2017-11-27,0001-06-25 E 24,2017-11-27,0001-06-25 != 
> 27,2017-11-27,0001-06-26 E 25,2017-11-27,0001-06-26 != 
> 28,2017-11-27,0001-06-27 E 26,2017-11-27,0001-06-27 != 
> 29,2017-11-27,2017-11-28 E 27,2017-11-27,0001-06-28 != 
> 30,-12-31,-12-01 E 28,2017-11-27,0001-06-29 != 
> 31,-12-31,-12-31 E 29,2017-11-27,2017-11-28 != None E 
> 3,0001-01-01,1399-12-31 != None E 30,-12-31,-12-01 != None E 
> 31,-12-31,-12-31 != None E 4,0001-01-01,2017-11-28 != None E 
> 5,0001-01-01,-12-31 != None E 6,0001-01-01,NULL != None E Number 
> of rows returned (expected vs actual): 22 != 15
> Stacktrace
> query_test/test_date_queries.py:60: in test_queries
> self.run_test_case('QueryTest/avro_date', vector)
> common/impala_test_suite.py:690: in run_test_case
> self.__verify_results_and_errors(vector, test_section, result, use_db)
> common/impala_test_suite.py:523: in __verify_results_and_errors
> replace_filenames_with_placeholder)
> common/test_result_verifier.py:456: in verify_raw_results
> VERIFIER_MAP[verifier](expected, actual)
> common/test_result_verifier.py:278: in verify_query_result_is_equal
> assert expected_results == actual_results
> E   assert Comparing QueryTestResults (expected vs actual):
> E 0,0001-01-01,0001-01-01 != 10,1399-06-27,2017-11-28
> E 1,0001-01-01,0001-12-31 != 11,1399-06-27,NULL
> E 10,1399-06-27,2017-11-28 != 12,1399-06-27,2018-12-31
> E 11,1399-06-27,NULL != 20,2017-11-27,0001-06-19
> E 12,1399-06-27,2018-12-31 != 21,2017-11-27,0001-06-20
> E 2,0001-01-01,0002-01-01 != 22,2017-11-27,0001-06-21
> E 20,2017-11-27,0001-06-21 != 23,2017-11-27,0001-06-22
> E 21,2017-11-27,0001-06-22 != 24,2017-11-27,0001-06-23
> E 22,2017-11-27,0001-06-23 != 25,2017-11-27,0001-06-24
> E 23,2017-11-27,0001-06-24 != 26,2017-11-27,0001-06-25
> E 24,2017-11-27,0001-06-25 != 27,2017-11-27,0001-06-26
> E 25,2017-11-27,0001-06-26 != 28,2017-11-27,0001-06-27
> E 26,2017-11-27,0001-06-27 != 29,2017-11-27,2017-11-28
> E 27,2017-11-27,0001-06-28 != 30,-12-31,-12-01
> E 28,2017-11-27,0001-06-29 != 31,-12-31,-12-31
> E 29,2017-11-27,2017-11-28 != None
> E 3,0001-01-01,1399-12-31 != None
> E 30,-12-31,-12-01 != None
> E 31,-12-31,-12-31 != None
> E 4,0001-01-01,2017-11-28 != None
> E 5,0001-01-01,-12-31 != None
> E 6,0001-01-01,NULL != None
> E Number of rows returned (expected vs actual): 22 != 15
> Standard Error
> ERROR:test_configuration:Comparing QueryTestResults (expected vs actual):
> 0,0001-01-01,00

[jira] [Created] (IMPALA-9556) Add tests for interactions between metadata operations and the data cache

2020-03-26 Thread Joe McDonnell (Jira)
Joe McDonnell created IMPALA-9556:
-

 Summary: Add tests for interactions between metadata operations 
and the data cache
 Key: IMPALA-9556
 URL: https://issues.apache.org/jira/browse/IMPALA-9556
 Project: IMPALA
  Issue Type: Test
  Components: Backend
Affects Versions: Impala 4.0
Reporter: Joe McDonnell


Both the data cache and the file handle cache need to handle different versions 
of a file (e.g. if a file is overwritten or appended). They use the 
modification time to make distinctions between different versions of the file. 
The modification times are going to come from the metadata.

We should have tests the integrate various metadata operations and 
configurations with the data cache to verify that the data cache continues to 
have the appropriate hit rate after unrelated metadata operations. This would 
help verify that modification times don't change unnecessarily.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8989) TestAdmissionController.test_release_backend is flaky

2020-03-26 Thread Csaba Ringhofer (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17067943#comment-17067943
 ] 

Csaba Ringhofer commented on IMPALA-8989:
-

Found test_release_backend  broken again. One of the impalads crashed with 
SIGABRT. The callstack wasn't too helpful - the only thing I could understand 
was impalad!_fini, which suggests that this occurred during exit.
{code}
Crash reason:  SIGABRT
Crash address: 0x7d12437
Process uptime: not available

Thread 0 (crashed)
 0  libc-2.17.so + 0x351f7
rax = 0x   rdx = 0x0006
rcx = 0x   rbx = 0x136b85e8
rsi = 0x2437   rdi = 0x2437
rbp = 0x07289eb0   rsp = 0x7ffc919f9d98
 r8 = 0x000ar9 = 0x7f341d8889c0
r10 = 0x0008   r11 = 0x0202
r12 = 0x14592780   r13 = 0x7ffc919fa000
r14 = 0x   r15 = 0x
rip = 0x7f3419d0a1f7
Found by: given as instruction pointer in context
 1  libc-2.17.so + 0x368e8
rsp = 0x7ffc919f9da0   rip = 0x7f3419d0b8e8
Found by: stack scanning
 2  libstdc++.so.6.0.20 + 0xbf198
rsp = 0x7ffc919f9e50   rip = 0x7f341a670198
Found by: stack scanning
 3  libc-2.17.so + 0x765fd
rsp = 0x7ffc919f9e60   rip = 0x7f3419d4b5fd
Found by: stack scanning
 4  libc-2.17.so + 0x765fd
rsp = 0x7ffc919f9e70   rip = 0x7f3419d4b5fd
Found by: stack scanning
 5  libstdc++.so.6.0.20 + 0x5fd2d
rsp = 0x7ffc919f9ed0   rip = 0x7f341a610d2d
Found by: stack scanning
 6  libstdc++.so.6.0.20 + 0x5dd86
rsp = 0x7ffc919f9f00   rip = 0x7f341a60ed86
Found by: stack scanning
 7  libstdc++.so.6.0.20 + 0x5ce79
rsp = 0x7ffc919f9f10   rip = 0x7f341a60de79
Found by: stack scanning
 8  libstdc++.so.6.0.20 + 0x5d5db
rsp = 0x7ffc919f9f20   rip = 0x7f341a60e5db
Found by: stack scanning
 9  impalad!_fini + 0x16522e4
rsp = 0x7ffc919f9f40   rip = 0x06733c54
Found by: stack scanning
10  impalad!_fini + 0x1d49238
rsp = 0x7ffc919f9f60   rip = 0x06e2aba8
Found by: stack scanning
11  impalad!_fini + 0x1d4924d
rsp = 0x7ffc919f9f68   rip = 0x06e2abbd
Found by: stack scanning
12  impalad!_fini + 0x1cb973b
rsp = 0x7ffc919f9f88   rip = 0x06d9b0ab
Found by: stack scanning
13  libgcc_s.so.1 + 0xffa3
rsp = 0x7ffc919fa000   rip = 0x7f341a0a7fa3
Found by: stack scanning
{code}

The logs around the error:
{code}
05:53:54.959 ERROR 
custom_cluster/test_admission_controller.py::TestAdmissionController::()::test_release_backends[protocol:
 beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 5000, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
text/none]
05:53:54.960  ERRORS 

05:53:54.960  ERROR at setup of 
TestAdmissionController.test_release_backends[protocol: beeswax | exec_option: 
{'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 
'disable_codegen': False, 'abort_on_error': 1, 
'exec_single_node_rows_threshold': 0} | table_format: text/none] 
05:53:54.975 common/custom_cluster_test_suite.py:190: in setup_method
05:53:54.976 self._start_impala_cluster(cluster_args, **kwargs)
05:53:54.976 common/custom_cluster_test_suite.py:307: in _start_impala_cluster
05:53:54.976 check_call(cmd + options, close_fds=True)
05:53:54.976 /usr/lib64/python2.7/subprocess.py:542: in check_call
05:53:54.976 raise CalledProcessError(retcode, cmd)
05:53:54.976 E   CalledProcessError: Command 
'['/data/jenkins/workspace/impala-asf-master-core/repos/Impala/bin/start-impala-cluster.py',
 '--state_store_args=--statestore_update_frequency_ms=50 
--statestore_priority_update_frequency_ms=50 
--statestore_heartbeat_frequency_ms=50', '--cluster_size=3', 
'--num_coordinators=1', 
'--log_dir=/data/jenkins/workspace/impala-asf-master-core/repos/Impala/logs/custom_cluster_tests',
 '--log_level=1', '--use_exclusive_coordinators', '--state_store_args=None ', 
'--impalad_args=--default_query_options=']' returned non-zero exit status 1
05:53:54.977  Captured stderr setup 
-
05:53:54.978 01:47:32 MainThread: Found 0 impalad/0 statestored/0 catalogd 
process(es)
05:53:54.978 01:47:32 MainThread: Starting State Store logging to 
/data/jenkins/workspace/impala-asf-master-core/repos/Impala/logs/custom_cluster_tests/statestored.INFO
05:53:54.978 01:47:32 MainThread: Starting Catalog Service logging to 
/data/jenkins/workspace/impala-asf-master-core/repos/Impala/logs/custom_cluster_tests/catalogd.INFO
05:53:54.978 01:47:32 MainThread: Starting Impala Dae

[jira] [Commented] (IMPALA-9550) TestResultSpoolingFetchSize.test_fetch is flaky

2020-03-26 Thread Fang-Yu Rao (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17067995#comment-17067995
 ] 

Fang-Yu Rao commented on IMPALA-9550:
-

We have encountered a very similar failed test at 
[https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2036/testReport/junit/query_test.test_fetch/TestFetch/test_rows_sent_counters_protocol__beeswax___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/].

The error message is in the following.
{noformat}
query_test/test_fetch.py:50: in test_rows_sent_counters 
self.wait_for_state(handle, self.client.QUERY_STATES['FINISHED'], 30) 
common/impala_test_suite.py:1085: in wait_for_state 
self.wait_for_any_state(handle, [expected_state], timeout, client) 
common/impala_test_suite.py:1102: in wait_for_any_state actual_state)) E   
Timeout: query 5d4207c7795df850:ff640671 did not reach one of the 
expected states [4], last known state 2
{noformat}
The stacktrace is in the following.
{noformat}
query_test/test_fetch.py:50: in test_rows_sent_counters
self.wait_for_state(handle, self.client.QUERY_STATES['FINISHED'], 30)
common/impala_test_suite.py:1085: in wait_for_state
self.wait_for_any_state(handle, [expected_state], timeout, client)
common/impala_test_suite.py:1102: in wait_for_any_state
actual_state))
E   Timeout: query 5d4207c7795df850:ff640671 did not reach one of the 
expected states [4], last known state 2
{noformat}

> TestResultSpoolingFetchSize.test_fetch is flaky
> ---
>
> Key: IMPALA-9550
> URL: https://issues.apache.org/jira/browse/IMPALA-9550
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
>
> Looks like the timeout needs to be bumped up.
> {code}
> query_test.test_result_spooling.TestResultSpoolingFetchSize.test_fetch[fetch_size:
>  1 | protocol: beeswax | exec_option: {'batch_size': 2048, 'num_nodes': 
> 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> parquet/none | wait_for_finished: True] (from pytest)
> Error Message
> query_test/test_result_spooling.py:292: in test_fetch 
> self.wait_for_state(handle, self.client.QUERY_STATES['FINISHED'], timeout) 
> common/impala_test_suite.py:1085: in wait_for_state 
> self.wait_for_any_state(handle, [expected_state], timeout, client) 
> common/impala_test_suite.py:1102: in wait_for_any_state actual_state)) E  
>  Timeout: query 424f02e1ff0912bc:37a5e6cb did not reach one of the 
> expected states [4], last known state 2
> Stacktrace
> query_test/test_result_spooling.py:292: in test_fetch
> self.wait_for_state(handle, self.client.QUERY_STATES['FINISHED'], timeout)
> common/impala_test_suite.py:1085: in wait_for_state
> self.wait_for_any_state(handle, [expected_state], timeout, client)
> common/impala_test_suite.py:1102: in wait_for_any_state
> actual_state))
> E   Timeout: query 424f02e1ff0912bc:37a5e6cb did not reach one of the 
> expected states [4], last known state 2
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9534) Kudu show create table tests fail due to case difference for external.table.purge

2020-03-26 Thread Vihang Karajgaonkar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar reassigned IMPALA-9534:
---

Assignee: Vihang Karajgaonkar

> Kudu show create table tests fail due to case difference for 
> external.table.purge
> -
>
> Key: IMPALA-9534
> URL: https://issues.apache.org/jira/browse/IMPALA-9534
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0
>Reporter: Joe McDonnell
>Assignee: Vihang Karajgaonkar
>Priority: Blocker
>  Labels: broken-build
>
> When updating to the latest CDP GBN, there are test failures due to our tests 
> expecting external.table.purge=TRUE (upper case) whereas it is actually 
> external.table.purge=true (lower case):
>  
> {noformat}
> query_test/test_kudu.py:862: in test_primary_key_and_distribution
> db=cursor.conn.db_name, kudu_addr=KUDU_MASTER_HOSTS))
> query_test/test_kudu.py:836: in assert_show_create_equals
> assert "TBLPROPERTIES ('external.table.purge'='TRUE', " in output
> E   assert "TBLPROPERTIES ('external.table.purge'='TRUE', " in "CREATE 
> EXTERNAL TABLE testshowcreatetable_6928_i0obd1.jlxsrpzmcu (\n  c INT NOT NULL 
> ENCODING AUTO_ENCODING COMPRESSI...H (c) PARTITIONS 3\nSTORED AS 
> KUDU\nTBLPROPERTIES ('external.table.purge'='true', 
> 'kudu.master_addresses'='localhost')"{noformat}
> This impacts the following tests:
>  
>  
> {noformat}
> metadata.test_ddl.TestDdlStatements.test_create_alter_tbl_properties
> metadata.test_show_create_table.TestShowCreateTable.test_show_create_table
> query_test.test_kudu.TestShowCreateTable.test_primary_key_and_distribution
> query_test.test_kudu.TestShowCreateTable.test_timestamp_default_value
> query_test.test_kudu.TestShowCreateTable.test_managed_kudu_table_name_with_show_create
> org.apache.impala.catalog.local.LocalCatalogTest.testKuduTable{noformat}
> I think we can just make these case insensitive.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9534) Kudu show create table tests fail due to case difference for external.table.purge

2020-03-26 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar reassigned IMPALA-9534:


Assignee: Sahil Takiar  (was: Vihang Karajgaonkar)

> Kudu show create table tests fail due to case difference for 
> external.table.purge
> -
>
> Key: IMPALA-9534
> URL: https://issues.apache.org/jira/browse/IMPALA-9534
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0
>Reporter: Joe McDonnell
>Assignee: Sahil Takiar
>Priority: Blocker
>  Labels: broken-build
>
> When updating to the latest CDP GBN, there are test failures due to our tests 
> expecting external.table.purge=TRUE (upper case) whereas it is actually 
> external.table.purge=true (lower case):
>  
> {noformat}
> query_test/test_kudu.py:862: in test_primary_key_and_distribution
> db=cursor.conn.db_name, kudu_addr=KUDU_MASTER_HOSTS))
> query_test/test_kudu.py:836: in assert_show_create_equals
> assert "TBLPROPERTIES ('external.table.purge'='TRUE', " in output
> E   assert "TBLPROPERTIES ('external.table.purge'='TRUE', " in "CREATE 
> EXTERNAL TABLE testshowcreatetable_6928_i0obd1.jlxsrpzmcu (\n  c INT NOT NULL 
> ENCODING AUTO_ENCODING COMPRESSI...H (c) PARTITIONS 3\nSTORED AS 
> KUDU\nTBLPROPERTIES ('external.table.purge'='true', 
> 'kudu.master_addresses'='localhost')"{noformat}
> This impacts the following tests:
>  
>  
> {noformat}
> metadata.test_ddl.TestDdlStatements.test_create_alter_tbl_properties
> metadata.test_show_create_table.TestShowCreateTable.test_show_create_table
> query_test.test_kudu.TestShowCreateTable.test_primary_key_and_distribution
> query_test.test_kudu.TestShowCreateTable.test_timestamp_default_value
> query_test.test_kudu.TestShowCreateTable.test_managed_kudu_table_name_with_show_create
> org.apache.impala.catalog.local.LocalCatalogTest.testKuduTable{noformat}
> I think we can just make these case insensitive.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9534) Kudu show create table tests fail due to case difference for external.table.purge

2020-03-26 Thread Sahil Takiar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17068120#comment-17068120
 ] 

Sahil Takiar commented on IMPALA-9534:
--

I've discussed this with the Hive folks, and looks like this was a mistake. The 
intention is to use 'TRUE'. apache master Hive actually uses 'TRUE'. The Hive 
team is making a fix for this.

> Kudu show create table tests fail due to case difference for 
> external.table.purge
> -
>
> Key: IMPALA-9534
> URL: https://issues.apache.org/jira/browse/IMPALA-9534
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0
>Reporter: Joe McDonnell
>Assignee: Sahil Takiar
>Priority: Blocker
>  Labels: broken-build
>
> When updating to the latest CDP GBN, there are test failures due to our tests 
> expecting external.table.purge=TRUE (upper case) whereas it is actually 
> external.table.purge=true (lower case):
>  
> {noformat}
> query_test/test_kudu.py:862: in test_primary_key_and_distribution
> db=cursor.conn.db_name, kudu_addr=KUDU_MASTER_HOSTS))
> query_test/test_kudu.py:836: in assert_show_create_equals
> assert "TBLPROPERTIES ('external.table.purge'='TRUE', " in output
> E   assert "TBLPROPERTIES ('external.table.purge'='TRUE', " in "CREATE 
> EXTERNAL TABLE testshowcreatetable_6928_i0obd1.jlxsrpzmcu (\n  c INT NOT NULL 
> ENCODING AUTO_ENCODING COMPRESSI...H (c) PARTITIONS 3\nSTORED AS 
> KUDU\nTBLPROPERTIES ('external.table.purge'='true', 
> 'kudu.master_addresses'='localhost')"{noformat}
> This impacts the following tests:
>  
>  
> {noformat}
> metadata.test_ddl.TestDdlStatements.test_create_alter_tbl_properties
> metadata.test_show_create_table.TestShowCreateTable.test_show_create_table
> query_test.test_kudu.TestShowCreateTable.test_primary_key_and_distribution
> query_test.test_kudu.TestShowCreateTable.test_timestamp_default_value
> query_test.test_kudu.TestShowCreateTable.test_managed_kudu_table_name_with_show_create
> org.apache.impala.catalog.local.LocalCatalogTest.testKuduTable{noformat}
> I think we can just make these case insensitive.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9534) Kudu show create table tests fail due to case difference for external.table.purge

2020-03-26 Thread Vihang Karajgaonkar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17068126#comment-17068126
 ] 

Vihang Karajgaonkar commented on IMPALA-9534:
-

Its weird to have a case-sensitive boolean property value. I am curious to know 
why it has to be 'TRUE' v/s 'true' 

> Kudu show create table tests fail due to case difference for 
> external.table.purge
> -
>
> Key: IMPALA-9534
> URL: https://issues.apache.org/jira/browse/IMPALA-9534
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0
>Reporter: Joe McDonnell
>Assignee: Sahil Takiar
>Priority: Blocker
>  Labels: broken-build
>
> When updating to the latest CDP GBN, there are test failures due to our tests 
> expecting external.table.purge=TRUE (upper case) whereas it is actually 
> external.table.purge=true (lower case):
>  
> {noformat}
> query_test/test_kudu.py:862: in test_primary_key_and_distribution
> db=cursor.conn.db_name, kudu_addr=KUDU_MASTER_HOSTS))
> query_test/test_kudu.py:836: in assert_show_create_equals
> assert "TBLPROPERTIES ('external.table.purge'='TRUE', " in output
> E   assert "TBLPROPERTIES ('external.table.purge'='TRUE', " in "CREATE 
> EXTERNAL TABLE testshowcreatetable_6928_i0obd1.jlxsrpzmcu (\n  c INT NOT NULL 
> ENCODING AUTO_ENCODING COMPRESSI...H (c) PARTITIONS 3\nSTORED AS 
> KUDU\nTBLPROPERTIES ('external.table.purge'='true', 
> 'kudu.master_addresses'='localhost')"{noformat}
> This impacts the following tests:
>  
>  
> {noformat}
> metadata.test_ddl.TestDdlStatements.test_create_alter_tbl_properties
> metadata.test_show_create_table.TestShowCreateTable.test_show_create_table
> query_test.test_kudu.TestShowCreateTable.test_primary_key_and_distribution
> query_test.test_kudu.TestShowCreateTable.test_timestamp_default_value
> query_test.test_kudu.TestShowCreateTable.test_managed_kudu_table_name_with_show_create
> org.apache.impala.catalog.local.LocalCatalogTest.testKuduTable{noformat}
> I think we can just make these case insensitive.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9513) query_test.test_kudu.TestKuduOperations.test_column_storage_attributes fails on exhaustive tests

2020-03-26 Thread Joe McDonnell (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17068190#comment-17068190
 ] 

Joe McDonnell commented on IMPALA-9513:
---

>From what I'm seeing, this test has never passed. It assumes that when 
>fetching, the date will be returned as a datetime.date object. I don't see 
>code to do that, and we currently return a string representation of the date.

I'm putting together a change to fix the test. If we want dates to be returned 
as a datetime.date object, then that is a separate piece of work.

> query_test.test_kudu.TestKuduOperations.test_column_storage_attributes fails 
> on exhaustive tests
> 
>
> Key: IMPALA-9513
> URL: https://issues.apache.org/jira/browse/IMPALA-9513
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Norbert Luksa
>Priority: Blocker
>  Labels: build-failure
>
> Encountered the mentioned test failures in recent exhaustive tests. The 
> failed assertion is: 
> {code:java}
> query_test/test_kudu.py:436: in test_column_storage_attributes assert 
> cursor.fetchall() == \ E assert [(0, True, 0, 0, 0, 0, ...)] == [(0, True, 0, 
> 0, 0, 0, ...)] E At index 0 diff: (0, True, 0, 0, 0, 0, 0.0, 0.0, '0', 
> datetime.datetime(2009, 1, 1, 0, 0), Decimal('0'), '2010-01-01') != (0, True, 
> 0, 0, 0, 0, 0.0, 0.0, '0', datetime.datetime(2009, 1, 1, 0, 0), 0, 
> datetime.date(2010, 1, 1)) E
> {code}
> Looks like it is caused by https://gerrit.cloudera.org/#/c/14705/.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-9433) Change FileHandleCache from using a multimap to an unordered_map

2020-03-26 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-9433 started by Anurag Mantripragada.

> Change FileHandleCache from using a multimap to an unordered_map
> 
>
> Key: IMPALA-9433
> URL: https://issues.apache.org/jira/browse/IMPALA-9433
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.4.0
>Reporter: Joe McDonnell
>Assignee: Anurag Mantripragada
>Priority: Minor
>  Labels: ramp-up
>
> The file handle cache can contain multiple file handles per filename. 
> Currently it uses a std::multimap, where the file handles for each filename 
> are a contiguous set of entries. A lookup will find the beginning of that 
> range and then iterate through it to find a free one.
> A multimap is implemented as a red-black tree with O(log(N)) lookup, so we 
> should be able to improve this by using a hashtable-based structure such as 
> unordered_map/unordered_multimap with O(1) lookup.
> Another optimization would be to add an intermediary structure for each 
> filename and hold all the file handles for that file name in a linked list. 
> Lookup would find this intermediary structure by looking up the filename, 
> then it would iterate. In the current method, the key/value pair for each 
> file handle must store a copy of the filename string as the key, even for 
> duplicates. With the intermediary structure, it would store the filename once 
> per unique filename.
> It also looks like the LRU list would benefit from being a Boost intrusive 
> list ([https://www.boost.org/doc/libs/1_64_0/doc/html/intrusive.html]). Every 
> file handle is always in the LRU list, so a std::list has a higher memory 
> overhead and requires more memory accesses. It also complicates the code, 
> because the FileHandleEntry needs to store a LruListType::iterator to its 
> location in the LRU list.
> These optimizations are low priority, but they provide good ramp-up for some 
> C++ concepts/APIs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org