[jira] [Assigned] (IMPALA-8108) Impala query returns TIMESTAMP values in different types
[ https://issues.apache.org/jira/browse/IMPALA-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robbie Zhang reassigned IMPALA-8108: Assignee: (was: Robbie Zhang) > Impala query returns TIMESTAMP values in different types > > > Key: IMPALA-8108 > URL: https://issues.apache.org/jira/browse/IMPALA-8108 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Robbie Zhang >Priority: Major > > When a timestamp has a .000 or .00 or .0 (when fraction value is > zeros) the timestamp is displayed with no fraction of second. For example: > {code:java} > select cast(ts as timestamp) from > (values > ('2019-01-11 10:40:18' as ts), > ('2019-01-11 10:40:19.0'), > ('2019-01-11 10:40:19.00'), > ('2019-01-11 10:40:19.000'), > ('2019-01-11 10:40:19.'), > ('2019-01-11 10:40:19.0'), > ('2019-01-11 10:40:19.00'), > ('2019-01-11 10:40:19.000'), > ('2019-01-11 10:40:19.'), > ('2019-01-11 10:40:19.0'), > ('2019-01-11 10:40:19.1') > ) t;{code} > The output is: > {code:java} > +---+ > |cast(ts as timestamp)| > +---+ > |2019-01-11 10:40:18| > |2019-01-11 10:40:19| > |2019-01-11 10:40:19| > |2019-01-11 10:40:19| > |2019-01-11 10:40:19| > |2019-01-11 10:40:19| > |2019-01-11 10:40:19| > |2019-01-11 10:40:19| > |2019-01-11 10:40:19| > |2019-01-11 10:40:19| > |2019-01-11 10:40:19.1| > +---+ > {code} > As we can see, values of the same column are returned in two different types. > The inconsistency breaks some downstream use cases. > The reason is that impala uses function > boost::posix_time::to_simple_string(time_duration) to convert timestamp to a > string and to_simple_string() remove fractional seconds if they are all > zeros. Perhaps we can append ".0" if the length of the string is 8 > (HH:MM:SS). > For now we can work around it by using function from_timestamp(ts, > '-mm-dd hh:mm.ss.s') to unify the output (convert to string), or > using function millisecond(ts) to get fractional seconds. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-9375) Remove DirectMetaProvider usage from CatalogMetaProvider
[ https://issues.apache.org/jira/browse/IMPALA-9375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315220#comment-17315220 ] Robbie Zhang commented on IMPALA-9375: -- [~vihangk1], except for DirectMetaProvider, there is one more HMS connection in each executor: [https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/fe/src/main/java/org/apache/impala/service/Frontend.java#L348] Should we remove this connection as well? > Remove DirectMetaProvider usage from CatalogMetaProvider > > > Key: IMPALA-9375 > URL: https://issues.apache.org/jira/browse/IMPALA-9375 > Project: IMPALA > Issue Type: Improvement > Components: Catalog >Affects Versions: Impala 3.4.0 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Critical > > I see that CatalogMetaProvider uses {{DirectMetaProvider}} here > https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java#L239 > There are only a couple of places where it is used within > CatalogMetaProvider. We should implement those remaining APIs in catalog-v2 > mode and remove the usage of DirectMetaProvider from CatalogMetaProvider. > DirectMetaProvider starts by default a MetastoreClientPool (with 10 > connections). This is unnecessary given that catalog already makes the > connections to HMS at its startup. It also slows down the coordinator startup > time if there are HMS connection issues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10082) Concurrent invalidate metadata and create/drop table cause discrepancy in metadata
[ https://issues.apache.org/jira/browse/IMPALA-10082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17182913#comment-17182913 ] Robbie Zhang commented on IMPALA-10082: --- Not sure how LocalCatalog is affected yet. But I believe I found a bug in CatalogServiceCatalog.java. Let's take [removeTable|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L2076-L2078] for example. {code:java} public Table removeTable(String dbName, String tblName) { Db parentDb = getDb(dbName); if (parentDb == null) return null; versionLock_.writeLock().lock(); try { Table removedTable = parentDb.removeTable(tblName); {code} The method getDb is called before the lock is occupied. If the thread which is processing a global invalidate metadata occupies the lock at that time, parentDb will be stale. In other words, the table is removed from HMS but not removed from the latest database object in catalogd. Then we can't create a table with the same name until we run invalidate metadata again. The method addFunction has the same issue. > Concurrent invalidate metadata and create/drop table cause discrepancy in > metadata > -- > > Key: IMPALA-10082 > URL: https://issues.apache.org/jira/browse/IMPALA-10082 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 4.0 >Reporter: Robbie Zhang >Priority: Major > > The symptom is similar to IMPALA-7093 but is a different issue. Here is how I > reproduce it: > 1) Ran the first script to keep running create/insert/drop queries > {code:java} > #!/bin/bash > while [ 1 ] > do > shell/impala-shell -q "create table if not exists test(i int); insert into > test(i) values(1); drop table test;" 2>&1| tee test.output > n=`egrep "Exception" test.output | wc -l` > if [ $n -lt 0 ]; then > rm -f /tmp/testing > exit > fi > done > {code} > 2) Ran the second script to keep running global invalidate metadata > {code:java} > #!/bin/bash > while [ 1 ] > do > shell/impala-shell -q "invalidate metadata" > done > {code} > Sometime later, the first scrip ended with "Table default.test does not > exist": > {code:java} > Starting Impala Shell with no authentication using Python 2.7.12 > Warning: live_progress only applies to interactive shell sessions, and is > being skipped for now. > Opened TCP connection to localhost:21000 > Connected to localhost:21000 > Server version: impalad version 4.0.0-SNAPSHOT DEBUG (build > f95f7940e4a290d75ee85fd78e85bc26795f0f9f) > Query: create table if not exists test(i int) > Fetched 1 row(s) in 0.01s > Query: insert into test(i) values(1) > Query submitted at: 2020-08-13 22:57:51 (Coordinator: http://impala34:25000) > ERROR: AnalysisException: Table does not exist: default.test > Could not execute command: insert into test(i) values(1){code} > Even after I change to local catalog mode, this issue still exists: > {code:java} > Starting Impala Shell with no authentication using Python 2.7.12 > Warning: live_progress only applies to interactive shell sessions, and is > being skipped for now. > Opened TCP connection to localhost:21000 > Connected to localhost:21000 > Server version: impalad version 4.0.0-SNAPSHOT DEBUG (build > f95f7940e4a290d75ee85fd78e85bc26795f0f9f) > Query: create table if not exists test(i int) > Fetched 1 row(s) in 0.07s > Query: insert into test(i) values(1) > Query submitted at: 2020-08-13 22:10:16 (Coordinator: http://impala34:25000) > ERROR: AnalysisException: org.apache.impala.catalog.TableLoadingException: > Could not load table default.test from catalog > CAUSED BY: TableLoadingException: Could not load table default.test from > catalog > CAUSED BY: TException: > TGetPartialCatalogObjectResponse(status:TStatus(status_code:GENERAL, > error_msgs:[TableLoadingException: Table default.test no longer exists in the > Hive MetaStore. Run 'invalidate metadata default.test' to update the Impala > catalog.]), lookup_status:OK) > Could not execute command: insert into test(i) values(1) > {code} > And in local catalog mode, the newly created table was lost but it's still > visible in the coordinator. After running 'invalidate metadata default.test', > the table disappeared at all. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10082) Concurrent invalidate metadata and create/drop table cause discrepancy in metadata
Robbie Zhang created IMPALA-10082: - Summary: Concurrent invalidate metadata and create/drop table cause discrepancy in metadata Key: IMPALA-10082 URL: https://issues.apache.org/jira/browse/IMPALA-10082 Project: IMPALA Issue Type: Bug Affects Versions: Impala 4.0 Reporter: Robbie Zhang The symptom is similar to IMPALA-7093 but is a different issue. Here is how I reproduce it: 1) Ran the first script to keep running create/insert/drop queries {code:java} #!/bin/bash while [ 1 ] do shell/impala-shell -q "create table if not exists test(i int); insert into test(i) values(1); drop table test;" 2>&1| tee test.output n=`egrep "Exception" test.output | wc -l` if [ $n -lt 0 ]; then rm -f /tmp/testing exit fi done {code} 2) Ran the second script to keep running global invalidate metadata {code:java} #!/bin/bash while [ 1 ] do shell/impala-shell -q "invalidate metadata" done {code} Sometime later, the first scrip ended with "Table default.test does not exist": {code:java} Starting Impala Shell with no authentication using Python 2.7.12 Warning: live_progress only applies to interactive shell sessions, and is being skipped for now. Opened TCP connection to localhost:21000 Connected to localhost:21000 Server version: impalad version 4.0.0-SNAPSHOT DEBUG (build f95f7940e4a290d75ee85fd78e85bc26795f0f9f) Query: create table if not exists test(i int) Fetched 1 row(s) in 0.01s Query: insert into test(i) values(1) Query submitted at: 2020-08-13 22:57:51 (Coordinator: http://impala34:25000) ERROR: AnalysisException: Table does not exist: default.test Could not execute command: insert into test(i) values(1){code} Even after I change to local catalog mode, this issue still exists: {code:java} Starting Impala Shell with no authentication using Python 2.7.12 Warning: live_progress only applies to interactive shell sessions, and is being skipped for now. Opened TCP connection to localhost:21000 Connected to localhost:21000 Server version: impalad version 4.0.0-SNAPSHOT DEBUG (build f95f7940e4a290d75ee85fd78e85bc26795f0f9f) Query: create table if not exists test(i int) Fetched 1 row(s) in 0.07s Query: insert into test(i) values(1) Query submitted at: 2020-08-13 22:10:16 (Coordinator: http://impala34:25000) ERROR: AnalysisException: org.apache.impala.catalog.TableLoadingException: Could not load table default.test from catalog CAUSED BY: TableLoadingException: Could not load table default.test from catalog CAUSED BY: TException: TGetPartialCatalogObjectResponse(status:TStatus(status_code:GENERAL, error_msgs:[TableLoadingException: Table default.test no longer exists in the Hive MetaStore. Run 'invalidate metadata default.test' to update the Impala catalog.]), lookup_status:OK) Could not execute command: insert into test(i) values(1) {code} And in local catalog mode, the newly created table was lost but it's still visible in the coordinator. After running 'invalidate metadata default.test', the table disappeared at all. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-10082) Concurrent invalidate metadata and create/drop table cause discrepancy in metadata
Robbie Zhang created IMPALA-10082: - Summary: Concurrent invalidate metadata and create/drop table cause discrepancy in metadata Key: IMPALA-10082 URL: https://issues.apache.org/jira/browse/IMPALA-10082 Project: IMPALA Issue Type: Bug Affects Versions: Impala 4.0 Reporter: Robbie Zhang The symptom is similar to IMPALA-7093 but is a different issue. Here is how I reproduce it: 1) Ran the first script to keep running create/insert/drop queries {code:java} #!/bin/bash while [ 1 ] do shell/impala-shell -q "create table if not exists test(i int); insert into test(i) values(1); drop table test;" 2>&1| tee test.output n=`egrep "Exception" test.output | wc -l` if [ $n -lt 0 ]; then rm -f /tmp/testing exit fi done {code} 2) Ran the second script to keep running global invalidate metadata {code:java} #!/bin/bash while [ 1 ] do shell/impala-shell -q "invalidate metadata" done {code} Sometime later, the first scrip ended with "Table default.test does not exist": {code:java} Starting Impala Shell with no authentication using Python 2.7.12 Warning: live_progress only applies to interactive shell sessions, and is being skipped for now. Opened TCP connection to localhost:21000 Connected to localhost:21000 Server version: impalad version 4.0.0-SNAPSHOT DEBUG (build f95f7940e4a290d75ee85fd78e85bc26795f0f9f) Query: create table if not exists test(i int) Fetched 1 row(s) in 0.01s Query: insert into test(i) values(1) Query submitted at: 2020-08-13 22:57:51 (Coordinator: http://impala34:25000) ERROR: AnalysisException: Table does not exist: default.test Could not execute command: insert into test(i) values(1){code} Even after I change to local catalog mode, this issue still exists: {code:java} Starting Impala Shell with no authentication using Python 2.7.12 Warning: live_progress only applies to interactive shell sessions, and is being skipped for now. Opened TCP connection to localhost:21000 Connected to localhost:21000 Server version: impalad version 4.0.0-SNAPSHOT DEBUG (build f95f7940e4a290d75ee85fd78e85bc26795f0f9f) Query: create table if not exists test(i int) Fetched 1 row(s) in 0.07s Query: insert into test(i) values(1) Query submitted at: 2020-08-13 22:10:16 (Coordinator: http://impala34:25000) ERROR: AnalysisException: org.apache.impala.catalog.TableLoadingException: Could not load table default.test from catalog CAUSED BY: TableLoadingException: Could not load table default.test from catalog CAUSED BY: TException: TGetPartialCatalogObjectResponse(status:TStatus(status_code:GENERAL, error_msgs:[TableLoadingException: Table default.test no longer exists in the Hive MetaStore. Run 'invalidate metadata default.test' to update the Impala catalog.]), lookup_status:OK) Could not execute command: insert into test(i) values(1) {code} And in local catalog mode, the newly created table was lost but it's still visible in the coordinator. After running 'invalidate metadata default.test', the table disappeared at all. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IMPALA-7093) Tables briefly appear to not exist after INVALIDATE METADATA or catalog restart
[ https://issues.apache.org/jira/browse/IMPALA-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868173#comment-16868173 ] Robbie Zhang commented on IMPALA-7093: -- Hi [~bharathv] , thanks for your comment. My fix only changes the behavior within one catalog update. The risks are: 1) The ImpaladCatalog keeps the removed table/function objects 2) The ImpaladCatalog doesn't replace stale table/function objects For 1), I think it's alright unless for some reason the catalogd doesn't include the removed table/function objects into deleteLog_. But for 2), I do find a scenario in which it could happen. For example, when the catalogd is restarted while impala daemons are running, the catalog object versions are reset and might be lower than the version of objects in impala daemons. It will definitely break my fix. So I just came up with an idea to improve my fix. It's not so smart but it should work. I think my fix can be improved as: a) [ImpaladCatalog.addCatalogObject|https://github.com/apache/impala/blob/30c3cd95a42cacbfa2dbb0b29a4757745af942c3/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L291] removes existing table object first then adds it back, just as what we do for function objects; b) ImpaladCatalog.addCatalogObject adds the name of all updated table/function objects to a list or map, and [ImpaladCatalog.updateCatalog|https://github.com/apache/impala/blob/30c3cd95a42cacbfa2dbb0b29a4757745af942c3/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L226] removes all table/function object which are not in the list or map. With the above improvement, I believe the fix has no side effect as long as the performance is acceptable. How do you reckon? > Tables briefly appear to not exist after INVALIDATE METADATA or catalog > restart > --- > > Key: IMPALA-7093 > URL: https://issues.apache.org/jira/browse/IMPALA-7093 > Project: IMPALA > Issue Type: Bug > Components: Catalog >Affects Versions: Impala 2.12.0, Impala 2.13.0 >Reporter: Todd Lipcon >Priority: Major > > I'm doing some stress testing of Impala 2.13 (recent snapshot build) and hit > the following sequence: > {code} > {"query": "SHOW TABLES in consistency_test", "type": "call", "id": 3} > {"type": "response", "id": 3, "results": [["t1"]]} > {"query": "INVALIDATE METADATA", "type": "call", "id": 7} > {"type": "response", "id": 7} > {"query": "DESCRIBE consistency_test.t1", "type": "call", "id": 9} > {"type": "response", "id": 9, "error": "AnalysisException: Could not resolve > path: 'consistency_test.t1'\n"} > {code} > i.e. 'SHOW TABLES' shows that a table exists, but then shortly after an > INVALIDATE METADATA, an attempt to describe a table indicates that the table > does not exist. This is a single-threaded test case against a single impalad. > I also saw a similar behavior that issuing queries to an impalad shortly > after a catalogd restart could transiently show tables not existing that in > fact exist. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-7093) Tables briefly appear to not exist after INVALIDATE METADATA or catalog restart
[ https://issues.apache.org/jira/browse/IMPALA-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16862718#comment-16862718 ] Robbie Zhang edited comment on IMPALA-7093 at 6/19/19 1:48 PM: --- Thank you, [~tarmstrong]! I find the problem is in [ImpaladCatalog.updateCatalog()|https://github.com/apache/impala/blob/ab908d54c22861967f693428ec7d9f6d7008607f/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L191]. This function always adds top-level catalog objects first. When we run 'invalidate metadata', it adds the database objects then table/view/function objects. But at that time the new database objects are empty, no table/view/function object in them. After function [ImpaladCatalog.addDB()|https://github.com/apache/impala/blob/ab908d54c22861967f693428ec7d9f6d7008607f/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L391] replaces the existing database objects with the new database objects, the existing table/view/function objects are lost until Catalogd.updateCatalog() adds these objects back. If the impalad compiles a query when the table/view/function objects disappear, the query will fail with AnalysisException. The error message various in the different type of queries. For example, for 'desc table', we can see 'Could not resolve path', for 'select * from table', we can see 'Could not resolve table reference', for 'insert into table', we can see 'Table does not exist', etc. I can reproduce this issue by running two scripts. The first script keeps running 'invalidate metadata': {code:java} #!/bin/bash while [ 1 ] do shell/impala-shell -q "invalidate metadata" done {code} After I start the first script, I run the second script which keeps running a query: {code:java} #!/bin/bash while [ 1 ] do #shell/impala-shell -q "desc test" 2>&1| tee test.output #shell/impala-shell -q "select * from test" 2>&1| tee test.output shell/impala-shell -q "insert overwrite test(i) values(1)" 2>&1| tee test.output n=`egrep "Fetched |Modified " test.output | wc -l` if [ $n -lt 1 ]; then exit fi done{code} The more table/view/function objects there are, the longer the objects disappear, and the easier the second script hit AnalysisException. I created thousands tables on my cluster. Sometimes the second script hit AnalysisException in a couple of minutes while sometimes it takes nearly half an hour. Anyway, it's repeatable. I changed ImpaladCatalog.java as the following. So far, I haven't see the AnalysisException again. Seems the issue has gone. {code:java} diff --git a/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java b/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java index 13cb620..23a7d68 100644 --- a/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java +++ b/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java @@ -20,6 +20,8 @@ package org.apache.impala.catalog; import java.nio.ByteBuffer; import java.util.ArrayDeque; import java.util.Set; +import java.util.Map; +import java.util.List; import java.util.concurrent.atomic.AtomicLong; import java.util.concurrent.atomic.AtomicReference; @@ -388,6 +390,19 @@ public class ImpaladCatalog extends Catalog implements FeCatalog { existingDb.getCatalogVersion() < catalogVersion) { Db newDb = Db.fromTDatabase(thriftDb); newDb.setCatalogVersion(catalogVersion); + if (existingDb != null) { + // Migrant all existing table/view/function to newDb. Otherwise they + // will disappear temporarily. + for (Table tbl: existingDb.getTables()) { + newDb.addTable(tbl); + } + Map> functions = existingDb.getAllFunctions(); + for (List fns: existingDb.getAllFunctions().values()) { + for (Function f: fns) { + newDb.addFunction(f); + } + } + } addDb(newDb); if (existingDb != null) { CatalogObjectVersionSet.INSTANCE.updateVersions( {code} Adding a lock into Catalog is another solution. But the change will be more complex. In my change, one possible problem is that if the new database object has less table/view/function objects than the existing database object, the deleted object might be left in Catalog forever. According to my test, the deleted objects should be in sequencer.getDeletedObjects() and will be removed by [ImpaladCatalog.removeCatalogObject()|https://github.com/apache/impala/blob/ab908d54c22861967f693428ec7d9f6d7008607f/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L229]. So I think my change is fine. Please correct me if I'm wrong. was (Author: robbie): Thank you, [~tarmstrong]! I find the problem is in [Catalogd.updateCatalog()|https://github.com/apache/impala/blob/ab908d54c22861967f693428ec7d9f6d7008607f/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L191]. This function
[jira] [Commented] (IMPALA-7093) Tables briefly appear to not exist after INVALIDATE METADATA or catalog restart
[ https://issues.apache.org/jira/browse/IMPALA-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867278#comment-16867278 ] Robbie Zhang commented on IMPALA-7093: -- I added a parameter --running-invalidate-metadata for tests/stress/concurrent_select.py to start a thread which keeps running 'invalidate metadata'. By running concurrent_select.py with "--running-invalidate-metadata=true", I can reproduce this issue easily: {code:java} # tests/stress/concurrent_select.py --minicluster-num-impalads 3 --max-queries=1000 --startup-queries-per-second=10 --tpch-db=tpch_parquet --running-invalidate-metadata=true Cluster Impalad Version Info: localhost: impalad version 3.3.0-SNAPSHOT DEBUG (build ab5ee0b7857c6ad19f244dc308210f2809436684) Built on Tue Jun 18 05:04:27 PDT 2019 localhost: impalad version 3.3.0-SNAPSHOT DEBUG (build ab5ee0b7857c6ad19f244dc308210f2809436684) Built on Tue Jun 18 05:04:27 PDT 2019 localhost: impalad version 3.3.0-SNAPSHOT DEBUG (build ab5ee0b7857c6ad19f244dc308210f2809436684) Built on Tue Jun 18 05:04:27 PDT 2019 2019-06-18 06:24:39,494 27754 Thread-8 INFO:cluster[705]:Finding impalad binary location 2019-06-18 06:24:39,494 27754 Thread-7 INFO:cluster[705]:Finding impalad binary location 2019-06-18 06:24:39,494 27754 Thread-9 INFO:cluster[705]:Finding impalad binary location 2019-06-18 06:24:39,843 27754 MainThread INFO:queries[115]:Loading tpch queries 2019-06-18 06:24:39,843 27754 MainThread INFO:test_file_parser[336]:Loading tpch queries Using 25 queries 2019-06-18 06:24:39,865 27754 MainThread INFO:concurrent_select[1508]:Number of queries in the list: 25 Done | Active | Executing | Mem Lmt Ex | AC Reject | AC Timeout | Cancel | Err | Incorrect | Next Qry Mem Lmt | Tot Qry Mem Lmt | Tracked Mem | RSS Mem 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |0 | 0 | | 5 | 43 |28 | 0 | 0 | 0 | 3 | 0 | 0 | 92 |6802 | 617 |1964 8 | 75 |39 | 0 | 0 | 0 | 5 | 0 | 0 | 155 | 12286 |2693 |2521 Process Process-49: Traceback (most recent call last): File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run self._target(*self._args, **self._kwargs) File "tests/stress/concurrent_select.py", line 663, in _start_single_runner mesg=error_msg)) Exception: Query tpch_parquet_TPCH-Q16 ID None failed: AnalysisException: Could not resolve table reference: 'part' Aborting due to 1 successive errors encountered {code} After I changed ImpaladCatalogd.java, concurrent_select.py can execute 1 queries without exception: {code:java} 9998 | 2 | 2 | 0 | 0 | 0 | 1018 | 0 | 0 | 420 | 602 | | Query runner (16866) exited with exit code 0 Query runner (15888) exited with exit code 0 1 | 0 | 0 | 0 | 0 | 0 | 1018 | 0 | 0 | 420 | 0 | | 2019-06-18 23:11:38,444 15605 MainThread INFO:concurrent_select[844]:Test Duration: 12071 seconds {code} I also started another test in which concurrent_select.py starts 10 queries on another cluster. It's still in progress. Nearly 6 queries have been executed without exception so far. > Tables briefly appear to not exist after INVALIDATE METADATA or catalog > restart > --- > > Key: IMPALA-7093 > URL: https://issues.apache.org/jira/browse/IMPALA-7093 > Project: IMPALA > Issue Type: Bug > Components: Catalog >Affects Versions: Impala 2.12.0, Impala 2.13.0 >Reporter: Todd Lipcon >Priority: Major > > I'm doing some stress testing of Impala 2.13 (recent snapshot build) and hit > the following sequence: > {code} > {"query": "SHOW TABLES in consistency_test", "type": "call", "id": 3} > {"type": "response", "id": 3, "results": [["t1"]]} > {"query": "INVALIDATE METADATA", "type": "call", "id": 7} > {"type": "response", "id": 7} > {"query": "DESCRIBE consistency_test.t1", "type": "call", "id": 9} > {"type": "response", "id": 9, "error": "AnalysisException: Could not resolve > path: 'consistency_test.t1'\n"} > {code} > i.e. 'SHOW TABLES' shows that a table exists, but then shortly after an > INVALIDATE METADATA, an attempt to describe a table indicates that the table > does not exist. This is a single-threaded test case against a single impalad. > I also saw a similar behavior that issuing queries to an impalad shortly > after a catalogd
[jira] [Commented] (IMPALA-7093) Tables briefly appear to not exist after INVALIDATE METADATA or catalog restart
[ https://issues.apache.org/jira/browse/IMPALA-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16862718#comment-16862718 ] Robbie Zhang commented on IMPALA-7093: -- Thank you, [~tarmstrong]! I find the problem is in [Catalogd.updateCatalog()|https://github.com/apache/impala/blob/ab908d54c22861967f693428ec7d9f6d7008607f/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L191]. This function always adds top-level catalog objects first. When we run 'invalidate metadata', it adds the database objects then table/view/function objects. But at that time the new database objects are empty, no table/view/function object in them. After function [Catalogd.addDB()|https://github.com/apache/impala/blob/ab908d54c22861967f693428ec7d9f6d7008607f/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L391] replaces the existing database objects with the new database objects, the existing table/view/function objects are lost until Catalogd.updateCatalog() adds these objects back. If the impalad compiles a query when the table/view/function objects disappear, the query will fail with AnalysisException. The error message various in the different type of queries. For example, for 'desc table', we can see 'Could not resolve path', for 'select * from table', we can see 'Could not resolve table reference', for 'insert into table', we can see 'Table does not exist', etc. I can reproduce this issue by running two scripts. The first script keeps running 'invalidate metadata': {code:java} #!/bin/bash while [ 1 ] do shell/impala-shell -q "invalidate metadata" done {code} After I start the first script, I run the second script which keeps running a query: {code:java} #!/bin/bash while [ 1 ] do #shell/impala-shell -q "desc test" 2>&1| tee test.output #shell/impala-shell -q "select * from test" 2>&1| tee test.output shell/impala-shell -q "insert overwrite test(i) values(1)" 2>&1| tee test.output n=`egrep "Fetched |Modified " test.output | wc -l` if [ $n -lt 1 ]; then exit fi done{code} The more table/view/function objects there are, the longer the objects disappear, and the easier the second script hit AnalysisException. I created thousands tables on my cluster. Sometimes the second script hit AnalysisException in a couple of minutes while sometimes it takes nearly half an hour. Anyway, it's repeatable. I changed ImpaladCatalog.java as the following. So far, I haven't see the AnalysisException again. Seems the issue has gone. {code:java} diff --git a/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java b/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java index 13cb620..23a7d68 100644 --- a/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java +++ b/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java @@ -20,6 +20,8 @@ package org.apache.impala.catalog; import java.nio.ByteBuffer; import java.util.ArrayDeque; import java.util.Set; +import java.util.Map; +import java.util.List; import java.util.concurrent.atomic.AtomicLong; import java.util.concurrent.atomic.AtomicReference; @@ -388,6 +390,19 @@ public class ImpaladCatalog extends Catalog implements FeCatalog { existingDb.getCatalogVersion() < catalogVersion) { Db newDb = Db.fromTDatabase(thriftDb); newDb.setCatalogVersion(catalogVersion); + if (existingDb != null) { + // Migrant all existing table/view/function to newDb. Otherwise they + // will disappear temporarily. + for (Table tbl: existingDb.getTables()) { + newDb.addTable(tbl); + } + Map> functions = existingDb.getAllFunctions(); + for (List fns: existingDb.getAllFunctions().values()) { + for (Function f: fns) { + newDb.addFunction(f); + } + } + } addDb(newDb); if (existingDb != null) { CatalogObjectVersionSet.INSTANCE.updateVersions( {code} Adding a lock into Catalog is another solution. But the change will be more complex. In my change, one possible problem is that if the new database object has less table/view/function objects than the existing database object, the deleted object might be left in Catalog forever. According to my test, the deleted objects should be in sequencer.getDeletedObjects() and will be removed by [ImpaladCatalog.removeCatalogObject()|https://github.com/apache/impala/blob/ab908d54c22861967f693428ec7d9f6d7008607f/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L229]. So I think my change is fine. Please correct me if I'm wrong. > Tables briefly appear to not exist after INVALIDATE METADATA or catalog > restart > --- > > Key: IMPALA-7093 > URL: https://issues.apache.org/jira/browse/IMPALA-7093 > Project: IMPALA > Issue Type:
[jira] [Commented] (IMPALA-4089) missing thrift function validation checks
[ https://issues.apache.org/jira/browse/IMPALA-4089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16856388#comment-16856388 ] Robbie Zhang commented on IMPALA-4089: -- An empty catalog could cause this exception. Here is what I saw on a cluster: {code:java} E0604 07:59:21.678375 40945 MetaStoreUtils.java:1234] Got exception: org.apache.hadoop.hive.metastore.api.MetaException Could not retrieve transation read-only status server Java exception follows: MetaException(message:Could not retrieve transation read-only status server) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_all_databases_result$get_all_databases_resultStandardScheme.read(ThriftHiveMetastore.java:18313) ... at org.apache.impala.service.JniCatalog.(JniCatalog.java:108) E0604 07:59:21.678470 40945 MetaStoreUtils.java:1235] Converting exception to MetaException E0604 07:59:21.678840 40945 CatalogServiceCatalog.java:702] MetaException(message:Got exception: org.apache.hadoop.hive.metastore.api.MetaException Could not retrieve transation read-only status server) E0604 07:59:21.679153 40945 JniCatalog.java:110] Error initializing Catalog. Please run 'invalidate metadata' Java exception follows: org.apache.impala.catalog.CatalogException: Error initializing Catalog. Catalog may be empty. at org.apache.impala.catalog.CatalogServiceCatalog.reset(CatalogServiceCatalog.java:703) at org.apache.impala.service.JniCatalog.(JniCatalog.java:108) Caused by: MetaException(message:Got exception: org.apache.hadoop.hive.metastore.api.MetaException Could not retrieve transation read-only status server) at org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:1236) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAllDatabases(HiveMetaStoreClient.java:1055) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:101) at com.sun.proxy.$Proxy5.getAllDatabases(Unknown Source) at org.apache.impala.catalog.CatalogServiceCatalog.reset(CatalogServiceCatalog.java:686) ... 1 more ... I0604 07:59:21.718371 41215 jni-util.cc:169] org.apache.thrift.protocol.TProtocolException: Required field 'update_fn_symbol' was not present! Struct: TAggregateFunction(intermediate_type:TColumnType(types:[TTypeNode(type:SCALAR, scalar_type:TScalarType(type:STRING))]), update_fn_symbol:null, init_fn_symbol:null, ignores_distinct:false) at org.apache.impala.thrift.TAggregateFunction.validate(TAggregateFunction.java:948) at org.apache.impala.thrift.TFunction.validate(TFunction.java:1164) at org.apache.impala.thrift.TCatalogObject.validate(TCatalogObject.java:1058) at org.apache.impala.thrift.TCatalogObject$TCatalogObjectStandardScheme.write(TCatalogObject.java:1213) at org.apache.impala.thrift.TCatalogObject$TCatalogObjectStandardScheme.write(TCatalogObject.java:1098) at org.apache.impala.thrift.TCatalogObject.write(TCatalogObject.java:938) at org.apache.impala.thrift.TGetAllCatalogObjectsResponse$TGetAllCatalogObjectsResponseStandardScheme.write(TGetAllCatalogObjectsResponse.java:487) at org.apache.impala.thrift.TGetAllCatalogObjectsResponse$TGetAllCatalogObjectsResponseStandardScheme.write(TGetAllCatalogObjectsResponse.java:421) at org.apache.impala.thrift.TGetAllCatalogObjectsResponse.write(TGetAllCatalogObjectsResponse.java:365) at org.apache.thrift.TSerializer.serialize(TSerializer.java:79) at org.apache.impala.service.JniCatalog.getCatalogObjects(JniCatalog.java:124) {code} Here is the error from HMS log file: {code:java} 2019-06-04 07:59:21,670 WARN org.apache.hadoop.hive.metastore.MetaStoreDirectSql: [pool-6-thread-49]: Database initialization failed; direct SQL is disabled javax.jdo.JDODataStoreException: Could not retrieve transation read-only status server at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451) ... java.sql.SQLException: Could not retrieve transation read-only status server at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:998) ... Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure The last packet successfully received from the server was 561,221 milliseconds ago. The last packet sent successfully to the server was 561,221 milliseconds ago. at sun.reflect.GeneratedConstructorAccessor29.newInstance(Unknown Source) ...
[jira] [Resolved] (IMPALA-8595) THRIFT-3505 breaks IMPALA-5775
[ https://issues.apache.org/jira/browse/IMPALA-8595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robbie Zhang resolved IMPALA-8595. -- Resolution: Fixed Fix Version/s: Impala 3.3.0 > THRIFT-3505 breaks IMPALA-5775 > -- > > Key: IMPALA-8595 > URL: https://issues.apache.org/jira/browse/IMPALA-8595 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.1.0 >Reporter: Robbie Zhang >Assignee: Robbie Zhang >Priority: Major > Fix For: Impala 3.3.0 > > > IMPALA-5690 replaced thrift 0.9.0 with 0.9.3 in which THRIFT-3505 changed > transport/TSSLSocket.py. > In thrift 0.9.3, if the python version is lower than 2.7.9, TSSLSocket uses > PROTOCOL_TLSv1 by default: > {code:java} > # For pythoon >= 2.7.9, use latest TLS that both client and server supports. > # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3. > # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are > unavailable. > _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else > ssl.PROTOCOL_TLSv1 > {code} > And the SSL version should be passed as an argument to TSSLSocket.__init__ > instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. > The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use > python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) > and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to > impalad: > > {code:java} > # impala-shell -i impalad01.example.com > -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem > SSL is enabled > No handlers could be found for logger "thrift.transport.TSSLSocket" > Error connecting: TTransportException, Could not connect to > impalad01.example.com:21000: EOF occurred in violation of protocol > (_ssl.c:579) > {code} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8595) THRIFT-3505 breaks IMPALA-5775
[ https://issues.apache.org/jira/browse/IMPALA-8595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robbie Zhang resolved IMPALA-8595. -- Resolution: Fixed Fix Version/s: Impala 3.3.0 > THRIFT-3505 breaks IMPALA-5775 > -- > > Key: IMPALA-8595 > URL: https://issues.apache.org/jira/browse/IMPALA-8595 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.1.0 >Reporter: Robbie Zhang >Assignee: Robbie Zhang >Priority: Major > Fix For: Impala 3.3.0 > > > IMPALA-5690 replaced thrift 0.9.0 with 0.9.3 in which THRIFT-3505 changed > transport/TSSLSocket.py. > In thrift 0.9.3, if the python version is lower than 2.7.9, TSSLSocket uses > PROTOCOL_TLSv1 by default: > {code:java} > # For pythoon >= 2.7.9, use latest TLS that both client and server supports. > # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3. > # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are > unavailable. > _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else > ssl.PROTOCOL_TLSv1 > {code} > And the SSL version should be passed as an argument to TSSLSocket.__init__ > instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. > The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use > python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) > and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to > impalad: > > {code:java} > # impala-shell -i impalad01.example.com > -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem > SSL is enabled > No handlers could be found for logger "thrift.transport.TSSLSocket" > Error connecting: TTransportException, Could not connect to > impalad01.example.com:21000: EOF occurred in violation of protocol > (_ssl.c:579) > {code} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Issue Comment Deleted] (IMPALA-8595) THRIFT-3505 breaks IMPALA-5775
[ https://issues.apache.org/jira/browse/IMPALA-8595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robbie Zhang updated IMPALA-8595: - Comment: was deleted (was: IMPALA-8595: Support TLSv1.2 with Python < 2.7.9 in shell IMPALA-5690 replaced thrift 0.9.0 with 0.9.3 in which THRIFT-3505 changed transport/TSSLSocket.py. In thrift 0.9.3, if the python version is lower than 2.7.9, TSSLSocket uses PROTOCOL_TLSv1 by default and the SSL version is passed to TSSLSocket as a parameter when calling TSSLSocket.__init__. Although TLSv1.2 is supported by Python from 2.7.9, Red Hat/CentOS support TLSv1.2 from 2.7.5 with upgraded python-libs. We need to get impala-shell support TLSv1.2 with Python 2.7.5 on Red Hat/CentOS. TESTING: impala-py.test tests/custom_cluster/test_client_ssl.py Change-Id: I3fb6510f4b556bd8c6b1e86380379aba8be4b805 Reviewed-on: http://gerrit.cloudera.org:8080/13457 Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins ) > THRIFT-3505 breaks IMPALA-5775 > -- > > Key: IMPALA-8595 > URL: https://issues.apache.org/jira/browse/IMPALA-8595 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.1.0 >Reporter: Robbie Zhang >Assignee: Robbie Zhang >Priority: Major > > IMPALA-5690 replaced thrift 0.9.0 with 0.9.3 in which THRIFT-3505 changed > transport/TSSLSocket.py. > In thrift 0.9.3, if the python version is lower than 2.7.9, TSSLSocket uses > PROTOCOL_TLSv1 by default: > {code:java} > # For pythoon >= 2.7.9, use latest TLS that both client and server supports. > # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3. > # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are > unavailable. > _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else > ssl.PROTOCOL_TLSv1 > {code} > And the SSL version should be passed as an argument to TSSLSocket.__init__ > instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. > The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use > python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) > and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to > impalad: > > {code:java} > # impala-shell -i impalad01.example.com > -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem > SSL is enabled > No handlers could be found for logger "thrift.transport.TSSLSocket" > Error connecting: TTransportException, Could not connect to > impalad01.example.com:21000: EOF occurred in violation of protocol > (_ssl.c:579) > {code} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8595) THRIFT-3505 breaks IMPALA-5775
[ https://issues.apache.org/jira/browse/IMPALA-8595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855071#comment-16855071 ] Robbie Zhang commented on IMPALA-8595: -- IMPALA-8595: Support TLSv1.2 with Python < 2.7.9 in shell IMPALA-5690 replaced thrift 0.9.0 with 0.9.3 in which THRIFT-3505 changed transport/TSSLSocket.py. In thrift 0.9.3, if the python version is lower than 2.7.9, TSSLSocket uses PROTOCOL_TLSv1 by default and the SSL version is passed to TSSLSocket as a parameter when calling TSSLSocket.__init__. Although TLSv1.2 is supported by Python from 2.7.9, Red Hat/CentOS support TLSv1.2 from 2.7.5 with upgraded python-libs. We need to get impala-shell support TLSv1.2 with Python 2.7.5 on Red Hat/CentOS. TESTING: impala-py.test tests/custom_cluster/test_client_ssl.py Change-Id: I3fb6510f4b556bd8c6b1e86380379aba8be4b805 Reviewed-on: http://gerrit.cloudera.org:8080/13457 Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins > THRIFT-3505 breaks IMPALA-5775 > -- > > Key: IMPALA-8595 > URL: https://issues.apache.org/jira/browse/IMPALA-8595 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.1.0 >Reporter: Robbie Zhang >Assignee: Robbie Zhang >Priority: Major > > IMPALA-5690 replaced thrift 0.9.0 with 0.9.3 in which THRIFT-3505 changed > transport/TSSLSocket.py. > In thrift 0.9.3, if the python version is lower than 2.7.9, TSSLSocket uses > PROTOCOL_TLSv1 by default: > {code:java} > # For pythoon >= 2.7.9, use latest TLS that both client and server supports. > # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3. > # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are > unavailable. > _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else > ssl.PROTOCOL_TLSv1 > {code} > And the SSL version should be passed as an argument to TSSLSocket.__init__ > instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. > The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use > python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) > and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to > impalad: > > {code:java} > # impala-shell -i impalad01.example.com > -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem > SSL is enabled > No handlers could be found for logger "thrift.transport.TSSLSocket" > Error connecting: TTransportException, Could not connect to > impalad01.example.com:21000: EOF occurred in violation of protocol > (_ssl.c:579) > {code} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8595) THRIFT-3505 breaks IMPALA-5775
[ https://issues.apache.org/jira/browse/IMPALA-8595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robbie Zhang updated IMPALA-8595: - Description: IMPALA-5690 replaced thrift 0.9.0 with 0.9.3 in which THRIFT-3505 changed transport/TSSLSocket.py. In thrift 0.9.3, if the python version is lower than 2.7.9, TSSLSocket uses PROTOCOL_TLSv1 by default: {code:java} # For pythoon >= 2.7.9, use latest TLS that both client and server supports. # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3. # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are unavailable. _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else ssl.PROTOCOL_TLSv1 {code} And the SSL version should be passed as an argument to TSSLSocket.__init__ instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to impalad: {code:java} # impala-shell -i impalad01.example.com -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem SSL is enabled No handlers could be found for logger "thrift.transport.TSSLSocket" Error connecting: TTransportException, Could not connect to impalad01.example.com:21000: EOF occurred in violation of protocol (_ssl.c:579) {code} was: IMPALA-5690 replaced thrift 0.9.0 with 0.9.3 in which THRIFT-3505 changed transport/TSSLSocket.py. In thrift 0.9.3, if the python version is lower than 2.9.7, TSSLSocket uses PROTOCOL_TLSv1 by default: {code:java} # For pythoon >= 2.7.9, use latest TLS that both client and server supports. # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3. # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are unavailable. _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else ssl.PROTOCOL_TLSv1 {code} And the SSL version should be passed as an argument to TSSLSocket.__init__ instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to impalad: {code:java} # impala-shell -i impalad01.example.com -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem SSL is enabled No handlers could be found for logger "thrift.transport.TSSLSocket" Error connecting: TTransportException, Could not connect to impalad01.example.com:21000: EOF occurred in violation of protocol (_ssl.c:579) {code} > THRIFT-3505 breaks IMPALA-5775 > -- > > Key: IMPALA-8595 > URL: https://issues.apache.org/jira/browse/IMPALA-8595 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.1.0 >Reporter: Robbie Zhang >Assignee: Robbie Zhang >Priority: Major > > IMPALA-5690 replaced thrift 0.9.0 with 0.9.3 in which THRIFT-3505 changed > transport/TSSLSocket.py. > In thrift 0.9.3, if the python version is lower than 2.7.9, TSSLSocket uses > PROTOCOL_TLSv1 by default: > {code:java} > # For pythoon >= 2.7.9, use latest TLS that both client and server supports. > # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3. > # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are > unavailable. > _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else > ssl.PROTOCOL_TLSv1 > {code} > And the SSL version should be passed as an argument to TSSLSocket.__init__ > instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. > The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use > python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) > and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to > impalad: > > {code:java} > # impala-shell -i impalad01.example.com > -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem > SSL is enabled > No handlers could be found for logger "thrift.transport.TSSLSocket" > Error connecting: TTransportException, Could not connect to > impalad01.example.com:21000: EOF occurred in violation of protocol > (_ssl.c:579) > {code} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8595) THRIFT-3505 breaks IMPALA-5775
[ https://issues.apache.org/jira/browse/IMPALA-8595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robbie Zhang updated IMPALA-8595: - Description: IMPALA-5690 replaced thrift 0.9.0 with 0.9.3 in which THRIFT-3505 changed transport/TSSLSocket.py. In thrift 0.9.3, if the python version is lower than 2.9.7, TSSLSocket uses PROTOCOL_TLSv1 by default: {code:java} # For pythoon >= 2.7.9, use latest TLS that both client and server supports. # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3. # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are unavailable. _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else ssl.PROTOCOL_TLSv1 {code} And the SSL version should be passed as an argument to TSSLSocket.__init__ instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to impalad: {code:java} # impala-shell -i impalad01.example.com -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem SSL is enabled No handlers could be found for logger "thrift.transport.TSSLSocket" Error connecting: TTransportException, Could not connect to impalad01.example.com:21000: EOF occurred in violation of protocol (_ssl.c:579) {code} was: IMPALA-5690 replaced thrift 0.9.0 with 0.9.3 in which THRIFT-3505 changed transport/TSSLSocket.py. In thrift 0.9.3, if the python version is lower than 2.9.7, TSSLSocket uses **PROTOCOL_TLSv1 by default: {code:java} # For pythoon >= 2.7.9, use latest TLS that both client and server supports. # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3. # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are unavailable. _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else ssl.PROTOCOL_TLSv1 {code} And the SSL version should be passed as an argument to TSSLSocket.__init__ instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to impalad: {code:java} # impala-shell -i impalad01.example.com -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem SSL is enabled No handlers could be found for logger "thrift.transport.TSSLSocket" Error connecting: TTransportException, Could not connect to impalad01.example.com:21000: EOF occurred in violation of protocol (_ssl.c:579) {code} > THRIFT-3505 breaks IMPALA-5775 > -- > > Key: IMPALA-8595 > URL: https://issues.apache.org/jira/browse/IMPALA-8595 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.1.0 >Reporter: Robbie Zhang >Assignee: Robbie Zhang >Priority: Major > > IMPALA-5690 replaced thrift 0.9.0 with 0.9.3 in which THRIFT-3505 changed > transport/TSSLSocket.py. > In thrift 0.9.3, if the python version is lower than 2.9.7, TSSLSocket uses > PROTOCOL_TLSv1 by default: > {code:java} > # For pythoon >= 2.7.9, use latest TLS that both client and server supports. > # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3. > # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are > unavailable. > _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else > ssl.PROTOCOL_TLSv1 > {code} > And the SSL version should be passed as an argument to TSSLSocket.__init__ > instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. > The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use > python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) > and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to > impalad: > > {code:java} > # impala-shell -i impalad01.example.com > -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem > SSL is enabled > No handlers could be found for logger "thrift.transport.TSSLSocket" > Error connecting: TTransportException, Could not connect to > impalad01.example.com:21000: EOF occurred in violation of protocol > (_ssl.c:579) > {code} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8595) THRIFT-3505 breaks IMPALA-5775
[ https://issues.apache.org/jira/browse/IMPALA-8595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robbie Zhang reassigned IMPALA-8595: Assignee: Robbie Zhang > THRIFT-3505 breaks IMPALA-5775 > -- > > Key: IMPALA-8595 > URL: https://issues.apache.org/jira/browse/IMPALA-8595 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.1.0 >Reporter: Robbie Zhang >Assignee: Robbie Zhang >Priority: Major > > IMPALA-5690 replaced thrift 0.9.0 with 0.9.3 in which THRIFT-3505 changed > transport/TSSLSocket.py. > In thrift 0.9.3, if the python version is lower than 2.9.7, TSSLSocket uses > **PROTOCOL_TLSv1 by default: > {code:java} > # For pythoon >= 2.7.9, use latest TLS that both client and server supports. > # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3. > # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are > unavailable. > _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else > ssl.PROTOCOL_TLSv1 > {code} > And the SSL version should be passed as an argument to TSSLSocket.__init__ > instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. > The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use > python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) > and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to > impalad: > > {code:java} > # impala-shell -i impalad01.example.com > -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem > SSL is enabled > No handlers could be found for logger "thrift.transport.TSSLSocket" > Error connecting: TTransportException, Could not connect to > impalad01.example.com:21000: EOF occurred in violation of protocol > (_ssl.c:579) > {code} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8595) THRIFT-3505 breaks IMPALA-5775
Robbie Zhang created IMPALA-8595: Summary: THRIFT-3505 breaks IMPALA-5775 Key: IMPALA-8595 URL: https://issues.apache.org/jira/browse/IMPALA-8595 Project: IMPALA Issue Type: Bug Affects Versions: Impala 3.1.0 Reporter: Robbie Zhang IMPALA-5690 replaced thrift 0.9.0 with 0.9.3 in which THRIFT-3505 changed transport/TSSLSocket.py. In thrift 0.9.3, if the python version is lower than 2.9.7, TSSLSocket uses **PROTOCOL_TLSv1 by default: {code:java} # For pythoon >= 2.7.9, use latest TLS that both client and server supports. # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3. # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are unavailable. _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else ssl.PROTOCOL_TLSv1 {code} And the SSL version should be passed as an argument to TSSLSocket.__init__ instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to impalad: {code:java} # impala-shell -i impalad01.example.com -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem SSL is enabled No handlers could be found for logger "thrift.transport.TSSLSocket" Error connecting: TTransportException, Could not connect to impalad01.example.com:21000: EOF occurred in violation of protocol (_ssl.c:579) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8595) THRIFT-3505 breaks IMPALA-5775
Robbie Zhang created IMPALA-8595: Summary: THRIFT-3505 breaks IMPALA-5775 Key: IMPALA-8595 URL: https://issues.apache.org/jira/browse/IMPALA-8595 Project: IMPALA Issue Type: Bug Affects Versions: Impala 3.1.0 Reporter: Robbie Zhang IMPALA-5690 replaced thrift 0.9.0 with 0.9.3 in which THRIFT-3505 changed transport/TSSLSocket.py. In thrift 0.9.3, if the python version is lower than 2.9.7, TSSLSocket uses **PROTOCOL_TLSv1 by default: {code:java} # For pythoon >= 2.7.9, use latest TLS that both client and server supports. # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3. # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are unavailable. _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else ssl.PROTOCOL_TLSv1 {code} And the SSL version should be passed as an argument to TSSLSocket.__init__ instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to impalad: {code:java} # impala-shell -i impalad01.example.com -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem SSL is enabled No handlers could be found for logger "thrift.transport.TSSLSocket" Error connecting: TTransportException, Could not connect to impalad01.example.com:21000: EOF occurred in violation of protocol (_ssl.c:579) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IMPALA-8108) Impala query returns TIMESTAMP values in different types
[ https://issues.apache.org/jira/browse/IMPALA-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robbie Zhang reassigned IMPALA-8108: Assignee: Robbie Zhang > Impala query returns TIMESTAMP values in different types > > > Key: IMPALA-8108 > URL: https://issues.apache.org/jira/browse/IMPALA-8108 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Robbie Zhang >Assignee: Robbie Zhang >Priority: Major > > When a timestamp has a .000 or .00 or .0 (when fraction value is > zeros) the timestamp is displayed with no fraction of second. For example: > {code:java} > select cast(ts as timestamp) from > (values > ('2019-01-11 10:40:18' as ts), > ('2019-01-11 10:40:19.0'), > ('2019-01-11 10:40:19.00'), > ('2019-01-11 10:40:19.000'), > ('2019-01-11 10:40:19.'), > ('2019-01-11 10:40:19.0'), > ('2019-01-11 10:40:19.00'), > ('2019-01-11 10:40:19.000'), > ('2019-01-11 10:40:19.'), > ('2019-01-11 10:40:19.0'), > ('2019-01-11 10:40:19.1') > ) t;{code} > The output is: > {code:java} > +---+ > |cast(ts as timestamp)| > +---+ > |2019-01-11 10:40:18| > |2019-01-11 10:40:19| > |2019-01-11 10:40:19| > |2019-01-11 10:40:19| > |2019-01-11 10:40:19| > |2019-01-11 10:40:19| > |2019-01-11 10:40:19| > |2019-01-11 10:40:19| > |2019-01-11 10:40:19| > |2019-01-11 10:40:19| > |2019-01-11 10:40:19.1| > +---+ > {code} > As we can see, values of the same column are returned in two different types. > The inconsistency breaks some downstream use cases. > The reason is that impala uses function > boost::posix_time::to_simple_string(time_duration) to convert timestamp to a > string and to_simple_string() remove fractional seconds if they are all > zeros. Perhaps we can append ".0" if the length of the string is 8 > (HH:MM:SS). > For now we can work around it by using function from_timestamp(ts, > '-mm-dd hh:mm.ss.s') to unify the output (convert to string), or > using function millisecond(ts) to get fractional seconds. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8108) Impala query returns TIMESTAMP values in different types
Robbie Zhang created IMPALA-8108: Summary: Impala query returns TIMESTAMP values in different types Key: IMPALA-8108 URL: https://issues.apache.org/jira/browse/IMPALA-8108 Project: IMPALA Issue Type: Improvement Components: Backend Reporter: Robbie Zhang When a timestamp has a .000 or .00 or .0 (when fraction value is zeros) the timestamp is displayed with no fraction of second. For example: {code:java} select cast(ts as timestamp) from (values ('2019-01-11 10:40:18' as ts), ('2019-01-11 10:40:19.0'), ('2019-01-11 10:40:19.00'), ('2019-01-11 10:40:19.000'), ('2019-01-11 10:40:19.'), ('2019-01-11 10:40:19.0'), ('2019-01-11 10:40:19.00'), ('2019-01-11 10:40:19.000'), ('2019-01-11 10:40:19.'), ('2019-01-11 10:40:19.0'), ('2019-01-11 10:40:19.1') ) t;{code} The output is: {code:java} +---+ |cast(ts as timestamp)| +---+ |2019-01-11 10:40:18| |2019-01-11 10:40:19| |2019-01-11 10:40:19| |2019-01-11 10:40:19| |2019-01-11 10:40:19| |2019-01-11 10:40:19| |2019-01-11 10:40:19| |2019-01-11 10:40:19| |2019-01-11 10:40:19| |2019-01-11 10:40:19| |2019-01-11 10:40:19.1| +---+ {code} As we can see, values of the same column are returned in two different types. The inconsistency breaks some downstream use cases. The reason is that impala uses function boost::posix_time::to_simple_string(time_duration) to convert timestamp to a string and to_simple_string() remove fractional seconds if they are all zeros. Perhaps we can append ".0" if the length of the string is 8 (HH:MM:SS). For now we can work around it by using function from_timestamp(ts, '-mm-dd hh:mm.ss.s') to unify the output (convert to string), or using function millisecond(ts) to get fractional seconds. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8108) Impala query returns TIMESTAMP values in different types
Robbie Zhang created IMPALA-8108: Summary: Impala query returns TIMESTAMP values in different types Key: IMPALA-8108 URL: https://issues.apache.org/jira/browse/IMPALA-8108 Project: IMPALA Issue Type: Improvement Components: Backend Reporter: Robbie Zhang When a timestamp has a .000 or .00 or .0 (when fraction value is zeros) the timestamp is displayed with no fraction of second. For example: {code:java} select cast(ts as timestamp) from (values ('2019-01-11 10:40:18' as ts), ('2019-01-11 10:40:19.0'), ('2019-01-11 10:40:19.00'), ('2019-01-11 10:40:19.000'), ('2019-01-11 10:40:19.'), ('2019-01-11 10:40:19.0'), ('2019-01-11 10:40:19.00'), ('2019-01-11 10:40:19.000'), ('2019-01-11 10:40:19.'), ('2019-01-11 10:40:19.0'), ('2019-01-11 10:40:19.1') ) t;{code} The output is: {code:java} +---+ |cast(ts as timestamp)| +---+ |2019-01-11 10:40:18| |2019-01-11 10:40:19| |2019-01-11 10:40:19| |2019-01-11 10:40:19| |2019-01-11 10:40:19| |2019-01-11 10:40:19| |2019-01-11 10:40:19| |2019-01-11 10:40:19| |2019-01-11 10:40:19| |2019-01-11 10:40:19| |2019-01-11 10:40:19.1| +---+ {code} As we can see, values of the same column are returned in two different types. The inconsistency breaks some downstream use cases. The reason is that impala uses function boost::posix_time::to_simple_string(time_duration) to convert timestamp to a string and to_simple_string() remove fractional seconds if they are all zeros. Perhaps we can append ".0" if the length of the string is 8 (HH:MM:SS). For now we can work around it by using function from_timestamp(ts, '-mm-dd hh:mm.ss.s') to unify the output (convert to string), or using function millisecond(ts) to get fractional seconds. -- This message was sent by Atlassian JIRA (v7.6.3#76005)