[jira] [Assigned] (IMPALA-8108) Impala query returns TIMESTAMP values in different types

2021-09-08 Thread Robbie Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robbie Zhang reassigned IMPALA-8108:


Assignee: (was: Robbie Zhang)

> Impala query returns TIMESTAMP values in different types
> 
>
> Key: IMPALA-8108
> URL: https://issues.apache.org/jira/browse/IMPALA-8108
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Robbie Zhang
>Priority: Major
>
> When a timestamp has a .000 or .00 or .0 (when fraction value is 
> zeros) the timestamp is displayed with no fraction of second. For example:
> {code:java}
> select cast(ts as timestamp) from 
>  (values 
>  ('2019-01-11 10:40:18' as ts),
>  ('2019-01-11 10:40:19.0'),
>  ('2019-01-11 10:40:19.00'), 
>  ('2019-01-11 10:40:19.000'),
>  ('2019-01-11 10:40:19.'),
>  ('2019-01-11 10:40:19.0'),
>  ('2019-01-11 10:40:19.00'),
>  ('2019-01-11 10:40:19.000'),
>  ('2019-01-11 10:40:19.'),
>  ('2019-01-11 10:40:19.0'),
>  ('2019-01-11 10:40:19.1')
>  ) t;{code}
> The output is:
> {code:java}
> +---+
> |cast(ts as timestamp)|
> +---+
> |2019-01-11 10:40:18|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19.1|
> +---+
> {code}
> As we can see, values of the same column are returned in two different types. 
> The inconsistency breaks some downstream use cases. 
> The reason is that impala uses function 
> boost::posix_time::to_simple_string(time_duration) to convert timestamp to a 
> string and to_simple_string() remove fractional seconds if they are all 
> zeros. Perhaps we can append ".0" if the length of the string is 8 
> (HH:MM:SS).
> For now we can work around it by using function from_timestamp(ts, 
> '-mm-dd hh:mm.ss.s') to unify the output (convert to string), or 
> using function millisecond(ts) to get fractional seconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9375) Remove DirectMetaProvider usage from CatalogMetaProvider

2021-04-05 Thread Robbie Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315220#comment-17315220
 ] 

Robbie Zhang commented on IMPALA-9375:
--

[~vihangk1], except for DirectMetaProvider, there is one more HMS connection in 
each executor:

[https://github.com/apache/impala/blob/1231208da7104c832c13f272d1e5b8f554d29337/fe/src/main/java/org/apache/impala/service/Frontend.java#L348]

Should we remove this connection as well?

> Remove DirectMetaProvider usage from CatalogMetaProvider
> 
>
> Key: IMPALA-9375
> URL: https://issues.apache.org/jira/browse/IMPALA-9375
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 3.4.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Critical
>
> I see that CatalogMetaProvider uses {{DirectMetaProvider}} here 
> https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java#L239
> There are only a couple of places where it is used within 
> CatalogMetaProvider. We should implement those remaining APIs in catalog-v2 
> mode and remove the usage of DirectMetaProvider from CatalogMetaProvider. 
> DirectMetaProvider starts by default a MetastoreClientPool (with 10 
> connections). This is unnecessary given that catalog already makes the 
> connections to HMS at its startup. It also slows down the coordinator startup 
> time if there are HMS connection issues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10082) Concurrent invalidate metadata and create/drop table cause discrepancy in metadata

2020-08-23 Thread Robbie Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17182913#comment-17182913
 ] 

Robbie Zhang commented on IMPALA-10082:
---

Not sure how LocalCatalog is affected yet. But I believe I found a bug in 
CatalogServiceCatalog.java.

Let's take 
[removeTable|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L2076-L2078]
 for example.
{code:java}
  public Table removeTable(String dbName, String tblName) {
Db parentDb = getDb(dbName);
if (parentDb == null) return null;
versionLock_.writeLock().lock();
try {
  Table removedTable = parentDb.removeTable(tblName);
{code}
 The method getDb is called before the lock is occupied. If the thread which is 
processing a global invalidate metadata occupies the lock at that time, 
parentDb will be stale. In other words, the table is removed from HMS but not 
removed from the latest database object in catalogd. Then we can't create a 
table with the same name until we run invalidate metadata again.

 The method addFunction has the same issue.

> Concurrent invalidate metadata and create/drop table cause discrepancy in 
> metadata
> --
>
> Key: IMPALA-10082
> URL: https://issues.apache.org/jira/browse/IMPALA-10082
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 4.0
>Reporter: Robbie Zhang
>Priority: Major
>
> The symptom is similar to IMPALA-7093 but is a different issue. Here is how I 
> reproduce it:
> 1) Ran the first script to keep running create/insert/drop queries
> {code:java}
> #!/bin/bash
> while [ 1 ]
> do
>   shell/impala-shell -q "create table if not exists test(i int); insert into 
> test(i) values(1); drop table test;" 2>&1| tee test.output
>   n=`egrep "Exception" test.output | wc -l`
>   if [ $n -lt 0 ]; then
>     rm -f /tmp/testing
>     exit
>   fi
> done
> {code}
> 2) Ran the second script to keep running global invalidate metadata
> {code:java}
> #!/bin/bash
> while [ 1 ]
> do
>   shell/impala-shell -q "invalidate metadata"
> done
> {code}
> Sometime later, the first scrip ended with "Table default.test does not 
> exist":
> {code:java}
> Starting Impala Shell with no authentication using Python 2.7.12
> Warning: live_progress only applies to interactive shell sessions, and is 
> being skipped for now.
> Opened TCP connection to localhost:21000
> Connected to localhost:21000
> Server version: impalad version 4.0.0-SNAPSHOT DEBUG (build 
> f95f7940e4a290d75ee85fd78e85bc26795f0f9f)
> Query: create table if not exists test(i int)
> Fetched 1 row(s) in 0.01s
> Query: insert into test(i) values(1)
> Query submitted at: 2020-08-13 22:57:51 (Coordinator: http://impala34:25000)
> ERROR: AnalysisException: Table does not exist: default.test
> Could not execute command: insert into test(i) values(1){code}
> Even after I change to local catalog mode, this issue still exists:
> {code:java}
> Starting Impala Shell with no authentication using Python 2.7.12
> Warning: live_progress only applies to interactive shell sessions, and is 
> being skipped for now.
> Opened TCP connection to localhost:21000
> Connected to localhost:21000
> Server version: impalad version 4.0.0-SNAPSHOT DEBUG (build 
> f95f7940e4a290d75ee85fd78e85bc26795f0f9f)
> Query: create table if not exists test(i int)
> Fetched 1 row(s) in 0.07s
> Query: insert into test(i) values(1)
> Query submitted at: 2020-08-13 22:10:16 (Coordinator: http://impala34:25000)
> ERROR: AnalysisException: org.apache.impala.catalog.TableLoadingException: 
> Could not load table default.test from catalog
> CAUSED BY: TableLoadingException: Could not load table default.test from 
> catalog
> CAUSED BY: TException: 
> TGetPartialCatalogObjectResponse(status:TStatus(status_code:GENERAL, 
> error_msgs:[TableLoadingException: Table default.test no longer exists in the 
> Hive MetaStore. Run 'invalidate metadata default.test' to update the Impala 
> catalog.]), lookup_status:OK)
> Could not execute command: insert into test(i) values(1)
> {code}
> And in local catalog mode, the newly created table was lost but it's still 
> visible in the coordinator. After running 'invalidate metadata default.test', 
> the table disappeared at all.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10082) Concurrent invalidate metadata and create/drop table cause discrepancy in metadata

2020-08-13 Thread Robbie Zhang (Jira)
Robbie Zhang created IMPALA-10082:
-

 Summary: Concurrent invalidate metadata and create/drop table 
cause discrepancy in metadata
 Key: IMPALA-10082
 URL: https://issues.apache.org/jira/browse/IMPALA-10082
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 4.0
Reporter: Robbie Zhang


The symptom is similar to IMPALA-7093 but is a different issue. Here is how I 
reproduce it:

1) Ran the first script to keep running create/insert/drop queries
{code:java}
#!/bin/bash

while [ 1 ]
do
  shell/impala-shell -q "create table if not exists test(i int); insert into 
test(i) values(1); drop table test;" 2>&1| tee test.output
  n=`egrep "Exception" test.output | wc -l`
  if [ $n -lt 0 ]; then
    rm -f /tmp/testing
    exit
  fi
done
{code}
2) Ran the second script to keep running global invalidate metadata
{code:java}
#!/bin/bash

while [ 1 ]
do
  shell/impala-shell -q "invalidate metadata"
done
{code}
Sometime later, the first scrip ended with "Table default.test does not exist":
{code:java}
Starting Impala Shell with no authentication using Python 2.7.12
Warning: live_progress only applies to interactive shell sessions, and is being 
skipped for now.
Opened TCP connection to localhost:21000
Connected to localhost:21000
Server version: impalad version 4.0.0-SNAPSHOT DEBUG (build 
f95f7940e4a290d75ee85fd78e85bc26795f0f9f)
Query: create table if not exists test(i int)
Fetched 1 row(s) in 0.01s
Query: insert into test(i) values(1)
Query submitted at: 2020-08-13 22:57:51 (Coordinator: http://impala34:25000)
ERROR: AnalysisException: Table does not exist: default.test


Could not execute command: insert into test(i) values(1){code}
Even after I change to local catalog mode, this issue still exists:
{code:java}
Starting Impala Shell with no authentication using Python 2.7.12
Warning: live_progress only applies to interactive shell sessions, and is being 
skipped for now.
Opened TCP connection to localhost:21000
Connected to localhost:21000
Server version: impalad version 4.0.0-SNAPSHOT DEBUG (build 
f95f7940e4a290d75ee85fd78e85bc26795f0f9f)
Query: create table if not exists test(i int)
Fetched 1 row(s) in 0.07s
Query: insert into test(i) values(1)
Query submitted at: 2020-08-13 22:10:16 (Coordinator: http://impala34:25000)
ERROR: AnalysisException: org.apache.impala.catalog.TableLoadingException: 
Could not load table default.test from catalog
CAUSED BY: TableLoadingException: Could not load table default.test from catalog
CAUSED BY: TException: 
TGetPartialCatalogObjectResponse(status:TStatus(status_code:GENERAL, 
error_msgs:[TableLoadingException: Table default.test no longer exists in the 
Hive MetaStore. Run 'invalidate metadata default.test' to update the Impala 
catalog.]), lookup_status:OK)


Could not execute command: insert into test(i) values(1)
{code}
And in local catalog mode, the newly created table was lost but it's still 
visible in the coordinator. After running 'invalidate metadata default.test', 
the table disappeared at all.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10082) Concurrent invalidate metadata and create/drop table cause discrepancy in metadata

2020-08-13 Thread Robbie Zhang (Jira)
Robbie Zhang created IMPALA-10082:
-

 Summary: Concurrent invalidate metadata and create/drop table 
cause discrepancy in metadata
 Key: IMPALA-10082
 URL: https://issues.apache.org/jira/browse/IMPALA-10082
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 4.0
Reporter: Robbie Zhang


The symptom is similar to IMPALA-7093 but is a different issue. Here is how I 
reproduce it:

1) Ran the first script to keep running create/insert/drop queries
{code:java}
#!/bin/bash

while [ 1 ]
do
  shell/impala-shell -q "create table if not exists test(i int); insert into 
test(i) values(1); drop table test;" 2>&1| tee test.output
  n=`egrep "Exception" test.output | wc -l`
  if [ $n -lt 0 ]; then
    rm -f /tmp/testing
    exit
  fi
done
{code}
2) Ran the second script to keep running global invalidate metadata
{code:java}
#!/bin/bash

while [ 1 ]
do
  shell/impala-shell -q "invalidate metadata"
done
{code}
Sometime later, the first scrip ended with "Table default.test does not exist":
{code:java}
Starting Impala Shell with no authentication using Python 2.7.12
Warning: live_progress only applies to interactive shell sessions, and is being 
skipped for now.
Opened TCP connection to localhost:21000
Connected to localhost:21000
Server version: impalad version 4.0.0-SNAPSHOT DEBUG (build 
f95f7940e4a290d75ee85fd78e85bc26795f0f9f)
Query: create table if not exists test(i int)
Fetched 1 row(s) in 0.01s
Query: insert into test(i) values(1)
Query submitted at: 2020-08-13 22:57:51 (Coordinator: http://impala34:25000)
ERROR: AnalysisException: Table does not exist: default.test


Could not execute command: insert into test(i) values(1){code}
Even after I change to local catalog mode, this issue still exists:
{code:java}
Starting Impala Shell with no authentication using Python 2.7.12
Warning: live_progress only applies to interactive shell sessions, and is being 
skipped for now.
Opened TCP connection to localhost:21000
Connected to localhost:21000
Server version: impalad version 4.0.0-SNAPSHOT DEBUG (build 
f95f7940e4a290d75ee85fd78e85bc26795f0f9f)
Query: create table if not exists test(i int)
Fetched 1 row(s) in 0.07s
Query: insert into test(i) values(1)
Query submitted at: 2020-08-13 22:10:16 (Coordinator: http://impala34:25000)
ERROR: AnalysisException: org.apache.impala.catalog.TableLoadingException: 
Could not load table default.test from catalog
CAUSED BY: TableLoadingException: Could not load table default.test from catalog
CAUSED BY: TException: 
TGetPartialCatalogObjectResponse(status:TStatus(status_code:GENERAL, 
error_msgs:[TableLoadingException: Table default.test no longer exists in the 
Hive MetaStore. Run 'invalidate metadata default.test' to update the Impala 
catalog.]), lookup_status:OK)


Could not execute command: insert into test(i) values(1)
{code}
And in local catalog mode, the newly created table was lost but it's still 
visible in the coordinator. After running 'invalidate metadata default.test', 
the table disappeared at all.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IMPALA-7093) Tables briefly appear to not exist after INVALIDATE METADATA or catalog restart

2019-06-19 Thread Robbie Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868173#comment-16868173
 ] 

Robbie Zhang commented on IMPALA-7093:
--

Hi [~bharathv] , thanks for your comment. 

My fix only changes the behavior within one catalog update. The risks are:

  1) The ImpaladCatalog keeps the removed table/function objects

  2) The ImpaladCatalog doesn't replace stale table/function objects

For 1), I think it's alright unless for some reason the catalogd doesn't 
include the removed table/function objects into deleteLog_. But for 2), I do 
find a scenario in which it could happen. For example, when the catalogd is 
restarted while impala daemons are running, the catalog object versions are 
reset and might be lower than the version of objects in impala daemons. It will 
definitely break my fix. So I just came up with an idea to improve my fix. It's 
not so smart but it should work. I think my fix can be improved as:

  a) 
[ImpaladCatalog.addCatalogObject|https://github.com/apache/impala/blob/30c3cd95a42cacbfa2dbb0b29a4757745af942c3/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L291]
 removes existing table object first then adds it back, just as what we do for 
function objects;
  b) ImpaladCatalog.addCatalogObject adds the name of all updated 
table/function objects to a list or map, and 
[ImpaladCatalog.updateCatalog|https://github.com/apache/impala/blob/30c3cd95a42cacbfa2dbb0b29a4757745af942c3/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L226]
 removes all table/function object which are not in the list or map.

 With the above improvement, I believe the fix has no side effect as long as 
the performance is acceptable. How do you reckon?

> Tables briefly appear to not exist after INVALIDATE METADATA or catalog 
> restart
> ---
>
> Key: IMPALA-7093
> URL: https://issues.apache.org/jira/browse/IMPALA-7093
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.12.0, Impala 2.13.0
>Reporter: Todd Lipcon
>Priority: Major
>
> I'm doing some stress testing of Impala 2.13 (recent snapshot build) and hit 
> the following sequence:
> {code}
>  {"query": "SHOW TABLES in consistency_test", "type": "call", "id": 3}
> {"type": "response", "id": 3, "results": [["t1"]]}
> {"query": "INVALIDATE METADATA", "type": "call", "id": 7}
> {"type": "response", "id": 7}
> {"query": "DESCRIBE consistency_test.t1", "type": "call", "id": 9}
> {"type": "response", "id": 9, "error": "AnalysisException: Could not resolve 
> path: 'consistency_test.t1'\n"}
> {code}
> i.e. 'SHOW TABLES' shows that a table exists, but then shortly after an 
> INVALIDATE METADATA, an attempt to describe a table indicates that the table 
> does not exist. This is a single-threaded test case against a single impalad.
> I also saw a similar behavior that issuing queries to an impalad shortly 
> after a catalogd restart could transiently show tables not existing that in 
> fact exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-7093) Tables briefly appear to not exist after INVALIDATE METADATA or catalog restart

2019-06-19 Thread Robbie Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16862718#comment-16862718
 ] 

Robbie Zhang edited comment on IMPALA-7093 at 6/19/19 1:48 PM:
---

Thank you, [~tarmstrong]!

I find the problem is in 
[ImpaladCatalog.updateCatalog()|https://github.com/apache/impala/blob/ab908d54c22861967f693428ec7d9f6d7008607f/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L191].
 This function always adds top-level catalog objects first. When we run 
'invalidate metadata', it adds the database objects then table/view/function 
objects. But at that time the new database objects are empty, no 
table/view/function object in them. After function 
[ImpaladCatalog.addDB()|https://github.com/apache/impala/blob/ab908d54c22861967f693428ec7d9f6d7008607f/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L391]
 replaces the existing database objects with the new database objects, the 
existing table/view/function objects are lost until Catalogd.updateCatalog() 
adds these objects back. If the impalad compiles a query when the 
table/view/function objects disappear, the query will fail with 
AnalysisException. The error message various in the different type of queries. 
For example, for 'desc table', we can see 'Could not resolve path', for 'select 
* from table', we can see 'Could not resolve table reference', for 'insert into 
table', we can see 'Table does not exist', etc.

I can reproduce this issue by running two scripts. The first script keeps 
running 'invalidate metadata':
{code:java}
#!/bin/bash

while [ 1 ]
do
  shell/impala-shell -q "invalidate metadata" 
done

{code}
After I start the first script, I run the second script which keeps running a 
query: 
{code:java}
#!/bin/bash

while [ 1 ]
do
  #shell/impala-shell -q "desc test" 2>&1| tee test.output
  #shell/impala-shell -q "select * from test" 2>&1| tee test.output
  shell/impala-shell -q "insert overwrite test(i) values(1)" 2>&1| tee 
test.output
  n=`egrep "Fetched |Modified " test.output | wc -l`
  if [ $n -lt 1 ]; then
    exit
  fi
done{code}
The more table/view/function objects there are, the longer the objects 
disappear, and the easier the second script hit AnalysisException. I created 
thousands tables on my cluster. Sometimes the second script hit 
AnalysisException in a couple of minutes while sometimes it takes nearly half 
an hour. Anyway, it's repeatable.

I changed ImpaladCatalog.java as the following. So far, I haven't see the 
AnalysisException again. Seems the issue has gone.
{code:java}
diff --git a/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java 
b/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
index 13cb620..23a7d68 100644
--- a/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
+++ b/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
@@ -20,6 +20,8 @@ package org.apache.impala.catalog;
 import java.nio.ByteBuffer;
 import java.util.ArrayDeque;
 import java.util.Set;
+import java.util.Map;
+import java.util.List;
 import java.util.concurrent.atomic.AtomicLong;
 import java.util.concurrent.atomic.AtomicReference;
 
@@ -388,6 +390,19 @@ public class ImpaladCatalog extends Catalog implements 
FeCatalog {
         existingDb.getCatalogVersion() < catalogVersion) {
       Db newDb = Db.fromTDatabase(thriftDb);
       newDb.setCatalogVersion(catalogVersion);
+      if (existingDb != null) {
+        // Migrant all existing table/view/function to newDb. Otherwise they
+        // will disappear temporarily.
+        for (Table tbl: existingDb.getTables()) {
+          newDb.addTable(tbl);
+        }
+        Map> functions = existingDb.getAllFunctions();
+        for (List fns: existingDb.getAllFunctions().values()) {
+          for (Function f: fns) {
+            newDb.addFunction(f);
+          }
+        }
+      }
       addDb(newDb);
       if (existingDb != null) {
         CatalogObjectVersionSet.INSTANCE.updateVersions(
{code}
Adding a lock into Catalog is another solution. But the change will be more 
complex. In my change, one possible problem is that if the new database object 
has less table/view/function objects than the existing database object, the 
deleted object might be left in Catalog forever. According to my test, the 
deleted objects should be in sequencer.getDeletedObjects() and will be removed 
by 
[ImpaladCatalog.removeCatalogObject()|https://github.com/apache/impala/blob/ab908d54c22861967f693428ec7d9f6d7008607f/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L229].
 So I think my change is fine. Please correct me if I'm wrong.

 


was (Author: robbie):
Thank you, [~tarmstrong]!

I find the problem is in 
[Catalogd.updateCatalog()|https://github.com/apache/impala/blob/ab908d54c22861967f693428ec7d9f6d7008607f/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L191].
 This function 

[jira] [Commented] (IMPALA-7093) Tables briefly appear to not exist after INVALIDATE METADATA or catalog restart

2019-06-19 Thread Robbie Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867278#comment-16867278
 ] 

Robbie Zhang commented on IMPALA-7093:
--

I added a parameter --running-invalidate-metadata for 
tests/stress/concurrent_select.py to start a thread which keeps running 
'invalidate metadata'. By running concurrent_select.py with 
"--running-invalidate-metadata=true", I can reproduce this issue easily:
{code:java}
# tests/stress/concurrent_select.py --minicluster-num-impalads 3 
--max-queries=1000 --startup-queries-per-second=10 --tpch-db=tpch_parquet 
--running-invalidate-metadata=true
Cluster Impalad Version Info:
localhost: impalad version 3.3.0-SNAPSHOT DEBUG (build 
ab5ee0b7857c6ad19f244dc308210f2809436684)
Built on Tue Jun 18 05:04:27 PDT 2019
localhost: impalad version 3.3.0-SNAPSHOT DEBUG (build 
ab5ee0b7857c6ad19f244dc308210f2809436684)
Built on Tue Jun 18 05:04:27 PDT 2019
localhost: impalad version 3.3.0-SNAPSHOT DEBUG (build 
ab5ee0b7857c6ad19f244dc308210f2809436684)
Built on Tue Jun 18 05:04:27 PDT 2019
2019-06-18 06:24:39,494 27754 Thread-8 INFO:cluster[705]:Finding impalad binary 
location
2019-06-18 06:24:39,494 27754 Thread-7 INFO:cluster[705]:Finding impalad binary 
location
2019-06-18 06:24:39,494 27754 Thread-9 INFO:cluster[705]:Finding impalad binary 
location
2019-06-18 06:24:39,843 27754 MainThread INFO:queries[115]:Loading tpch queries
2019-06-18 06:24:39,843 27754 MainThread INFO:test_file_parser[336]:Loading 
tpch queries
Using 25 queries
2019-06-18 06:24:39,865 27754 MainThread INFO:concurrent_select[1508]:Number of 
queries in the list: 25
Done | Active | Executing | Mem Lmt Ex | AC Reject | AC Timeout | Cancel | Err 
| Incorrect | Next Qry Mem Lmt | Tot Qry Mem Lmt | Tracked Mem | RSS Mem
   0 |  0 | 0 |  0 | 0 |  0 |  0 |   0 
| 0 |0 |   0 | |
   5 | 43 |28 |  0 | 0 |  0 |  3 |   0 
| 0 |   92 |6802 | 617 |1964
   8 | 75 |39 |  0 | 0 |  0 |  5 |   0 
| 0 |  155 |   12286 |2693 |2521
Process Process-49:
Traceback (most recent call last):
  File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
  File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
  File "tests/stress/concurrent_select.py", line 663, in _start_single_runner
mesg=error_msg))
Exception: Query tpch_parquet_TPCH-Q16 ID None failed: AnalysisException: Could 
not resolve table reference: 'part'

Aborting due to 1 successive errors encountered
{code}
After I changed ImpaladCatalogd.java, concurrent_select.py can execute 1 
queries without exception:
{code:java}
9998 |  2 | 2 |  0 | 0 |  0 |   1018 |   0 
| 0 |  420 | 602 | |
Query runner (16866) exited with exit code 0
Query runner (15888) exited with exit code 0
1 |  0 | 0 |  0 | 0 |  0 |   1018 |   0 
| 0 |  420 |   0 | |
2019-06-18 23:11:38,444 15605 MainThread INFO:concurrent_select[844]:Test 
Duration: 12071 seconds
{code}
I also started another test in which concurrent_select.py starts 10 queries 
on another cluster. It's still in progress. Nearly 6 queries have been 
executed without exception so far.

> Tables briefly appear to not exist after INVALIDATE METADATA or catalog 
> restart
> ---
>
> Key: IMPALA-7093
> URL: https://issues.apache.org/jira/browse/IMPALA-7093
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.12.0, Impala 2.13.0
>Reporter: Todd Lipcon
>Priority: Major
>
> I'm doing some stress testing of Impala 2.13 (recent snapshot build) and hit 
> the following sequence:
> {code}
>  {"query": "SHOW TABLES in consistency_test", "type": "call", "id": 3}
> {"type": "response", "id": 3, "results": [["t1"]]}
> {"query": "INVALIDATE METADATA", "type": "call", "id": 7}
> {"type": "response", "id": 7}
> {"query": "DESCRIBE consistency_test.t1", "type": "call", "id": 9}
> {"type": "response", "id": 9, "error": "AnalysisException: Could not resolve 
> path: 'consistency_test.t1'\n"}
> {code}
> i.e. 'SHOW TABLES' shows that a table exists, but then shortly after an 
> INVALIDATE METADATA, an attempt to describe a table indicates that the table 
> does not exist. This is a single-threaded test case against a single impalad.
> I also saw a similar behavior that issuing queries to an impalad shortly 
> after a catalogd 

[jira] [Commented] (IMPALA-7093) Tables briefly appear to not exist after INVALIDATE METADATA or catalog restart

2019-06-12 Thread Robbie Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16862718#comment-16862718
 ] 

Robbie Zhang commented on IMPALA-7093:
--

Thank you, [~tarmstrong]!

I find the problem is in 
[Catalogd.updateCatalog()|https://github.com/apache/impala/blob/ab908d54c22861967f693428ec7d9f6d7008607f/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L191].
 This function always adds top-level catalog objects first. When we run 
'invalidate metadata', it adds the database objects then table/view/function 
objects. But at that time the new database objects are empty, no 
table/view/function object in them. After function 
[Catalogd.addDB()|https://github.com/apache/impala/blob/ab908d54c22861967f693428ec7d9f6d7008607f/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L391]
 replaces the existing database objects with the new database objects, the 
existing table/view/function objects are lost until Catalogd.updateCatalog() 
adds these objects back. If the impalad compiles a query when the 
table/view/function objects disappear, the query will fail with 
AnalysisException. The error message various in the different type of queries. 
For example, for 'desc table', we can see 'Could not resolve path', for 'select 
* from table', we can see 'Could not resolve table reference', for 'insert into 
table', we can see 'Table does not exist', etc.

I can reproduce this issue by running two scripts. The first script keeps 
running 'invalidate metadata':
{code:java}
#!/bin/bash

while [ 1 ]
do
  shell/impala-shell -q "invalidate metadata" 
done

{code}
After I start the first script, I run the second script which keeps running a 
query: 
{code:java}
#!/bin/bash

while [ 1 ]
do
  #shell/impala-shell -q "desc test" 2>&1| tee test.output
  #shell/impala-shell -q "select * from test" 2>&1| tee test.output
  shell/impala-shell -q "insert overwrite test(i) values(1)" 2>&1| tee 
test.output
  n=`egrep "Fetched |Modified " test.output | wc -l`
  if [ $n -lt 1 ]; then
    exit
  fi
done{code}
The more table/view/function objects there are, the longer the objects 
disappear, and the easier the second script hit AnalysisException. I created 
thousands tables on my cluster. Sometimes the second script hit 
AnalysisException in a couple of minutes while sometimes it takes nearly half 
an hour. Anyway, it's repeatable.

I changed ImpaladCatalog.java as the following. So far, I haven't see the 
AnalysisException again. Seems the issue has gone.
{code:java}
diff --git a/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java 
b/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
index 13cb620..23a7d68 100644
--- a/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
+++ b/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
@@ -20,6 +20,8 @@ package org.apache.impala.catalog;
 import java.nio.ByteBuffer;
 import java.util.ArrayDeque;
 import java.util.Set;
+import java.util.Map;
+import java.util.List;
 import java.util.concurrent.atomic.AtomicLong;
 import java.util.concurrent.atomic.AtomicReference;
 
@@ -388,6 +390,19 @@ public class ImpaladCatalog extends Catalog implements 
FeCatalog {
         existingDb.getCatalogVersion() < catalogVersion) {
       Db newDb = Db.fromTDatabase(thriftDb);
       newDb.setCatalogVersion(catalogVersion);
+      if (existingDb != null) {
+        // Migrant all existing table/view/function to newDb. Otherwise they
+        // will disappear temporarily.
+        for (Table tbl: existingDb.getTables()) {
+          newDb.addTable(tbl);
+        }
+        Map> functions = existingDb.getAllFunctions();
+        for (List fns: existingDb.getAllFunctions().values()) {
+          for (Function f: fns) {
+            newDb.addFunction(f);
+          }
+        }
+      }
       addDb(newDb);
       if (existingDb != null) {
         CatalogObjectVersionSet.INSTANCE.updateVersions(
{code}
Adding a lock into Catalog is another solution. But the change will be more 
complex. In my change, one possible problem is that if the new database object 
has less table/view/function objects than the existing database object, the 
deleted object might be left in Catalog forever. According to my test, the 
deleted objects should be in sequencer.getDeletedObjects() and will be removed 
by 
[ImpaladCatalog.removeCatalogObject()|https://github.com/apache/impala/blob/ab908d54c22861967f693428ec7d9f6d7008607f/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L229].
 So I think my change is fine. Please correct me if I'm wrong.

 

> Tables briefly appear to not exist after INVALIDATE METADATA or catalog 
> restart
> ---
>
> Key: IMPALA-7093
> URL: https://issues.apache.org/jira/browse/IMPALA-7093
> Project: IMPALA
>  Issue Type: 

[jira] [Commented] (IMPALA-4089) missing thrift function validation checks

2019-06-05 Thread Robbie Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-4089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16856388#comment-16856388
 ] 

Robbie Zhang commented on IMPALA-4089:
--

An empty catalog could cause this exception.  Here is what I saw on a cluster:

{code:java}
E0604 07:59:21.678375 40945 MetaStoreUtils.java:1234] Got exception: 
org.apache.hadoop.hive.metastore.api.MetaException Could not retrieve 
transation read-only status server
Java exception follows:
MetaException(message:Could not retrieve transation read-only status server)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_all_databases_result$get_all_databases_resultStandardScheme.read(ThriftHiveMetastore.java:18313)
...
at org.apache.impala.service.JniCatalog.(JniCatalog.java:108)
E0604 07:59:21.678470 40945 MetaStoreUtils.java:1235] Converting exception to 
MetaException
E0604 07:59:21.678840 40945 CatalogServiceCatalog.java:702] 
MetaException(message:Got exception: 
org.apache.hadoop.hive.metastore.api.MetaException Could not retrieve 
transation read-only status server)
E0604 07:59:21.679153 40945 JniCatalog.java:110] Error initializing Catalog. 
Please run 'invalidate metadata'
Java exception follows:
org.apache.impala.catalog.CatalogException: Error initializing Catalog. Catalog 
may be empty.
at 
org.apache.impala.catalog.CatalogServiceCatalog.reset(CatalogServiceCatalog.java:703)
at org.apache.impala.service.JniCatalog.(JniCatalog.java:108)
Caused by: MetaException(message:Got exception: 
org.apache.hadoop.hive.metastore.api.MetaException Could not retrieve 
transation read-only status server)
at 
org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:1236)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAllDatabases(HiveMetaStoreClient.java:1055)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:101)
at com.sun.proxy.$Proxy5.getAllDatabases(Unknown Source)
at 
org.apache.impala.catalog.CatalogServiceCatalog.reset(CatalogServiceCatalog.java:686)
... 1 more
...
I0604 07:59:21.718371 41215 jni-util.cc:169] 
org.apache.thrift.protocol.TProtocolException: Required field 
'update_fn_symbol' was not present! Struct: 
TAggregateFunction(intermediate_type:TColumnType(types:[TTypeNode(type:SCALAR, 
scalar_type:TScalarType(type:STRING))]), update_fn_symbol:null, 
init_fn_symbol:null, ignores_distinct:false)
at 
org.apache.impala.thrift.TAggregateFunction.validate(TAggregateFunction.java:948)
at org.apache.impala.thrift.TFunction.validate(TFunction.java:1164)
at 
org.apache.impala.thrift.TCatalogObject.validate(TCatalogObject.java:1058)
at 
org.apache.impala.thrift.TCatalogObject$TCatalogObjectStandardScheme.write(TCatalogObject.java:1213)
at 
org.apache.impala.thrift.TCatalogObject$TCatalogObjectStandardScheme.write(TCatalogObject.java:1098)
at 
org.apache.impala.thrift.TCatalogObject.write(TCatalogObject.java:938)
at 
org.apache.impala.thrift.TGetAllCatalogObjectsResponse$TGetAllCatalogObjectsResponseStandardScheme.write(TGetAllCatalogObjectsResponse.java:487)
at 
org.apache.impala.thrift.TGetAllCatalogObjectsResponse$TGetAllCatalogObjectsResponseStandardScheme.write(TGetAllCatalogObjectsResponse.java:421)
at 
org.apache.impala.thrift.TGetAllCatalogObjectsResponse.write(TGetAllCatalogObjectsResponse.java:365)
at org.apache.thrift.TSerializer.serialize(TSerializer.java:79)
at 
org.apache.impala.service.JniCatalog.getCatalogObjects(JniCatalog.java:124)
{code}

Here is the error from HMS log file:

{code:java}
2019-06-04 07:59:21,670 WARN  
org.apache.hadoop.hive.metastore.MetaStoreDirectSql: [pool-6-thread-49]: 
Database initialization failed; direct SQL is disabled
javax.jdo.JDODataStoreException: Could not retrieve transation read-only status 
server
at 
org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451)
...
java.sql.SQLException: Could not retrieve transation read-only status server
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:998)
...
Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: 
Communications link failure

The last packet successfully received from the server was 561,221 milliseconds 
ago.  The last packet sent successfully to the server was 561,221 milliseconds 
ago.
at sun.reflect.GeneratedConstructorAccessor29.newInstance(Unknown 
Source)
...

[jira] [Resolved] (IMPALA-8595) THRIFT-3505 breaks IMPALA-5775

2019-06-03 Thread Robbie Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robbie Zhang resolved IMPALA-8595.
--
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> THRIFT-3505 breaks IMPALA-5775
> --
>
> Key: IMPALA-8595
> URL: https://issues.apache.org/jira/browse/IMPALA-8595
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.1.0
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
> Fix For: Impala 3.3.0
>
>
> IMPALA-5690 replaced thrift  0.9.0 with 0.9.3 in which THRIFT-3505 changed 
> transport/TSSLSocket.py. 
> In thrift 0.9.3, if the python version is lower than 2.7.9, TSSLSocket uses 
> PROTOCOL_TLSv1 by default:
> {code:java}
>   # For pythoon >= 2.7.9, use latest TLS that both client and server supports.
>   # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
>   # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are 
> unavailable.
>   _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else 
> ssl.PROTOCOL_TLSv1
> {code}
> And the SSL version should be passed as an argument to TSSLSocket.__init__ 
> instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. 
> The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use 
> python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) 
> and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to 
> impalad:
>  
> {code:java}
> # impala-shell -i impalad01.example.com
>  -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
> SSL is enabled
> No handlers could be found for logger "thrift.transport.TSSLSocket"
> Error connecting: TTransportException, Could not connect to 
> impalad01.example.com:21000: EOF occurred in violation of protocol 
> (_ssl.c:579)
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8595) THRIFT-3505 breaks IMPALA-5775

2019-06-03 Thread Robbie Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robbie Zhang resolved IMPALA-8595.
--
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> THRIFT-3505 breaks IMPALA-5775
> --
>
> Key: IMPALA-8595
> URL: https://issues.apache.org/jira/browse/IMPALA-8595
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.1.0
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
> Fix For: Impala 3.3.0
>
>
> IMPALA-5690 replaced thrift  0.9.0 with 0.9.3 in which THRIFT-3505 changed 
> transport/TSSLSocket.py. 
> In thrift 0.9.3, if the python version is lower than 2.7.9, TSSLSocket uses 
> PROTOCOL_TLSv1 by default:
> {code:java}
>   # For pythoon >= 2.7.9, use latest TLS that both client and server supports.
>   # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
>   # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are 
> unavailable.
>   _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else 
> ssl.PROTOCOL_TLSv1
> {code}
> And the SSL version should be passed as an argument to TSSLSocket.__init__ 
> instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. 
> The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use 
> python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) 
> and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to 
> impalad:
>  
> {code:java}
> # impala-shell -i impalad01.example.com
>  -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
> SSL is enabled
> No handlers could be found for logger "thrift.transport.TSSLSocket"
> Error connecting: TTransportException, Could not connect to 
> impalad01.example.com:21000: EOF occurred in violation of protocol 
> (_ssl.c:579)
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (IMPALA-8595) THRIFT-3505 breaks IMPALA-5775

2019-06-03 Thread Robbie Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robbie Zhang updated IMPALA-8595:
-
Comment: was deleted

(was: 
IMPALA-8595: Support TLSv1.2 with Python < 2.7.9 in shell

IMPALA-5690 replaced thrift 0.9.0 with 0.9.3 in which THRIFT-3505
changed transport/TSSLSocket.py.
In thrift 0.9.3, if the python version is lower than 2.7.9, TSSLSocket
uses PROTOCOL_TLSv1 by default and the SSL version is passed to
TSSLSocket as a parameter when calling TSSLSocket.__init__.
Although TLSv1.2 is supported by Python from 2.7.9, Red Hat/CentOS
support TLSv1.2 from 2.7.5 with upgraded python-libs. We need to get
impala-shell support TLSv1.2 with Python 2.7.5 on Red Hat/CentOS.

TESTING:
impala-py.test tests/custom_cluster/test_client_ssl.py

Change-Id: I3fb6510f4b556bd8c6b1e86380379aba8be4b805
Reviewed-on: http://gerrit.cloudera.org:8080/13457
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins )

> THRIFT-3505 breaks IMPALA-5775
> --
>
> Key: IMPALA-8595
> URL: https://issues.apache.org/jira/browse/IMPALA-8595
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.1.0
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
>
> IMPALA-5690 replaced thrift  0.9.0 with 0.9.3 in which THRIFT-3505 changed 
> transport/TSSLSocket.py. 
> In thrift 0.9.3, if the python version is lower than 2.7.9, TSSLSocket uses 
> PROTOCOL_TLSv1 by default:
> {code:java}
>   # For pythoon >= 2.7.9, use latest TLS that both client and server supports.
>   # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
>   # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are 
> unavailable.
>   _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else 
> ssl.PROTOCOL_TLSv1
> {code}
> And the SSL version should be passed as an argument to TSSLSocket.__init__ 
> instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. 
> The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use 
> python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) 
> and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to 
> impalad:
>  
> {code:java}
> # impala-shell -i impalad01.example.com
>  -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
> SSL is enabled
> No handlers could be found for logger "thrift.transport.TSSLSocket"
> Error connecting: TTransportException, Could not connect to 
> impalad01.example.com:21000: EOF occurred in violation of protocol 
> (_ssl.c:579)
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8595) THRIFT-3505 breaks IMPALA-5775

2019-06-03 Thread Robbie Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16855071#comment-16855071
 ] 

Robbie Zhang commented on IMPALA-8595:
--


IMPALA-8595: Support TLSv1.2 with Python < 2.7.9 in shell

IMPALA-5690 replaced thrift 0.9.0 with 0.9.3 in which THRIFT-3505
changed transport/TSSLSocket.py.
In thrift 0.9.3, if the python version is lower than 2.7.9, TSSLSocket
uses PROTOCOL_TLSv1 by default and the SSL version is passed to
TSSLSocket as a parameter when calling TSSLSocket.__init__.
Although TLSv1.2 is supported by Python from 2.7.9, Red Hat/CentOS
support TLSv1.2 from 2.7.5 with upgraded python-libs. We need to get
impala-shell support TLSv1.2 with Python 2.7.5 on Red Hat/CentOS.

TESTING:
impala-py.test tests/custom_cluster/test_client_ssl.py

Change-Id: I3fb6510f4b556bd8c6b1e86380379aba8be4b805
Reviewed-on: http://gerrit.cloudera.org:8080/13457
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 

> THRIFT-3505 breaks IMPALA-5775
> --
>
> Key: IMPALA-8595
> URL: https://issues.apache.org/jira/browse/IMPALA-8595
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.1.0
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
>
> IMPALA-5690 replaced thrift  0.9.0 with 0.9.3 in which THRIFT-3505 changed 
> transport/TSSLSocket.py. 
> In thrift 0.9.3, if the python version is lower than 2.7.9, TSSLSocket uses 
> PROTOCOL_TLSv1 by default:
> {code:java}
>   # For pythoon >= 2.7.9, use latest TLS that both client and server supports.
>   # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
>   # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are 
> unavailable.
>   _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else 
> ssl.PROTOCOL_TLSv1
> {code}
> And the SSL version should be passed as an argument to TSSLSocket.__init__ 
> instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. 
> The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use 
> python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) 
> and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to 
> impalad:
>  
> {code:java}
> # impala-shell -i impalad01.example.com
>  -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
> SSL is enabled
> No handlers could be found for logger "thrift.transport.TSSLSocket"
> Error connecting: TTransportException, Could not connect to 
> impalad01.example.com:21000: EOF occurred in violation of protocol 
> (_ssl.c:579)
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8595) THRIFT-3505 breaks IMPALA-5775

2019-05-29 Thread Robbie Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robbie Zhang updated IMPALA-8595:
-
Description: 
IMPALA-5690 replaced thrift  0.9.0 with 0.9.3 in which THRIFT-3505 changed 
transport/TSSLSocket.py. 

In thrift 0.9.3, if the python version is lower than 2.7.9, TSSLSocket uses 
PROTOCOL_TLSv1 by default:
{code:java}
  # For pythoon >= 2.7.9, use latest TLS that both client and server supports.
  # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
  # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are 
unavailable.
  _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else 
ssl.PROTOCOL_TLSv1
{code}
And the SSL version should be passed as an argument to TSSLSocket.__init__ 
instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. 
The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use python 
lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) and set 
ssl_minimum_version to tlsv1.2, impala-shell command can't connect to impalad:

 
{code:java}
# impala-shell -i impalad01.example.com
 -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
SSL is enabled
No handlers could be found for logger "thrift.transport.TSSLSocket"
Error connecting: TTransportException, Could not connect to 
impalad01.example.com:21000: EOF occurred in violation of protocol (_ssl.c:579)
{code}
 

 

  was:
IMPALA-5690 replaced thrift  0.9.0 with 0.9.3 in which THRIFT-3505 changed 
transport/TSSLSocket.py. 

In thrift 0.9.3, if the python version is lower than 2.9.7, TSSLSocket uses 
PROTOCOL_TLSv1 by default:
{code:java}
  # For pythoon >= 2.7.9, use latest TLS that both client and server supports.
  # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
  # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are 
unavailable.
  _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else 
ssl.PROTOCOL_TLSv1
{code}
And the SSL version should be passed as an argument to TSSLSocket.__init__ 
instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. 
The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use python 
lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) and set 
ssl_minimum_version to tlsv1.2, impala-shell command can't connect to impalad:

 
{code:java}
# impala-shell -i impalad01.example.com
 -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
SSL is enabled
No handlers could be found for logger "thrift.transport.TSSLSocket"
Error connecting: TTransportException, Could not connect to 
impalad01.example.com:21000: EOF occurred in violation of protocol (_ssl.c:579)
{code}
 

 


> THRIFT-3505 breaks IMPALA-5775
> --
>
> Key: IMPALA-8595
> URL: https://issues.apache.org/jira/browse/IMPALA-8595
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.1.0
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
>
> IMPALA-5690 replaced thrift  0.9.0 with 0.9.3 in which THRIFT-3505 changed 
> transport/TSSLSocket.py. 
> In thrift 0.9.3, if the python version is lower than 2.7.9, TSSLSocket uses 
> PROTOCOL_TLSv1 by default:
> {code:java}
>   # For pythoon >= 2.7.9, use latest TLS that both client and server supports.
>   # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
>   # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are 
> unavailable.
>   _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else 
> ssl.PROTOCOL_TLSv1
> {code}
> And the SSL version should be passed as an argument to TSSLSocket.__init__ 
> instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. 
> The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use 
> python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) 
> and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to 
> impalad:
>  
> {code:java}
> # impala-shell -i impalad01.example.com
>  -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
> SSL is enabled
> No handlers could be found for logger "thrift.transport.TSSLSocket"
> Error connecting: TTransportException, Could not connect to 
> impalad01.example.com:21000: EOF occurred in violation of protocol 
> (_ssl.c:579)
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8595) THRIFT-3505 breaks IMPALA-5775

2019-05-29 Thread Robbie Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robbie Zhang updated IMPALA-8595:
-
Description: 
IMPALA-5690 replaced thrift  0.9.0 with 0.9.3 in which THRIFT-3505 changed 
transport/TSSLSocket.py. 

In thrift 0.9.3, if the python version is lower than 2.9.7, TSSLSocket uses 
PROTOCOL_TLSv1 by default:
{code:java}
  # For pythoon >= 2.7.9, use latest TLS that both client and server supports.
  # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
  # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are 
unavailable.
  _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else 
ssl.PROTOCOL_TLSv1
{code}
And the SSL version should be passed as an argument to TSSLSocket.__init__ 
instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. 
The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use python 
lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) and set 
ssl_minimum_version to tlsv1.2, impala-shell command can't connect to impalad:

 
{code:java}
# impala-shell -i impalad01.example.com
 -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
SSL is enabled
No handlers could be found for logger "thrift.transport.TSSLSocket"
Error connecting: TTransportException, Could not connect to 
impalad01.example.com:21000: EOF occurred in violation of protocol (_ssl.c:579)
{code}
 

 

  was:
IMPALA-5690 replaced thrift  0.9.0 with 0.9.3 in which THRIFT-3505 changed 
transport/TSSLSocket.py. 

In thrift 0.9.3, if the python version is lower than 2.9.7, TSSLSocket uses 
**PROTOCOL_TLSv1 by default:
{code:java}
  # For pythoon >= 2.7.9, use latest TLS that both client and server supports.
  # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
  # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are 
unavailable.
  _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else 
ssl.PROTOCOL_TLSv1
{code}
And the SSL version should be passed as an argument to TSSLSocket.__init__ 
instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. 
The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use python 
lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) and set 
ssl_minimum_version to tlsv1.2, impala-shell command can't connect to impalad:

 
{code:java}
# impala-shell -i impalad01.example.com
 -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
SSL is enabled
No handlers could be found for logger "thrift.transport.TSSLSocket"
Error connecting: TTransportException, Could not connect to 
impalad01.example.com:21000: EOF occurred in violation of protocol (_ssl.c:579)
{code}

  

 


> THRIFT-3505 breaks IMPALA-5775
> --
>
> Key: IMPALA-8595
> URL: https://issues.apache.org/jira/browse/IMPALA-8595
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.1.0
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
>
> IMPALA-5690 replaced thrift  0.9.0 with 0.9.3 in which THRIFT-3505 changed 
> transport/TSSLSocket.py. 
> In thrift 0.9.3, if the python version is lower than 2.9.7, TSSLSocket uses 
> PROTOCOL_TLSv1 by default:
> {code:java}
>   # For pythoon >= 2.7.9, use latest TLS that both client and server supports.
>   # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
>   # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are 
> unavailable.
>   _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else 
> ssl.PROTOCOL_TLSv1
> {code}
> And the SSL version should be passed as an argument to TSSLSocket.__init__ 
> instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. 
> The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use 
> python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) 
> and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to 
> impalad:
>  
> {code:java}
> # impala-shell -i impalad01.example.com
>  -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
> SSL is enabled
> No handlers could be found for logger "thrift.transport.TSSLSocket"
> Error connecting: TTransportException, Could not connect to 
> impalad01.example.com:21000: EOF occurred in violation of protocol 
> (_ssl.c:579)
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-8595) THRIFT-3505 breaks IMPALA-5775

2019-05-29 Thread Robbie Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robbie Zhang reassigned IMPALA-8595:


Assignee: Robbie Zhang

> THRIFT-3505 breaks IMPALA-5775
> --
>
> Key: IMPALA-8595
> URL: https://issues.apache.org/jira/browse/IMPALA-8595
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.1.0
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
>
> IMPALA-5690 replaced thrift  0.9.0 with 0.9.3 in which THRIFT-3505 changed 
> transport/TSSLSocket.py. 
> In thrift 0.9.3, if the python version is lower than 2.9.7, TSSLSocket uses 
> **PROTOCOL_TLSv1 by default:
> {code:java}
>   # For pythoon >= 2.7.9, use latest TLS that both client and server supports.
>   # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
>   # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are 
> unavailable.
>   _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else 
> ssl.PROTOCOL_TLSv1
> {code}
> And the SSL version should be passed as an argument to TSSLSocket.__init__ 
> instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. 
> The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use 
> python lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) 
> and set ssl_minimum_version to tlsv1.2, impala-shell command can't connect to 
> impalad:
>  
> {code:java}
> # impala-shell -i impalad01.example.com
>  -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
> SSL is enabled
> No handlers could be found for logger "thrift.transport.TSSLSocket"
> Error connecting: TTransportException, Could not connect to 
> impalad01.example.com:21000: EOF occurred in violation of protocol 
> (_ssl.c:579)
> {code}
>   
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8595) THRIFT-3505 breaks IMPALA-5775

2019-05-29 Thread Robbie Zhang (JIRA)
Robbie Zhang created IMPALA-8595:


 Summary: THRIFT-3505 breaks IMPALA-5775
 Key: IMPALA-8595
 URL: https://issues.apache.org/jira/browse/IMPALA-8595
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 3.1.0
Reporter: Robbie Zhang


IMPALA-5690 replaced thrift  0.9.0 with 0.9.3 in which THRIFT-3505 changed 
transport/TSSLSocket.py. 

In thrift 0.9.3, if the python version is lower than 2.9.7, TSSLSocket uses 
**PROTOCOL_TLSv1 by default:
{code:java}
  # For pythoon >= 2.7.9, use latest TLS that both client and server supports.
  # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
  # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are 
unavailable.
  _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else 
ssl.PROTOCOL_TLSv1
{code}
And the SSL version should be passed as an argument to TSSLSocket.__init__ 
instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. 
The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use python 
lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) and set 
ssl_minimum_version to tlsv1.2, impala-shell command can't connect to impalad:

 
{code:java}
# impala-shell -i impalad01.example.com
 -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
SSL is enabled
No handlers could be found for logger "thrift.transport.TSSLSocket"
Error connecting: TTransportException, Could not connect to 
impalad01.example.com:21000: EOF occurred in violation of protocol (_ssl.c:579)
{code}

  

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8595) THRIFT-3505 breaks IMPALA-5775

2019-05-29 Thread Robbie Zhang (JIRA)
Robbie Zhang created IMPALA-8595:


 Summary: THRIFT-3505 breaks IMPALA-5775
 Key: IMPALA-8595
 URL: https://issues.apache.org/jira/browse/IMPALA-8595
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 3.1.0
Reporter: Robbie Zhang


IMPALA-5690 replaced thrift  0.9.0 with 0.9.3 in which THRIFT-3505 changed 
transport/TSSLSocket.py. 

In thrift 0.9.3, if the python version is lower than 2.9.7, TSSLSocket uses 
**PROTOCOL_TLSv1 by default:
{code:java}
  # For pythoon >= 2.7.9, use latest TLS that both client and server supports.
  # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
  # For pythoon < 2.7.9, use TLS 1.0 since TLSv1_X nare OP_NO_SSLvX are 
unavailable.
  _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else 
ssl.PROTOCOL_TLSv1
{code}
And the SSL version should be passed as an argument to TSSLSocket.__init__ 
instead of overriding self.SSL_VERSION in TSSLSocketWithWildcardSAN.__init__. 
The fix for IMPALA-5775 doesn't work against thrift 0.9.3. So if we use python 
lower than 2.7.9 (for example, it's python2.7.5 on Red Hat/CentOS 7.5) and set 
ssl_minimum_version to tlsv1.2, impala-shell command can't connect to impalad:

 
{code:java}
# impala-shell -i impalad01.example.com
 -k --ssl --ca_cert=/etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
SSL is enabled
No handlers could be found for logger "thrift.transport.TSSLSocket"
Error connecting: TTransportException, Could not connect to 
impalad01.example.com:21000: EOF occurred in violation of protocol (_ssl.c:579)
{code}

  

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (IMPALA-8108) Impala query returns TIMESTAMP values in different types

2019-02-14 Thread Robbie Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robbie Zhang reassigned IMPALA-8108:


Assignee: Robbie Zhang

> Impala query returns TIMESTAMP values in different types
> 
>
> Key: IMPALA-8108
> URL: https://issues.apache.org/jira/browse/IMPALA-8108
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
>
> When a timestamp has a .000 or .00 or .0 (when fraction value is 
> zeros) the timestamp is displayed with no fraction of second. For example:
> {code:java}
> select cast(ts as timestamp) from 
>  (values 
>  ('2019-01-11 10:40:18' as ts),
>  ('2019-01-11 10:40:19.0'),
>  ('2019-01-11 10:40:19.00'), 
>  ('2019-01-11 10:40:19.000'),
>  ('2019-01-11 10:40:19.'),
>  ('2019-01-11 10:40:19.0'),
>  ('2019-01-11 10:40:19.00'),
>  ('2019-01-11 10:40:19.000'),
>  ('2019-01-11 10:40:19.'),
>  ('2019-01-11 10:40:19.0'),
>  ('2019-01-11 10:40:19.1')
>  ) t;{code}
> The output is:
> {code:java}
> +---+
> |cast(ts as timestamp)|
> +---+
> |2019-01-11 10:40:18|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19|
> |2019-01-11 10:40:19.1|
> +---+
> {code}
> As we can see, values of the same column are returned in two different types. 
> The inconsistency breaks some downstream use cases. 
> The reason is that impala uses function 
> boost::posix_time::to_simple_string(time_duration) to convert timestamp to a 
> string and to_simple_string() remove fractional seconds if they are all 
> zeros. Perhaps we can append ".0" if the length of the string is 8 
> (HH:MM:SS).
> For now we can work around it by using function from_timestamp(ts, 
> '-mm-dd hh:mm.ss.s') to unify the output (convert to string), or 
> using function millisecond(ts) to get fractional seconds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8108) Impala query returns TIMESTAMP values in different types

2019-01-24 Thread Robbie Zhang (JIRA)
Robbie Zhang created IMPALA-8108:


 Summary: Impala query returns TIMESTAMP values in different types
 Key: IMPALA-8108
 URL: https://issues.apache.org/jira/browse/IMPALA-8108
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Reporter: Robbie Zhang


When a timestamp has a .000 or .00 or .0 (when fraction value is 
zeros) the timestamp is displayed with no fraction of second. For example:
{code:java}
select cast(ts as timestamp) from 
 (values 
 ('2019-01-11 10:40:18' as ts),
 ('2019-01-11 10:40:19.0'),
 ('2019-01-11 10:40:19.00'), 
 ('2019-01-11 10:40:19.000'),
 ('2019-01-11 10:40:19.'),
 ('2019-01-11 10:40:19.0'),
 ('2019-01-11 10:40:19.00'),
 ('2019-01-11 10:40:19.000'),
 ('2019-01-11 10:40:19.'),
 ('2019-01-11 10:40:19.0'),
 ('2019-01-11 10:40:19.1')
 ) t;{code}
The output is:
{code:java}
+---+
|cast(ts as timestamp)|
+---+
|2019-01-11 10:40:18|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19.1|
+---+
{code}

As we can see, values of the same column are returned in two different types. 
The inconsistency breaks some downstream use cases. 

The reason is that impala uses function 
boost::posix_time::to_simple_string(time_duration) to convert timestamp to a 
string and to_simple_string() remove fractional seconds if they are all zeros. 
Perhaps we can append ".0" if the length of the string is 8 (HH:MM:SS).

For now we can work around it by using function from_timestamp(ts, '-mm-dd 
hh:mm.ss.s') to unify the output (convert to string), or using function 
millisecond(ts) to get fractional seconds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8108) Impala query returns TIMESTAMP values in different types

2019-01-24 Thread Robbie Zhang (JIRA)
Robbie Zhang created IMPALA-8108:


 Summary: Impala query returns TIMESTAMP values in different types
 Key: IMPALA-8108
 URL: https://issues.apache.org/jira/browse/IMPALA-8108
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Reporter: Robbie Zhang


When a timestamp has a .000 or .00 or .0 (when fraction value is 
zeros) the timestamp is displayed with no fraction of second. For example:
{code:java}
select cast(ts as timestamp) from 
 (values 
 ('2019-01-11 10:40:18' as ts),
 ('2019-01-11 10:40:19.0'),
 ('2019-01-11 10:40:19.00'), 
 ('2019-01-11 10:40:19.000'),
 ('2019-01-11 10:40:19.'),
 ('2019-01-11 10:40:19.0'),
 ('2019-01-11 10:40:19.00'),
 ('2019-01-11 10:40:19.000'),
 ('2019-01-11 10:40:19.'),
 ('2019-01-11 10:40:19.0'),
 ('2019-01-11 10:40:19.1')
 ) t;{code}
The output is:
{code:java}
+---+
|cast(ts as timestamp)|
+---+
|2019-01-11 10:40:18|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19.1|
+---+
{code}

As we can see, values of the same column are returned in two different types. 
The inconsistency breaks some downstream use cases. 

The reason is that impala uses function 
boost::posix_time::to_simple_string(time_duration) to convert timestamp to a 
string and to_simple_string() remove fractional seconds if they are all zeros. 
Perhaps we can append ".0" if the length of the string is 8 (HH:MM:SS).

For now we can work around it by using function from_timestamp(ts, '-mm-dd 
hh:mm.ss.s') to unify the output (convert to string), or using function 
millisecond(ts) to get fractional seconds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)