[jira] [Created] (IMPALA-9059) Add UNPIVOT operator

2019-10-16 Thread Greg Rahn (Jira)
Greg Rahn created IMPALA-9059:
-

 Summary: Add UNPIVOT operator
 Key: IMPALA-9059
 URL: https://issues.apache.org/jira/browse/IMPALA-9059
 Project: IMPALA
  Issue Type: New Feature
Reporter: Greg Rahn


References:
* 
https://docs.oracle.com/cd/B28359_01/server.111/b28286/statements_10002.htm#CHDCEJJE
* 
https://docs.microsoft.com/en-us/sql/t-sql/queries/from-using-pivot-and-unpivot?view=sql-server-ver15



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9018) ORDER BY using an expression + column alias fails with "Could not resolve column/field"

2019-10-07 Thread Greg Rahn (Jira)
Greg Rahn created IMPALA-9018:
-

 Summary: ORDER BY using an expression + column alias fails with 
"Could not resolve column/field"
 Key: IMPALA-9018
 URL: https://issues.apache.org/jira/browse/IMPALA-9018
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Reporter: Greg Rahn
Assignee: Shant Hovsepian


Test case:

{noformat}
select version();

+-+
| version() 
  |
+-+
| impalad version 3.4.0-SNAPSHOT RELEASE (build 
f047e967d099119717d1d3bbb7a235554707513f) |
| Built on Mon Oct  7 10:07:27 UTC 2019 
  |
+-+

-- works
with t as (select a from (values(1 as a),(2),(3)) t)
select
  a, 
  a + 10 as alias_of_a
from t
order by abs(a);

+---++
| a | alias_of_a |
+---++
| 1 | 11 |
| 2 | 12 |
| 3 | 13 |
+---++

-- fails with 
-- AnalysisException: Could not resolve column/field reference: 'alias_of_a'
with t as (select a from (values(1 as a),(2),(3)) t)
select
  a, 
  a + 10 as alias_of_a
from t
order by abs(alias_of_a);

{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-3759) Add function for unix_timestamp with date format yyyyDDD

2019-10-04 Thread Greg Rahn (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-3759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Rahn resolved IMPALA-3759.
---
Resolution: Fixed

> Add function for unix_timestamp with date format DDD
> 
>
> Key: IMPALA-3759
> URL: https://issues.apache.org/jira/browse/IMPALA-3759
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Affects Versions: Impala 2.3.0
>Reporter: Pavas Garg
>Priority: Minor
>  Labels: built-in-function, ramp-up
>
> Add function for unix_timestamp with date format DDD 
> This works good from hive beeline shell
> Select unix_timestamp('2000123', 'DDD') as julian_unix
> , from_unixtime(unix_timestamp('2000123', 'DDD')) as julian_date
> , unix_timestamp('2502', 'yyMMDD') as yymmdd_unix
> , from_unixtime(unix_timestamp('2502', 'yyMMDD')) as yymmdd_date
> But does not from impala-shell, throwing message - 
> WARNINGS: Bad date/time conversion format: DDD
> Require the Hive functionality in Impala to deal with Julian dates.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-2860) allow Impala to interpret "0001-01-01" and later as a date

2019-10-04 Thread Greg Rahn (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Rahn resolved IMPALA-2860.
---
Resolution: Fixed

> allow Impala to interpret "0001-01-01" and later as a date
> --
>
> Key: IMPALA-2860
> URL: https://issues.apache.org/jira/browse/IMPALA-2860
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.2.2
>Reporter: Boris Gitline
>Priority: Minor
>
> scenario:
> create table foo
> stored as parquet
> as
> select cast('0001-01-01 00:00:00' as timestamp) ts;
> result:
> Year is out of valid range: 1400..1
>  importing lots of data from mainframe systems; the default date on these 
> mainframes is "0001-01-01". As it is, when impala interprets these dates into 
> a timestamp field, it is converted to null.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-4627) selective query optimizations

2019-10-04 Thread Greg Rahn (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Rahn resolved IMPALA-4627.
---
Resolution: Fixed

> selective query optimizations
> -
>
> Key: IMPALA-4627
> URL: https://issues.apache.org/jira/browse/IMPALA-4627
> Project: IMPALA
>  Issue Type: Epic
>  Components: Backend
>Affects Versions: Impala 2.8.0
>Reporter: Greg Rahn
>Priority: Major
>
> Epic for tacking optimizations that will help with very selective queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-6057) Cache Remote Reads

2019-09-04 Thread Greg Rahn (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Rahn resolved IMPALA-6057.
---
Resolution: Duplicate

Seems like a duplicate of IMPALA-8341 now.

> Cache Remote Reads
> --
>
> Key: IMPALA-6057
> URL: https://issues.apache.org/jira/browse/IMPALA-6057
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.11.0
>Reporter: Jim Apple
>Priority: Major
>
> Reads from local disk might be cached in-memory by the OS. Reads from s3 
> should be cached when possible, using {{getETag()}} and {{getLstModified()}} 
> for coherence/invalidation.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (IMPALA-8709) Add Damerau-Levenshtein edit distance built-in function

2019-06-25 Thread Greg Rahn (JIRA)
Greg Rahn created IMPALA-8709:
-

 Summary: Add Damerau-Levenshtein edit distance built-in function
 Key: IMPALA-8709
 URL: https://issues.apache.org/jira/browse/IMPALA-8709
 Project: IMPALA
  Issue Type: New Feature
Reporter: Greg Rahn
Assignee: Greg Rahn


Algo:
https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance

References:
https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.1/com.ibm.nz.dbu.doc/r_dbuser_functions_expressions_fuzzy_funcs.html




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-8165) Planner does not push through predicates when there is a disjunction

2019-02-05 Thread Greg Rahn (JIRA)
Greg Rahn created IMPALA-8165:
-

 Summary: Planner does not push through predicates when there is a 
disjunction
 Key: IMPALA-8165
 URL: https://issues.apache.org/jira/browse/IMPALA-8165
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Reporter: Greg Rahn


If we take a simple query like such:

{noformat}
select avg(ss_quantity)
from store_sales 
join household_demographics on (ss_hdemo_sk=hd_demo_sk)
where (ss_sales_price between 0 and 100 and hd_dep_count = 1)
 or (ss_sales_price between 100 and 200 and hd_dep_count = 2);
{noformat}

and look at the plan we see that neither table scan has any predicates pushed 
to it, the only filter is in the join

(from impalad version 2.12.0-cdh5.16.x RELEASE (build 
3f68649c7bf8a01fb6ba0cbe35dd2492adb836dd)
{noformat}
PLAN-ROOT SINK
|
06:AGGREGATE [FINALIZE]
| output: avg:merge(ss_quantity)
|
05:EXCHANGE [UNPARTITIONED]
|
03:AGGREGATE
| output: avg(ss_quantity)
|
02:HASH JOIN [INNER JOIN, BROADCAST]
| hash predicates: ss_hdemo_sk = hd_demo_sk
| other predicates: (ss_sales_price >= 0 AND ss_sales_price <= 100 AND 
hd_dep_count = 1) OR (ss_sales_price >= 100 AND ss_sales_price <= 200 AND 
hd_dep_count = 2)
| runtime filters: RF000 <- hd_demo_sk
|
|--04:EXCHANGE [BROADCAST]
| |
| 01:SCAN HDFS [tpcds_1000_parquet.household_demographics]
| partitions=1/1 files=1 size=41.08KB
|
00:SCAN HDFS [tpcds_1000_parquet.store_sales]
 partitions=1824/1824 files=1824 size=189.24GB
 runtime filters: RF000 -> ss_hdemo_sk
{noformat}

If we look at PostgreSQL 11.1, we'll see that not only does the join filter, 
but the table scan has the appropriate filters pushed to it.

{noformat}
 Finalize Aggregate  (cost=67549.69..67549.70 rows=1 width=32)
   ->  Gather  (cost=67549.47..67549.68 rows=2 width=32)
 Workers Planned: 2
 ->  Partial Aggregate  (cost=66549.47..66549.48 rows=1 width=32)
   ->  Hash Join  (cost=113.12..66547.64 rows=734 width=4)
 Hash Cond: (store_sales.ss_hdemo_sk = 
household_demographics.hd_demo_sk)
 Join Filter: (((store_sales.ss_sales_price >= 
'0'::numeric) AND 
(store_sales.ss_sales_price <= 
'100'::numeric) AND 
(household_demographics.hd_dep_count = 1)) 
OR 
   ((store_sales.ss_sales_price >= 
'100'::numeric) AND 
(store_sales.ss_sales_price <= 
'200'::numeric) AND 
(household_demographics.hd_dep_count = 2)))
 ->  Parallel Seq Scan on store_sales  (cost=0.00..66343.20 
rows=7305 width=22)
   Filter: (((ss_sales_price >= '0'::numeric) AND 
(ss_sales_price <= '100'::numeric)) OR 
((ss_sales_price >= '100'::numeric) AND 
(ss_sales_price <= '200'::numeric)))
 ->  Hash  (cost=112.62..112.62 rows=40 width=8)
   ->  Seq Scan on household_demographics  
(cost=0.00..112.62 rows=40 width=8)
 Filter: ((hd_dep_count = 1) OR (hd_dep_count = 
2))
{noformat}





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (IMPALA-2200) [Cloudera][ODBC] (10000) General error: Unexpected exception has been caught.

2019-02-04 Thread Greg Rahn (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-2200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Rahn closed IMPALA-2200.
-
Resolution: Information Provided

> [Cloudera][ODBC] (1) General error: Unexpected exception has been caught.
> -
>
> Key: IMPALA-2200
> URL: https://issues.apache.org/jira/browse/IMPALA-2200
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 2.2
> Environment: Server:
> OS: CentOS 6.6
> CDH 5.4.4 cluster with Cloudera manager - 
> Version: Cloudera Express 5.4.3 (#258 built by jenkins on 20150625-2046 git: 
> 1139ffb081360fbbb1be8a19a87ca3c52ee4b1cf)
> Impala: impalad version 2.2.0-cdh5.4.4 RELEASE (build 
> a13d3c6b203e79a284b509df821bffbe229e6dc3)
> Client:
> OS: Windows 7/8
> client app: Tableau Desktop
> Latest, at the moment, Impala-ODBC driver - 
> http://www.cloudera.com/content/cloudera/en/downloads/connectors/impala/odbc/impala-odbc-v2-5-29.html
> Tableau desktop: version 8/9
>Reporter: Boyan Bonev
>Assignee: Syed A. Hashmi
>Priority: Minor
>  Labels: impala, odbc
> Attachments: SimbaImpalaODBC_driver.log
>
>
> Hi,
> We are have a CDH 5.4.4 cluster and use Impala for add-hoc data analysis. We 
> also use Tableau for data visualization. The Tableau connects to Impala using 
> your ODBC driver.  We experience several issues, but most important one is 
> that sometimes (1 out of 10 queries) our queries fail for unknown issue. When 
> we retry the issues is gone till the next occurrence. There is no info in the 
> server and almost no info in the ODBC client log (see attachment). Actually 
> the the only message that the client shows is logged in the ODBC log:
> Aug 10 14:02:43 ERROR 8284 Statement::SQLPrepareW: [Cloudera][ODBC] (1) 
> General error: Unexpected exception has been caught.
> Judging by this we suspect that this might be an ODBC driver issue in the 
> client in the prepared statement logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (IMPALA-5040) User Name authentication (i.e. AuthMech=2) causes JDBC connection to get HUNG.

2019-02-04 Thread Greg Rahn (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Rahn closed IMPALA-5040.
-
Resolution: Information Provided

> User Name authentication (i.e. AuthMech=2) causes JDBC connection to get HUNG.
> --
>
> Key: IMPALA-5040
> URL: https://issues.apache.org/jira/browse/IMPALA-5040
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 2.2.4
>Reporter: Sudarshan
>Priority: Major
> Attachments: ImpalaJDBC_driver.log, Impala_connection_0.log, 
> impalad.INFO
>
>
> Creating a JDBC connection with "AuthMech=2", [jdbc_url] with Cloudera JDBC 
> Driver for Impala Version 2.5.36, causes the JDBC connection to get HUNG for 
> a long time. It never returns. netstat shows that a TCP connection is 
> established and it remains in this state for a long time.
> {code:java}
> [root@nightly59-unsecure-2 impalad]# netstat -an | grep 21050
> tcp0  0 :::21050:::*
> LISTEN  
> tcp0  0 :::172.31.113.152:21050 :::172.16.1.48:50028
> ESTABLISHED 
> [root@nightly59-unsecure-2 impalad]#
> {code}
> [jdbc_url]
> jdbc:impala://nightly59-unsecure-2.gce.cloudera.com:21050/default;AuthMech=2;REQUEST_POOL=sudarshan_pool;UID=sudarshan;LogLevel=6;LogPath=/tmp/IMPALA_TEST
> Entire source code is as follows.
> {code:java}
> import java.sql.Connection;
> import java.sql.DriverManager;
> import java.sql.ResultSet;
> import java.sql.Statement;
> public class SampleCode_Impala_Unsecure {
> public static final String JDBC_4_DRIVER = 
> "com.cloudera.impala.jdbc4.Driver";
> public static final String DRIVER_CLASS = JDBC_4_DRIVER;
> public static final String CONNECTION_URL =  
> "jdbc:impala://nightly59-unsecure-2.gce.cloudera.com:21050/default;AuthMech=2;REQUEST_POOL=sudarshan_pool;UID=sudarshan;LogLevel=6;LogPath=/tmp/IMPALA_TEST";
>public static void main(String[] args) throws Exception{
> Class.forName(DRIVER_CLASS);
> Connection connection = DriverManager.getConnection(CONNECTION_URL);
> Statement st = connection.createStatement();
> ResultSet rs = st.executeQuery("select 'hello' ");
> while (rs.next()) {
> System.out.println(rs.getString(1));
> }
> rs.close();
> st.close();
> connection.close();
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (IMPALA-3784) status-benchmark is broken

2019-01-08 Thread Greg Rahn (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Rahn closed IMPALA-3784.
-
   Resolution: Fixed
Fix Version/s: Impala 2.11.0

> status-benchmark is broken
> --
>
> Key: IMPALA-3784
> URL: https://issues.apache.org/jira/browse/IMPALA-3784
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.7.0
>Reporter: Jim Apple
>Assignee: Jinchul Kim
>Priority: Minor
>  Labels: newbie
> Fix For: Impala 2.11.0
>
>
> status-benchmark in debug mode crashes:
> {noformat}
> status-benchmark: 
> /opt/Impala-Toolchain/boost-1.57.0/include/boost/optional/optional.hpp:992: 
> boost::optional::reference_type boost::optional::get() [with T = 
> large_type; boost::optional::reference_type = large_type&]: Assertion 
> `this->is_initialized()' failed.
> {noformat}
> In release mode, with the toolchain gcc, compilation of just that one file 
> runs for over 15 minutes. After that, I killed it. I tried this multiple 
> times.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-8054) Implicit cast fails with {const INT} BETWEEN FLOAT and INT

2019-01-07 Thread Greg Rahn (JIRA)
Greg Rahn created IMPALA-8054:
-

 Summary: Implicit cast fails with {const INT} BETWEEN FLOAT and INT
 Key: IMPALA-8054
 URL: https://issues.apache.org/jira/browse/IMPALA-8054
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 3.2.0
Reporter: Greg Rahn
Assignee: Paul Rogers


In the following query the literal number 10 needs to be compared to two 
different types: INT and FLOAT, but Impala fails to make the implicit cast for 
the FLOAD. 

The predicates should be
{noformat}
predicates: (cast(10 as float) >= col4) AND (10 <= col3)
{noformat}

*Test case:*
{noformat}
sql> describe tab4
+--++-+
| name | type   | comment |
+--++-+
| pk   | int| |
| col0 | int| |
| col1 | float  | |
| col2 | string | |
| col3 | int| |
| col4 | float  | |
| col5 | string | |
+--++-+

sql> SELECT col0 FROM tab4 WHERE 10 BETWEEN col4 AND col3;
ERROR: IllegalStateException: child 0 type: FLOAT child 1 type: DOUBLE
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-8029) Support DISTINCT with aggregates

2018-12-28 Thread Greg Rahn (JIRA)
Greg Rahn created IMPALA-8029:
-

 Summary: Support DISTINCT with aggregates
 Key: IMPALA-8029
 URL: https://issues.apache.org/jira/browse/IMPALA-8029
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 3.2.0
Reporter: Greg Rahn
Assignee: Paul Rogers


The following is valid syntax, but throws an error in Impala 3.2.0:
{noformat}
sql> select distinct sum(col0) from tab0;
ERROR: AnalysisException: cannot combine SELECT DISTINCT with aggregate 
functions or GROUP BY
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-8028) BETWEEN predicate failures - BetweenPredicate needs to be rewritten into a CompoundPredicate

2018-12-28 Thread Greg Rahn (JIRA)
Greg Rahn created IMPALA-8028:
-

 Summary: BETWEEN predicate failures - BetweenPredicate needs to be 
rewritten into a CompoundPredicate
 Key: IMPALA-8028
 URL: https://issues.apache.org/jira/browse/IMPALA-8028
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 3.2.0
Reporter: Greg Rahn
Assignee: Paul Rogers


It appears that there are between predicates that Impala has challenges 
handling.  These result in the error:
{noformat}
IllegalStateException: BetweenPredicate needs to be rewritten into a 
CompoundPredicate.
{noformat}
Attaching test cases in file.

Tested on
{noformat}
impalad version 3.2.0-cdh6.x-SNAPSHOT RELEASE (build 
7effb62de5add60eb071ae5331e80a42cf7b0dc1)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-7759) Add Levenshtein edit distance built-in function

2018-12-02 Thread Greg Rahn (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Rahn resolved IMPALA-7759.
---
   Resolution: Fixed
Fix Version/s: Impala 3.2.0

> Add Levenshtein edit distance built-in function
> ---
>
> Key: IMPALA-7759
> URL: https://issues.apache.org/jira/browse/IMPALA-7759
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Greg Rahn
>Assignee: Greg Rahn
>Priority: Major
>  Labels: built-in-function
> Fix For: Impala 3.2.0
>
>
> References:
>  * [Netezza - 
> le_dst()|https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.1/com.ibm.nz.dbu.doc/r_dbuser_functions_expressions_fuzzy_funcs.html]
>  * [Postgres - 
> levenshtein()|https://www.postgresql.org/docs/current/static/fuzzystrmatch.html#id-1.11.7.24.6]
> One notable difference:
>  * Netezza: if either value is NULL, returns the length of non-NULL value
>  * Postgres: if either value is NULL, returns NULL
> Preference is to implement Postgres version due to ease of cross-system 
> testing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-7919) Add predicates line in plan output for partition key predicates

2018-12-01 Thread Greg Rahn (JIRA)
Greg Rahn created IMPALA-7919:
-

 Summary: Add predicates line in plan output for partition key 
predicates
 Key: IMPALA-7919
 URL: https://issues.apache.org/jira/browse/IMPALA-7919
 Project: IMPALA
  Issue Type: Improvement
  Components: Frontend
Reporter: Greg Rahn


When there is a predicate on a partitioned table's partition key column the 
SCAN node does not print the "predicates" line as it would if the table was not 
partitioned. IMO predicates should always be included in the nodes where they 
are applied irregardless of partitioning or not to make it clear.

Query:
{noformat}
select * from t1 where part_key=42;
{noformat}

>From a non-partitioned table:
{noformat}
00:SCAN HDFS [default.t1]
   partitions=1/1 files=2 size=10B
   predicates: default.t1.part_key = 42
{noformat}

>From a non-partitioned table:
{noformat}
00:SCAN HDFS [default.t1]
   partitions=1/2 files=1 size=2B
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (IMPALA-7850) INSERT using VALUES with "CAST" can cause trailing spaces.

2018-11-14 Thread Greg Rahn (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Rahn closed IMPALA-7850.
-
Resolution: Not A Bug

> INSERT using VALUES with "CAST" can cause trailing spaces.
> --
>
> Key: IMPALA-7850
> URL: https://issues.apache.org/jira/browse/IMPALA-7850
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Sudarshan
>Priority: Major
>
> INSERT using VALUES with "CAST" can cause trailing spaces. p.s.b.
>  
> Schema :-
>  
>  
> {code:java}
> create database tmp;
>  CREATE TABLE tmp.tablename ( col_id int,col_second string, col_third 
> string);{code}
>  
> Insert statement :-
>  =
>  
> {code:java}
> INSERT INTO tmp.tablename(col_id, col_second, col_third) values (100, 
> CAST('AWESOME' AS CHAR(7)), CAST('TEST' AS CHAR(4))), (1, CAST('I' AS 
> CHAR(1)), CAST('AI' AS CHAR(2))){code}
>  
>  
> File on HDFS :-
>  
>  
> {noformat}
> [admin@host-10-17-101-151 ~]$ cat 
> 9d42419642cbf42e-ffb7c99c_1661109707_data.0.
>  100,AWESOME,TEST
>  1,I ,AI <== Trailing space
>  [admin@host-10-17-101-151 ~]${noformat}
>  
> Query showing length of "I" as 7
>  
> {noformat}
> Query: select col_id, length(col_second), col_second from tmp.tablename
> | col_id | length(col_second) | col_second |
> ++--+-+
> | 100 | 7 | AWESOME |
> | 1 | 7 | I |
> ++--+-+
> [host-10-17-102-128.coe.cloudera.com:21000] >{noformat}
>  
> Workaround :-
> =
> Workaround would be to remove CAST from above statements.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-7759) Add Levenshtein edit distance function

2018-10-25 Thread Greg Rahn (JIRA)
Greg Rahn created IMPALA-7759:
-

 Summary: Add Levenshtein edit distance function
 Key: IMPALA-7759
 URL: https://issues.apache.org/jira/browse/IMPALA-7759
 Project: IMPALA
  Issue Type: New Feature
Reporter: Greg Rahn
Assignee: Greg Rahn


References:
 * [Netezza - 
(le_dst())|https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.1/com.ibm.nz.dbu.doc/r_dbuser_functions_expressions_fuzzy_funcs.html]
 * [Postgres - 
levenshtein()|https://www.postgresql.org/docs/current/static/fuzzystrmatch.html#id-1.11.7.24.6]

One notable difference:
* Netezza: if either value is NULL, returns the length of non-NULL value
* Postgres: if either value is NULL, returns NULL 

Preference is to implement Postgres version due to ease of cross-system testing.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-6537) Add missing ODBC scalar functions

2018-02-23 Thread Greg Rahn (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Rahn resolved IMPALA-6537.
---
   Resolution: Fixed
Fix Version/s: Impala 2.12.0
   Impala 3.0

> Add missing ODBC scalar functions
> -
>
> Key: IMPALA-6537
> URL: https://issues.apache.org/jira/browse/IMPALA-6537
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Greg Rahn
>Assignee: Greg Rahn
>Priority: Major
> Fix For: Impala 3.0, Impala 2.12.0
>
>
> Add the following scalar functions to remove the need for ODBC driver 
> translation for them:
>  - add LEFT() [alias for STRLEFT()]
>  - add RIGHT() [alias for STRRIGHT()]
>  - add WEEK() [alias for WEEKOFYEAR()]
>  - add QUARTER() [new]
>  - add MONTHNAME() [new] 
> And while we're at it, we'll add QUARTER to date_part() and extract() as well.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6537) Add missing ODBC scalar functions

2018-02-18 Thread Greg Rahn (JIRA)
Greg Rahn created IMPALA-6537:
-

 Summary: Add missing ODBC scalar functions
 Key: IMPALA-6537
 URL: https://issues.apache.org/jira/browse/IMPALA-6537
 Project: IMPALA
  Issue Type: New Feature
Reporter: Greg Rahn
Assignee: Greg Rahn


Add the following scalar functions to remove the need for ODBC driver 
translation for them:
 
- add LEFT() [alias for STRLEFT()]
- add RIGHT() [alias for STRRIGHT()]
- add WEEK() [alias to WEEKOFYEAR()]
- add QUARTER() [new]
- add MONTHNAME()[new]
 
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)