[jira] [Created] (HIVE-24903) Change String.getBytes() to DFSUtil.string2Bytes(String) to avoid Unsupported Encoding Exception

2021-03-17 Thread dbgp2021 (Jira)
dbgp2021 created HIVE-24903:
---

 Summary: Change String.getBytes() to DFSUtil.string2Bytes(String) 
to avoid Unsupported Encoding Exception
 Key: HIVE-24903
 URL: https://issues.apache.org/jira/browse/HIVE-24903
 Project: Hive
  Issue Type: Bug
Reporter: dbgp2021


Hello,
I found that DFSUtil.string2Bytes(String) can be used here instead of 
String.getBytes(). Otherwise, the API String.getBytes() may cause potential 
risk of UnsupportedEncodingException since the behavior of this method when the 
string cannot be encoded in the default charset is unspecified. One recommended 
API is DFSUtil.string2Bytes(String) which provides more control over the 
encoding process and can avoid this exception.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24902) Incorrect result due to ReduceExpressionsRule

2021-03-17 Thread Nemon Lou (Jira)
Nemon Lou created HIVE-24902:


 Summary: Incorrect result due to ReduceExpressionsRule
 Key: HIVE-24902
 URL: https://issues.apache.org/jira/browse/HIVE-24902
 Project: Hive
  Issue Type: Bug
  Components: CBO
Affects Versions: 3.1.2, 4.0.0
Reporter: Nemon Lou


The following sql returns only one record (20210308)but we expect two(20210308
20210309).

{code:sql}
select * from (
select 
case when b.a=1
   then  
cast 
(from_unixtime(unix_timestamp(cast(20210309 as string),'MMdd') - 
86400,'MMdd') as bigint)
  else 
  20210309 
   end 
as col
from 
(select stack(2,1,2) as (a))
 as b
) t 
where t.col is not null;
{code}

After debuging, i find the ReduceExpressionsRule changes expression in the 
wrong way.
Original expression:

{code:sql}
IS NOT NULL(CASE(=($0, 1), 
CAST(FROM_UNIXTIME(-(UNIX_TIMESTAMP(CAST(_UTF-16LE'20210309'):VARCHAR(2147483647)
 CHARACTER SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary", 
_UTF-16LE'MMdd'), CAST(86400):BIGINT), _UTF-16LE'MMdd')):BIGINT, 
20210309))
{code}

After reducing expressions:
{code:sql}
CASE(=($0, 1), IS NOT 
NULL(CAST(FROM_UNIXTIME(-(UNIX_TIMESTAMP(CAST(_UTF-16LE'20210309'):VARCHAR(2147483647)
 CHARACTER SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary", 
_UTF-16LE'MMdd'), CAST(86400):BIGINT), _UTF-16LE'MMdd')):BIGINT), true)
{code}

The query plan in main branch:
{code:sql}
STAGE DEPENDENCIES:
  Stage-0 is a root stage

STAGE PLANS:
  Stage: Stage-0
Fetch Operator
  limit: -1
  Processor Tree:
TableScan
  alias: _dummy_table
  Row Limit Per Split: 1
  Statistics: Num rows: 1 Data size: 10 Basic stats: COMPLETE Column 
stats: COMPLETE
  Select Operator
expressions: 2 (type: int), 1 (type: int), 2 (type: int)
outputColumnNames: _col0, _col1, _col2
Statistics: Num rows: 1 Data size: 12 Basic stats: COMPLETE Column 
stats: COMPLETE
UDTF Operator
  Statistics: Num rows: 1 Data size: 12 Basic stats: COMPLETE 
Column stats: COMPLETE
  function name: stack
  Filter Operator
predicate: COALESCE((col0 = 1),false) (type: boolean)
Statistics: Num rows: 1 Data size: 12 Basic stats: COMPLETE 
Column stats: COMPLETE
Select Operator
  expressions: CASE WHEN ((col0 = 1)) THEN (20210308L) ELSE 
(20210309L) END (type: bigint)
  outputColumnNames: _col0
  Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE 
Column stats: COMPLETE
  ListSink

Time taken: 0.155 seconds, Fetched: 28 row(s)

{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24901) Re-enable tests in TestBeeLineWithArgs

2021-03-17 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-24901:
--

 Summary: Re-enable tests in TestBeeLineWithArgs
 Key: HIVE-24901
 URL: https://issues.apache.org/jira/browse/HIVE-24901
 Project: Hive
  Issue Type: Test
  Components: Test
Reporter: Zhihua Deng


Re-enable the tests in TestBeeLineWithArgs, cause they are stable on master now:

http://ci.hive.apache.org/job/hive-flaky-check/219/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24900) Failed compaction does not cleanup the directories

2021-03-17 Thread Ramesh Kumar Thangarajan (Jira)
Ramesh Kumar Thangarajan created HIVE-24900:
---

 Summary: Failed compaction does not cleanup the directories
 Key: HIVE-24900
 URL: https://issues.apache.org/jira/browse/HIVE-24900
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Ramesh Kumar Thangarajan
Assignee: Ramesh Kumar Thangarajan


Failed compaction does not cleanup the directories



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24899) create database event does not include managedLocation URI

2021-03-17 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-24899:
--

 Summary: create database event does not include managedLocation URI
 Key: HIVE-24899
 URL: https://issues.apache.org/jira/browse/HIVE-24899
 Project: Hive
  Issue Type: Bug
Reporter: Vihang Karajgaonkar


I noticed that when a database is created, Metastore generated Notification 
event for the database doesn't have the managed location set. If I do a 
getDatabase call later, metastore returns the managedLocationUri. This seems 
like a inconsistency and it would be good if the generated event includes the 
managedLocationUri as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24898) Beeline does not honor the credential provided in property-file

2021-03-17 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-24898:
-

 Summary: Beeline does not honor the credential provided in 
property-file
 Key: HIVE-24898
 URL: https://issues.apache.org/jira/browse/HIVE-24898
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 4.0.0
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


Beeline read the param correctly from the properties files but again fallback 
to the default beeline connection which require user to provide username and 
password.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Hive meetup on March 17

2021-03-17 Thread Zoltan Haindrich

Hey All!

We have our first online Hive meetup today!

We will start at 5pm UTC for other timezones see on this site:
https://www.timeanddate.com/worldclock/meetingdetails.html?year=2021&month=3&day=17&hour=17&min=0&sec=0&p1=50&p2=137&p3=136&p4=70&p5=176

If you don't yet have the meeting url - it will be held in a zoom room at:
https://cloudera.zoom.us/j/91452267238
Most likely there will be a recording of it - which will be shared afterwards.

I was thinking to use Github discussions to (also) ask questions during the event - because it could help untangle "question time" from "answer time"; we may of course 
choose not to use it - but I've experimented with it and if we add discussions to the "Q&A" section we may even answer it - and people thinking about the same thing may 
extend the question by adding further comments...or just vote on the question...

not sure how well it will work - might worth a try!
I've set it up on my own fork for now: 
https://github.com/kgyrtkirk/hive/discussions

The meetup url is here:
https://www.meetup.com/Hive-User-Group-Meeting/events/276886707

Meet you there!

cheers,
Zoltan

On 3/16/21 3:29 PM, Zoltan Haindrich wrote:

Hey All!

Our meetup is also available as a meetup.com event:
https://www.meetup.com/Hive-User-Group-Meeting/events/276886707/

In case you want to add it to the calendar or something... :)

cheers,
Zoltan


On 3/11/21 3:00 PM, Zoltan Haindrich wrote:

Hey All!

I would like to invite you to our (first?) online Hive meetup! It will be held 
on March 17. 17:00 UTC
I'll send out a zoom url before the event starts!

The planned topics are accessible here:
https://docs.google.com/document/d/12jaWa7e6jvVjUaxoMWNJcjvTjnNoqwdCAMyswY1OiUg/edit?usp=sharing

Meet you there!

cheers,
Zoltan






[jira] [Created] (HIVE-24897) Is null filter on partitioning column leads to non-vectorized execution

2021-03-17 Thread Jira
László Bodor created HIVE-24897:
---

 Summary: Is null filter on partitioning column leads to 
non-vectorized execution
 Key: HIVE-24897
 URL: https://issues.apache.org/jira/browse/HIVE-24897
 Project: Hive
  Issue Type: Improvement
Reporter: László Bodor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Add my profile in Hive mailing list and contributor list

2021-03-17 Thread Ranith Sardar
Hi,

This is my profile details, please allow me to join the mailing list and
contributor list of Hive.
Email id: "ranithsardar.90@gmail"
User Name: RANith


Thanks
Ranith Sardar


[jira] [Created] (HIVE-24896) External table having same name as dropped managed table fails to replicate

2021-03-17 Thread Pravin Sinha (Jira)
Pravin Sinha created HIVE-24896:
---

 Summary: External table having same name as dropped managed table 
fails to replicate
 Key: HIVE-24896
 URL: https://issues.apache.org/jira/browse/HIVE-24896
 Project: Hive
  Issue Type: Bug
Reporter: Pravin Sinha
Assignee: Pravin Sinha






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24895) Add a DataCopyEnd task for external table replication

2021-03-17 Thread Ayush Saxena (Jira)
Ayush Saxena created HIVE-24895:
---

 Summary: Add a DataCopyEnd task for external table replication
 Key: HIVE-24895
 URL: https://issues.apache.org/jira/browse/HIVE-24895
 Project: Hive
  Issue Type: Improvement
Reporter: Ayush Saxena
Assignee: Ayush Saxena


Add a task to mark the end of external table copy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24894) transform_acid is unstable

2021-03-17 Thread Krisztian Kasa (Jira)
Krisztian Kasa created HIVE-24894:
-

 Summary: transform_acid is unstable
 Key: HIVE-24894
 URL: https://issues.apache.org/jira/browse/HIVE-24894
 Project: Hive
  Issue Type: Improvement
Reporter: Krisztian Kasa


[http://ci.hive.apache.org/job/hive-flaky-check/217]
{code}
Client execution failed with error code = 2 
running 

SELECT transform(*) USING 'transform_acid_grep.sh' AS (col string) FROM 
transform_acid 
fname=transform_acid.q
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24893) Download data from Thriftserver through JDBC

2021-03-17 Thread Yuming Wang (Jira)
Yuming Wang created HIVE-24893:
--

 Summary: Download data from Thriftserver through JDBC
 Key: HIVE-24893
 URL: https://issues.apache.org/jira/browse/HIVE-24893
 Project: Hive
  Issue Type: New Feature
  Components: HiveServer2, JDBC
Affects Versions: 4.0.0
Reporter: Yuming Wang




Snowflake support Download Data Files Directly from an Internal Stage to a 
Stream:
https://docs.snowflake.com/en/user-guide/jdbc-using.html#label-jdbc-download-from-stage-to-stream
https://github.com/snowflakedb/snowflake-jdbc/blob/95a7d8a03316093430dc3960df6635643208b6fd/src/main/java/net/snowflake/client/jdbc/SnowflakeConnectionV1.java#L886




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24892) Replace getContentSummary::getLength with listStatus(recursive) for blobstores

2021-03-17 Thread Rajesh Balamohan (Jira)
Rajesh Balamohan created HIVE-24892:
---

 Summary: Replace getContentSummary::getLength with 
listStatus(recursive) for blobstores
 Key: HIVE-24892
 URL: https://issues.apache.org/jira/browse/HIVE-24892
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Rajesh Balamohan


https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStats.java#L219

For blobstores, getContentSummary() is super slow. It would be good to replace 
this with "fs.listFiles(path, true)".




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24891) Tablestamp field returns a value different from what was inserted using PreparedStaement.setLog

2021-03-17 Thread Anurag Shekhar (Jira)
Anurag Shekhar created HIVE-24891:
-

 Summary: Tablestamp field returns a value different from what was 
inserted using PreparedStaement.setLog
 Key: HIVE-24891
 URL: https://issues.apache.org/jira/browse/HIVE-24891
 Project: Hive
  Issue Type: Bug
  Components: JDBC
 Environment: Setup

Hive Cluster Timezone - UTC

JDBC Client Timezone - IST

Create timestamp "ts = Timestamp.valueOf("2021-03-16 00:00:00.000");"

Insert using PreparedStatement (call setLong (index, ts.getTime())

Query Same field.

Return Timestamp differs from the one inserted.

Reproduction Code 
{code:java}
private static String getFormattedTimestamp(Timestamp ts) {
SimpleDateFormat format = new SimpleDateFormat("-MM-dd'T'HH:mm:ss.SSS 
z");
return format.format(ts);
}

public static void main (String [] arg) throws SQLException {
TimeZone.setDefault(TimeZone.getTimeZone("IST"));
Connection conn = DriverManager.getConnection 
("jdbc:hive2://anurag-hwc-2.anurag-hwc.root.hwx.site:1", "hive", "hive");
Statement stmt = conn.createStatement();
stmt.execute("drop table if exists ts_table");
stmt.execute("create table ts_table (ts timestamp) stored as orc");
PreparedStatement pStmt = conn.prepareStatement("insert into ts_table (ts) 
values (?)");
long timeStamp = System.currentTimeMillis();
Timestamp ts = Timestamp.valueOf("2021-03-16 00:00:00.000");
pStmt.setLong (1, ts.getTime());
pStmt.execute();
pStmt.close();

System.out.println ("Inserted " + getFormattedTimestamp(ts) + " In millis " 
+ ts.getTime());

ResultSet rs = stmt.executeQuery("Select * from ts_table");
rs.next();
Timestamp resultTs = rs.getTimestamp(1);
System.out.println("Retrieved " + getFormattedTimestamp(resultTs) + " In 
millis " + resultTs.getTime());
rs.close();
} {code}
 

Output of above code
 Inserted 2021-03-16T00:00:00.000 IST In millis 161583300
 Retrieved 2021-03-15T18:30:00.000 IST In millis 161581320
Reporter: Anurag Shekhar






--
This message was sent by Atlassian Jira
(v8.3.4#803005)