Re: Review Request 49655: HIVE-12646: beeline and HIVE CLI do not parse ; in quote properly

2016-07-11 Thread Sahil Takiar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49655/
---

(Updated July 12, 2016, 4:04 a.m.)


Review request for hive, Sergio Pena and Yongzhi Chen.


Bugs: HIVE-12646
https://issues.apache.org/jira/browse/HIVE-12646


Repository: hive-git


Description
---

HIVE-12646: beeline and HIVE CLI do not parse ; in quote properly

Approach:

  * Modified the `Commands.execute(...)` command to iterate throught the given 
input line character by character
  * It looks for single and double quotes in order to track when the iterator 
is inside a quotation block
  * If the iterator is inside a quotation block and it finds a semicolon, it 
ignores it, otherwise it treats it as it normally would
  * Moved the logic for parsing the line into a helper method called 
`getCmList(...)` which is responsible for returning a `List` of commands that 
need to be run


Diffs
-

  beeline/src/java/org/apache/hive/beeline/Commands.java 3a204c0 
  
itests/hive-unit/src/test/java/org/apache/hive/beeline/TestBeeLineWithArgs.java 
ecfeddb 

Diff: https://reviews.apache.org/r/49655/diff/


Testing
---

Add a unit tests which checks that Beeline can successfully run queries that 
contain semi-colons inside quotation blocks. Confirmed existing unit tests pass.


Thanks,

Sahil Takiar



[jira] [Created] (HIVE-14212) hbase_queries result out of date on branch-2.1

2016-07-11 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-14212:
---

 Summary: hbase_queries result out of date on branch-2.1
 Key: HIVE-14212
 URL: https://issues.apache.org/jira/browse/HIVE-14212
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Trivial






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14211) AcidUtils.getAcidState()/Cleaner - make it consistent wrt multiple base files etc

2016-07-11 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-14211:
-

 Summary: AcidUtils.getAcidState()/Cleaner - make it consistent wrt 
multiple base files etc
 Key: HIVE-14211
 URL: https://issues.apache.org/jira/browse/HIVE-14211
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Priority: Blocker


The JavaDoc on getAcidState() reads, in part:

"Note that because major compactions don't
   preserve the history, we can't use a base directory that includes a
   transaction id that we must exclude."

which is correct but there is nothing in the code that does this.

And if we detect a situation where txn X must be excluded but and there are 
deltas that contain X, we'll have to abort the txn.  This can't (reasonably) 
happen with auto commit mode, but with multi statement txns it's possible.
Suppose some long running txn starts and lock in snapshot at 17 (HWM).  An hour 
later it decides to access some partition for which all txns < 20 (for example) 
have already been compacted (i.e. GC'd).  

==
Here is a more concrete example.  Let's say the file for table A are as follows 
and created in the order listed.
delta_4_4
delta_5_5
delta_4_5
base_5
delta_16_16
delta_17_17
base_17  (for example user ran major compaction)

let's say getAcidState() is called with ValidTxnList(20:16), i.e. with HWM=20 
and ExceptionList=<16>
Assume that all txns <= 20 commit.

Reader can't use base_17 because it has result of txn16.  So it should chose 
base_5 "TxnBase bestBase" in _getChildState()_.
Then the reset of the logic in _getAcidState()_ should choose delta_16_16 and 
delta_17_17 in _Directory_ object.  This would represent acceptable snapshot 
for such reader.

The issue is if at the same time the Cleaner process is running.  It will see 
everything with txnid<17 as obsolete.  Then it will check lock manger state and 
decide to delete (as there may not be any locks in LM for table A).  The order 
in which the files are deleted is undefined right now.  It may delete 
delta_16_16 and delta_17_17 first and right at this moment the read request 
with ValidTxnList(20:16) arrives (such snapshot may have bee locked in by some 
multi-stmt txn that started some time ago.  It acquires locks after the Cleaner 
checks LM state and calls getAcidState(). This request will choose base_5 but 
it won't see delta_16_16 and delta_17_17 and thus return the snapshot w/o 
modifications made by those txns.
[This is not possible currently since we only support autoCommit=true.  The 
reason is the a query (0) opens txn (if appropriate), (1) acquires locks, (2) 
locks in the snapshot.  The cleaner won't delete anything for a given 
compaction (partition) if there are locks on it.  Thus for duration of the 
transaction, nothing will be deleted so it's safe to use base_5]


This is a subtle race condition but possible.

1. So the safest thing to do to ensure correctness is to use the latest base_x 
as the "best" and check against exceptions in ValidTxnList and throw an 
exception if there is an exception <=x.

2. A better option is to keep 2 exception lists: aborted and open and only 
throw if there is an open txn <=x.  Compaction throws away data from aborted 
txns and thus there is no harm using base with aborted txns in its range.

3. You could make each txn record the lowest open txn id at its start and 
prevent the cleaner from cleaning anything delta with id range that includes 
this open txn id for any txn that is still running.  This has a drawback of 
potentially delaying GC of old files for arbitrarily long periods.  So this 
should be a user config choice.   The implementation is not trivial.

I would go with 1 now and do 2/3 together with multi-statement txn work.



Side note:  if 2 deltas have overlapping ID range, then 1 must be a subset of 
the other



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14210) SSLFactory truststore reloader threads leaking in HiveServer2

2016-07-11 Thread Thomas Friedrich (JIRA)
Thomas Friedrich created HIVE-14210:
---

 Summary: SSLFactory truststore reloader threads leaking in 
HiveServer2
 Key: HIVE-14210
 URL: https://issues.apache.org/jira/browse/HIVE-14210
 Project: Hive
  Issue Type: Bug
  Components: Hive, HiveServer2
Affects Versions: 2.1.0, 2.0.0, 1.2.1
Reporter: Thomas Friedrich


We found an issue in a customer environment where the HS2 crashed after a few 
days and the Java core dump contained several thousands of truststore reloader 
threads:

"Truststore reloader thread" #126 daemon prio=5 os_prio=0 
tid=0x7f680d2e3000 nid=0x98fd waiting on 
condition [0x7f67e482c000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run
(ReloadingX509TrustManager.java:225)
at java.lang.Thread.run(Thread.java:745)

We found the issue to be caused by a bug in Hadoop where the TimelineClientImpl 
is not destroying the SSLFactory if SSL is enabled in Hadoop and the timeline 
server is running. I opened YARN-5309 which has more details on the problem, 
and a patch was submitted a few days back.

In addition to the changes in Hadoop, there are a couple of Hive changes 
required:
- ExecDriver needs to call jobclient.close() to trigger the clean-up of the 
resources after the submitted job is done/failed
- Hive needs to pick up a newer release of Hadoop to pick up MAPREDUCE-6618 and 
MAPREDUCE-6621 that fixed issues with calling jobclient.close(). Both fixes are 
included in Hadoop 2.6.4. 
However, since we also need to pick up YARN-5309, we need to wait for a new 
release of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14209) Add some logging info for session and operation management

2016-07-11 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-14209:
---

 Summary: Add some logging info for session and operation management
 Key: HIVE-14209
 URL: https://issues.apache.org/jira/browse/HIVE-14209
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 2.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu
Priority: Minor


It's hard to track the session and operation open and close in multiple user 
env. Add some logging info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 49919: HIVE-14135 : beeline output not formatted correctly for large column widths

2016-07-11 Thread Vihang Karajgaonkar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49919/
---

Review request for hive, Mohit Sabharwal, Sergio Pena, Sahil Takiar, and Thejas 
Nair.


Repository: hive-git


Description
---

HIVE-14135 : beeline output not formatted correctly for large column widths


Diffs
-

  beeline/pom.xml a720d0835314221ec3bd9e8d354d148498ff794c 
  beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java 
5aaa38527734d46de037352ff51e54e0ae1cede0 
  beeline/src/java/org/apache/hive/beeline/BufferedRows.java 
962c5319bb7e6e448979e1cef80a086cadd2ecc6 
  beeline/src/test/org/apache/hive/beeline/TestBufferedRows.java PRE-CREATION 

Diff: https://reviews.apache.org/r/49919/diff/


Testing
---


Thanks,

Vihang Karajgaonkar



Re: Review Request 49288: HIVE-11402 HS2 - disallow parallel query execution within a single Session

2016-07-11 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49288/
---

(Updated July 11, 2016, 8:12 p.m.)


Review request for hive and Thejas Nair.


Repository: hive-git


Description
---

.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 283ef2e 
  
itests/hive-unit/src/test/java/org/apache/hive/service/cli/session/TestHiveSessionImpl.java
 d58a913 
  
service/src/java/org/apache/hive/service/cli/operation/ExecuteStatementOperation.java
 ff46ed8 
  service/src/java/org/apache/hive/service/cli/operation/MetadataOperation.java 
44463c9 
  service/src/java/org/apache/hive/service/cli/operation/Operation.java ba034ab 
  service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 
28c4553 
  service/src/java/org/apache/hive/service/cli/session/HiveSession.java 78ff388 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
7341635 
  
service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java
 f7b3412 

Diff: https://reviews.apache.org/r/49288/diff/


Testing
---


Thanks,

Sergey Shelukhin



Review Request 49917: HIVE-13966: Added Transactional listener support

2016-07-11 Thread Rahul Sharma

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49917/
---

Review request for hive, Colin McCabe, Sravya Tirukkovalur, and Nachiket Vaidya.


Repository: hive-git


Description
---

HIVE-13966: Added Transactional listener support


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
283ef2ef725d9df7a8359145c141b2494e718529 
  
hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java
 172f58d435ba06b4c3df0344a3f1f6567a5e970c 
  metastore/src/java/org/apache/hadoop/hive/metastore/AlterHandler.java 
dedd4497adfcc9d57090a943f6bb4f35ea87fa61 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 
7b8459556f54ad8d6e38526796c2ca0c48525cfb 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
c945bf73c068630f5e7a6ac28ae386a2e74e1755 
  
metastore/src/java/org/apache/hadoop/hive/metastore/TransactionalMetaStoreEventListener.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/49917/diff/


Testing
---


Thanks,

Rahul Sharma



[jira] [Created] (HIVE-14208) Outer MapJoin uses key data of outer input and Converter

2016-07-11 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-14208:
--

 Summary: Outer MapJoin uses key data of outer input and Converter
 Key: HIVE-14208
 URL: https://issues.apache.org/jira/browse/HIVE-14208
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Jesus Camacho Rodriguez


Consider an left outer MapJoin operator. OI for the outputs are created from 
outer and inner side from their inputs. However, when there is a match in the 
join, the data for the key is always taken from the outer side (as it is done 
currently). Thus, we need to apply the Converter logic on the data to get the 
correct type.

This issue is to explore whether a better solution would be to use the key from 
correct inputs of the join to eliminate need of Converters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 49782: HIVE-14170: Beeline IncrementalRows should buffer rows and incrementally re-calculate width if TableOutputFormat is used

2016-07-11 Thread Sahil Takiar


> On July 11, 2016, 4:46 p.m., Sergio Pena wrote:
> > beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java, line 75
> > 
> >
> > Should we use an constant variable for the 1000 number?

Yes thats a good idea, updated.


> On July 11, 2016, 4:46 p.m., Sergio Pena wrote:
> > beeline/src/java/org/apache/hive/beeline/IncrementalRows.java, line 36
> > 
> >
> > is '--incremental' still working? 
> > Also, I don't see this config on the --help message. I think that 
> > adding it on BeeLine.properties will add it as Peter Vary mentioned.

Yes the `--incremental` option still works, and in fact the new parameter only 
takes affect if `--incremental` is set to `true`. Right now, it is set to 
`false` by default, but there are efforts to set it to `true` by default.

I updated the `BeeLine.properties` file and confirmed that the `--help` option 
displays the new parameter.


- Sahil


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49782/#review141714
---


On July 11, 2016, 5:29 p.m., Sahil Takiar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49782/
> ---
> 
> (Updated July 11, 2016, 5:29 p.m.)
> 
> 
> Review request for hive, Sergio Pena, Thejas Nair, and Vaibhav Gumashta.
> 
> 
> Bugs: HIVE-14170
> https://issues.apache.org/jira/browse/HIVE-14170
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> * Added a new BeeLine Options called `incrementalBufferRows` which controls 
> the number of `Row`s the `IncrementalRows` class should buffer, by default it 
> is 1000
> * Modified `BufferedRows` so that it can accept a limit on the number of 
> `Row`s it buffers
> * Modified `IncrementalRows` to read the value of `incrementalBufferRows` and 
> buffer rows as per HIVE-14170
> * The class delegates all buffering work to a `BufferedRows` class
> * This has the advantage that all the width calculaltion that spans multiple 
> rows can be encapsulate in the `BufferedRows` class, there is no need to 
> re-implement the logic in `IncrementalRows`
> * `IncrementalRows` will buffer `incrementalBufferRows` rows at a time, when 
> the buffer is depleted, it will fetch the next buffer and re-calculate the 
> width for that buffer
> 
> 
> Diffs
> -
> 
>   beeline/pom.xml a720d08 
>   beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java 5aaa385 
>   beeline/src/java/org/apache/hive/beeline/BufferedRows.java 962c531 
>   beeline/src/java/org/apache/hive/beeline/IncrementalRows.java 8aef976 
>   beeline/src/java/org/apache/hive/beeline/Rows.java 453f685 
>   beeline/src/main/resources/BeeLine.properties 7500df9 
>   beeline/src/test/org/apache/hive/beeline/TestIncrementalRows.java 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/49782/diff/
> 
> 
> Testing
> ---
> 
> * Unit Test added for `IncrementalRows`
> * Tested locally
> 
> 
> Thanks,
> 
> Sahil Takiar
> 
>



Re: Review Request 49782: HIVE-14170: Beeline IncrementalRows should buffer rows and incrementally re-calculate width if TableOutputFormat is used

2016-07-11 Thread Sahil Takiar


> On July 8, 2016, 9:24 p.m., Peter Vary wrote:
> > I have checked your code, and I think it is nice, and clean.
> > I think we should add the new parameter to 
> > beeline/src/main/resources/BeeLine.properties, and later in the 
> > documentation.
> > 
> > Keeping in mind, that I am new to hive, I think you patch is otherwise good.
> > 
> > Thanks,
> > Peter

Thanks for the review Peter! I have updated the `BeeLine.properties` file so 
that the `--help` option displays the new parameter. I will investigate how to 
update the external documentation.


- Sahil


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49782/#review141335
---


On July 11, 2016, 5:29 p.m., Sahil Takiar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49782/
> ---
> 
> (Updated July 11, 2016, 5:29 p.m.)
> 
> 
> Review request for hive, Sergio Pena, Thejas Nair, and Vaibhav Gumashta.
> 
> 
> Bugs: HIVE-14170
> https://issues.apache.org/jira/browse/HIVE-14170
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> * Added a new BeeLine Options called `incrementalBufferRows` which controls 
> the number of `Row`s the `IncrementalRows` class should buffer, by default it 
> is 1000
> * Modified `BufferedRows` so that it can accept a limit on the number of 
> `Row`s it buffers
> * Modified `IncrementalRows` to read the value of `incrementalBufferRows` and 
> buffer rows as per HIVE-14170
> * The class delegates all buffering work to a `BufferedRows` class
> * This has the advantage that all the width calculaltion that spans multiple 
> rows can be encapsulate in the `BufferedRows` class, there is no need to 
> re-implement the logic in `IncrementalRows`
> * `IncrementalRows` will buffer `incrementalBufferRows` rows at a time, when 
> the buffer is depleted, it will fetch the next buffer and re-calculate the 
> width for that buffer
> 
> 
> Diffs
> -
> 
>   beeline/pom.xml a720d08 
>   beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java 5aaa385 
>   beeline/src/java/org/apache/hive/beeline/BufferedRows.java 962c531 
>   beeline/src/java/org/apache/hive/beeline/IncrementalRows.java 8aef976 
>   beeline/src/java/org/apache/hive/beeline/Rows.java 453f685 
>   beeline/src/main/resources/BeeLine.properties 7500df9 
>   beeline/src/test/org/apache/hive/beeline/TestIncrementalRows.java 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/49782/diff/
> 
> 
> Testing
> ---
> 
> * Unit Test added for `IncrementalRows`
> * Tested locally
> 
> 
> Thanks,
> 
> Sahil Takiar
> 
>



Re: Review Request 49782: HIVE-14170: Beeline IncrementalRows should buffer rows and incrementally re-calculate width if TableOutputFormat is used

2016-07-11 Thread Sahil Takiar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49782/
---

(Updated July 11, 2016, 5:29 p.m.)


Review request for hive, Sergio Pena, Thejas Nair, and Vaibhav Gumashta.


Changes
---

Addressing comments


Bugs: HIVE-14170
https://issues.apache.org/jira/browse/HIVE-14170


Repository: hive-git


Description
---

* Added a new BeeLine Options called `incrementalBufferRows` which controls the 
number of `Row`s the `IncrementalRows` class should buffer, by default it is 
1000
* Modified `BufferedRows` so that it can accept a limit on the number of `Row`s 
it buffers
* Modified `IncrementalRows` to read the value of `incrementalBufferRows` and 
buffer rows as per HIVE-14170
* The class delegates all buffering work to a `BufferedRows` class
* This has the advantage that all the width calculaltion that spans multiple 
rows can be encapsulate in the `BufferedRows` class, there is no need to 
re-implement the logic in `IncrementalRows`
* `IncrementalRows` will buffer `incrementalBufferRows` rows at a time, when 
the buffer is depleted, it will fetch the next buffer and re-calculate the 
width for that buffer


Diffs (updated)
-

  beeline/pom.xml a720d08 
  beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java 5aaa385 
  beeline/src/java/org/apache/hive/beeline/BufferedRows.java 962c531 
  beeline/src/java/org/apache/hive/beeline/IncrementalRows.java 8aef976 
  beeline/src/java/org/apache/hive/beeline/Rows.java 453f685 
  beeline/src/main/resources/BeeLine.properties 7500df9 
  beeline/src/test/org/apache/hive/beeline/TestIncrementalRows.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/49782/diff/


Testing
---

* Unit Test added for `IncrementalRows`
* Tested locally


Thanks,

Sahil Takiar



Re: Review Request 49782: HIVE-14170: Beeline IncrementalRows should buffer rows and incrementally re-calculate width if TableOutputFormat is used

2016-07-11 Thread Sergio Pena

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49782/#review141714
---




beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java (line 75)


Should we use an constant variable for the 1000 number?



beeline/src/java/org/apache/hive/beeline/IncrementalRows.java (line 36)


is '--incremental' still working? 
Also, I don't see this config on the --help message. I think that adding it 
on BeeLine.properties will add it as Peter Vary mentioned.


- Sergio Pena


On July 8, 2016, 8 p.m., Sahil Takiar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49782/
> ---
> 
> (Updated July 8, 2016, 8 p.m.)
> 
> 
> Review request for hive, Sergio Pena, Thejas Nair, and Vaibhav Gumashta.
> 
> 
> Bugs: HIVE-14170
> https://issues.apache.org/jira/browse/HIVE-14170
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> * Added a new BeeLine Options called `incrementalBufferRows` which controls 
> the number of `Row`s the `IncrementalRows` class should buffer, by default it 
> is 1000
> * Modified `BufferedRows` so that it can accept a limit on the number of 
> `Row`s it buffers
> * Modified `IncrementalRows` to read the value of `incrementalBufferRows` and 
> buffer rows as per HIVE-14170
> * The class delegates all buffering work to a `BufferedRows` class
> * This has the advantage that all the width calculaltion that spans multiple 
> rows can be encapsulate in the `BufferedRows` class, there is no need to 
> re-implement the logic in `IncrementalRows`
> * `IncrementalRows` will buffer `incrementalBufferRows` rows at a time, when 
> the buffer is depleted, it will fetch the next buffer and re-calculate the 
> width for that buffer
> 
> 
> Diffs
> -
> 
>   beeline/pom.xml a720d08 
>   beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java 5aaa385 
>   beeline/src/java/org/apache/hive/beeline/BufferedRows.java 962c531 
>   beeline/src/java/org/apache/hive/beeline/IncrementalRows.java 8aef976 
>   beeline/src/java/org/apache/hive/beeline/Rows.java 453f685 
>   beeline/src/test/org/apache/hive/beeline/TestIncrementalRows.java 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/49782/diff/
> 
> 
> Testing
> ---
> 
> * Unit Test added for `IncrementalRows`
> * Tested locally
> 
> 
> Thanks,
> 
> Sahil Takiar
> 
>



[jira] [Created] (HIVE-14207) Strip HiveConf hidden params in webui conf

2016-07-11 Thread Sushanth Sowmyan (JIRA)
Sushanth Sowmyan created HIVE-14207:
---

 Summary: Strip HiveConf hidden params in webui conf
 Key: HIVE-14207
 URL: https://issues.apache.org/jira/browse/HIVE-14207
 Project: Hive
  Issue Type: Bug
  Components: Web UI
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan


HIVE-12338 introduced a new web ui, which has a page that displays the current 
HiveConf being used by HS2. However, before it displays that config, it does 
not strip entries from it which are considered "hidden" conf parameters, thus 
exposing those values from a web-ui for HS2. We need to add stripping to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14206) CASE boolean output is NULL when SUM predicate returns NULL

2016-07-11 Thread Stathis Fotiadis (JIRA)
Stathis Fotiadis created HIVE-14206:
---

 Summary: CASE boolean output is NULL when SUM predicate returns 
NULL
 Key: HIVE-14206
 URL: https://issues.apache.org/jira/browse/HIVE-14206
 Project: Hive
  Issue Type: Bug
Reporter: Stathis Fotiadis
Priority: Critical


This query 
`
select
case when 1=sum(1) then true else false end) bool_true,
(case when 1=sum(0) then true else false end) bool_false,
(case when 1=(1+null) then true else false end) bool_null,
(case when 1=sum(1+null) then true else false end) bool_null_sum,
(case when 1=sum(1) then 'true' else 'false' end) string_true,
(case when 1=sum(0) then 'true' else 'false' end) string_false,
(case when 1=(1+null) then 'true' else 'false' end) string_null,
(case when 1=sum(1+null) then 'true' else 'false' end) string_null_sum
`

results to


bool_true   bool_false  bool_null   bool_null_sum   string_true 
string_falsestring_null string_null_sum
truefalse   false   NULLtruefalse   false   false


First of all the is the inconsistency between (1+null) and SUM(1+null) and 
secondly the inconsistency between using boolean and string outpu



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 49882: HIVE-14146: Column comments with "\n" character "corrupts" table metadata

2016-07-11 Thread Peter Vary

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49882/
---

(Updated July 11, 2016, 1:43 p.m.)


Review request for hive, Aihua Xu, Sergio Pena, Szehon Ho, and Vihang 
Karajgaonkar.


Changes
---

Incorporated Aihua's recommendations


Bugs: HIVE-14146
https://issues.apache.org/jira/browse/HIVE-14146


Repository: hive-git


Description
---

The patch contains:
- MetaDataFormatUtils changes - to escape the \n in index, and column comments 
(table comments are already handled)
- TextMetaDataFormatter changes - to escape the \n in database comments
- DDLTask chages - to escape \n in show create table result
- New query test, to test the escaping


Diffs (updated)
-

  common/src/java/org/apache/hive/common/util/HiveStringUtils.java 72c3fa9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java bb43950 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java
 03803bb 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/TextMetaDataFormatter.java
 47d67b1 
  ql/src/test/queries/clientpositive/escape_comments.q PRE-CREATION 
  ql/src/test/results/clientpositive/escape_comments.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/49882/diff/


Testing
---

New unit test and manually


Thanks,

Peter Vary



Re: Review Request 49619: sorting of tuple array using multiple fields

2016-07-11 Thread Simanchal Das

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49619/
---

(Updated July 11, 2016, 1:32 p.m.)


Review request for hive, Ashutosh Chauhan and Carl Steinbach.


Repository: hive-git


Description (updated)
---

https://issues.apache.org/jira/browse/HIVE-14159

Problem Statement:

When we are working with complex structure of data like avro.
Most of the times we are encountering array contains multiple tuples and each 
tuple have struct schema.
Suppose here struct schema is like below:
{
"name": "employee",
"type": [{
"type": "record",
"name": "Employee",
"namespace": "com.company.Employee",
"fields": [{
"name": "empId",
"type": "int"
}, {
"name": "empName",
"type": "string"
}, {
"name": "age",
"type": "int"
}, {
"name": "salary",
"type": "double"
}]
}]
}

Then while running our hive query complex array looks like array of employee 
objects.
Example: 
//(array>)

Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)]

When we are implementing business use cases day to day life we are encountering 
problems like sorting a tuple array by specific field[s] like 
empId,name,salary,etc by ASC or DESC order.
Proposal:
I have developed a udf 'sort_array_by' which will sort a tuple array by one or 
more fields in ASC or DESC order provided by user ,default is ascending order .
Example:
1.Select 
sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary","ASC");
output: 
array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)]

2.Select 
sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","ASC");
output: 
array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)]

3.Select 
sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age,"ASC");
output: 
array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)]


Diffs
-

  itests/src/test/resources/testconfiguration.properties 1ab914d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 2f4a94c 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArrayByField.java
 PRE-CREATION 
  
ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFSortArrayByField.java
 PRE-CREATION 
  ql/src/test/queries/clientnegative/udf_sort_array_by_wrong1.q PRE-CREATION 
  ql/src/test/queries/clientnegative/udf_sort_array_by_wrong2.q PRE-CREATION 
  ql/src/test/queries/clientnegative/udf_sort_array_by_wrong3.q PRE-CREATION 
  ql/src/test/queries/clientpositive/udf_sort_array_by.q PRE-CREATION 
  ql/src/test/results/beelinepositive/show_functions.q.out 4f3ec40 
  ql/src/test/results/clientnegative/udf_sort_array_by_wrong1.q.out 
PRE-CREATION 
  ql/src/test/results/clientnegative/udf_sort_array_by_wrong2.q.out 
PRE-CREATION 
  ql/src/test/results/clientnegative/udf_sort_array_by_wrong3.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/show_functions.q.out a811747 
  ql/src/test/results/clientpositive/udf_sort_array_by.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/49619/diff/


Testing
---

Junit test cases and query.q files are attached


Thanks,

Simanchal Das



Re: Review Request 49882: HIVE-14146: Column comments with "\n" character "corrupts" table metadata

2016-07-11 Thread Aihua Xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49882/#review141686
---




common/src/java/org/apache/hive/common/util/HiveStringUtils.java (line 644)


Somehow I feel we should have a better name for this function since we are 
escaping single quote and \ as well. 

Maybe just the orignal one escapeHiveCommand()?


- Aihua Xu


On July 11, 2016, 9:36 a.m., Peter Vary wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49882/
> ---
> 
> (Updated July 11, 2016, 9:36 a.m.)
> 
> 
> Review request for hive, Aihua Xu, Sergio Pena, Szehon Ho, and Vihang 
> Karajgaonkar.
> 
> 
> Bugs: HIVE-14146
> https://issues.apache.org/jira/browse/HIVE-14146
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The patch contains:
> - MetaDataFormatUtils changes - to escape the \n in index, and column 
> comments (table comments are already handled)
> - TextMetaDataFormatter changes - to escape the \n in database comments
> - DDLTask chages - to escape \n in show create table result
> - New query test, to test the escaping
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hive/common/util/HiveStringUtils.java 72c3fa9 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java bb43950 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java
>  03803bb 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/TextMetaDataFormatter.java
>  47d67b1 
>   ql/src/test/queries/clientpositive/escape_comments.q PRE-CREATION 
>   ql/src/test/results/clientpositive/escape_comments.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/49882/diff/
> 
> 
> Testing
> ---
> 
> New unit test and manually
> 
> 
> Thanks,
> 
> Peter Vary
> 
>



[jira] [Created] (HIVE-14205) Hive doesn't support union type with AVRO file format

2016-07-11 Thread Yibing Shi (JIRA)
Yibing Shi created HIVE-14205:
-

 Summary: Hive doesn't support union type with AVRO file format
 Key: HIVE-14205
 URL: https://issues.apache.org/jira/browse/HIVE-14205
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Yibing Shi


Reproduce steps:
{noformat}
hive> CREATE TABLE avro_union_test
> PARTITIONED BY (p int)
> ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS INPUTFORMAT 
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> TBLPROPERTIES ('avro.schema.literal'='{
>"type":"record",
>"name":"nullUnionTest",
>"fields":[
>   {
>  "name":"value",
>  "type":[
> "null",
> "int",
> "long"
>  ],
>  "default":null
>   }
>]
> }');
OK
Time taken: 0.105 seconds
hive> alter table avro_union_test add partition (p=1);
OK
Time taken: 0.093 seconds
hive> select * from avro_union_test;
FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
Failed with exception Hive internal error inside 
isAssignableFromSettablePrimitiveOI void not supported 
yet.java.lang.RuntimeException: Hive internal error inside 
isAssignableFromSettablePrimitiveOI void not supported yet.
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettablePrimitiveOI(ObjectInspectorUtils.java:1140)
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:1149)
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1187)
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1220)
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1200)
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:581)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140)
at 
org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:79)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:482)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1194)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1108)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:218)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:170)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:381)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:773)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:691)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{noformat}

Another test case to show this problem is:
{noformat}
hive> create table avro_union_test2 (value uniontype) stored as 
avro;
OK
Time taken: 0.053 seconds
hive> show create table avro_union_test2;
OK
CREATE TABLE `avro_union_test2`(
  `value` uniontype COMMENT '')
ROW FORMAT SERDE
  'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION
  'hdfs://localhost/user/hive/warehouse/avro_union_test2'
TBLPROPERTIES (
  'transient_lastDdlTime'='1468173589')
Time taken: 0.051 seconds, Fetched: 12 row(s)
{noformat}

Although column {{value}} is defined as {{uniontype}} in create 
table command, its type becomes {{uniontype}} after table is 
defined. Hive accidentally make the nullable definition in avro schema 
({{\["null", "int", 

Review Request 49882: HIVE-14146: Column comments with "\n" character "corrupts" table metadata

2016-07-11 Thread Peter Vary

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49882/
---

Review request for hive, Aihua Xu, Sergio Pena, Szehon Ho, and Vihang 
Karajgaonkar.


Bugs: HIVE-14146
https://issues.apache.org/jira/browse/HIVE-14146


Repository: hive-git


Description
---

The patch contains:
- MetaDataFormatUtils changes - to escape the \n in index, and column comments 
(table comments are already handled)
- TextMetaDataFormatter changes - to escape the \n in database comments
- DDLTask chages - to escape \n in show create table result
- New query test, to test the escaping


Diffs
-

  common/src/java/org/apache/hive/common/util/HiveStringUtils.java 72c3fa9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java bb43950 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java
 03803bb 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/TextMetaDataFormatter.java
 47d67b1 
  ql/src/test/queries/clientpositive/escape_comments.q PRE-CREATION 
  ql/src/test/results/clientpositive/escape_comments.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/49882/diff/


Testing
---

New unit test and manually


Thanks,

Peter Vary



Review Request 49885: HIVE-14027: NULL values produced by left outer join do not behave as NULL (Jesus Camacho Rodriguez, reviewed by Ashutosh Chauhan)

2016-07-11 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49885/
---

Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-14027
https://issues.apache.org/jira/browse/HIVE-14027


Repository: hive-git


Description
---

HIVE-14027: NULL values produced by left outer join do not behave as NULL 
(Jesus Camacho Rodriguez, reviewed by Ashutosh Chauhan)


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java 
ff292a34dbbcd6f95de34275b108229afdace289 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/UnwrapRowContainer.java 
e7771e678a5aaff354637e08f985344fadea339c 
  ql/src/test/queries/clientpositive/mapjoin2.q PRE-CREATION 
  ql/src/test/results/clientpositive/mapjoin2.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/49885/diff/


Testing
---


Thanks,

Jesús Camacho Rodríguez



Re: Review Request 49881: HIVE-14204: Optimize loading loaddynamic partitions

2016-07-11 Thread Rajesh Balamohan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49881/
---

(Updated July 11, 2016, 9:01 a.m.)


Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-14204
https://issues.apache.org/jira/browse/HIVE-14204


Repository: hive-git


Description
---

Lots of time is spent in sequential fashion to load dynamic partitioned dataset 
in driver side. 

E.g simple dynamic partitioned load as follows takes 300+ seconds

INSERT INTO web_sales_test partition(ws_sold_date_sk) select * from 
tpcds_bin_partitioned_orc_200.web_sales;

Time taken to load dynamic partitions: 309.22 seconds


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 82abd52 

Diff: https://reviews.apache.org/r/49881/diff/


Testing
---


Thanks,

Rajesh Balamohan



[jira] [Created] (HIVE-14204) Optimize loading loaddynamic partitions

2016-07-11 Thread Rajesh Balamohan (JIRA)
Rajesh Balamohan created HIVE-14204:
---

 Summary: Optimize loading loaddynamic partitions 
 Key: HIVE-14204
 URL: https://issues.apache.org/jira/browse/HIVE-14204
 Project: Hive
  Issue Type: Improvement
Reporter: Rajesh Balamohan
Priority: Minor


Lots of time is spent in sequential fashion to load dynamic partitioned dataset 
in driver side. E.g simple dynamic partitioned load as follows takes 300+ 
seconds

{noformat}
INSERT INTO web_sales_test partition(ws_sold_date_sk) select * from 
tpcds_bin_partitioned_orc_200.web_sales;


Time taken to load dynamic partitions: 309.22 seconds
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 49619: sorting of tuple array using multiple fields

2016-07-11 Thread Simanchal Das

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49619/
---

(Updated July 11, 2016, 8:37 a.m.)


Review request for hive, Ashutosh Chauhan and Carl Steinbach.


Repository: hive-git


Description
---

Problem Statement:

When we are working with complex structure of data like avro.
Most of the times we are encountering array contains multiple tuples and each 
tuple have struct schema.
Suppose here struct schema is like below:
{
"name": "employee",
"type": [{
"type": "record",
"name": "Employee",
"namespace": "com.company.Employee",
"fields": [{
"name": "empId",
"type": "int"
}, {
"name": "empName",
"type": "string"
}, {
"name": "age",
"type": "int"
}, {
"name": "salary",
"type": "double"
}]
}]
}

Then while running our hive query complex array looks like array of employee 
objects.
Example: 
//(array>)

Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)]

When we are implementing business use cases day to day life we are encountering 
problems like sorting a tuple array by specific field[s] like 
empId,name,salary,etc by ASC or DESC order.
Proposal:
I have developed a udf 'sort_array_by' which will sort a tuple array by one or 
more fields in ASC or DESC order provided by user ,default is ascending order .
Example:
1.Select 
sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary","ASC");
output: 
array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)]

2.Select 
sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","ASC");
output: 
array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)]

3.Select 
sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age,"ASC");
output: 
array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)]


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties 1ab914d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 2f4a94c 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArrayByField.java
 PRE-CREATION 
  
ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFSortArrayByField.java
 PRE-CREATION 
  ql/src/test/queries/clientnegative/udf_sort_array_by_wrong1.q PRE-CREATION 
  ql/src/test/queries/clientnegative/udf_sort_array_by_wrong2.q PRE-CREATION 
  ql/src/test/queries/clientnegative/udf_sort_array_by_wrong3.q PRE-CREATION 
  ql/src/test/queries/clientpositive/udf_sort_array_by.q PRE-CREATION 
  ql/src/test/results/beelinepositive/show_functions.q.out 4f3ec40 
  ql/src/test/results/clientnegative/udf_sort_array_by_wrong1.q.out 
PRE-CREATION 
  ql/src/test/results/clientnegative/udf_sort_array_by_wrong2.q.out 
PRE-CREATION 
  ql/src/test/results/clientnegative/udf_sort_array_by_wrong3.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/show_functions.q.out a811747 
  ql/src/test/results/clientpositive/udf_sort_array_by.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/49619/diff/


Testing
---

Junit test cases and query.q files are attached


Thanks,

Simanchal Das



Re: Review Request 49619: sorting of tuple array using multiple fields

2016-07-11 Thread Simanchal Das


> On July 9, 2016, 8:53 p.m., Carl Steinbach wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArrayByField.java,
> >  line 56
> > 
> >
> > A couple notes:
> > 
> > 1. I think the example should actually return a value that is different 
> > than the input. It would also be good to include more than two elements in 
> > the input. If screen space is an issue I recommend only including a single 
> > element in each of the structs in the example, which I think has the added 
> > benefit of making the example clearer by not distracting the reader with 
> > irrelevant details.
> > 
> > 2. It looks like the default sorting order (ASC) is actually the 
> > reverse of what I would expect it to be, i.e. I expect 'b' to come before 
> > 'g'.
> > 
> > 3. Related to point (2), I think it's important to ensure that the 
> > sorting order of this UDF is consisent with ORDER BY, e.g. for a table t 
> > containing a single row with a single array of struct field a_struct_array, 
> > the queries "SELECT a_struct FROM t LATERAL VIEW explode(a_struct_array) 
> > structTable AS a_struct ORDER BY a_struct.col1 DESC" should return the same 
> > results as "SELECT a_struct FROM t LATERAL VIEW 
> > explode(sort_array_by(a_struct_array, 'col1', 'DESC')) structTable AS 
> > a_struct". Note that I probably didn't get the syntax for LATERAL VIEW and 
> > explode() correct.

1. Sorry there was a typo.
2. corrected the sorting order in example.
3. As per your instruction I have added test example of LATERAL VIEW 
explode(array) and explode(udf). Which gives same results.


> On July 9, 2016, 8:53 p.m., Carl Steinbach wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArrayByField.java,
> >  line 93
> > 
> >
> > Unnecessary string concatenation operators.

removed


> On July 9, 2016, 8:53 p.m., Carl Steinbach wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArrayByField.java,
> >  line 104
> > 
> >
> > Unnecessary "+" operator.

removed


> On July 9, 2016, 8:53 p.m., Carl Steinbach wrote:
> > ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFSortArrayByField.java,
> >  line 1
> > 
> >
> > Does this unit test provide any additional coverage or advantages over 
> > the q file tests? Is it necessary to have both?
> > 
> > Note that I am a strong advocate of end-to-end qfile tests over unit 
> > tests, which is an opinion that not everyone holds.

These are kind of same as q file. I feels test cases on Test classes are good 
for doing unit testing while development and takes less time to identify 
problem comparare to q files.
Any ways I have removed some test cases from Test class.


- Simanchal


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49619/#review141413
---


On July 8, 2016, 12:35 p.m., Simanchal Das wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49619/
> ---
> 
> (Updated July 8, 2016, 12:35 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Carl Steinbach.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Problem Statement:
> 
> When we are working with complex structure of data like avro.
> Most of the times we are encountering array contains multiple tuples and each 
> tuple have struct schema.
> Suppose here struct schema is like below:
> {
>   "name": "employee",
>   "type": [{
>   "type": "record",
>   "name": "Employee",
>   "namespace": "com.company.Employee",
>   "fields": [{
>   "name": "empId",
>   "type": "int"
>   }, {
>   "name": "empName",
>   "type": "string"
>   }, {
>   "name": "age",
>   "type": "int"
>   }, {
>   "name": "salary",
>   "type": "double"
>   }]
>   }]
> }
> 
> Then while running our hive query complex array looks like array of employee 
> objects.
> Example: 
>   //(array>)
>   
> Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)]
> 
> When we are implementing business use cases day to day life we are 
> encountering problems like sorting a 

Re: Request write access permissions to Hive Wiki

2016-07-11 Thread Lefty Leverenz
Done.  Welcome to the Hive wiki team, Sergio!

-- Lefty

On Sun, Jul 10, 2016 at 5:36 PM, Sergio Pena 
wrote:

> Hi,
>
> Can I get write access permissions to the Hive wiki? My username is: spena
>
> Thanks,
> - Sergio
>


Re: Review Request 49644: Support masking and filtering of rows/columns: deal with derived column names

2016-07-11 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49644/
---

(Updated July 11, 2016, 6:28 a.m.)


Review request for hive, Ashutosh Chauhan and Gunther Hagleitner.


Repository: hive-git


Description
---

HIVE-14158


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 1e44ccf 
  ql/src/java/org/apache/hadoop/hive/ql/parse/MaskAndFilterInfo.java f5a12a3 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 96ad809 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TableMask.java 1686f36 
  ql/src/test/queries/clientpositive/masking_6.q PRE-CREATION 
  ql/src/test/queries/clientpositive/masking_7.q PRE-CREATION 
  ql/src/test/queries/clientpositive/masking_8.q PRE-CREATION 
  ql/src/test/queries/clientpositive/view_alias.q PRE-CREATION 
  ql/src/test/results/clientpositive/masking_1.q.out 3b63550 
  ql/src/test/results/clientpositive/masking_1_newdb.q.out 51c2619 
  ql/src/test/results/clientpositive/masking_2.q.out ff045a9 
  ql/src/test/results/clientpositive/masking_3.q.out 1925dce 
  ql/src/test/results/clientpositive/masking_4.q.out 7e923e8 
  ql/src/test/results/clientpositive/masking_6.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/masking_7.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/masking_8.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/masking_disablecbo_1.q.out 6717527 
  ql/src/test/results/clientpositive/masking_disablecbo_3.q.out 6aaab20 
  ql/src/test/results/clientpositive/view_alias.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/49644/diff/


Testing
---


Thanks,

pengcheng xiong