date:20140821

Re: Timeline for release of Hive 0.14

2014-08-21 Thread Lefty Leverenz

Release 0.14 should include HIVE-6586
 (various fixes to
HiveConf.java parameters).  I'll do that as soon as possible.

72 jiras have the TODOC14 label now, although my own tally is 99.  This is
more than mere mortals can accomplish in a few weeks.  Therefore I
recommend that you all plead with your managers to allocate some
tech-writer resources to Hive wikidocs for the 0.14.0 release.

I'll send out a state-of-the-docs message in a separate thread.

-- Lefty


On Fri, Aug 22, 2014 at 2:28 AM, Alan Gates  wrote:

> +1, Eugene and I are working on getting HIVE-5317 (insert, update, delete)
> done and would like to get it in.
>
> Alan.
>
>   Nick Dimiduk 
>  August 20, 2014 at 12:27
> It'd be great to get HIVE-4765 included in 0.14. The proposed changes are a
> big improvement for us HBase folks. Would someone mind having a look in
> that direction?
>
> Thanks,
> Nick
>
>
>
>   Thejas Nair 
>  August 19, 2014 at 15:20
> +1
> Sounds good to me.
> Its already almost 4 months since the last release. It is time to
> start preparing for the next one.
> Thanks for volunteering!
>
>
>   Vikram Dixit 
>  August 19, 2014 at 14:02
> Hi Folks,
>
> I was thinking that it was about time that we had a release of hive 0.14
> given our commitment to having a release of hive on a periodic basis. We
> could cut a branch and start working on a release in say 2 weeks time
> around September 5th (Friday). After branching, we can focus on stabilizing
> for the release and hopefully have an RC in about 2 weeks post that. I
> would like to volunteer myself for the duties of the release manager for
> this version if the community agrees.
>
> Thanks
> Vikram.
>
>
> --
> Sent with Postbox 
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

[jira] [Commented] (HIVE-6245) HS2 creates DBs/Tables with wrong ownership when HMS setugi is true

2014-08-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106558#comment-14106558
 ] 

Hive QA commented on HIVE-6245:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12663248/HIVE-6245.4.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6116 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_join
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/451/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/451/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-451/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12663248

> HS2 creates DBs/Tables with wrong ownership when HMS setugi is true
> ---
>
> Key: HIVE-6245
> URL: https://issues.apache.org/jira/browse/HIVE-6245
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0, 0.13.0
>Reporter: Chaoyu Tang
>Assignee: Venki Korukanti
> Attachments: HIVE-6245.2.patch.txt, HIVE-6245.3.patch.txt, 
> HIVE-6245.4.patch, HIVE-6245.patch
>
>
> The case with following settings is valid but does not work correctly in 
> current HS2:
> ==
> hive.server2.authentication=NONE (or LDAP)
> hive.server2.enable.doAs= true
> hive.metastore.sasl.enabled=false
> hive.metastore.execute.setugi=true
> ==
> Ideally, HS2 is able to impersonate the logged in user (from Beeline, or JDBC 
> application) and create DBs/Tables with user's ownership.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7832) Do ORC dictionary check at a finer level and preserve encoding across stripes

2014-08-21 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106554#comment-14106554
 ] 

Gopal V commented on HIVE-7832:
---

Minor refactoring comments on RB.

LGTM +1, pending tests pass.

> Do ORC dictionary check at a finer level and preserve encoding across stripes
> -
>
> Key: HIVE-7832
> URL: https://issues.apache.org/jira/browse/HIVE-7832
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.14.0
>Reporter: Prasanth J
>Assignee: Prasanth J
> Attachments: HIVE-7832.1.patch
>
>
> Currently ORC dictionary check happens while writing the stripe. Just before 
> writing stripe if ratio of dictionary entries to total non-null rows is 
> greater than threshold then the dictionary is discarded. Also, the decision 
> of using dictionary or not is preserved across stripes. This sometimes leads 
> to costly insertion cost of O(logn) for each stripes when there are too many 
> distinct keys.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 24962: HIVE-7730: Extend ReadEntity to add accessed columns from query

2014-08-21 Thread Xiaomeng Huang



> On Aug. 22, 2014, 6:14 a.m., Szehon Ho wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/hooks/ReadEntity.java, line 54
> > 
> >
> > Can we make this final, and not have a setter?  The caller can just add 
> > to the list.  It'll make the code a bit simpler.
> > 
> > Also should it be set?

Thanks, I think it better to be list. I get accessed columns from 
tableToColumnAccessMap, which is a Map>. Hive's native 
authorization is use this list too.
I get the column list via a table name, then set it to readEntity directly, 
don't need to add every one with a loop. so it is necessary to have a setter.
BTW, I can also to add a API addAccessedColumn(String column) to add one column 
to this column list.


> On Aug. 22, 2014, 6:14 a.m., Szehon Ho wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java, line 9521
> > 
> >
> > No need for '==true' part.

fixed. Thanks.


> On Aug. 22, 2014, 6:14 a.m., Szehon Ho wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java, line 9539
> > 
> >
> > Can we indent this code block inside {}?

fixed. thanks.


- Xiaomeng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24962/#review51257
---


On Aug. 22, 2014, 6:47 a.m., Xiaomeng Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24962/
> ---
> 
> (Updated Aug. 22, 2014, 6:47 a.m.)
> 
> 
> Review request for hive, Prasad Mujumdar and Szehon Ho.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> External authorization model can not get accessed columns from query. Hive 
> should store accessed columns to ReadEntity 
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/hooks/ReadEntity.java 7ed50b4 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java b05d3b4 
> 
> Diff: https://reviews.apache.org/r/24962/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Xiaomeng Huang
> 
>

Re: Review Request 24962: HIVE-7730: Extend ReadEntity to add accessed columns from query

2014-08-21 Thread Xiaomeng Huang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24962/
---

(Updated Aug. 22, 2014, 6:47 a.m.)


Review request for hive, Prasad Mujumdar and Szehon Ho.


Repository: hive-git


Description
---

External authorization model can not get accessed columns from query. Hive 
should store accessed columns to ReadEntity 


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/hooks/ReadEntity.java 7ed50b4 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java b05d3b4 

Diff: https://reviews.apache.org/r/24962/diff/


Testing
---


Thanks,

Xiaomeng Huang

[jira] [Assigned] (HIVE-7794) Enable tests on Spark branch (4) [Sparch Branch]

2014-08-21 Thread Chinna Rao Lalam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam reassigned HIVE-7794:
--

Assignee: Chinna Rao Lalam

> Enable tests on Spark branch (4) [Sparch Branch]
> 
>
> Key: HIVE-7794
> URL: https://issues.apache.org/jira/browse/HIVE-7794
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Brock Noland
>Assignee: Chinna Rao Lalam
>
> This jira is to enable *most* of the tests below. If tests don't pass because 
> of some unsupported feature, ensure that a JIRA exists and move on.
> {noformat}
>   vector_cast_constant.q,\
>   vector_data_types.q,\
>   vector_decimal_aggregate.q,\
>   vector_left_outer_join.q,\
>   vector_string_concat.q,\
>   vectorization_12.q,\
>   vectorization_13.q,\
>   vectorization_14.q,\
>   vectorization_15.q,\
>   vectorization_9.q,\
>   vectorization_part_project.q,\
>   vectorization_short_regress.q,\
>   vectorized_mapjoin.q,\
>   vectorized_nested_mapjoin.q,\
>   vectorized_ptf.q,\
>   vectorized_shufflejoin.q,\
>   vectorized_timestamp_funcs.q
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7821) StarterProject: enable groupby4.q

2014-08-21 Thread Chinna Rao Lalam (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106546#comment-14106546
 ] 

Chinna Rao Lalam commented on HIVE-7821:


Hi [~brocknoland],

I don't know you have created for suhas. I am handling group by queries in the 
previous jira so i assigned my self to avoid duplicate work. 
I don't mind let Suhas can work on this.


> StarterProject: enable groupby4.q
> -
>
> Key: HIVE-7821
> URL: https://issues.apache.org/jira/browse/HIVE-7821
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Suhas Satish
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7702) Start running .q file tests on spark [Spark Branch]

2014-08-21 Thread Chinna Rao Lalam (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106542#comment-14106542
 ] 

Chinna Rao Lalam commented on HIVE-7702:


Hi [~brocknoland],

Compare against MR most of the times differences are due to sorting order only.





> Start running .q file tests on spark [Spark Branch]
> ---
>
> Key: HIVE-7702
> URL: https://issues.apache.org/jira/browse/HIVE-7702
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Chinna Rao Lalam
> Attachments: HIVE-7702-spark.patch, HIVE-7702.1-spark.patch
>
>
> Spark can currently only support a few queries, however there are some .q 
> file tests which will pass today. The basic idea is that we should get some 
> number of these actually working (10-20) so we can actually start testing the 
> project.
> A good starting point might be the udf*, varchar*, or alter* tests:
> https://github.com/apache/hive/tree/spark/ql/src/test/queries/clientpositive
> To generate the output file for test XXX.q, you'd do:
> {noformat}
> mvn clean install -DskipTests -Phadoop-2
> cd itests
> mvn clean install -DskipTests -Phadoop-2
> cd qtest-spark
> mvn test -Dtest=TestCliDriver -Dqfile=XXX.q -Dtest.output.overwrite=true 
> -Phadoop-2
> {noformat}
> which would generate XXX.q.out which we can check-in to source control as a 
> "golden file".
> Multiple tests can be run at a give time as so:
> {noformat}
> mvn test -Dtest=TestCliDriver -Dqfile=X1.q,X2.q -Dtest.output.overwrite=true 
> -Phadoop-2
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Timeline for release of Hive 0.14

2014-08-21 Thread Alan Gates

+1, Eugene and I are working on getting HIVE-5317 (insert, update, 
delete) done and would like to get it in.


Alan.


Nick Dimiduk 
August 20, 2014 at 12:27
It'd be great to get HIVE-4765 included in 0.14. The proposed changes 
are a

big improvement for us HBase folks. Would someone mind having a look in
that direction?

Thanks,
Nick



Thejas Nair 
August 19, 2014 at 15:20
+1
Sounds good to me.
Its already almost 4 months since the last release. It is time to
start preparing for the next one.
Thanks for volunteering!


Vikram Dixit 
August 19, 2014 at 14:02
Hi Folks,

I was thinking that it was about time that we had a release of hive 0.14
given our commitment to having a release of hive on a periodic basis. We
could cut a branch and start working on a release in say 2 weeks time
around September 5th (Friday). After branching, we can focus on 
stabilizing

for the release and hopefully have an RC in about 2 weeks post that. I
would like to volunteer myself for the duties of the release manager for
this version if the community agrees.

Thanks
Vikram.



--
Sent with Postbox 

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

[jira] [Commented] (HIVE-7654) A method to extrapolate columnStats for partitions of a table

2014-08-21 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106528#comment-14106528
 ] 

Szehon Ho commented on HIVE-7654:
-

Sorry for maybe a dumb question, but was curious is it a typical behavior to 
extrapolate in all cases?  I can see it would be a good approx in some case, 
but would it ever be undesirable in some cases?

> A method to extrapolate columnStats for partitions of a table
> -
>
> Key: HIVE-7654
> URL: https://issues.apache.org/jira/browse/HIVE-7654
> Project: Hive
>  Issue Type: New Feature
>Reporter: pengcheng xiong
>Assignee: pengcheng xiong
>Priority: Minor
> Attachments: Extrapolate the Column Status.docx, HIVE-7654.0.patch, 
> HIVE-7654.1.patch, HIVE-7654.4.patch, HIVE-7654.6.patch, HIVE-7654.7.patch, 
> HIVE-7654.8.patch
>
>
> In a PARTITIONED table, there are many partitions. For example, 
> create table if not exists loc_orc (
>   state string,
>   locid int,
>   zip bigint
> ) partitioned by(year string) stored as orc;
> We assume there are 4 partitions, partition(year='2000'), 
> partition(year='2001'), partition(year='2002') and partition(year='2003').
> We can use the following command to compute statistics for columns 
> state,locid of partition(year='2001')
> analyze table loc_orc partition(year='2001') compute statistics for columns 
> state,locid;
> We need to know the “aggregated” column status for the whole table loc_orc. 
> However, we may not have the column status for some partitions, e.g., 
> partition(year='2002') and also we may not have the column status for some 
> columns, e.g., zip bigint for partition(year='2001')
> We propose a method to extrapolate the missing column status for the 
> partitions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7384) Research into reduce-side join [Spark Branch]

2014-08-21 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106532#comment-14106532
 ] 

Szehon Ho commented on HIVE-7384:
-

Thanks [~lianhuiwang] for the information.

> Research into reduce-side join [Spark Branch]
> -
>
> Key: HIVE-7384
> URL: https://issues.apache.org/jira/browse/HIVE-7384
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Szehon Ho
> Attachments: Hive on Spark Reduce Side Join.docx, sales_items.txt, 
> sales_products.txt, sales_stores.txt
>
>
> Hive's join operator is very sophisticated, especially for reduce-side join. 
> While we expect that other types of join, such as map-side join and SMB 
> map-side join, will work out of the box with our design, there may be some 
> complication in reduce-side join, which extensively utilizes key tag and 
> shuffle behavior. Our design principle prefers to making Hive implementation 
> work out of box also, which might requires new functionality from Spark. The 
> tasks is to research into this area, identifying requirements for Spark 
> community and the work to be done on Hive to make reduce-side join work.
> A design doc might be needed for this. For more information, please refer to 
> the overall design doc on wiki.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7730) Extend ReadEntity to add accessed columns from query

2014-08-21 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106523#comment-14106523
 ] 

Szehon Ho commented on HIVE-7730:
-

Thanks Xiaomeng, patch looks good overall, I put some minor comments on rb.

> Extend ReadEntity to add accessed columns from query
> 
>
> Key: HIVE-7730
> URL: https://issues.apache.org/jira/browse/HIVE-7730
> Project: Hive
>  Issue Type: Bug
>Reporter: Xiaomeng Huang
> Attachments: HIVE-7730.001.patch, HIVE-7730.002.patch
>
>
> -Now what we get from HiveSemanticAnalyzerHookContextImpl is limited. If we 
> have hook of HiveSemanticAnalyzerHook, we may want to get more things from 
> hookContext. (e.g. the needed colums from query).-
> -So we should get instance of HiveSemanticAnalyzerHookContext from 
> configuration, extends HiveSemanticAnalyzerHookContext with a new 
> implementation, overide the HiveSemanticAnalyzerHookContext.update() and put 
> what you want to the class.-
> Hive should store accessed columns to ReadEntity when we set 
> HIVE_STATS_COLLECT_SCANCOLS(or we can add a confVar) is true.
> Then external authorization model can get accessed columns when do 
> authorization in compile before execute. Maybe we will remove 
> columnAccessInfo from BaseSemanticAnalyzer, old authorization and 
> AuthorizationModeV2 can get accessed columns from ReadEntity too.
> Here is the quick implement in SemanticAnalyzer.analyzeInternal() below:
> {code}   boolean isColumnInfoNeedForAuth = 
> SessionState.get().isAuthorizationModeV2()
> && HiveConf.getBoolVar(conf, 
> HiveConf.ConfVars.HIVE_AUTHORIZATION_ENABLED);
> if (isColumnInfoNeedForAuth
> || HiveConf.getBoolVar(this.conf, 
> HiveConf.ConfVars.HIVE_STATS_COLLECT_SCANCOLS) == true) {
>   ColumnAccessAnalyzer columnAccessAnalyzer = new 
> ColumnAccessAnalyzer(pCtx);
>   setColumnAccessInfo(columnAccessAnalyzer.analyzeColumnAccess()); 
> }
> compiler.compile(pCtx, rootTasks, inputs, outputs);
> // TODO: 
> // after compile, we can put accessed column list to ReadEntity getting 
> from columnAccessInfo if HIVE_AUTHORIZATION_ENABLED is set true
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 24962: HIVE-7730: Extend ReadEntity to add accessed columns from query

2014-08-21 Thread Szehon Ho


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24962/#review51257
---


Hi Xiaomeng, patch looks good, just had some style comments.


ql/src/java/org/apache/hadoop/hive/ql/hooks/ReadEntity.java


Can we make this final, and not have a setter?  The caller can just add to 
the list.  It'll make the code a bit simpler.

Also should it be set?



ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java


No need for '==true' part.



ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java


Can we indent this code block inside {}?


- Szehon Ho


On Aug. 22, 2014, 6:01 a.m., Xiaomeng Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24962/
> ---
> 
> (Updated Aug. 22, 2014, 6:01 a.m.)
> 
> 
> Review request for hive, Prasad Mujumdar and Szehon Ho.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> External authorization model can not get accessed columns from query. Hive 
> should store accessed columns to ReadEntity 
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/hooks/ReadEntity.java 7ed50b4 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java b05d3b4 
> 
> Diff: https://reviews.apache.org/r/24962/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Xiaomeng Huang
> 
>

[jira] [Updated] (HIVE-7847) query orc partitioned table fail when table column type change

2014-08-21 Thread Zhichun Wu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichun Wu updated HIVE-7847:
-

Status: Patch Available  (was: Open)

> query orc partitioned table fail when table column type change
> --
>
> Key: HIVE-7847
> URL: https://issues.apache.org/jira/browse/HIVE-7847
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.0, 0.12.0, 0.11.0
>Reporter: Zhichun Wu
>Assignee: Zhichun Wu
> Fix For: 0.14.0
>
> Attachments: HIVE-7847.1.patch
>
>
> I use the following script to test orc column type change with partitioned 
> table on branch-0.13:
> {code}
> use test;
> DROP TABLE if exists orc_change_type_staging;
> DROP TABLE if exists orc_change_type;
> CREATE TABLE orc_change_type_staging (
> id int
> );
> CREATE TABLE orc_change_type (
> id int
> ) PARTITIONED BY (`dt` string)
> stored as orc;
> --- load staging table
> LOAD DATA LOCAL INPATH '../hive/examples/files/int.txt' OVERWRITE INTO TABLE 
> orc_change_type_staging;
> --- populate orc hive table
> INSERT OVERWRITE TABLE orc_change_type partition(dt='20140718') select * FROM 
> orc_change_type_staging limit 1;
> --- change column id from int to bigint
> ALTER TABLE orc_change_type CHANGE id id bigint;
> INSERT OVERWRITE TABLE orc_change_type partition(dt='20140719') select * FROM 
> orc_change_type_staging limit 1;
> SELECT id FROM orc_change_type where dt between '20140718' and '20140719';
> {code}
> if fails in the last query "SELECT id FROM orc_change_type where dt between 
> '20140718' and '20140719';" with exception:
> {code}
> Error: java.io.IOException: java.io.IOException: 
> java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast 
> to org.apache.hadoop.io.LongWritable
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:256)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:171)
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:197)
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:183)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
> Caused by: java.io.IOException: java.lang.ClassCastException: 
> org.apache.hadoop.io.IntWritable cannot be cast to 
> org.apache.hadoop.io.LongWritable
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:344)
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:122)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:254)
> ... 11 more
> Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable 
> cannot be cast to org.apache.hadoop.io.LongWritable
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$LongTreeReader.next(RecordReaderImpl.java:717)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1788)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2997)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:153)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcInp

[jira] [Updated] (HIVE-7847) query orc partitioned table fail when table column type change

2014-08-21 Thread Zhichun Wu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichun Wu updated HIVE-7847:
-

Attachment: HIVE-7847.1.patch

> query orc partitioned table fail when table column type change
> --
>
> Key: HIVE-7847
> URL: https://issues.apache.org/jira/browse/HIVE-7847
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.11.0, 0.12.0, 0.13.0
>Reporter: Zhichun Wu
>Assignee: Zhichun Wu
> Fix For: 0.14.0
>
> Attachments: HIVE-7847.1.patch
>
>
> I use the following script to test orc column type change with partitioned 
> table on branch-0.13:
> {code}
> use test;
> DROP TABLE if exists orc_change_type_staging;
> DROP TABLE if exists orc_change_type;
> CREATE TABLE orc_change_type_staging (
> id int
> );
> CREATE TABLE orc_change_type (
> id int
> ) PARTITIONED BY (`dt` string)
> stored as orc;
> --- load staging table
> LOAD DATA LOCAL INPATH '../hive/examples/files/int.txt' OVERWRITE INTO TABLE 
> orc_change_type_staging;
> --- populate orc hive table
> INSERT OVERWRITE TABLE orc_change_type partition(dt='20140718') select * FROM 
> orc_change_type_staging limit 1;
> --- change column id from int to bigint
> ALTER TABLE orc_change_type CHANGE id id bigint;
> INSERT OVERWRITE TABLE orc_change_type partition(dt='20140719') select * FROM 
> orc_change_type_staging limit 1;
> SELECT id FROM orc_change_type where dt between '20140718' and '20140719';
> {code}
> if fails in the last query "SELECT id FROM orc_change_type where dt between 
> '20140718' and '20140719';" with exception:
> {code}
> Error: java.io.IOException: java.io.IOException: 
> java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast 
> to org.apache.hadoop.io.LongWritable
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:256)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:171)
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:197)
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:183)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
> Caused by: java.io.IOException: java.lang.ClassCastException: 
> org.apache.hadoop.io.IntWritable cannot be cast to 
> org.apache.hadoop.io.LongWritable
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:344)
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:122)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:254)
> ... 11 more
> Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable 
> cannot be cast to org.apache.hadoop.io.LongWritable
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$LongTreeReader.next(RecordReaderImpl.java:717)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1788)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2997)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:153)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputForma

[jira] [Commented] (HIVE-7730) Extend ReadEntity to add accessed columns from query

2014-08-21 Thread Xiaomeng Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106515#comment-14106515
 ] 

Xiaomeng Huang commented on HIVE-7730:
--

Thanks [~szehon] I have linked to review board.

> Extend ReadEntity to add accessed columns from query
> 
>
> Key: HIVE-7730
> URL: https://issues.apache.org/jira/browse/HIVE-7730
> Project: Hive
>  Issue Type: Bug
>Reporter: Xiaomeng Huang
> Attachments: HIVE-7730.001.patch, HIVE-7730.002.patch
>
>
> -Now what we get from HiveSemanticAnalyzerHookContextImpl is limited. If we 
> have hook of HiveSemanticAnalyzerHook, we may want to get more things from 
> hookContext. (e.g. the needed colums from query).-
> -So we should get instance of HiveSemanticAnalyzerHookContext from 
> configuration, extends HiveSemanticAnalyzerHookContext with a new 
> implementation, overide the HiveSemanticAnalyzerHookContext.update() and put 
> what you want to the class.-
> Hive should store accessed columns to ReadEntity when we set 
> HIVE_STATS_COLLECT_SCANCOLS(or we can add a confVar) is true.
> Then external authorization model can get accessed columns when do 
> authorization in compile before execute. Maybe we will remove 
> columnAccessInfo from BaseSemanticAnalyzer, old authorization and 
> AuthorizationModeV2 can get accessed columns from ReadEntity too.
> Here is the quick implement in SemanticAnalyzer.analyzeInternal() below:
> {code}   boolean isColumnInfoNeedForAuth = 
> SessionState.get().isAuthorizationModeV2()
> && HiveConf.getBoolVar(conf, 
> HiveConf.ConfVars.HIVE_AUTHORIZATION_ENABLED);
> if (isColumnInfoNeedForAuth
> || HiveConf.getBoolVar(this.conf, 
> HiveConf.ConfVars.HIVE_STATS_COLLECT_SCANCOLS) == true) {
>   ColumnAccessAnalyzer columnAccessAnalyzer = new 
> ColumnAccessAnalyzer(pCtx);
>   setColumnAccessInfo(columnAccessAnalyzer.analyzeColumnAccess()); 
> }
> compiler.compile(pCtx, rootTasks, inputs, outputs);
> // TODO: 
> // after compile, we can put accessed column list to ReadEntity getting 
> from columnAccessInfo if HIVE_AUTHORIZATION_ENABLED is set true
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-7847) query orc partitioned table fail when table column type change

2014-08-21 Thread Zhichun Wu (JIRA)

Zhichun Wu created HIVE-7847:


 Summary: query orc partitioned table fail when table column type 
change
 Key: HIVE-7847
 URL: https://issues.apache.org/jira/browse/HIVE-7847
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.13.0, 0.12.0, 0.11.0
Reporter: Zhichun Wu
Assignee: Zhichun Wu
 Fix For: 0.14.0


I use the following script to test orc column type change with partitioned 
table on branch-0.13:

{code}
use test;
DROP TABLE if exists orc_change_type_staging;
DROP TABLE if exists orc_change_type;
CREATE TABLE orc_change_type_staging (
id int
);
CREATE TABLE orc_change_type (
id int
) PARTITIONED BY (`dt` string)
stored as orc;
--- load staging table
LOAD DATA LOCAL INPATH '../hive/examples/files/int.txt' OVERWRITE INTO TABLE 
orc_change_type_staging;
--- populate orc hive table
INSERT OVERWRITE TABLE orc_change_type partition(dt='20140718') select * FROM 
orc_change_type_staging limit 1;
--- change column id from int to bigint
ALTER TABLE orc_change_type CHANGE id id bigint;
INSERT OVERWRITE TABLE orc_change_type partition(dt='20140719') select * FROM 
orc_change_type_staging limit 1;
SELECT id FROM orc_change_type where dt between '20140718' and '20140719';
{code}

if fails in the last query "SELECT id FROM orc_change_type where dt between 
'20140718' and '20140719';" with exception:
{code}
Error: java.io.IOException: java.io.IOException: java.lang.ClassCastException: 
org.apache.hadoop.io.IntWritable cannot be cast to 
org.apache.hadoop.io.LongWritable
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:256)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:171)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:197)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:183)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.io.IOException: java.lang.ClassCastException: 
org.apache.hadoop.io.IntWritable cannot be cast to 
org.apache.hadoop.io.LongWritable
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:344)
at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:122)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:254)
... 11 more
Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable 
cannot be cast to org.apache.hadoop.io.LongWritable
at 
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$LongTreeReader.next(RecordReaderImpl.java:717)
at 
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1788)
at 
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2997)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:153)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:127)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:339)
... 15 more
{code}

The value object would be reused each time we deserialize the row,  it will 
fail when we start to process the next path with different schema.  Resetting 
value each time we finish reading one path would solve this problem.

Review Request 24962: HIVE-7730: Extend ReadEntity to add accessed columns from query

2014-08-21 Thread Xiaomeng Huang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24962/
---

Review request for hive, Prasad Mujumdar and Szehon Ho.


Repository: hive-git


Description
---

External authorization model can not get accessed columns from query. Hive 
should store accessed columns to ReadEntity 


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/hooks/ReadEntity.java 7ed50b4 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java b05d3b4 

Diff: https://reviews.apache.org/r/24962/diff/


Testing
---


Thanks,

Xiaomeng Huang

[jira] [Commented] (HIVE-7735) Implement Char, Varchar in ParquetSerDe

2014-08-21 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106505#comment-14106505
 ] 

Szehon Ho commented on HIVE-7735:
-

Thanks Lefty, sounds good to me, adding char/varchar to second bullet and 
removing from first.

> Implement Char, Varchar in ParquetSerDe
> ---
>
> Key: HIVE-7735
> URL: https://issues.apache.org/jira/browse/HIVE-7735
> Project: Hive
>  Issue Type: Sub-task
>  Components: Serializers/Deserializers
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
>  Labels: Parquet, TODOC14
> Fix For: 0.14.0
>
> Attachments: HIVE-7735.1.patch, HIVE-7735.1.patch, HIVE-7735.2.patch, 
> HIVE-7735.2.patch, HIVE-7735.3.patch, HIVE-7735.patch
>
>
> This JIRA is to implement CHAR and VARCHAR support in Parquet SerDe.
> Both are represented in Parquet as PrimitiveType binary and OriginalType UTF8.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7736) improve the columns stats update speed for all the partitions of a table

2014-08-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106491#comment-14106491
 ] 

Hive QA commented on HIVE-7736:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12663447/HIVE-7736.4.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6116 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_join
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/450/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/450/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-450/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12663447

> improve the columns stats update speed for all the partitions of a table
> 
>
> Key: HIVE-7736
> URL: https://issues.apache.org/jira/browse/HIVE-7736
> Project: Hive
>  Issue Type: Improvement
>Reporter: pengcheng xiong
>Assignee: pengcheng xiong
>Priority: Minor
> Attachments: HIVE-7736.0.patch, HIVE-7736.1.patch, HIVE-7736.2.patch, 
> HIVE-7736.3.patch, HIVE-7736.4.patch
>
>
> The current implementation of columns stats update for all the partitions of 
> a table takes a long time when there are thousands of partitions. 
> For example, on a given cluster, it took 600+ seconds to update all the 
> partitions' columns stats for a table with 2 columns but 2000 partitions.
> ANALYZE TABLE src_stat_part partition (partitionId) COMPUTE STATISTICS for 
> columns;
> We would like to improve the columns stats update speed for all the 
> partitions of a table



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-4629) HS2 should support an API to retrieve query logs

2014-08-21 Thread Dong Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106482#comment-14106482
 ] 

Dong Chen commented on HIVE-4629:
-

Hi, [~thejas], [~brocknoland], [~cwsteinbach], thanks for your precious 
comments. I am working on the updated patch to address these comments, and will 
let you know when the updated patch is uploaded.

bq. Earlier patch also had a method in HiveStatement to get the log. I think 
that will be convenient for many users, though we need to be careful and 
specify that is the only non jdbc function that is part of a public API in it. 
But this can also be done as follow up work in separate jira.

[~thejas], you are right, adding a method in HiveStatement to get the log is 
convenient for users. I filed a Jira HIVE-7615 and planed to add the getting 
log API in jdbc level there.

bq. I didn't see this in the patch. Are you referring to something in the 
Thrift IDL file or something else?

[~cwsteinbach], [~brocknoland], the latest patch still does not fulfill the 
comments about backward compatibility. I will update the patch soon and let you 
know. :)
The Thrift level interface TCLIService is OK. 
For client and service layer interface ICLIService, although it is not RPC and 
is not a public API of Hive, I think making it follow the single 
request/response struct mode is also good. Will make the new fetchResults 
method follow the single request/response struct model. Then remove those old 
fetchResults methods.

> HS2 should support an API to retrieve query logs
> 
>
> Key: HIVE-4629
> URL: https://issues.apache.org/jira/browse/HIVE-4629
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Shreepadma Venugopalan
>Assignee: Dong Chen
> Attachments: HIVE-4629-no_thrift.1.patch, HIVE-4629.1.patch, 
> HIVE-4629.2.patch, HIVE-4629.3.patch.txt, HIVE-4629.4.patch, 
> HIVE-4629.5.patch, HIVE-4629.6.patch, HIVE-4629.7.patch
>
>
> HiveServer2 should support an API to retrieve query logs. This is 
> particularly relevant because HiveServer2 supports async execution but 
> doesn't provide a way to report progress. Providing an API to retrieve query 
> logs will help report progress to the client.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-4523) round() function with specified decimal places not consistent with mysql

2014-08-21 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106478#comment-14106478
 ] 

Lefty Leverenz commented on HIVE-4523:
--

This change of behavior for round() should be documented in the wiki here (with 
version information, of course, and a link back to HIVE-4523):

* [Hive Operators and UDFs -- Mathematical Functions | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-MathematicalFunctions]

> round() function with specified decimal places not consistent with mysql 
> -
>
> Key: HIVE-4523
> URL: https://issues.apache.org/jira/browse/HIVE-4523
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 0.7.1
>Reporter: Fred Desing
>Assignee: Xuefu Zhang
>Priority: Minor
>  Labels: TODOC13
> Fix For: 0.13.0
>
> Attachments: HIVE-4523.1.patch, HIVE-4523.2.patch, HIVE-4523.3.patch, 
> HIVE-4523.4.patch, HIVE-4523.5.patch, HIVE-4523.6.patch, HIVE-4523.7.patch, 
> HIVE-4523.8.patch, HIVE-4523.patch
>
>
> // hive
> hive> select round(150.000, 2) from temp limit 1;
> 150.0
> hive> select round(150, 2) from temp limit 1;
> 150.0
> // mysql
> mysql> select round(150.000, 2) from DUAL limit 1;
> round(150.000, 2)
> 150.00
> mysql> select round(150, 2) from DUAL limit 1;
> round(150, 2)
> 150
> http://dev.mysql.com/doc/refman/5.1/en/mathematical-functions.html#function_round



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-4523) round() function with specified decimal places not consistent with mysql

2014-08-21 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-4523:
-

Labels: TODOC13  (was: )

> round() function with specified decimal places not consistent with mysql 
> -
>
> Key: HIVE-4523
> URL: https://issues.apache.org/jira/browse/HIVE-4523
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 0.7.1
>Reporter: Fred Desing
>Assignee: Xuefu Zhang
>Priority: Minor
>  Labels: TODOC13
> Fix For: 0.13.0
>
> Attachments: HIVE-4523.1.patch, HIVE-4523.2.patch, HIVE-4523.3.patch, 
> HIVE-4523.4.patch, HIVE-4523.5.patch, HIVE-4523.6.patch, HIVE-4523.7.patch, 
> HIVE-4523.8.patch, HIVE-4523.patch
>
>
> // hive
> hive> select round(150.000, 2) from temp limit 1;
> 150.0
> hive> select round(150, 2) from temp limit 1;
> 150.0
> // mysql
> mysql> select round(150.000, 2) from DUAL limit 1;
> round(150.000, 2)
> 150.00
> mysql> select round(150, 2) from DUAL limit 1;
> round(150, 2)
> 150
> http://dev.mysql.com/doc/refman/5.1/en/mathematical-functions.html#function_round



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7735) Implement Char, Varchar in ParquetSerDe

2014-08-21 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106460#comment-14106460
 ] 

Lefty Leverenz commented on HIVE-7735:
--

Document this in the wiki here:

* [Parquet -- Limitations | 
https://cwiki.apache.org/confluence/display/Hive/Parquet#Parquet-Limitations]

Either follow the pattern of the second bullet (timestamp & decimal) or add 
char/varchar to that bullet.  Also revise the first bullet.

> Implement Char, Varchar in ParquetSerDe
> ---
>
> Key: HIVE-7735
> URL: https://issues.apache.org/jira/browse/HIVE-7735
> Project: Hive
>  Issue Type: Sub-task
>  Components: Serializers/Deserializers
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
>  Labels: Parquet, TODOC14
> Fix For: 0.14.0
>
> Attachments: HIVE-7735.1.patch, HIVE-7735.1.patch, HIVE-7735.2.patch, 
> HIVE-7735.2.patch, HIVE-7735.3.patch, HIVE-7735.patch
>
>
> This JIRA is to implement CHAR and VARCHAR support in Parquet SerDe.
> Both are represented in Parquet as PrimitiveType binary and OriginalType UTF8.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7735) Implement Char, Varchar in ParquetSerDe

2014-08-21 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-7735:
-

Labels: Parquet TODOC14  (was: Parquet)

> Implement Char, Varchar in ParquetSerDe
> ---
>
> Key: HIVE-7735
> URL: https://issues.apache.org/jira/browse/HIVE-7735
> Project: Hive
>  Issue Type: Sub-task
>  Components: Serializers/Deserializers
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
>  Labels: Parquet, TODOC14
> Fix For: 0.14.0
>
> Attachments: HIVE-7735.1.patch, HIVE-7735.1.patch, HIVE-7735.2.patch, 
> HIVE-7735.2.patch, HIVE-7735.3.patch, HIVE-7735.patch
>
>
> This JIRA is to implement CHAR and VARCHAR support in Parquet SerDe.
> Both are represented in Parquet as PrimitiveType binary and OriginalType UTF8.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6987) Metastore qop settings won't work with Hadoop-2.4

2014-08-21 Thread skrho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

skrho updated HIVE-6987:


Labels: patch  (was: )
Status: Patch Available  (was: Open)

> Metastore qop settings won't work with Hadoop-2.4
> -
>
> Key: HIVE-6987
> URL: https://issues.apache.org/jira/browse/HIVE-6987
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0
>Reporter: Vaibhav Gumashta
>  Labels: patch
> Fix For: 0.14.0
>
> Attachments: HIVE-6987.txt
>
>
>  [HADOOP-10211|https://issues.apache.org/jira/browse/HADOOP-10211] made a 
> backward incompatible change due to which the following hive call returns a 
> null map:
> {code}
> Map hadoopSaslProps =  ShimLoader.getHadoopThriftAuthBridge().
> getHadoopSaslProperties(conf); 
> {code}
> Metastore uses the underlying hadoop.rpc.protection values to set the qop 
> between metastore client/server. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6987) Metastore qop settings won't work with Hadoop-2.4

2014-08-21 Thread skrho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

skrho updated HIVE-6987:


Attachment: HIVE-6987.txt

Please review my patch and give me chance to contribute source~ ^^

> Metastore qop settings won't work with Hadoop-2.4
> -
>
> Key: HIVE-6987
> URL: https://issues.apache.org/jira/browse/HIVE-6987
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0
>Reporter: Vaibhav Gumashta
>  Labels: patch
> Fix For: 0.14.0
>
> Attachments: HIVE-6987.txt
>
>
>  [HADOOP-10211|https://issues.apache.org/jira/browse/HADOOP-10211] made a 
> backward incompatible change due to which the following hive call returns a 
> null map:
> {code}
> Map hadoopSaslProps =  ShimLoader.getHadoopThriftAuthBridge().
> getHadoopSaslProperties(conf); 
> {code}
> Metastore uses the underlying hadoop.rpc.protection values to set the qop 
> between metastore client/server. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7654) A method to extrapolate columnStats for partitions of a table

2014-08-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106422#comment-14106422
 ] 

Hive QA commented on HIVE-7654:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12663445/HIVE-7654.8.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 6117 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_extrapolate_part_stats_partial
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_join
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/449/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/449/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-449/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12663445

> A method to extrapolate columnStats for partitions of a table
> -
>
> Key: HIVE-7654
> URL: https://issues.apache.org/jira/browse/HIVE-7654
> Project: Hive
>  Issue Type: New Feature
>Reporter: pengcheng xiong
>Assignee: pengcheng xiong
>Priority: Minor
> Attachments: Extrapolate the Column Status.docx, HIVE-7654.0.patch, 
> HIVE-7654.1.patch, HIVE-7654.4.patch, HIVE-7654.6.patch, HIVE-7654.7.patch, 
> HIVE-7654.8.patch
>
>
> In a PARTITIONED table, there are many partitions. For example, 
> create table if not exists loc_orc (
>   state string,
>   locid int,
>   zip bigint
> ) partitioned by(year string) stored as orc;
> We assume there are 4 partitions, partition(year='2000'), 
> partition(year='2001'), partition(year='2002') and partition(year='2003').
> We can use the following command to compute statistics for columns 
> state,locid of partition(year='2001')
> analyze table loc_orc partition(year='2001') compute statistics for columns 
> state,locid;
> We need to know the “aggregated” column status for the whole table loc_orc. 
> However, we may not have the column status for some partitions, e.g., 
> partition(year='2002') and also we may not have the column status for some 
> columns, e.g., zip bigint for partition(year='2001')
> We propose a method to extrapolate the missing column status for the 
> partitions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7384) Research into reduce-side join [Spark Branch]

2014-08-21 Thread Lianhui Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106407#comment-14106407
 ] 

Lianhui Wang commented on HIVE-7384:


i think the thoughts is same as ideas that you said before. like HIVE-7158, 
that will auto-calculate the number of reducers based on some input from Hive 
(upper/lower bound).

> Research into reduce-side join [Spark Branch]
> -
>
> Key: HIVE-7384
> URL: https://issues.apache.org/jira/browse/HIVE-7384
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Szehon Ho
> Attachments: Hive on Spark Reduce Side Join.docx, sales_items.txt, 
> sales_products.txt, sales_stores.txt
>
>
> Hive's join operator is very sophisticated, especially for reduce-side join. 
> While we expect that other types of join, such as map-side join and SMB 
> map-side join, will work out of the box with our design, there may be some 
> complication in reduce-side join, which extensively utilizes key tag and 
> shuffle behavior. Our design principle prefers to making Hive implementation 
> work out of box also, which might requires new functionality from Spark. The 
> tasks is to research into this area, identifying requirements for Spark 
> community and the work to be done on Hive to make reduce-side join work.
> A design doc might be needed for this. For more information, please refer to 
> the overall design doc on wiki.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7809) Fix ObjectRegistry to work with Tez 0.5

2014-08-21 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-7809:
-

Attachment: HIVE-7809.1.patch

A little ugly, but seems like the simplest fix - considering a Tez*Context is 
not available everywhere that the cache is required (InputFormats, code shared 
with MR). [~hagleitn] - please take a look.

> Fix ObjectRegistry to work with Tez 0.5
> ---
>
> Key: HIVE-7809
> URL: https://issues.apache.org/jira/browse/HIVE-7809
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tez
>Reporter: Siddharth Seth
> Attachments: HIVE-7809.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (HIVE-7809) Fix ObjectRegistry to work with Tez 0.5

2014-08-21 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth reassigned HIVE-7809:


Assignee: Siddharth Seth

> Fix ObjectRegistry to work with Tez 0.5
> ---
>
> Key: HIVE-7809
> URL: https://issues.apache.org/jira/browse/HIVE-7809
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tez
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-7809.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-7846) authorization api should not assume case insensitive role names

2014-08-21 Thread Thejas M Nair (JIRA)

Thejas M Nair created HIVE-7846:
---

 Summary: authorization api should not assume case insensitive role 
names
 Key: HIVE-7846
 URL: https://issues.apache.org/jira/browse/HIVE-7846
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair


The case insensitive behavior of roles should be specific to sql standard 
authorization.




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6250) sql std auth - view authorization should not underlying table. More tests and fixes.

2014-08-21 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106387#comment-14106387
 ] 

Thejas M Nair commented on HIVE-6250:
-

[~THEcreationist]
Please open a new jira with an example including the configuration used.




> sql std auth - view authorization should not underlying table. More tests and 
> fixes.
> 
>
> Key: HIVE-6250
> URL: https://issues.apache.org/jira/browse/HIVE-6250
> Project: Hive
>  Issue Type: Sub-task
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.13.0
>
> Attachments: HIVE-6250.1.patch, HIVE-6250.2.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> This patch adds more tests for table and view authorization and also fixes a 
> number of issues found during testing -
> - View authorization should happen on only on the view, and not the 
> underlying table (Change in ReadEntity to indicate if it is a direct/indirect 
> dependency)
> - table owner in metadata should be the user as per SessionState 
> authentication provider
> - added utility function for finding the session state authentication 
> provider user
> - authorization should be based on current roles
> - admin user should have all permissions
> - error message improvements



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7792) Enable tests on Spark branch (2) [Sparch Branch]

2014-08-21 Thread Venki Korukanti (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venki Korukanti updated HIVE-7792:
--

Attachment: HIVE-7792.1-spark.patch


Attached patch enables following tests:
{noformat}
 * metadata_only_queries.q
 * load_dyn_part2.q
 * load_dyn_part3.q
 * mapreduce1.q
 * mapreduce2.q
 * limit_pushdown.q (Order of results is different from MR, but order is 
deterministic)
{noformat}


{noformat}
 * load_dyn_part1.q - Failure - tracked by HIVE-7842
 * mapjoin_mapjoin.q: Results are wrong, probably because MapJoin is not  
supported yet (HIVE-7613).
 * optimize_nullscan.q: Differences in table serdes and plan. Looks like the 
plan is not optimized for limit 0 cases - tracked by HIVE-7844
 * orc_analyze.q - Failure - tracked by HIVE-7843
{noformat}

Stats differences such as below for rest of the q files.
{noformat}
< Statistics: Num rows: 1000 Data size: 94000 Basic stats: COMPLETE Column 
stats: NONE
---
> Statistics: Num rows: 46 Data size: 4920 Basic stats: COMPLETE Column stats: 
> NONE
{noformat}

Not sure if it is because we don't have stats collection from Spark job yet. 
Still investigating.
{noformat}
 * orc_merge1.q
 * orc_merge2.q
 * orc_merge3.q
 * orc_merge4.q
 * merge1.q
 * merge2.q
{noformat}


> Enable tests on Spark branch (2) [Sparch Branch]
> 
>
> Key: HIVE-7792
> URL: https://issues.apache.org/jira/browse/HIVE-7792
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Brock Noland
>Assignee: Venki Korukanti
> Attachments: HIVE-7792.1-spark.patch
>
>
> This jira is to enable *most* of the tests below. If tests don't pass because 
> of some unsupported feature, ensure that a JIRA exists and move on.
> {noformat}
> limit_pushdown.q,\
>   load_dyn_part1.q,\
>   load_dyn_part2.q,\
>   load_dyn_part3.q,\
>   mapjoin_mapjoin.q,\
>   mapreduce1.q,\
>   mapreduce2.q,\
>   merge1.q,\
>   merge2.q,\
>   metadata_only_queries.q,\
>   optimize_nullscan.q,\
>   orc_analyze.q,\
>   orc_merge1.q,\
>   orc_merge2.q,\
>   orc_merge3.q,\
>   orc_merge4.q,\
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7680) Do not throw SQLException for HiveStatement getMoreResults and setEscapeProcessing(false)

2014-08-21 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106371#comment-14106371
 ] 

Thejas M Nair commented on HIVE-7680:
-

+1


> Do not throw SQLException for HiveStatement getMoreResults and 
> setEscapeProcessing(false)
> -
>
> Key: HIVE-7680
> URL: https://issues.apache.org/jira/browse/HIVE-7680
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 0.13.1
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>Priority: Minor
> Attachments: HIVE-7680.2.patch, HIVE-7680.2.patch, HIVE-7680.patch
>
>
> 1. Some JDBC clients call method setEscapeProcessing(false)  (e.g. SQL 
> Workbench)
> Looks like setEscapeProcessing(false) should do nothing.So, lets do  nothing 
> instead of throwing SQLException
> 2. getMoreResults is needed in case Statements returns several ReseltSet.
> Hive does not support Multiple ResultSets. So this method can safely always 
> return false.
> 3. getUpdateCount. Currently this method always returns 0. Hive cannot tell 
> us how many rows were inserted. According to JDBC spec it should return " -1 
> if the current result is a ResultSet object or there are no more results" 
> if this method returns 0 then in case of execution insert statement JDBC 
> client shows "0 rows were inserted" which is not true.
> if this method returns -1 then JDBC client runs insert statements and  shows 
> that it was executed successfully, no result were returned. 
> I think the latter behaviour is more correct.
> 4. Some methods in Statement class should throw 
> SQLFeatureNotSupportedException if they are not supported.  Current 
> implementation throws SQLException instead which means database access error.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7681) qualified tablenames usage does not work with several alter-table commands

2014-08-21 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106369#comment-14106369
 ] 

Thejas M Nair commented on HIVE-7681:
-

Thanks for the changes and for updating all those tests!

+1. Pending some minor comments in review board.




> qualified tablenames usage does not work with several alter-table commands
> --
>
> Key: HIVE-7681
> URL: https://issues.apache.org/jira/browse/HIVE-7681
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Navis
> Attachments: HIVE-7681.1.patch.txt, HIVE-7681.2.patch.txt, 
> HIVE-7681.3.patch.txt, HIVE-7681.4.patch.txt
>
>
> Changes were made in HIVE-4064 for use of qualified table names in more types 
> of queries. But several alter table commands don't work with qualified 
> - alter table default.tmpfoo set tblproperties ("bar" = "bar value")
> - ALTER TABLE default.kv_rename_test CHANGE a a STRING
> - add,drop partition
> - alter index rebuild



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (HIVE-7681) qualified tablenames usage does not work with several alter-table commands

2014-08-21 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106369#comment-14106369
 ] 

Thejas M Nair edited comment on HIVE-7681 at 8/22/14 2:26 AM:
--

Thanks for the changes and for updating all those tests!

+1. Pending some minor comments in review board to be addressed.






was (Author: thejas):
Thanks for the changes and for updating all those tests!

+1. Pending some minor comments in review board.




> qualified tablenames usage does not work with several alter-table commands
> --
>
> Key: HIVE-7681
> URL: https://issues.apache.org/jira/browse/HIVE-7681
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Navis
> Attachments: HIVE-7681.1.patch.txt, HIVE-7681.2.patch.txt, 
> HIVE-7681.3.patch.txt, HIVE-7681.4.patch.txt
>
>
> Changes were made in HIVE-4064 for use of qualified table names in more types 
> of queries. But several alter table commands don't work with qualified 
> - alter table default.tmpfoo set tblproperties ("bar" = "bar value")
> - ALTER TABLE default.kv_rename_test CHANGE a a STRING
> - add,drop partition
> - alter index rebuild



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7829) Entity.getLocation can throw an NPE

2014-08-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106368#comment-14106368
 ] 

Hive QA commented on HIVE-7829:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12663432/HIVE-7829.1.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6115 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_join
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/447/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/447/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-447/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12663432

> Entity.getLocation can throw an NPE
> ---
>
> Key: HIVE-7829
> URL: https://issues.apache.org/jira/browse/HIVE-7829
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-7829.1.patch, HIVE-7892.patch
>
>
> It's possible for the getDataLocation methods which Entity.getLocation calls 
> to return null and as such NPE



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 24833: qualified tablenames usage does not work with several alter-table commands

2014-08-21 Thread Thejas Nair


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24833/#review51243
---



hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/SemanticAnalysis/HCatSemanticAnalyzer.java


What does this additional switch-case statement do ? Looks like it does not 
have any impact, it does not matter what the child token is. I think we can get 
rid of it.



hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/SemanticAnalysis/HCatSemanticAnalyzer.java


same with this switch-case


- Thejas Nair


On Aug. 19, 2014, 1:20 a.m., Navis Ryu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24833/
> ---
> 
> (Updated Aug. 19, 2014, 1:20 a.m.)
> 
> 
> Review request for hive and Thejas Nair.
> 
> 
> Bugs: HIVE-7681
> https://issues.apache.org/jira/browse/HIVE-7681
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Changes were made in HIVE-4064 for use of qualified table names in more types 
> of queries. But several alter table commands don't work with qualified 
> - alter table default.tmpfoo set tblproperties ("bar" = "bar value")
> - ALTER TABLE default.kv_rename_test CHANGE a a STRING
> - add,drop partition
> - alter index rebuild
> 
> 
> Diffs
> -
> 
>   
> hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/SemanticAnalysis/CreateTableHook.java
>  ff0f210 
>   
> hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/SemanticAnalysis/HCatSemanticAnalyzer.java
>  4d338b5 
>   
> hcatalog/core/src/test/java/org/apache/hive/hcatalog/cli/TestSemanticAnalysis.java
>  1e25ed3 
>   ql/src/java/org/apache/hadoop/hive/ql/hooks/UpdateInputAccessTimeHook.java 
> ae89182 
>   ql/src/java/org/apache/hadoop/hive/ql/index/IndexMetadataChangeTask.java 
> 1e01001 
>   ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
> 27e251c 
>   
> ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
> e7434a3 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
> 60d490f 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> f31a409 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g a76cad7 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/IndexUpdater.java 8527239 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 7a71ec7 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 
> 3dfce99 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/AlterTableDesc.java 20d863b 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/HiveOperation.java 67be666 
>   
> ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveOperationType.java
>  29ae4a0 
>   
> ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/Operation2Privilege.java
>  45404fe 
>   ql/src/test/queries/clientpositive/add_part_exist.q d176661 
>   ql/src/test/queries/clientpositive/alter1.q 312a017 
>   ql/src/test/queries/clientpositive/alter_char1.q d391138 
>   ql/src/test/queries/clientpositive/alter_index.q 2aa13da 
>   ql/src/test/queries/clientpositive/alter_partition_coltype.q 115eaf9 
>   ql/src/test/queries/clientpositive/alter_skewed_table.q 216bbb5 
>   ql/src/test/queries/clientpositive/alter_varchar1.q 6f644a0 
>   ql/src/test/queries/clientpositive/alter_view_as_select.q dcab3ca 
>   ql/src/test/queries/clientpositive/alter_view_rename.q 68cf9d6 
>   ql/src/test/queries/clientpositive/archive_multi.q 2c1a6d8 
>   ql/src/test/queries/clientpositive/create_or_replace_view.q a8f59b7 
>   ql/src/test/queries/clientpositive/drop_multi_partitions.q 14e2356 
>   ql/src/test/queries/clientpositive/exchange_partition.q 4be6e3f 
>   ql/src/test/queries/clientpositive/index_auto_empty.q 41f4a40 
>   ql/src/test/queries/clientpositive/touch.q 8a661ef 
>   ql/src/test/queries/clientpositive/unset_table_view_property.q f838cd1 
>   ql/src/test/results/clientpositive/add_part_exist.q.out 4c22d6a 
>   ql/src/test/results/clientpositive/alter1.q.out 1cfaf75 
>   ql/src/test/results/clientpositive/alter_char1.q.out 017da60 
>   ql/src/test/results/clientpositive/alter_index.q.out 2093e2f 
>   ql/src/test/results/clientpositive/alter_partition_coltype.q.out 25eb48c 
>   ql/src/test/results/clientpositive/alter_skewed_table.q.out e6bfc5a 
>   ql/src/test/results/clientpositive/alter_varchar1.q.out e74a7ed 
>   ql/src/test/results/clientpositive/alter_view_as_select.q.out 53a6b37 
>   ql/src/test/results/clientpositive/alter_view_rename.q.out 0f3dd14 
>   ql/src/test/results/clientpositive/archive_multi.q.out 7e84def 
>   ql/src/test/results/clientpositive/create_or

[jira] [Commented] (HIVE-7384) Research into reduce-side join [Spark Branch]

2014-08-21 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106361#comment-14106361
 ] 

Szehon Ho commented on HIVE-7384:
-

Sorry Lianhui, I didnt see your reply before I typed the comment.  Please do 
let us know about the Spark thoughts on auto-parallelism if you have any.

> Research into reduce-side join [Spark Branch]
> -
>
> Key: HIVE-7384
> URL: https://issues.apache.org/jira/browse/HIVE-7384
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Szehon Ho
> Attachments: Hive on Spark Reduce Side Join.docx, sales_items.txt, 
> sales_products.txt, sales_stores.txt
>
>
> Hive's join operator is very sophisticated, especially for reduce-side join. 
> While we expect that other types of join, such as map-side join and SMB 
> map-side join, will work out of the box with our design, there may be some 
> complication in reduce-side join, which extensively utilizes key tag and 
> shuffle behavior. Our design principle prefers to making Hive implementation 
> work out of box also, which might requires new functionality from Spark. The 
> tasks is to research into this area, identifying requirements for Spark 
> community and the work to be done on Hive to make reduce-side join work.
> A design doc might be needed for this. For more information, please refer to 
> the overall design doc on wiki.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7384) Research into reduce-side join [Spark Branch]

2014-08-21 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106358#comment-14106358
 ] 

Szehon Ho commented on HIVE-7384:
-

1.  I thought that TotalOrderPartition was only in order-by case, for 
hive.optimize.sampling.orderby=true, and not for joins?  Just my reading of it, 
I'll take a second look and update if wrong.

2.  Auto-parallelism looks like a Tez feature, that will auto-calculate the 
number of reducers based on some input from Hive (upper/lower bound).

Today in Spark we are taking numReducers from what hive query optimizer gives 
us, and during shuffle stage put that as the number of RDD partitions of the 
shuffle output (reducer input).  Spark has some defaults if we dont set it 
explicitly (the doc says its based on the number of partitions already in the 
largest parent RDD) 
[http://spark.apache.org/docs/latest/tuning.html#level-of-parallelism|http://spark.apache.org/docs/latest/tuning.html#level-of-parallelism].
  I dont know off top of my head of any option for Spark to give us a better 
number, if thats what you're asking?

> Research into reduce-side join [Spark Branch]
> -
>
> Key: HIVE-7384
> URL: https://issues.apache.org/jira/browse/HIVE-7384
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Szehon Ho
> Attachments: Hive on Spark Reduce Side Join.docx, sales_items.txt, 
> sales_products.txt, sales_stores.txt
>
>
> Hive's join operator is very sophisticated, especially for reduce-side join. 
> While we expect that other types of join, such as map-side join and SMB 
> map-side join, will work out of the box with our design, there may be some 
> complication in reduce-side join, which extensively utilizes key tag and 
> shuffle behavior. Our design principle prefers to making Hive implementation 
> work out of box also, which might requires new functionality from Spark. The 
> tasks is to research into this area, identifying requirements for Spark 
> community and the work to be done on Hive to make reduce-side join work.
> A design doc might be needed for this. For more information, please refer to 
> the overall design doc on wiki.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7384) Research into reduce-side join [Spark Branch]

2014-08-21 Thread Lianhui Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106343#comment-14106343
 ] 

Lianhui Wang commented on HIVE-7384:


@Szehon Ho yes,i read OrderedRDDFunctions code and discove that sortByKey 
actually does a range-partition. we need to replace range-partition with hash 
partition. so spark maybe should create a new interface example: 
partitionSortByKey.
@Brock Noland  code in 1) means when sample data and more than one reducers, 
Hive does a total order sort. so join does not sample data, it does not need a 
total order sort.
2) i think we really need auto-parallelism. before i talk it with Reynold Xin, 
spark need to support re-partition mapoutput's data as same as tez does.

> Research into reduce-side join [Spark Branch]
> -
>
> Key: HIVE-7384
> URL: https://issues.apache.org/jira/browse/HIVE-7384
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Szehon Ho
> Attachments: Hive on Spark Reduce Side Join.docx, sales_items.txt, 
> sales_products.txt, sales_stores.txt
>
>
> Hive's join operator is very sophisticated, especially for reduce-side join. 
> While we expect that other types of join, such as map-side join and SMB 
> map-side join, will work out of the box with our design, there may be some 
> complication in reduce-side join, which extensively utilizes key tag and 
> shuffle behavior. Our design principle prefers to making Hive implementation 
> work out of box also, which might requires new functionality from Spark. The 
> tasks is to research into this area, identifying requirements for Spark 
> community and the work to be done on Hive to make reduce-side join work.
> A design doc might be needed for this. For more information, please refer to 
> the overall design doc on wiki.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7772) Add tests for order/sort/distribute/cluster by query [Spark Branch]

2014-08-21 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106297#comment-14106297
 ] 

Rui Li commented on HIVE-7772:
--

Thanks [~brocknoland], let me rebase my branch and see if I can add more tests.

> Add tests for order/sort/distribute/cluster by query [Spark Branch]
> ---
>
> Key: HIVE-7772
> URL: https://issues.apache.org/jira/browse/HIVE-7772
> Project: Hive
>  Issue Type: Test
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-7772-spark.patch
>
>
> Now that these queries are supported, we should have tests to catch any 
> problems we may have.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-7845) "Failed to locate the winutils binary" when loading JDBC driver on Windows

2014-08-21 Thread Ben White (JIRA)

Ben White created HIVE-7845:
---

 Summary: "Failed to locate the winutils binary" when loading JDBC 
driver on Windows
 Key: HIVE-7845
 URL: https://issues.apache.org/jira/browse/HIVE-7845
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.12.0
Reporter: Ben White


This ERROR is thrown on Windows platforms when loading the JDBC driver, 
subsequent attempts will succeed.  The Hadoop binaries are indeed not 
available, but shouldn't be required when just using JDBC.

13:20:00 [ERROR pool-2-thread-4 Shell.getWinUtilsPath] Failed to locate the 
winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the 
Hadoop binaries.
   at 
org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:324)
   at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:339)
   at org.apache.hadoop.util.Shell.(Shell.java:332)
   at 
org.apache.hadoop.hive.conf.HiveConf$ConfVars.findHadoopBinary(HiveConf.java:918)
   at 
org.apache.hadoop.hive.conf.HiveConf$ConfVars.(HiveConf.java:228)
   at 
org.apache.hive.jdbc.HiveConnection.isHttpTransportMode(HiveConnection.java:304)
   at 
org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:181)
   at 
org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:164)
   at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown 
Source)
   at java.lang.reflect.Method.invoke(Unknown Source)
   at com.onseven.dbvis.d.B.D.ā(Z:1548)
   at com.onseven.dbvis.d.B.F$A.call(Z:278)
   at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
   at java.util.concurrent.FutureTask.run(Unknown Source)
   at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown 
Source)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown 
Source)
   at java.lang.Thread.run(Unknown Source)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7680) Do not throw SQLException for HiveStatement getMoreResults and setEscapeProcessing(false)

2014-08-21 Thread Alexander Pivovarov (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106286#comment-14106286
 ] 

Alexander Pivovarov commented on HIVE-7680:
---

Patch #2 was built by build #446. it has 3 failed tests. Looks like it is 
normal situation. I checked prev builds 445 an 444 - they have these 3 failed 
tests as well.
TestJdbcDriver2.testSelectAll, and other selectAll tests passed this time.

> Do not throw SQLException for HiveStatement getMoreResults and 
> setEscapeProcessing(false)
> -
>
> Key: HIVE-7680
> URL: https://issues.apache.org/jira/browse/HIVE-7680
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 0.13.1
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>Priority: Minor
> Attachments: HIVE-7680.2.patch, HIVE-7680.2.patch, HIVE-7680.patch
>
>
> 1. Some JDBC clients call method setEscapeProcessing(false)  (e.g. SQL 
> Workbench)
> Looks like setEscapeProcessing(false) should do nothing.So, lets do  nothing 
> instead of throwing SQLException
> 2. getMoreResults is needed in case Statements returns several ReseltSet.
> Hive does not support Multiple ResultSets. So this method can safely always 
> return false.
> 3. getUpdateCount. Currently this method always returns 0. Hive cannot tell 
> us how many rows were inserted. According to JDBC spec it should return " -1 
> if the current result is a ResultSet object or there are no more results" 
> if this method returns 0 then in case of execution insert statement JDBC 
> client shows "0 rows were inserted" which is not true.
> if this method returns -1 then JDBC client runs insert statements and  shows 
> that it was executed successfully, no result were returned. 
> I think the latter behaviour is more correct.
> 4. Some methods in Statement class should throw 
> SQLFeatureNotSupportedException if they are not supported.  Current 
> implementation throws SQLException instead which means database access error.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7843) orc_analyze.q fails with an assertion in FileSinkOperator [Spark Branch]

2014-08-21 Thread Venki Korukanti (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venki Korukanti updated HIVE-7843:
--

Summary: orc_analyze.q fails with an assertion in FileSinkOperator [Spark 
Branch]  (was: orc_analyze.q fails with an assertion in FileSinkOperator 
[SparkBranch])

> orc_analyze.q fails with an assertion in FileSinkOperator [Spark Branch]
> 
>
> Key: HIVE-7843
> URL: https://issues.apache.org/jira/browse/HIVE-7843
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
> Fix For: spark-branch
>
>
> {code}
> java.lang.AssertionError: data length is different from num of DP columns
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.getDynPartDirectory(FileSinkOperator.java:809)
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.getDynOutPaths(FileSinkOperator.java:730)
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.startGroup(FileSinkOperator.java:829)
> org.apache.hadoop.hive.ql.exec.Operator.defaultStartGroup(Operator.java:502)
> org.apache.hadoop.hive.ql.exec.Operator.startGroup(Operator.java:525)
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:198)
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:47)
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:27)
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:98)
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
> scala.collection.Iterator$class.foreach(Iterator.scala:727)
> scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:759)
> org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:759)
> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
> org.apache.spark.scheduler.Task.run(Task.scala:54)
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:199)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> java.lang.Thread.run(Thread.java:744)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7384) Research into reduce-side join [Spark Branch]

2014-08-21 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106253#comment-14106253
 ] 

Brock Noland commented on HIVE-7384:


1) I noticed recently that latest Hive, when there are more than one reducers, 
does a total order sort:
https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java#L374

2)Should we do some investigation into Tez auto-parallelism (HIVE-7158)? Let me 
know your thoughts.

> Research into reduce-side join [Spark Branch]
> -
>
> Key: HIVE-7384
> URL: https://issues.apache.org/jira/browse/HIVE-7384
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Szehon Ho
> Attachments: Hive on Spark Reduce Side Join.docx, sales_items.txt, 
> sales_products.txt, sales_stores.txt
>
>
> Hive's join operator is very sophisticated, especially for reduce-side join. 
> While we expect that other types of join, such as map-side join and SMB 
> map-side join, will work out of the box with our design, there may be some 
> complication in reduce-side join, which extensively utilizes key tag and 
> shuffle behavior. Our design principle prefers to making Hive implementation 
> work out of box also, which might requires new functionality from Spark. The 
> tasks is to research into this area, identifying requirements for Spark 
> community and the work to be done on Hive to make reduce-side join work.
> A design doc might be needed for this. For more information, please refer to 
> the overall design doc on wiki.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-7844) optimize_nullscan.q fails due to differences in explain plan [Spark Branch]

2014-08-21 Thread Venki Korukanti (JIRA)

Venki Korukanti created HIVE-7844:
-

 Summary: optimize_nullscan.q fails due to differences in explain 
plan [Spark Branch]
 Key: HIVE-7844
 URL: https://issues.apache.org/jira/browse/HIVE-7844
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: spark-branch
Reporter: Venki Korukanti
Assignee: Venki Korukanti
 Fix For: spark-branch


Looks like on spark branch, we are not optimizing query plans for limit 0 cases.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-7843) orc_analyze.q fails with an assertion in FileSinkOperator [SparkBranch]

2014-08-21 Thread Venki Korukanti (JIRA)

Venki Korukanti created HIVE-7843:
-

 Summary: orc_analyze.q fails with an assertion in FileSinkOperator 
[SparkBranch]
 Key: HIVE-7843
 URL: https://issues.apache.org/jira/browse/HIVE-7843
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: spark-branch
Reporter: Venki Korukanti
Assignee: Venki Korukanti
 Fix For: spark-branch


{code}
java.lang.AssertionError: data length is different from num of DP columns
org.apache.hadoop.hive.ql.exec.FileSinkOperator.getDynPartDirectory(FileSinkOperator.java:809)
org.apache.hadoop.hive.ql.exec.FileSinkOperator.getDynOutPaths(FileSinkOperator.java:730)
org.apache.hadoop.hive.ql.exec.FileSinkOperator.startGroup(FileSinkOperator.java:829)
org.apache.hadoop.hive.ql.exec.Operator.defaultStartGroup(Operator.java:502)
org.apache.hadoop.hive.ql.exec.Operator.startGroup(Operator.java:525)
org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:198)
org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:47)
org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:27)
org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:98)
scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
scala.collection.Iterator$class.foreach(Iterator.scala:727)
scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:759)
org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:759)
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
org.apache.spark.scheduler.Task.run(Task.scala:54)
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:199)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:744)
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-7842) load_dyn_part1.q fails with an assertion [Spark Branch]

2014-08-21 Thread Venki Korukanti (JIRA)

Venki Korukanti created HIVE-7842:
-

 Summary: load_dyn_part1.q fails with an assertion [Spark Branch]
 Key: HIVE-7842
 URL: https://issues.apache.org/jira/browse/HIVE-7842
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: spark-branch
Reporter: Venki Korukanti
Assignee: Venki Korukanti
 Fix For: spark-branch


On spark branch, load_dyn_part1.q fails with following assertion. Looks like 
SerDe is receiving invalid ByteWritable buffer.

{code}
java.lang.AssertionError
"org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:205)"
"org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:187)"
"org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:186)"
"org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:47)"
"org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:27)"
"org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:98)"
"scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)"
"scala.collection.Iterator$class.foreach(Iterator.scala:727)"
"scala.collection.AbstractIterator.foreach(Iterator.scala:1157)"
"org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:759)"
"org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:759)"
"org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)"
"org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121)"
"org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)"
"org.apache.spark.scheduler.Task.run(Task.scala:54)"
"org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:199)"
"java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)"
"java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)"
"java.lang.Thread.run(Thread.java:744)"
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7598) Potential null pointer dereference in MergeTask#closeJob()

2014-08-21 Thread SUYEON LEE (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SUYEON LEE updated HIVE-7598:
-

Attachment: (was: HIVE-7598.patch)

> Potential null pointer dereference in MergeTask#closeJob()
> --
>
> Key: HIVE-7598
> URL: https://issues.apache.org/jira/browse/HIVE-7598
> Project: Hive
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: SUYEON LEE
>Priority: Minor
> Attachments: HIVE-7598.patch
>
>
> Call to Utilities.mvFileToFinalPath() passes null as second last parameter, 
> conf.
> null gets passed to createEmptyBuckets() which dereferences conf directly:
> {code}
> boolean isCompressed = conf.getCompressed();
> TableDesc tableInfo = conf.getTableInfo();
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7598) Potential null pointer dereference in MergeTask#closeJob()

2014-08-21 Thread SUYEON LEE (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SUYEON LEE updated HIVE-7598:
-

Attachment: HIVE-7598.patch

> Potential null pointer dereference in MergeTask#closeJob()
> --
>
> Key: HIVE-7598
> URL: https://issues.apache.org/jira/browse/HIVE-7598
> Project: Hive
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: SUYEON LEE
>Priority: Minor
> Attachments: HIVE-7598.patch
>
>
> Call to Utilities.mvFileToFinalPath() passes null as second last parameter, 
> conf.
> null gets passed to createEmptyBuckets() which dereferences conf directly:
> {code}
> boolean isCompressed = conf.getCompressed();
> TableDesc tableInfo = conf.getTableInfo();
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7680) Do not throw SQLException for HiveStatement getMoreResults and setEscapeProcessing(false)

2014-08-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106247#comment-14106247
 ] 

Hive QA commented on HIVE-7680:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12663425/HIVE-7680.2.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 6115 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_join
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/446/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/446/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-446/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12663425

> Do not throw SQLException for HiveStatement getMoreResults and 
> setEscapeProcessing(false)
> -
>
> Key: HIVE-7680
> URL: https://issues.apache.org/jira/browse/HIVE-7680
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 0.13.1
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>Priority: Minor
> Attachments: HIVE-7680.2.patch, HIVE-7680.2.patch, HIVE-7680.patch
>
>
> 1. Some JDBC clients call method setEscapeProcessing(false)  (e.g. SQL 
> Workbench)
> Looks like setEscapeProcessing(false) should do nothing.So, lets do  nothing 
> instead of throwing SQLException
> 2. getMoreResults is needed in case Statements returns several ReseltSet.
> Hive does not support Multiple ResultSets. So this method can safely always 
> return false.
> 3. getUpdateCount. Currently this method always returns 0. Hive cannot tell 
> us how many rows were inserted. According to JDBC spec it should return " -1 
> if the current result is a ResultSet object or there are no more results" 
> if this method returns 0 then in case of execution insert statement JDBC 
> client shows "0 rows were inserted" which is not true.
> if this method returns -1 then JDBC client runs insert statements and  shows 
> that it was executed successfully, no result were returned. 
> I think the latter behaviour is more correct.
> 4. Some methods in Statement class should throw 
> SQLFeatureNotSupportedException if they are not supported.  Current 
> implementation throws SQLException instead which means database access error.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7800) Parqet Column Index Access Schema Size Checking

2014-08-21 Thread Daniel Weeks (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Weeks updated HIVE-7800:
---

Status: Open  (was: Patch Available)

Working on one other issue, so pulling this back for now.

> Parqet Column Index Access Schema Size Checking
> ---
>
> Key: HIVE-7800
> URL: https://issues.apache.org/jira/browse/HIVE-7800
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Daniel Weeks
>Assignee: Daniel Weeks
> Attachments: HIVE-7800.1.patch
>
>
> In the case that a parquet formatted table has partitions where the files 
> have different size schema, using column index access can result in an index 
> out of bounds exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (HIVE-7761) Failed to analyze stats with CounterStatsAggregator [SparkBranch]

2014-08-21 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland resolved HIVE-7761.


Resolution: Duplicate

> Failed to analyze stats with CounterStatsAggregator [SparkBranch]
> -
>
> Key: HIVE-7761
> URL: https://issues.apache.org/jira/browse/HIVE-7761
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chengxiang Li
>
> CounterStatsAggregator analyze stats with MR counter, we need to implement 
> another CounterStatsAggregator based on spark speficed counter to analyze 
> table stats. Here is the error information:
> {noformat}
> 2014-08-17 23:46:34,436 ERROR stats.CounterStatsAggregator 
> (CounterStatsAggregator.java:connect(51)) - Failed to get Job instance for 
> null
> java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.spark.SparkTask 
> cannot be cast to org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> at 
> org.apache.hadoop.hive.ql.stats.CounterStatsAggregator.connect(CounterStatsAggregator.java:46)
> at 
> org.apache.hadoop.hive.ql.exec.StatsTask.createStatsAggregator(StatsTask.java:282)
> at 
> org.apache.hadoop.hive.ql.exec.StatsTask.aggregateStats(StatsTask.java:142)
> at 
> org.apache.hadoop.hive.ql.exec.StatsTask.execute(StatsTask.java:118)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:161)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1534)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1301)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1113)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:937)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:927)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:246)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:198)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (HIVE-7816) Enable join tests which Tez executes

2014-08-21 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho reassigned HIVE-7816:
---

Assignee: Szehon Ho

> Enable join tests which Tez executes
> 
>
> Key: HIVE-7816
> URL: https://issues.apache.org/jira/browse/HIVE-7816
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Szehon Ho
>
>  
> {noformat}
>   auto_join0.q,\
>   auto_join1.q,\
>   cross_join.q,\
>   cross_product_check_1.q,\
>   cross_product_check_2.q,\
> {noformat}
> {noformat}
> filter_join_breaktask.q,\
> filter_join_breaktask2.q
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7761) Failed to analyze stats with CounterStatsAggregator [SparkBranch]

2014-08-21 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106240#comment-14106240
 ] 

Brock Noland commented on HIVE-7761:


Although HIVE-7819 doesn't fix stats, it does fix this error.

> Failed to analyze stats with CounterStatsAggregator [SparkBranch]
> -
>
> Key: HIVE-7761
> URL: https://issues.apache.org/jira/browse/HIVE-7761
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chengxiang Li
>
> CounterStatsAggregator analyze stats with MR counter, we need to implement 
> another CounterStatsAggregator based on spark speficed counter to analyze 
> table stats. Here is the error information:
> {noformat}
> 2014-08-17 23:46:34,436 ERROR stats.CounterStatsAggregator 
> (CounterStatsAggregator.java:connect(51)) - Failed to get Job instance for 
> null
> java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.spark.SparkTask 
> cannot be cast to org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> at 
> org.apache.hadoop.hive.ql.stats.CounterStatsAggregator.connect(CounterStatsAggregator.java:46)
> at 
> org.apache.hadoop.hive.ql.exec.StatsTask.createStatsAggregator(StatsTask.java:282)
> at 
> org.apache.hadoop.hive.ql.exec.StatsTask.aggregateStats(StatsTask.java:142)
> at 
> org.apache.hadoop.hive.ql.exec.StatsTask.execute(StatsTask.java:118)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:161)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1534)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1301)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1113)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:937)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:927)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:246)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:198)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6250) sql std auth - view authorization should not underlying table. More tests and fixes.

2014-08-21 Thread Ashu Pachauri (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106234#comment-14106234
 ] 

Ashu Pachauri commented on HIVE-6250:
-

Oh, I think I should have been more clear in my comment. We have Hive 0.13.1 
deployment that we upgraded from Hive 0.12 (used Hive upgrade script). A user 
creates a table and then tries to delete the table, the operation fails with 
error "No Drop Privileges found" . 
The user can grant the privileges to himself explicitly, and then it works. 
I am not sure if this is the change, or it is a problem in the upgrade. I just 
inferred it from the code changes I saw in this change.

> sql std auth - view authorization should not underlying table. More tests and 
> fixes.
> 
>
> Key: HIVE-6250
> URL: https://issues.apache.org/jira/browse/HIVE-6250
> Project: Hive
>  Issue Type: Sub-task
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.13.0
>
> Attachments: HIVE-6250.1.patch, HIVE-6250.2.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> This patch adds more tests for table and view authorization and also fixes a 
> number of issues found during testing -
> - View authorization should happen on only on the view, and not the 
> underlying table (Change in ReadEntity to indicate if it is a direct/indirect 
> dependency)
> - table owner in metadata should be the user as per SessionState 
> authentication provider
> - added utility function for finding the session state authentication 
> provider user
> - authorization should be based on current roles
> - admin user should have all permissions
> - error message improvements



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7839) Update union_null results now that it's deterministic

2014-08-21 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106228#comment-14106228
 ] 

Szehon Ho commented on HIVE-7839:
-

+1

> Update union_null results now that it's deterministic
> -
>
> Key: HIVE-7839
> URL: https://issues.apache.org/jira/browse/HIVE-7839
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-7839.1-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7839) Update union_null results now that it's deterministic [Spark Branch]

2014-08-21 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-7839:


Summary: Update union_null results now that it's deterministic [Spark 
Branch]  (was: Update union_null results now that it's deterministic)

Put spark-branch label on the JIRA for clarity.

> Update union_null results now that it's deterministic [Spark Branch]
> 
>
> Key: HIVE-7839
> URL: https://issues.apache.org/jira/browse/HIVE-7839
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-7839.1-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7672) Potential resource leak in EximUtil#createExportDump()

2014-08-21 Thread SUYEON LEE (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SUYEON LEE updated HIVE-7672:
-

Status: Patch Available  (was: Open)

> Potential resource leak in EximUtil#createExportDump()
> --
>
> Key: HIVE-7672
> URL: https://issues.apache.org/jira/browse/HIVE-7672
> Project: Hive
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: SUYEON LEE
>Priority: Minor
> Attachments: HIVE-7672.patch
>
>
> Here is related code:
> {code}
>   OutputStream out = fs.create(metadataPath);
>   out.write(jsonContainer.toString().getBytes("UTF-8"));
>   out.close();
> {code}
> If out.write() throws exception, out would be left unclosed.
> out.close() should be enclosed in finally block.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7672) Potential resource leak in EximUtil#createExportDump()

2014-08-21 Thread SUYEON LEE (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SUYEON LEE updated HIVE-7672:
-

Status: Open  (was: Patch Available)

> Potential resource leak in EximUtil#createExportDump()
> --
>
> Key: HIVE-7672
> URL: https://issues.apache.org/jira/browse/HIVE-7672
> Project: Hive
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: SUYEON LEE
>Priority: Minor
> Attachments: HIVE-7672.patch
>
>
> Here is related code:
> {code}
>   OutputStream out = fs.create(metadataPath);
>   out.write(jsonContainer.toString().getBytes("UTF-8"));
>   out.close();
> {code}
> If out.write() throws exception, out would be left unclosed.
> out.close() should be enclosed in finally block.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7815) Reduce Side Join with single reducer [Spark Branch]

2014-08-21 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106225#comment-14106225
 ] 

Szehon Ho commented on HIVE-7815:
-

Thanks Brock!

> Reduce Side Join with single reducer [Spark Branch]
> ---
>
> Key: HIVE-7815
> URL: https://issues.apache.org/jira/browse/HIVE-7815
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Fix For: spark-branch
>
> Attachments: HIVE-7815-spark.patch, HIVE-7815.2-spark.patch
>
>
> This is the first part of the reduce-side join work, see HIVE-7384 for 
> details.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7815) Reduce Side Join with single reducer [Spark Branch]

2014-08-21 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7815:
---

   Resolution: Fixed
Fix Version/s: spark-branch
   Status: Resolved  (was: Patch Available)

Thank you Szehon! I have committed this to spark!!

> Reduce Side Join with single reducer [Spark Branch]
> ---
>
> Key: HIVE-7815
> URL: https://issues.apache.org/jira/browse/HIVE-7815
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Fix For: spark-branch
>
> Attachments: HIVE-7815-spark.patch, HIVE-7815.2-spark.patch
>
>
> This is the first part of the reduce-side join work, see HIVE-7384 for 
> details.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7730) Extend ReadEntity to add accessed columns from query

2014-08-21 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106215#comment-14106215
 ] 

Szehon Ho commented on HIVE-7730:
-

I think its reasonable.  Xiaomeng, can you put up a review request at : 
[https://reviews.apache.org/dashboard/|https://reviews.apache.org/dashboard/] 
for some comments?

> Extend ReadEntity to add accessed columns from query
> 
>
> Key: HIVE-7730
> URL: https://issues.apache.org/jira/browse/HIVE-7730
> Project: Hive
>  Issue Type: Bug
>Reporter: Xiaomeng Huang
> Attachments: HIVE-7730.001.patch, HIVE-7730.002.patch
>
>
> -Now what we get from HiveSemanticAnalyzerHookContextImpl is limited. If we 
> have hook of HiveSemanticAnalyzerHook, we may want to get more things from 
> hookContext. (e.g. the needed colums from query).-
> -So we should get instance of HiveSemanticAnalyzerHookContext from 
> configuration, extends HiveSemanticAnalyzerHookContext with a new 
> implementation, overide the HiveSemanticAnalyzerHookContext.update() and put 
> what you want to the class.-
> Hive should store accessed columns to ReadEntity when we set 
> HIVE_STATS_COLLECT_SCANCOLS(or we can add a confVar) is true.
> Then external authorization model can get accessed columns when do 
> authorization in compile before execute. Maybe we will remove 
> columnAccessInfo from BaseSemanticAnalyzer, old authorization and 
> AuthorizationModeV2 can get accessed columns from ReadEntity too.
> Here is the quick implement in SemanticAnalyzer.analyzeInternal() below:
> {code}   boolean isColumnInfoNeedForAuth = 
> SessionState.get().isAuthorizationModeV2()
> && HiveConf.getBoolVar(conf, 
> HiveConf.ConfVars.HIVE_AUTHORIZATION_ENABLED);
> if (isColumnInfoNeedForAuth
> || HiveConf.getBoolVar(this.conf, 
> HiveConf.ConfVars.HIVE_STATS_COLLECT_SCANCOLS) == true) {
>   ColumnAccessAnalyzer columnAccessAnalyzer = new 
> ColumnAccessAnalyzer(pCtx);
>   setColumnAccessInfo(columnAccessAnalyzer.analyzeColumnAccess()); 
> }
> compiler.compile(pCtx, rootTasks, inputs, outputs);
> // TODO: 
> // after compile, we can put accessed column list to ReadEntity getting 
> from columnAccessInfo if HIVE_AUTHORIZATION_ENABLED is set true
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7841) Case, When, Lead, Lag UDF is missing annotation

2014-08-21 Thread Laljo John Pullokkaran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-7841:
-

Status: Patch Available  (was: Open)

> Case, When, Lead, Lag UDF is missing annotation
> ---
>
> Key: HIVE-7841
> URL: https://issues.apache.org/jira/browse/HIVE-7841
> Project: Hive
>  Issue Type: Bug
>Reporter: Laljo John Pullokkaran
>Assignee: Laljo John Pullokkaran
> Attachments: HIVE-7841.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7841) Case, When, Lead, Lag UDF is missing annotation

2014-08-21 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106211#comment-14106211
 ] 

Ashutosh Chauhan commented on HIVE-7841:


+1

> Case, When, Lead, Lag UDF is missing annotation
> ---
>
> Key: HIVE-7841
> URL: https://issues.apache.org/jira/browse/HIVE-7841
> Project: Hive
>  Issue Type: Bug
>Reporter: Laljo John Pullokkaran
>Assignee: Laljo John Pullokkaran
> Attachments: HIVE-7841.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 24919: HIVE-7815 : Reduce Side Join with single reducer [Spark Branch]

2014-08-21 Thread Szehon Ho



> On Aug. 21, 2014, 9:41 p.m., Brock Noland wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java, 
> > line 217
> > 
> >
> > I know you didn't write this method but can we remove it?
> 
> Szehon Ho wrote:
> For this one, its actually used so cant be removed, not sure if you meant 
> something else?
> 
> Brock Noland wrote:
> It's a private method which only calls "new Object()" so I was thinking 
> we would delete it and replace it with "new Object()"?
> 
> Brock Noland wrote:
> We can do this, if we decide to, in a follow on...

Yea lets keep it for now, I have no idea in the future if UnionTran will need 
to become more than a marker object.


- Szehon


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24919/#review51222
---


On Aug. 21, 2014, 10:44 p.m., Szehon Ho wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24919/
> ---
> 
> (Updated Aug. 21, 2014, 10:44 p.m.)
> 
> 
> Review request for hive and Brock Noland.
> 
> 
> Bugs: HIVE-7815
> https://issues.apache.org/jira/browse/HIVE-7815
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> This is the first part of the reduce-side join work.  See HIVE-7384 for the 
> overall design doc.
> 
> This patch inserts a UnionTran after the two join inputs, and thus leverages 
> the Union-all code path to run the Spark RDD.  I also made the following 
> changes:
> 
> 1.  Some API cleanup of GraphTran.  Connect will automatically add the child, 
> so no need for multiple calls.
> 2.  Fix a bug in HiveBaseReduceFunction.  HIVE-7652 made the iterator return 
> false after close if there's more rows, so Spark calls hasNext again and 
> close thus gets called twice.  CommonJoinOperator throws exception if close 
> gets called more than once.  So adding a check there. 
> 
> 
> Diffs
> -
> 
>   itests/src/test/resources/testconfiguration.properties 63af01d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/GraphTran.java 03f0ff8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java
>  6568a76 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 
> d16f1be 
>   ql/src/test/results/clientpositive/spark/join0.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/spark/join1.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/spark/join_casesensitive.q.out 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/24919/diff/
> 
> 
> Testing
> ---
> 
> Added three join tests to the TestSparkCliDriver suite.
> 
> 
> Thanks,
> 
> Szehon Ho
> 
>

[jira] [Commented] (HIVE-7815) Reduce Side Join with single reducer [Spark Branch]

2014-08-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106204#comment-14106204
 ] 

Hive QA commented on HIVE-7815:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12663538/HIVE-7815.2-spark.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 5980 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_fs_default_name2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_null
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/78/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/78/console
Test logs: 
http://ec2-54-176-176-199.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-78/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12663538

> Reduce Side Join with single reducer [Spark Branch]
> ---
>
> Key: HIVE-7815
> URL: https://issues.apache.org/jira/browse/HIVE-7815
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-7815-spark.patch, HIVE-7815.2-spark.patch
>
>
> This is the first part of the reduce-side join work, see HIVE-7384 for 
> details.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7841) Case, When, Lead, Lag UDF is missing annotation

2014-08-21 Thread Laljo John Pullokkaran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-7841:
-

Attachment: HIVE-7841.patch

> Case, When, Lead, Lag UDF is missing annotation
> ---
>
> Key: HIVE-7841
> URL: https://issues.apache.org/jira/browse/HIVE-7841
> Project: Hive
>  Issue Type: Bug
>Reporter: Laljo John Pullokkaran
>Assignee: Laljo John Pullokkaran
> Attachments: HIVE-7841.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-7841) Case, When, Lead, Lag UDF is missing annotation

2014-08-21 Thread Laljo John Pullokkaran (JIRA)

Laljo John Pullokkaran created HIVE-7841:


 Summary: Case, When, Lead, Lag UDF is missing annotation
 Key: HIVE-7841
 URL: https://issues.apache.org/jira/browse/HIVE-7841
 Project: Hive
  Issue Type: Bug
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7815) Reduce Side Join with single reducer [Spark Branch]

2014-08-21 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106183#comment-14106183
 ] 

Brock Noland commented on HIVE-7815:


+1 pending tests

> Reduce Side Join with single reducer [Spark Branch]
> ---
>
> Key: HIVE-7815
> URL: https://issues.apache.org/jira/browse/HIVE-7815
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-7815-spark.patch, HIVE-7815.2-spark.patch
>
>
> This is the first part of the reduce-side join work, see HIVE-7384 for 
> details.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 24919: HIVE-7815 : Reduce Side Join with single reducer [Spark Branch]

2014-08-21 Thread Brock Noland



> On Aug. 21, 2014, 9:41 p.m., Brock Noland wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java, 
> > line 217
> > 
> >
> > I know you didn't write this method but can we remove it?
> 
> Szehon Ho wrote:
> For this one, its actually used so cant be removed, not sure if you meant 
> something else?
> 
> Brock Noland wrote:
> It's a private method which only calls "new Object()" so I was thinking 
> we would delete it and replace it with "new Object()"?

We can do this, if we decide to, in a follow on...


- Brock


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24919/#review51222
---


On Aug. 21, 2014, 10:44 p.m., Szehon Ho wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24919/
> ---
> 
> (Updated Aug. 21, 2014, 10:44 p.m.)
> 
> 
> Review request for hive and Brock Noland.
> 
> 
> Bugs: HIVE-7815
> https://issues.apache.org/jira/browse/HIVE-7815
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> This is the first part of the reduce-side join work.  See HIVE-7384 for the 
> overall design doc.
> 
> This patch inserts a UnionTran after the two join inputs, and thus leverages 
> the Union-all code path to run the Spark RDD.  I also made the following 
> changes:
> 
> 1.  Some API cleanup of GraphTran.  Connect will automatically add the child, 
> so no need for multiple calls.
> 2.  Fix a bug in HiveBaseReduceFunction.  HIVE-7652 made the iterator return 
> false after close if there's more rows, so Spark calls hasNext again and 
> close thus gets called twice.  CommonJoinOperator throws exception if close 
> gets called more than once.  So adding a check there. 
> 
> 
> Diffs
> -
> 
>   itests/src/test/resources/testconfiguration.properties 63af01d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/GraphTran.java 03f0ff8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java
>  6568a76 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 
> d16f1be 
>   ql/src/test/results/clientpositive/spark/join0.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/spark/join1.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/spark/join_casesensitive.q.out 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/24919/diff/
> 
> 
> Testing
> ---
> 
> Added three join tests to the TestSparkCliDriver suite.
> 
> 
> Thanks,
> 
> Szehon Ho
> 
>

Re: Review Request 24919: HIVE-7815 : Reduce Side Join with single reducer [Spark Branch]

2014-08-21 Thread Brock Noland


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24919/#review51239
---

Ship it!


Ship It!

- Brock Noland


On Aug. 21, 2014, 10:44 p.m., Szehon Ho wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24919/
> ---
> 
> (Updated Aug. 21, 2014, 10:44 p.m.)
> 
> 
> Review request for hive and Brock Noland.
> 
> 
> Bugs: HIVE-7815
> https://issues.apache.org/jira/browse/HIVE-7815
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> This is the first part of the reduce-side join work.  See HIVE-7384 for the 
> overall design doc.
> 
> This patch inserts a UnionTran after the two join inputs, and thus leverages 
> the Union-all code path to run the Spark RDD.  I also made the following 
> changes:
> 
> 1.  Some API cleanup of GraphTran.  Connect will automatically add the child, 
> so no need for multiple calls.
> 2.  Fix a bug in HiveBaseReduceFunction.  HIVE-7652 made the iterator return 
> false after close if there's more rows, so Spark calls hasNext again and 
> close thus gets called twice.  CommonJoinOperator throws exception if close 
> gets called more than once.  So adding a check there. 
> 
> 
> Diffs
> -
> 
>   itests/src/test/resources/testconfiguration.properties 63af01d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/GraphTran.java 03f0ff8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java
>  6568a76 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 
> d16f1be 
>   ql/src/test/results/clientpositive/spark/join0.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/spark/join1.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/spark/join_casesensitive.q.out 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/24919/diff/
> 
> 
> Testing
> ---
> 
> Added three join tests to the TestSparkCliDriver suite.
> 
> 
> Thanks,
> 
> Szehon Ho
> 
>

[jira] [Commented] (HIVE-4629) HS2 should support an API to retrieve query logs

2014-08-21 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106168#comment-14106168
 ] 

Brock Noland commented on HIVE-4629:


bq. I didn't see this in the patch. Are you referring to something in the 
Thrift IDL file or something else?

You are right...I was confused between ICLIService and TCLIService. The patch 
only adds a new option member to TFetchResultsReq which is exactly what I 
wanted.

Nevermind me :)

> HS2 should support an API to retrieve query logs
> 
>
> Key: HIVE-4629
> URL: https://issues.apache.org/jira/browse/HIVE-4629
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Shreepadma Venugopalan
>Assignee: Dong Chen
> Attachments: HIVE-4629-no_thrift.1.patch, HIVE-4629.1.patch, 
> HIVE-4629.2.patch, HIVE-4629.3.patch.txt, HIVE-4629.4.patch, 
> HIVE-4629.5.patch, HIVE-4629.6.patch, HIVE-4629.7.patch
>
>
> HiveServer2 should support an API to retrieve query logs. This is 
> particularly relevant because HiveServer2 supports async execution but 
> doesn't provide a way to report progress. Providing an API to retrieve query 
> logs will help report progress to the client.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6250) sql std auth - view authorization should not underlying table. More tests and fixes.

2014-08-21 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106156#comment-14106156
 ] 

Thejas M Nair commented on HIVE-6250:
-

bq. After this change, hive.security.authorization.createtable.owner.grants 
becomes ineffective.
Can you elaborate on how this makes it ineffective ?
I am not aware of any issues with 
hive.security.authorization.createtable.owner.grants.


> sql std auth - view authorization should not underlying table. More tests and 
> fixes.
> 
>
> Key: HIVE-6250
> URL: https://issues.apache.org/jira/browse/HIVE-6250
> Project: Hive
>  Issue Type: Sub-task
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.13.0
>
> Attachments: HIVE-6250.1.patch, HIVE-6250.2.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> This patch adds more tests for table and view authorization and also fixes a 
> number of issues found during testing -
> - View authorization should happen on only on the view, and not the 
> underlying table (Change in ReadEntity to indicate if it is a direct/indirect 
> dependency)
> - table owner in metadata should be the user as per SessionState 
> authentication provider
> - added utility function for finding the session state authentication 
> provider user
> - authorization should be based on current roles
> - admin user should have all permissions
> - error message improvements



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7836) Ease-out denominator for multi-attribute join case in statistics annotation

2014-08-21 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106146#comment-14106146
 ] 

Ashutosh Chauhan commented on HIVE-7836:


+1

> Ease-out denominator for multi-attribute join case in statistics annotation
> ---
>
> Key: HIVE-7836
> URL: https://issues.apache.org/jira/browse/HIVE-7836
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor, Statistics
>Affects Versions: 0.14.0
>Reporter: Prasanth J
>Assignee: Prasanth J
> Attachments: HIVE-7836.1.patch
>
>
> In cases where number of relations involved in join is less than the number 
> of join attributes the denominator of join rule can get larger resulting in 
> aggressive row count estimation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6250) sql std auth - view authorization should not underlying table. More tests and fixes.

2014-08-21 Thread Ashu Pachauri (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106142#comment-14106142
 ] 

Ashu Pachauri commented on HIVE-6250:
-

After this change, hive.security.authorization.createtable.owner.grants  
becomes ineffective. Is it fixed in any subsequent change?

> sql std auth - view authorization should not underlying table. More tests and 
> fixes.
> 
>
> Key: HIVE-6250
> URL: https://issues.apache.org/jira/browse/HIVE-6250
> Project: Hive
>  Issue Type: Sub-task
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.13.0
>
> Attachments: HIVE-6250.1.patch, HIVE-6250.2.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> This patch adds more tests for table and view authorization and also fixes a 
> number of issues found during testing -
> - View authorization should happen on only on the view, and not the 
> underlying table (Change in ReadEntity to indicate if it is a direct/indirect 
> dependency)
> - table owner in metadata should be the user as per SessionState 
> authentication provider
> - added utility function for finding the session state authentication 
> provider user
> - authorization should be based on current roles
> - admin user should have all permissions
> - error message improvements



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7836) Ease-out denominator for multi-attribute join case in statistics annotation

2014-08-21 Thread Prasanth J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-7836:
-

Fix Version/s: (was: 0.13.0)

> Ease-out denominator for multi-attribute join case in statistics annotation
> ---
>
> Key: HIVE-7836
> URL: https://issues.apache.org/jira/browse/HIVE-7836
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor, Statistics
>Affects Versions: 0.14.0
>Reporter: Prasanth J
>Assignee: Prasanth J
> Attachments: HIVE-7836.1.patch
>
>
> In cases where number of relations involved in join is less than the number 
> of join attributes the denominator of join rule can get larger resulting in 
> aggressive row count estimation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7836) Ease-out denominator for multi-attribute join case in statistics annotation

2014-08-21 Thread Prasanth J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-7836:
-

Status: Patch Available  (was: Open)

> Ease-out denominator for multi-attribute join case in statistics annotation
> ---
>
> Key: HIVE-7836
> URL: https://issues.apache.org/jira/browse/HIVE-7836
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor, Statistics
>Affects Versions: 0.14.0
>Reporter: Prasanth J
>Assignee: Prasanth J
> Fix For: 0.13.0
>
> Attachments: HIVE-7836.1.patch
>
>
> In cases where number of relations involved in join is less than the number 
> of join attributes the denominator of join rule can get larger resulting in 
> aggressive row count estimation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 24919: HIVE-7815 : Reduce Side Join with single reducer [Spark Branch]

2014-08-21 Thread Brock Noland



> On Aug. 21, 2014, 9:41 p.m., Brock Noland wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java, 
> > line 217
> > 
> >
> > I know you didn't write this method but can we remove it?
> 
> Szehon Ho wrote:
> For this one, its actually used so cant be removed, not sure if you meant 
> something else?

It's a private method which only calls "new Object()" so I was thinking we 
would delete it and replace it with "new Object()"?


- Brock


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24919/#review51222
---


On Aug. 21, 2014, 10:44 p.m., Szehon Ho wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24919/
> ---
> 
> (Updated Aug. 21, 2014, 10:44 p.m.)
> 
> 
> Review request for hive and Brock Noland.
> 
> 
> Bugs: HIVE-7815
> https://issues.apache.org/jira/browse/HIVE-7815
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> This is the first part of the reduce-side join work.  See HIVE-7384 for the 
> overall design doc.
> 
> This patch inserts a UnionTran after the two join inputs, and thus leverages 
> the Union-all code path to run the Spark RDD.  I also made the following 
> changes:
> 
> 1.  Some API cleanup of GraphTran.  Connect will automatically add the child, 
> so no need for multiple calls.
> 2.  Fix a bug in HiveBaseReduceFunction.  HIVE-7652 made the iterator return 
> false after close if there's more rows, so Spark calls hasNext again and 
> close thus gets called twice.  CommonJoinOperator throws exception if close 
> gets called more than once.  So adding a check there. 
> 
> 
> Diffs
> -
> 
>   itests/src/test/resources/testconfiguration.properties 63af01d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/GraphTran.java 03f0ff8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java
>  6568a76 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 
> d16f1be 
>   ql/src/test/results/clientpositive/spark/join0.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/spark/join1.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/spark/join_casesensitive.q.out 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/24919/diff/
> 
> 
> Testing
> ---
> 
> Added three join tests to the TestSparkCliDriver suite.
> 
> 
> Thanks,
> 
> Szehon Ho
> 
>

[jira] [Commented] (HIVE-7840) Generated hive-default.xml.template mistakenly refers to property "name"s as "key"s

2014-08-21 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106126#comment-14106126
 ] 

Brock Noland commented on HIVE-7840:


Nice find! I was quite confused :) +1 pending tests

> Generated hive-default.xml.template mistakenly refers to property "name"s as 
> "key"s
> ---
>
> Key: HIVE-7840
> URL: https://issues.apache.org/jira/browse/HIVE-7840
> Project: Hive
>  Issue Type: Bug
>Reporter: Wilbur Yang
>Assignee: Wilbur Yang
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-7840.patch
>
>
> When Hive is built with Maven, the default template for hive-site.xml 
> (hive/packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/apache-hive-0.14.0-SNAPSHOT-bin/conf/hive-default.xml.template)
>  uses the  tag as opposed to the correct  tag. If a user were to 
> create a custom hive-site.xml using this template, then it results in a 
> rather confusing situation in which Hive logs that it has loaded 
> hive-site.xml, but in reality none of those properties are registering 
> correctly.
> *Wrong:*
> {quote}
> 
>   ...
>   
> hive.exec.script.wrapper
> 
> 
>   
>   ...
> {quote}
> *Right:*
> {quote}
> 
>   ...
>   
> hive.exec.script.wrapper
> 
> 
>   
>   ...
> {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7840) Generated hive-default.xml.template mistakenly refers to property "name"s as "key"s

2014-08-21 Thread Wilbur Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilbur Yang updated HIVE-7840:
--

Description: 
When Hive is built with Maven, the default template for hive-site.xml 
(hive/packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/apache-hive-0.14.0-SNAPSHOT-bin/conf/hive-default.xml.template)
 uses the  tag as opposed to the correct  tag. If a user were to 
create a custom hive-site.xml using this template, then it results in a rather 
confusing situation in which Hive logs that it has loaded hive-site.xml, but in 
reality none of those properties are registering correctly.

*Wrong:*

{quote}

  ...
  
hive.exec.script.wrapper


  
  ...
{quote}

*Right:*

{quote}

  ...
  
hive.exec.script.wrapper


  
  ...
{quote}

  was:When Hive is built with Maven, the default template for hive-site.xml 
(hive/packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/apache-hive-0.14.0-SNAPSHOT-bin/conf/hive-default.xml.template)
 uses the  tag as opposed to the correct  tag. If a user were to 
create a custom hive-site.xml using this template, then it results in a rather 
confusing situation in which Hive logs that it has loaded hive-site.xml, but in 
reality none of those properties are registering.


> Generated hive-default.xml.template mistakenly refers to property "name"s as 
> "key"s
> ---
>
> Key: HIVE-7840
> URL: https://issues.apache.org/jira/browse/HIVE-7840
> Project: Hive
>  Issue Type: Bug
>Reporter: Wilbur Yang
>Assignee: Wilbur Yang
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-7840.patch
>
>
> When Hive is built with Maven, the default template for hive-site.xml 
> (hive/packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/apache-hive-0.14.0-SNAPSHOT-bin/conf/hive-default.xml.template)
>  uses the  tag as opposed to the correct  tag. If a user were to 
> create a custom hive-site.xml using this template, then it results in a 
> rather confusing situation in which Hive logs that it has loaded 
> hive-site.xml, but in reality none of those properties are registering 
> correctly.
> *Wrong:*
> {quote}
> 
>   ...
>   
> hive.exec.script.wrapper
> 
> 
>   
>   ...
> {quote}
> *Right:*
> {quote}
> 
>   ...
>   
> hive.exec.script.wrapper
> 
> 
>   
>   ...
> {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7839) Update union_null results now that it's deterministic

2014-08-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106103#comment-14106103
 ] 

Hive QA commented on HIVE-7839:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12663518/HIVE-7839.1-spark.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 5977 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_fs_default_name2
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/77/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/77/console
Test logs: 
http://ec2-54-176-176-199.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-77/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12663518

> Update union_null results now that it's deterministic
> -
>
> Key: HIVE-7839
> URL: https://issues.apache.org/jira/browse/HIVE-7839
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-7839.1-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7812) Disable CombineHiveInputFormat when ACID format is used

2014-08-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106100#comment-14106100
 ] 

Hive QA commented on HIVE-7812:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12663421/HIVE-7812.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 6113 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_join
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/445/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/445/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-445/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12663421

> Disable CombineHiveInputFormat when ACID format is used
> ---
>
> Key: HIVE-7812
> URL: https://issues.apache.org/jira/browse/HIVE-7812
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-7812.patch, HIVE-7812.patch
>
>
> Currently the HiveCombineInputFormat complains when called on an ACID 
> directory. Modify HiveCombineInputFormat so that HiveInputFormat is used 
> instead if the directory is ACID format.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-4629) HS2 should support an API to retrieve query logs

2014-08-21 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106098#comment-14106098
 ] 

Carl Steinbach commented on HIVE-4629:
--

bq. I tried posting this on RB but it went down. Thank you very much for 
removing the thrift enum compatibility problem! I had another comment with 
regards to the method signature which I think I did not explain well. I think 
the new method should be...

[~brocknoland], I totally agree with this, but I didn't see this in the patch. 
Are you referring to something in the Thrift IDL file or something else?

> HS2 should support an API to retrieve query logs
> 
>
> Key: HIVE-4629
> URL: https://issues.apache.org/jira/browse/HIVE-4629
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Shreepadma Venugopalan
>Assignee: Dong Chen
> Attachments: HIVE-4629-no_thrift.1.patch, HIVE-4629.1.patch, 
> HIVE-4629.2.patch, HIVE-4629.3.patch.txt, HIVE-4629.4.patch, 
> HIVE-4629.5.patch, HIVE-4629.6.patch, HIVE-4629.7.patch
>
>
> HiveServer2 should support an API to retrieve query logs. This is 
> particularly relevant because HiveServer2 supports async execution but 
> doesn't provide a way to report progress. Providing an API to retrieve query 
> logs will help report progress to the client.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7815) Reduce Side Join with single reducer [Spark Branch]

2014-08-21 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106097#comment-14106097
 ] 

Szehon Ho commented on HIVE-7815:
-

Thanks Brock, new patch with more cleanup.

> Reduce Side Join with single reducer [Spark Branch]
> ---
>
> Key: HIVE-7815
> URL: https://issues.apache.org/jira/browse/HIVE-7815
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-7815-spark.patch, HIVE-7815.2-spark.patch
>
>
> This is the first part of the reduce-side join work, see HIVE-7384 for 
> details.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 24293: HIVE-4629: HS2 should support an API to retrieve query logs

2014-08-21 Thread Carl Steinbach



> On Aug. 5, 2014, 8:56 a.m., Lars Francke wrote:
> > service/if/TCLIService.thrift, line 1043
> > 
> >
> > I know that no one else does it yet in this file and I haven't gotten 
> > around to finishing my patch.
> > 
> > But could you use this style of comments instead:
> > 
> > /** Get the output result of a query. */
> > 
> > Thank you! That will be automatically moved into a comment section 
> > (python, javadoc etc.) by the Thrift compiler.
> 
> Dong Chen wrote:
> Thanks for you reminding. This comment style makes the generated code 
> look better.
> Not sure whether you are working on changing all the comment style in 
> TCLIService.thrift file. So I just change the 3 comments related with this 
> fix. 
> If not, I'm glad to make the changes of all the comments in the thrift 
> through this patch or another new Jira.

This sounds like a nice feature to exploit, but I think it should be saved for 
a separate patch that updates all of the comments at once.


- Carl


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24293/#review49573
---


On Aug. 14, 2014, 3:09 p.m., Dong Chen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24293/
> ---
> 
> (Updated Aug. 14, 2014, 3:09 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-4629: HS2 should support an API to retrieve query logs
> HiveServer2 should support an API to retrieve query logs. This is 
> particularly relevant because HiveServer2 supports async execution but 
> doesn't provide a way to report progress. Providing an API to retrieve query 
> logs will help report progress to the client.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3bfc681 
>   service/if/TCLIService.thrift 80086b4 
>   service/src/gen/thrift/gen-cpp/TCLIService_types.h 1b37fb5 
>   service/src/gen/thrift/gen-cpp/TCLIService_types.cpp d5f98a8 
>   
> service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchResultsReq.java
>  808b73f 
>   service/src/gen/thrift/gen-py/TCLIService/ttypes.py 2cbbdd8 
>   service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb 93f9a81 
>   service/src/java/org/apache/hive/service/cli/CLIService.java add37a1 
>   service/src/java/org/apache/hive/service/cli/CLIServiceClient.java 87c10b9 
>   service/src/java/org/apache/hive/service/cli/EmbeddedCLIServiceClient.java 
> f665146 
>   service/src/java/org/apache/hive/service/cli/FetchType.java PRE-CREATION 
>   service/src/java/org/apache/hive/service/cli/ICLIService.java c569796 
>   
> service/src/java/org/apache/hive/service/cli/operation/GetCatalogsOperation.java
>  c9fd5f9 
>   
> service/src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java
>  caf413d 
>   
> service/src/java/org/apache/hive/service/cli/operation/GetFunctionsOperation.java
>  fd4e94d 
>   
> service/src/java/org/apache/hive/service/cli/operation/GetSchemasOperation.java
>  ebca996 
>   
> service/src/java/org/apache/hive/service/cli/operation/GetTableTypesOperation.java
>  05991e0 
>   
> service/src/java/org/apache/hive/service/cli/operation/GetTablesOperation.java
>  315dbea 
>   
> service/src/java/org/apache/hive/service/cli/operation/GetTypeInfoOperation.java
>  0ec2543 
>   
> service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java
>  3d3fddc 
>   
> service/src/java/org/apache/hive/service/cli/operation/LogDivertAppender.java 
> PRE-CREATION 
>   
> service/src/java/org/apache/hive/service/cli/operation/MetadataOperation.java 
> e0d17a1 
>   service/src/java/org/apache/hive/service/cli/operation/Operation.java 
> 45fbd61 
>   service/src/java/org/apache/hive/service/cli/operation/OperationLog.java 
> PRE-CREATION 
>   
> service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 
> 21c33bc 
>   service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 
> de54ca1 
>   service/src/java/org/apache/hive/service/cli/session/HiveSession.java 
> 9785e95 
>   service/src/java/org/apache/hive/service/cli/session/HiveSessionBase.java 
> 4c3164e 
>   service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
> b39d64d 
>   service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
> 816bea4 
>   service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
> 5c87bcb 
>   
> service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIServiceClient.java
>  e3384d3 
>   
> service/src/test/org/apache/hive/service/cli/operation/TestOperationLoggingAPI.java
>  PRE-CREA

[jira] [Updated] (HIVE-7840) Generated hive-default.xml.template mistakenly refers to property "name"s as "key"s

2014-08-21 Thread Wilbur Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilbur Yang updated HIVE-7840:
--

Status: Patch Available  (was: Open)

> Generated hive-default.xml.template mistakenly refers to property "name"s as 
> "key"s
> ---
>
> Key: HIVE-7840
> URL: https://issues.apache.org/jira/browse/HIVE-7840
> Project: Hive
>  Issue Type: Bug
>Reporter: Wilbur Yang
>Assignee: Wilbur Yang
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-7840.patch
>
>
> When Hive is built with Maven, the default template for hive-site.xml 
> (hive/packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/apache-hive-0.14.0-SNAPSHOT-bin/conf/hive-default.xml.template)
>  uses the  tag as opposed to the correct  tag. If a user were to 
> create a custom hive-site.xml using this template, then it results in a 
> rather confusing situation in which Hive logs that it has loaded 
> hive-site.xml, but in reality none of those properties are registering.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7815) Reduce Side Join with single reducer [Spark Branch]

2014-08-21 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-7815:


Attachment: HIVE-7815.2-spark.patch

> Reduce Side Join with single reducer [Spark Branch]
> ---
>
> Key: HIVE-7815
> URL: https://issues.apache.org/jira/browse/HIVE-7815
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-7815-spark.patch, HIVE-7815.2-spark.patch
>
>
> This is the first part of the reduce-side join work, see HIVE-7384 for 
> details.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7840) Generated hive-default.xml.template mistakenly refers to property "name"s as "key"s

2014-08-21 Thread Wilbur Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilbur Yang updated HIVE-7840:
--

Attachment: HIVE-7840.patch

> Generated hive-default.xml.template mistakenly refers to property "name"s as 
> "key"s
> ---
>
> Key: HIVE-7840
> URL: https://issues.apache.org/jira/browse/HIVE-7840
> Project: Hive
>  Issue Type: Bug
>Reporter: Wilbur Yang
>Assignee: Wilbur Yang
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-7840.patch
>
>
> When Hive is built with Maven, the default template for hive-site.xml 
> (hive/packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/apache-hive-0.14.0-SNAPSHOT-bin/conf/hive-default.xml.template)
>  uses the  tag as opposed to the correct  tag. If a user were to 
> create a custom hive-site.xml using this template, then it results in a 
> rather confusing situation in which Hive logs that it has loaded 
> hive-site.xml, but in reality none of those properties are registering.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 24919: HIVE-7815 : Reduce Side Join with single reducer [Spark Branch]

2014-08-21 Thread Szehon Ho



> On Aug. 21, 2014, 9:41 p.m., Brock Noland wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java, 
> > line 217
> > 
> >
> > I know you didn't write this method but can we remove it?

For this one, its actually used so cant be removed, not sure if you meant 
something else?


- Szehon


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24919/#review51222
---


On Aug. 21, 2014, 10:44 p.m., Szehon Ho wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24919/
> ---
> 
> (Updated Aug. 21, 2014, 10:44 p.m.)
> 
> 
> Review request for hive and Brock Noland.
> 
> 
> Bugs: HIVE-7815
> https://issues.apache.org/jira/browse/HIVE-7815
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> This is the first part of the reduce-side join work.  See HIVE-7384 for the 
> overall design doc.
> 
> This patch inserts a UnionTran after the two join inputs, and thus leverages 
> the Union-all code path to run the Spark RDD.  I also made the following 
> changes:
> 
> 1.  Some API cleanup of GraphTran.  Connect will automatically add the child, 
> so no need for multiple calls.
> 2.  Fix a bug in HiveBaseReduceFunction.  HIVE-7652 made the iterator return 
> false after close if there's more rows, so Spark calls hasNext again and 
> close thus gets called twice.  CommonJoinOperator throws exception if close 
> gets called more than once.  So adding a check there. 
> 
> 
> Diffs
> -
> 
>   itests/src/test/resources/testconfiguration.properties 63af01d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/GraphTran.java 03f0ff8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java
>  6568a76 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 
> d16f1be 
>   ql/src/test/results/clientpositive/spark/join0.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/spark/join1.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/spark/join_casesensitive.q.out 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/24919/diff/
> 
> 
> Testing
> ---
> 
> Added three join tests to the TestSparkCliDriver suite.
> 
> 
> Thanks,
> 
> Szehon Ho
> 
>

Re: Review Request 24919: HIVE-7815 : Reduce Side Join with single reducer [Spark Branch]

2014-08-21 Thread Szehon Ho


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24919/
---

(Updated Aug. 21, 2014, 10:44 p.m.)


Review request for hive and Brock Noland.


Changes
---

Thanks Brock for the suggestion.  Nope I dont mind, happy to do more unrelated 
cleanup of that class.


Bugs: HIVE-7815
https://issues.apache.org/jira/browse/HIVE-7815


Repository: hive-git


Description
---

This is the first part of the reduce-side join work.  See HIVE-7384 for the 
overall design doc.

This patch inserts a UnionTran after the two join inputs, and thus leverages 
the Union-all code path to run the Spark RDD.  I also made the following 
changes:

1.  Some API cleanup of GraphTran.  Connect will automatically add the child, 
so no need for multiple calls.
2.  Fix a bug in HiveBaseReduceFunction.  HIVE-7652 made the iterator return 
false after close if there's more rows, so Spark calls hasNext again and close 
thus gets called twice.  CommonJoinOperator throws exception if close gets 
called more than once.  So adding a check there. 


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties 63af01d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/GraphTran.java 03f0ff8 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java
 6568a76 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 
d16f1be 
  ql/src/test/results/clientpositive/spark/join0.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/spark/join1.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/spark/join_casesensitive.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/24919/diff/


Testing
---

Added three join tests to the TestSparkCliDriver suite.


Thanks,

Szehon Ho

[jira] [Created] (HIVE-7840) Generated hive-default.xml.template mistakenly refers to property "name"s as "key"s

2014-08-21 Thread Wilbur Yang (JIRA)

Wilbur Yang created HIVE-7840:
-

 Summary: Generated hive-default.xml.template mistakenly refers to 
property "name"s as "key"s
 Key: HIVE-7840
 URL: https://issues.apache.org/jira/browse/HIVE-7840
 Project: Hive
  Issue Type: Bug
Reporter: Wilbur Yang
Assignee: Wilbur Yang
Priority: Minor
 Fix For: 0.14.0


When Hive is built with Maven, the default template for hive-site.xml 
(hive/packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/apache-hive-0.14.0-SNAPSHOT-bin/conf/hive-default.xml.template)
 uses the  tag as opposed to the correct  tag. If a user were to 
create a custom hive-site.xml using this template, then it results in a rather 
confusing situation in which Hive logs that it has loaded hive-site.xml, but in 
reality none of those properties are registering.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 24293: HIVE-4629: HS2 should support an API to retrieve query logs

2014-08-21 Thread Carl Steinbach


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24293/#review51230
---



service/if/TCLIService.thrift


Please use // for comments (like the rest of the file).


- Carl Steinbach


On Aug. 14, 2014, 3:09 p.m., Dong Chen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24293/
> ---
> 
> (Updated Aug. 14, 2014, 3:09 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-4629: HS2 should support an API to retrieve query logs
> HiveServer2 should support an API to retrieve query logs. This is 
> particularly relevant because HiveServer2 supports async execution but 
> doesn't provide a way to report progress. Providing an API to retrieve query 
> logs will help report progress to the client.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3bfc681 
>   service/if/TCLIService.thrift 80086b4 
>   service/src/gen/thrift/gen-cpp/TCLIService_types.h 1b37fb5 
>   service/src/gen/thrift/gen-cpp/TCLIService_types.cpp d5f98a8 
>   
> service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchResultsReq.java
>  808b73f 
>   service/src/gen/thrift/gen-py/TCLIService/ttypes.py 2cbbdd8 
>   service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb 93f9a81 
>   service/src/java/org/apache/hive/service/cli/CLIService.java add37a1 
>   service/src/java/org/apache/hive/service/cli/CLIServiceClient.java 87c10b9 
>   service/src/java/org/apache/hive/service/cli/EmbeddedCLIServiceClient.java 
> f665146 
>   service/src/java/org/apache/hive/service/cli/FetchType.java PRE-CREATION 
>   service/src/java/org/apache/hive/service/cli/ICLIService.java c569796 
>   
> service/src/java/org/apache/hive/service/cli/operation/GetCatalogsOperation.java
>  c9fd5f9 
>   
> service/src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java
>  caf413d 
>   
> service/src/java/org/apache/hive/service/cli/operation/GetFunctionsOperation.java
>  fd4e94d 
>   
> service/src/java/org/apache/hive/service/cli/operation/GetSchemasOperation.java
>  ebca996 
>   
> service/src/java/org/apache/hive/service/cli/operation/GetTableTypesOperation.java
>  05991e0 
>   
> service/src/java/org/apache/hive/service/cli/operation/GetTablesOperation.java
>  315dbea 
>   
> service/src/java/org/apache/hive/service/cli/operation/GetTypeInfoOperation.java
>  0ec2543 
>   
> service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java
>  3d3fddc 
>   
> service/src/java/org/apache/hive/service/cli/operation/LogDivertAppender.java 
> PRE-CREATION 
>   
> service/src/java/org/apache/hive/service/cli/operation/MetadataOperation.java 
> e0d17a1 
>   service/src/java/org/apache/hive/service/cli/operation/Operation.java 
> 45fbd61 
>   service/src/java/org/apache/hive/service/cli/operation/OperationLog.java 
> PRE-CREATION 
>   
> service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 
> 21c33bc 
>   service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 
> de54ca1 
>   service/src/java/org/apache/hive/service/cli/session/HiveSession.java 
> 9785e95 
>   service/src/java/org/apache/hive/service/cli/session/HiveSessionBase.java 
> 4c3164e 
>   service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
> b39d64d 
>   service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
> 816bea4 
>   service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
> 5c87bcb 
>   
> service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIServiceClient.java
>  e3384d3 
>   
> service/src/test/org/apache/hive/service/cli/operation/TestOperationLoggingAPI.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/24293/diff/
> 
> 
> Testing
> ---
> 
> UT passed.
> 
> 
> Thanks,
> 
> Dong Chen
> 
>

[jira] [Updated] (HIVE-7836) Ease-out denominator for multi-attribute join case in statistics annotation

2014-08-21 Thread Prasanth J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-7836:
-

Attachment: HIVE-7836.1.patch

> Ease-out denominator for multi-attribute join case in statistics annotation
> ---
>
> Key: HIVE-7836
> URL: https://issues.apache.org/jira/browse/HIVE-7836
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor, Statistics
>Affects Versions: 0.14.0
>Reporter: Prasanth J
>Assignee: Prasanth J
> Fix For: 0.13.0
>
> Attachments: HIVE-7836.1.patch
>
>
> In cases where number of relations involved in join is less than the number 
> of join attributes the denominator of join rule can get larger resulting in 
> aggressive row count estimation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7839) Update union_null results now that it's deterministic

2014-08-21 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7839:
---

Attachment: (was: HIVE-7839.patch)

> Update union_null results now that it's deterministic
> -
>
> Key: HIVE-7839
> URL: https://issues.apache.org/jira/browse/HIVE-7839
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-7839.1-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7839) Update union_null results now that it's deterministic

2014-08-21 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7839:
---

Attachment: HIVE-7839.1-spark.patch

> Update union_null results now that it's deterministic
> -
>
> Key: HIVE-7839
> URL: https://issues.apache.org/jira/browse/HIVE-7839
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-7839.1-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7839) Update union_null results now that it's deterministic

2014-08-21 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7839:
---

Attachment: HIVE-7839.patch

> Update union_null results now that it's deterministic
> -
>
> Key: HIVE-7839
> URL: https://issues.apache.org/jira/browse/HIVE-7839
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
> Attachments: HIVE-7839.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7839) Update union_null results now that it's deterministic

2014-08-21 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7839:
---

Assignee: Brock Noland
  Status: Patch Available  (was: Open)

> Update union_null results now that it's deterministic
> -
>
> Key: HIVE-7839
> URL: https://issues.apache.org/jira/browse/HIVE-7839
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-7839.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

1 2 3 >

1 - 100 of 223 matches

Mail list logo