[jira] [Commented] (HIVE-6831) The job schedule in condition task could not be correct with skewed join optimization

2014-04-07 Thread william zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962637#comment-13962637
 ] 

william zhu commented on HIVE-6831:
---

The problem is ,the union operate will be resolved two steps(A,B). The two 
steps have one child steps(C) ,C will aggregate A and B steps.
And in skew join condition task , c will be selected without any check that its 
parent steps(A,B) have been completed.

> The job schedule in condition task could not be correct with skewed join 
> optimization
> -
>
> Key: HIVE-6831
> URL: https://issues.apache.org/jira/browse/HIVE-6831
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0
> Environment: Hive 0.11.0
>Reporter: william zhu
> Attachments: 6831.patch
>
>
> Code snippet in  ConditionalTask.java as bellow:  
> // resolved task
> if (driverContext.addToRunnable(tsk)) {
>   console.printInfo(tsk.getId() + " is selected by condition 
> resolver.");
> }
> The selected task is added into the runnable queue immediately without any 
> dependency checking. If the selected task is original task ,and its parent 
> task is not being executed, then the result will be incorrect.
> Like this:
> 1. Before skew join optimization:
> Step1 ,Step 2 <-- step 3   ( Step1 and Step2 is Step 3's parent)
> 2. after skew join optimization:
> Step1 <- Step4 (ConditionTask)<- consists of [Step3,Step10]
> Step2 <- Step5 (ConditionTask)<- consists of [Step3,Step11]
> 3. Runing
> Step3 is selected in Step4 and Step5
> Step3 will be execute immediately after Step4 , its not correct.
> Step3 will be execute after Step5 again, its not correct either.
> 4. The correct scheduler is that step3 will be execute after step4 and step5.
> 5. So, I add a checking operate in the snippet  as bellow:
> if (!driverContext.getRunnable().contains(tsk)) {
>   console.printInfo(tsk.getId() + " is selected by condition 
> resolver.");
>   if(DriverContext.isLaunchable(tsk)){
> driverContext.addToRunnable(tsk);
>   }
> }
> So , that is work right for me in my enviroment. I am not sure whether it 
> will has some problems  in someother condition. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6831) The job schedule in condition task could not be correct with skewed join optimization

2014-04-07 Thread william zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962630#comment-13962630
 ] 

william zhu commented on HIVE-6831:
---

1. The query can be simplied like this:
Select * from 
(select   *  from   TableA  Union all select * from TableB)  a
2.  TableA and TableB is consists of other select query.  
3.  And I set the hive parameter is : 
set hive.auto.convert.join=false;
set hive.optimize.skewjoin = true;
set hive.skewjoin.key = 50;
set hive.mapjoin.smalltable.filesize=5000;



> The job schedule in condition task could not be correct with skewed join 
> optimization
> -
>
> Key: HIVE-6831
> URL: https://issues.apache.org/jira/browse/HIVE-6831
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0
> Environment: Hive 0.11.0
>Reporter: william zhu
> Attachments: 6831.patch
>
>
> Code snippet in  ConditionalTask.java as bellow:  
> // resolved task
> if (driverContext.addToRunnable(tsk)) {
>   console.printInfo(tsk.getId() + " is selected by condition 
> resolver.");
> }
> The selected task is added into the runnable queue immediately without any 
> dependency checking. If the selected task is original task ,and its parent 
> task is not being executed, then the result will be incorrect.
> Like this:
> 1. Before skew join optimization:
> Step1 ,Step 2 <-- step 3   ( Step1 and Step2 is Step 3's parent)
> 2. after skew join optimization:
> Step1 <- Step4 (ConditionTask)<- consists of [Step3,Step10]
> Step2 <- Step5 (ConditionTask)<- consists of [Step3,Step11]
> 3. Runing
> Step3 is selected in Step4 and Step5
> Step3 will be execute immediately after Step4 , its not correct.
> Step3 will be execute after Step5 again, its not correct either.
> 4. The correct scheduler is that step3 will be execute after step4 and step5.
> 5. So, I add a checking operate in the snippet  as bellow:
> if (!driverContext.getRunnable().contains(tsk)) {
>   console.printInfo(tsk.getId() + " is selected by condition 
> resolver.");
>   if(DriverContext.isLaunchable(tsk)){
> driverContext.addToRunnable(tsk);
>   }
> }
> So , that is work right for me in my enviroment. I am not sure whether it 
> will has some problems  in someother condition. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6865) Failed to load data into Hive from Pig using HCatStorer()

2014-04-07 Thread Bing Li (JIRA)
Bing Li created HIVE-6865:
-

 Summary: Failed to load data into Hive from Pig using HCatStorer()
 Key: HIVE-6865
 URL: https://issues.apache.org/jira/browse/HIVE-6865
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Bing Li
Assignee: Bing Li


Reproduce steps:
1. create a hive table
hive> create table t1 (c1 int, c2 int, c3 int);

2. start pig shell
grunt> register $HIVE_HOME/lib/*.jar
grunt> register $HIVE_HOME/hcatalog/share/hcatalog/*.jar
grunt> A = load 'pig.txt' as (c1:int, c2:int, c3:int)
grunt> store A into 't1' using org.apache.hive.hcatalog.HCatSrorer();

Error Message:
ERROR [main] org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2997: 
Unable to recreate exception from backend error: 
org.apache.hcatalog.common.HCatException : 2004 : HCatOutputFormat not 
initialized, setOutput has to be called
at 
org.apache.hcatalog.mapreduce.HCatBaseOutputFormat.getJobInfo(HCatBaseOutputFormat.java:111)
at 
org.apache.hcatalog.mapreduce.HCatBaseOutputFormat.getJobInfo(HCatBaseOutputFormat.java:97)
at 
org.apache.hcatalog.mapreduce.HCatBaseOutputFormat.getOutputFormat(HCatBaseOutputFormat.java:85)
at 
org.apache.hcatalog.mapreduce.HCatBaseOutputFormat.checkOutputSpecs(HCatBaseOutputFormat.java:75)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.checkOutputSpecsHelper(PigOutputFormat.java:207)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.checkOutputSpecs(PigOutputFormat.java:187)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:1000)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:963)
at 
java.security.AccessController.doPrivileged(AccessController.java:310)
at javax.security.auth.Subject.doAs(Subject.java:573)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1502)
at 
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:963)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:616)
at 
org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:336)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
at java.lang.reflect.Method.invoke(Method.java:611)
at 
org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128)
at 
org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:191)
at java.lang.Thread.run(Thread.java:738)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:270)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6831) The job schedule in condition task could not be correct with skewed join optimization

2014-04-07 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962616#comment-13962616
 ] 

Navis commented on HIVE-6831:
-

Could you provide the query which induces the situation you've described?

> The job schedule in condition task could not be correct with skewed join 
> optimization
> -
>
> Key: HIVE-6831
> URL: https://issues.apache.org/jira/browse/HIVE-6831
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0
> Environment: Hive 0.11.0
>Reporter: william zhu
> Attachments: 6831.patch
>
>
> Code snippet in  ConditionalTask.java as bellow:  
> // resolved task
> if (driverContext.addToRunnable(tsk)) {
>   console.printInfo(tsk.getId() + " is selected by condition 
> resolver.");
> }
> The selected task is added into the runnable queue immediately without any 
> dependency checking. If the selected task is original task ,and its parent 
> task is not being executed, then the result will be incorrect.
> Like this:
> 1. Before skew join optimization:
> Step1 ,Step 2 <-- step 3   ( Step1 and Step2 is Step 3's parent)
> 2. after skew join optimization:
> Step1 <- Step4 (ConditionTask)<- consists of [Step3,Step10]
> Step2 <- Step5 (ConditionTask)<- consists of [Step3,Step11]
> 3. Runing
> Step3 is selected in Step4 and Step5
> Step3 will be execute immediately after Step4 , its not correct.
> Step3 will be execute after Step5 again, its not correct either.
> 4. The correct scheduler is that step3 will be execute after step4 and step5.
> 5. So, I add a checking operate in the snippet  as bellow:
> if (!driverContext.getRunnable().contains(tsk)) {
>   console.printInfo(tsk.getId() + " is selected by condition 
> resolver.");
>   if(DriverContext.isLaunchable(tsk)){
> driverContext.addToRunnable(tsk);
>   }
> }
> So , that is work right for me in my enviroment. I am not sure whether it 
> will has some problems  in someother condition. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6864) HiveServer2 unsecured mode concurrency errors

2014-04-07 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-6864:
--

 Summary: HiveServer2 unsecured mode concurrency errors
 Key: HIVE-6864
 URL: https://issues.apache.org/jira/browse/HIVE-6864
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0


Concurrent queries create table with wrong ownership



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6831) The job schedule in condition task could not be correct with skewed join optimization

2014-04-07 Thread william zhu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

william zhu updated HIVE-6831:
--

Description: 
Code snippet in  ConditionalTask.java as bellow:  

// resolved task
if (driverContext.addToRunnable(tsk)) {
  console.printInfo(tsk.getId() + " is selected by condition 
resolver.");
}

The selected task is added into the runnable queue immediately without any 
dependency checking. If the selected task is original task ,and its parent task 
is not being executed, then the result will be incorrect.

Like this:
1. Before skew join optimization:
Step1 ,Step 2 <-- step 3   ( Step1 and Step2 is Step 3's parent)
2. after skew join optimization:
Step1 <- Step4 (ConditionTask)<- consists of [Step3,Step10]
Step2 <- Step5 (ConditionTask)<- consists of [Step3,Step11]
3. Runing
Step3 is selected in Step4 and Step5
Step3 will be execute immediately after Step4 , its not correct.
Step3 will be execute after Step5 again, its not correct either.
4. The correct scheduler is that step3 will be execute after step4 and step5.
5. So, I add a checking operate in the snippet  as bellow:

if (!driverContext.getRunnable().contains(tsk)) {
  console.printInfo(tsk.getId() + " is selected by condition 
resolver.");
  if(DriverContext.isLaunchable(tsk)){
  driverContext.addToRunnable(tsk);
  }
}

So , that is work right for me in my enviroment. I am not sure whether it will 
has some problems  in someother condition. 




  was:
Code snippet in  ConditionalTask.java as bellow:  

// resolved task
if (driverContext.addToRunnable(tsk)) {
  console.printInfo(tsk.getId() + " is selected by condition 
resolver.");
}

The selected task is added into the runnable queue immediately without any 
dependency checking. If the selected task is original task ,and its parent task 
is not being executed, then the result will be incorrect.

Like this:
1. Before skew join optimization:
Step1 ,Step 2 <-- step 3   ( Step1 and Step2 is Step 3's parent)
2. after skew join optimization:
Step1 <- Step4 (ConditionTask)<- consists of [Step3,Step10]
Step2 <- Step5 (ConditionTask)<- consists of [Step3,Step11]
3. Runing
Step3 is selected in Step4 and Step5
Step3 will be execute immediately after Step4 , its not correct.
Step3 will be execute after Step5 again, its not correct either.
4. The correct scheduler is that step3 will be execute after step4 and step5.
5. So, I add a checking operate in the snippet  as bellow:

if (!driverContext.getRunnable().contains(tsk)) {
  console.printInfo(tsk.getId() + " is selected by condition 
resolver.");
  if(DriverContext.isLaunchable(tsk)){
 // run the original task now
  driverContext.addToRunnable(tsk);
  }
}

So , that is work right for me in my enviroment. I am not sure whether it will 
has some problems  in someother condition. 





> The job schedule in condition task could not be correct with skewed join 
> optimization
> -
>
> Key: HIVE-6831
> URL: https://issues.apache.org/jira/browse/HIVE-6831
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0
> Environment: Hive 0.11.0
>Reporter: william zhu
> Attachments: 6831.patch
>
>
> Code snippet in  ConditionalTask.java as bellow:  
> // resolved task
> if (driverContext.addToRunnable(tsk)) {
>   console.printInfo(tsk.getId() + " is selected by condition 
> resolver.");
> }
> The selected task is added into the runnable queue immediately without any 
> dependency checking. If the selected task is original task ,and its parent 
> task is not being executed, then the result will be incorrect.
> Like this:
> 1. Before skew join optimization:
> Step1 ,Step 2 <-- step 3   ( Step1 and Step2 is Step 3's parent)
> 2. after skew join optimization:
> Step1 <- Step4 (ConditionTask)<- consists of [Step3,Step10]
> Step2 <- Step5 (ConditionTask)<- consists of [Step3,Step11]
> 3. Runing
> Step3 is selected in Step4 and Step5
> Step3 will be execute immediately after Step4 , its not correct.
> Step3 will be execute after Step5 again, its not correct either.
> 4. The correct scheduler is that step3 will be execute after step4 and step5.
> 5. So, I add a checking operate in the snippet  as bellow:
> if (!driverContext.getRunnable().contains(tsk)) {
>   console.printInfo(tsk.getId() + " is selected by condition 
> resolver.");
>   if(DriverContext.isLaunchable(tsk)){
> driverContext.addToRunnable(tsk);
>   }
> }
> So , that is work right for me in my enviroment. I am not sure whether it 
> will has some proble

[jira] [Updated] (HIVE-3972) Support using multiple reducer for fetching order by results

2014-04-07 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3972:


Attachment: HIVE-3972.8.patch.txt

> Support using multiple reducer for fetching order by results
> 
>
> Key: HIVE-3972
> URL: https://issues.apache.org/jira/browse/HIVE-3972
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: D8349.5.patch, D8349.6.patch, D8349.7.patch, 
> HIVE-3972.8.patch.txt, HIVE-3972.D8349.1.patch, HIVE-3972.D8349.2.patch, 
> HIVE-3972.D8349.3.patch, HIVE-3972.D8349.4.patch
>
>
> Queries for fetching results which have lastly "order by" clause make final 
> MR run with single reducer, which can be too much. For example, 
> {code}
> select value, sum(key) as sum from src group by value order by sum;
> {code}
> If number of reducer is reasonable, multiple result files could be merged 
> into single sorted stream in the fetcher level.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6863) HiveServer2 binary mode throws exception with PAM

2014-04-07 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-6863:
---

Attachment: HIVE-6863.1.patch

Minor patch. 

cc [~thejas] [~rhbutani] This is a bug for 13.

> HiveServer2 binary mode throws exception with PAM
> -
>
> Key: HIVE-6863
> URL: https://issues.apache.org/jira/browse/HIVE-6863
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.13.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 0.13.0
>
> Attachments: HIVE-6863.1.patch
>
>
> Works fine in http mode



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6863) HiveServer2 binary mode throws exception with PAM

2014-04-07 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-6863:
---

Status: Patch Available  (was: Open)

> HiveServer2 binary mode throws exception with PAM
> -
>
> Key: HIVE-6863
> URL: https://issues.apache.org/jira/browse/HIVE-6863
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.13.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 0.13.0
>
> Attachments: HIVE-6863.1.patch
>
>
> Works fine in http mode



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6863) HiveServer2 binary mode throws exception with PAM

2014-04-07 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-6863:
---

Description: Works fine in http mode

> HiveServer2 binary mode throws exception with PAM
> -
>
> Key: HIVE-6863
> URL: https://issues.apache.org/jira/browse/HIVE-6863
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.13.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 0.13.0
>
>
> Works fine in http mode



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6863) HiveServer2 binary mode throws exception with PAM

2014-04-07 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-6863:
--

 Summary: HiveServer2 binary mode throws exception with PAM
 Key: HIVE-6863
 URL: https://issues.apache.org/jira/browse/HIVE-6863
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6862) add DB schema DDL statements for MS SQL Server

2014-04-07 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-6862:


 Summary: add DB schema DDL statements for MS SQL Server
 Key: HIVE-6862
 URL: https://issues.apache.org/jira/browse/HIVE-6862
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman


need to add a unifed 0.13 script and a separate script for ACID support



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6843) INSTR for UTF-8 returns incorrect position

2014-04-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962583#comment-13962583
 ] 

Hive QA commented on HIVE-6843:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12639052/HIVE-6843.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5549 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketizedhiveinputformat
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2170/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2170/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12639052

> INSTR for UTF-8 returns incorrect position
> --
>
> Key: HIVE-6843
> URL: https://issues.apache.org/jira/browse/HIVE-6843
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 0.11.0, 0.12.0
>Reporter: Clif Kranish
>Assignee: Szehon Ho
>Priority: Minor
> Attachments: HIVE-6843.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-4790) MapredLocalTask task does not make virtual columns

2014-04-07 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-4790:


Attachment: HIVE-4790.7.patch.txt

> MapredLocalTask task does not make virtual columns
> --
>
> Key: HIVE-4790
> URL: https://issues.apache.org/jira/browse/HIVE-4790
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: D11511.3.patch, D11511.4.patch, HIVE-4790.5.patch.txt, 
> HIVE-4790.6.patch.txt, HIVE-4790.7.patch.txt, HIVE-4790.D11511.1.patch, 
> HIVE-4790.D11511.2.patch
>
>
> From mailing list, 
> http://www.mail-archive.com/user@hive.apache.org/msg08264.html
> {noformat}
> SELECT *,b.BLOCK__OFFSET__INSIDE__FILE FROM a JOIN b ON 
> b.rownumber = a.number;
> fails with this error:
>  
> > SELECT *,b.BLOCK__OFFSET__INSIDE__FILE FROM a JOIN b ON b.rownumber = 
> a.number;
> Automatically selecting local only mode for query
> Total MapReduce jobs = 1
> setting HADOOP_USER_NAMEpmarron
> 13/06/25 10:52:56 WARN conf.HiveConf: DEPRECATED: Configuration property 
> hive.metastore.local no longer has any effect. Make sure to provide a valid 
> value for hive.metastore.uris if you are connecting to a remote metastore.
> Execution log at: /tmp/pmarron/.log
> 2013-06-25 10:52:56 Starting to launch local task to process map join;
>   maximum memory = 932118528
> java.lang.RuntimeException: cannot find field block__offset__inside__file 
> from [0:rownumber, 1:offset]
> at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:366)
> at 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector.getStructFieldRef(LazySimpleStructObjectInspector.java:168)
> at 
> org.apache.hadoop.hive.serde2.objectinspector.DelegatedStructObjectInspector.getStructFieldRef(DelegatedStructObjectInspector.java:74)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57)
> at 
> org.apache.hadoop.hive.ql.exec.JoinUtil.getObjectInspectorsFromEvaluators(JoinUtil.java:68)
> at 
> org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.initializeOp(HashTableSinkOperator.java:222)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:451)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:407)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:186)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
> at 
> org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:394)
> at 
> org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
> at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Execution failed with exit status: 2
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6490) Hooks to map reduce task output (OutputCommitter) for Hive Storage Handlers

2014-04-07 Thread Yash Datta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yash Datta updated HIVE-6490:
-

Description: 
If user sets the "mapred.output.committer.class" property in jobConf via the 
storage handler implementation , then he/she can override the 
nulloutputcommitter to get hooks for commitJob , cleanupJob etc.

Hence we can make use of the outputcommitter class in hive :
http://hadoop.apache.org/docs/r2.2.0/api/org/apache/hadoop/mapred/OutputCommitter.html

commitJob provides a place to do job context tasks if needed by the storage 
handler 

  was:
If user sets the "mapred.output.committer.class" property in jobConf via the 
storage handler implementation , then he/she can override the 
nulloutputcommitter to get hooks for commitJob , cleanupJob etc.

commitJob provides a place to do job context tasks if needed by the storage 
handler 

Summary: Hooks to map reduce task output (OutputCommitter) for Hive 
Storage Handlers  (was: Override the default NullOutputCommitter in hive hadoop 
shims class)

> Hooks to map reduce task output (OutputCommitter) for Hive Storage Handlers
> ---
>
> Key: HIVE-6490
> URL: https://issues.apache.org/jira/browse/HIVE-6490
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Affects Versions: 0.10.0
>Reporter: Yash Datta
>Priority: Minor
>  Labels: patch
> Fix For: 0.10.1
>
> Attachments: HIVE-6490-branch-0.10.patch
>
>
> If user sets the "mapred.output.committer.class" property in jobConf via the 
> storage handler implementation , then he/she can override the 
> nulloutputcommitter to get hooks for commitJob , cleanupJob etc.
> Hence we can make use of the outputcommitter class in hive :
> http://hadoop.apache.org/docs/r2.2.0/api/org/apache/hadoop/mapred/OutputCommitter.html
> commitJob provides a place to do job context tasks if needed by the storage 
> handler 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6782) HiveServer2Concurrency issue when running with tez intermittently, throwing "org.apache.tez.dag.api.SessionNotRunning: Application not running" error

2014-04-07 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962554#comment-13962554
 ] 

Lefty Leverenz commented on HIVE-6782:
--

This adds *hive.localize.resource.wait.interval* and 
*hive.localize.resource.num.wait.attempts* to HiveConf.java.

They need descriptions in hive-default.xml.template or in a release note (since 
hive-default.xml.template will be generated from the new HiveConf.java after 
HIVE-6037 gets committed). When the time comes, I'll add them to the 
post-HIVE-6037 list (in HIVE-6586) and put them in the Configuration Properties 
wikidoc.

> HiveServer2Concurrency issue when running with tez intermittently, throwing 
> "org.apache.tez.dag.api.SessionNotRunning: Application not running" error
> -
>
> Key: HIVE-6782
> URL: https://issues.apache.org/jira/browse/HIVE-6782
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Fix For: 0.13.0, 0.14.0
>
> Attachments: HIVE-6782.1.patch, HIVE-6782.10.patch, 
> HIVE-6782.2.patch, HIVE-6782.3.patch, HIVE-6782.4.patch, HIVE-6782.5.patch, 
> HIVE-6782.6.patch, HIVE-6782.7.patch, HIVE-6782.8.patch, HIVE-6782.9.patch
>
>
> HiveServer2 concurrency is failing intermittently when using tez, throwing 
> "org.apache.tez.dag.api.SessionNotRunning: Application not running" error



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6319) Insert, update, delete functionality needs a compactor

2014-04-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962543#comment-13962543
 ] 

Hive QA commented on HIVE-6319:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12639048/HIVE-6319.patch

{color:green}SUCCESS:{color} +1 5591 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2169/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2169/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12639048

> Insert, update, delete functionality needs a compactor
> --
>
> Key: HIVE-6319
> URL: https://issues.apache.org/jira/browse/HIVE-6319
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: 0.13.0
>
> Attachments: 6319.wip.patch, HIVE-6319.patch, HIVE-6319.patch, 
> HIVE-6319.patch, HIVE-6319.patch, HiveCompactorDesign.pdf
>
>
> In order to keep the number of delta files from spiraling out of control we 
> need a compactor to collect these delta files together, and eventually 
> rewrite the base file when the deltas get large enough.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6861) more hadoop2 only golden files to fix

2014-04-07 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-6861:
-

Status: Patch Available  (was: Open)

> more hadoop2 only golden files to fix
> -
>
> Key: HIVE-6861
> URL: https://issues.apache.org/jira/browse/HIVE-6861
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-6861.1.patch
>
>
> More hadoop2 golden files to fix due to HIVE-6643, HIVE-6642, HIVE-6808, 
> HIVE-6144.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6861) more hadoop2 only golden files to fix

2014-04-07 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-6861:
-

Attachment: HIVE-6861.1.patch

patch v1.

> more hadoop2 only golden files to fix
> -
>
> Key: HIVE-6861
> URL: https://issues.apache.org/jira/browse/HIVE-6861
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-6861.1.patch
>
>
> More hadoop2 golden files to fix due to HIVE-6643, HIVE-6642, HIVE-6808, 
> HIVE-6144.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6861) more hadoop2 only golden files to fix

2014-04-07 Thread Jason Dere (JIRA)
Jason Dere created HIVE-6861:


 Summary: more hadoop2 only golden files to fix
 Key: HIVE-6861
 URL: https://issues.apache.org/jira/browse/HIVE-6861
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Jason Dere
Assignee: Jason Dere


More hadoop2 golden files to fix due to HIVE-6643, HIVE-6642, HIVE-6808, 
HIVE-6144.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6561) Beeline should accept -i option to Initializing a SQL file

2014-04-07 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6561:


Attachment: HIVE-6561.2.patch.txt

> Beeline should accept -i option to Initializing a SQL file
> --
>
> Key: HIVE-6561
> URL: https://issues.apache.org/jira/browse/HIVE-6561
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.10.0, 0.11.0, 0.12.0
>Reporter: Xuefu Zhang
>Assignee: Navis
> Attachments: HIVE-6561.1.patch.txt, HIVE-6561.2.patch.txt
>
>
> Hive CLI has -i option. From Hive CLI help:
> {code}
> ...
>  -i Initialization SQL file
> ...
> {code}
> However, Beeline has no such option:
> {code}
> xzhang@xzlt:~/apa/hive3$ 
> ./packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/apache-hive-0.14.0-SNAPSHOT-bin/bin/beeline
>  -u jdbc:hive2:// -i hive.rc
> ...
> Connected to: Apache Hive (version 0.14.0-SNAPSHOT)
> Driver: Hive JDBC (version 0.14.0-SNAPSHOT)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> -i (No such file or directory)
> Property "url" is required
> Beeline version 0.14.0-SNAPSHOT by Apache Hive
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 19984: Beeline should accept -i option to Initializing a SQL file

2014-04-07 Thread Navis Ryu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19984/
---

(Updated April 8, 2014, 2:07 a.m.)


Review request for hive.


Changes
---

addressed comment


Bugs: HIVE-6561
https://issues.apache.org/jira/browse/HIVE-6561


Repository: hive-git


Description
---

Hive CLI has -i option. From Hive CLI help:
{code}
...
 -i Initialization SQL file
...
{code}

However, Beeline has no such option:
{code}
xzhang@xzlt:~/apa/hive3$ 
./packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/apache-hive-0.14.0-SNAPSHOT-bin/bin/beeline
 -u jdbc:hive2:// -i hive.rc
...
Connected to: Apache Hive (version 0.14.0-SNAPSHOT)
Driver: Hive JDBC (version 0.14.0-SNAPSHOT)
Transaction isolation: TRANSACTION_REPEATABLE_READ
-i (No such file or directory)
Property "url" is required
Beeline version 0.14.0-SNAPSHOT by Apache Hive
...
{code}


Diffs (updated)
-

  beeline/src/java/org/apache/hive/beeline/BeeLine.java 5773109 
  beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java 44cabdf 
  beeline/src/java/org/apache/hive/beeline/Commands.java 493f963 
  beeline/src/main/resources/BeeLine.properties 697c29a 

Diff: https://reviews.apache.org/r/19984/diff/


Testing
---


Thanks,

Navis Ryu



[jira] [Commented] (HIVE-6785) query fails when partitioned table's table level serde is ParquetHiveSerDe and partition level serde is of different SerDe

2014-04-07 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962493#comment-13962493
 ] 

Szehon Ho commented on HIVE-6785:
-

Good catch Brock, I missed that.

> query fails when partitioned table's table level serde is ParquetHiveSerDe 
> and partition level serde is of different SerDe
> --
>
> Key: HIVE-6785
> URL: https://issues.apache.org/jira/browse/HIVE-6785
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats, Serializers/Deserializers
>Affects Versions: 0.13.0
>Reporter: Tongjie Chen
> Fix For: 0.14.0
>
> Attachments: HIVE-6785.1.patch.txt, HIVE-6785.2.patch.txt
>
>
> When a hive table's SerDe is ParquetHiveSerDe, while some partitions are of 
> other SerDe, AND if this table has string column[s], hive generates confusing 
> error message:
> "Failed with exception java.io.IOException:java.lang.ClassCastException: 
> parquet.hive.serde.primitive.ParquetStringInspector cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.SettableTimestampObjectInspector"
> This is confusing because timestamp is mentioned even if it is not been used 
> by the table. The reason is when there is SerDe difference between table and 
> partition, hive tries to convert objectinspector of two SerDes. 
> ParquetHiveSerDe's object inspector for string type is ParquetStringInspector 
> (newly introduced), neither a subclass of WritableStringObjectInspector nor 
> JavaStringObjectInspector, which ObjectInspectorConverters expect for string 
> category objector inspector. There is no break statement in STRING case 
> statement, hence the following TIMESTAMP case statement is executed, 
> generating confusing error message.
> see also in the following parquet issue:
> https://github.com/Parquet/parquet-mr/issues/324
> To fix that it is relatively easy, just make ParquetStringInspector subclass 
> of JavaStringObjectInspector instead of AbstractPrimitiveJavaObjectInspector. 
> But because constructor of class JavaStringObjectInspector is package scope 
> instead of public or protected, we would need to move ParquetStringInspector 
> to the same package with JavaStringObjectInspector.
> Also ArrayWritableObjectInspector's setStructFieldData needs to also accept 
> List data, since the corresponding setStructFieldData and create methods 
> return a list. This is also needed when table SerDe is ParquetHiveSerDe, and 
> partition SerDe is something else.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-1608) use sequencefile as the default for storing intermediate results

2014-04-07 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962487#comment-13962487
 ] 

Brock Noland commented on HIVE-1608:


A couple of notes on this one:

1) Most of the test failures look to be related to the .q.out files being 
different (referencing TextFile output class not SequenceFile)
2) This change as-is would be backwards incompatible for INSERT OVERWRITE 
DIRECTORY users.

Thus I think we need:
1) Leave the default of TextFile for INSERT OVERWRITE DIRECTORY
2) Update the .q.out files

> use sequencefile as the default for storing intermediate results
> 
>
> Key: HIVE-1608
> URL: https://issues.apache.org/jira/browse/HIVE-1608
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.7.0
>Reporter: Namit Jain
>Assignee: Brock Noland
> Fix For: 0.14.0
>
> Attachments: HIVE-1608.patch
>
>
> The only argument for having a text file for storing intermediate results 
> seems to be better debuggability.
> But, tailing a sequence file is possible, and it should be more space 
> efficient



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6785) query fails when partitioned table's table level serde is ParquetHiveSerDe and partition level serde is of different SerDe

2014-04-07 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962482#comment-13962482
 ] 

Brock Noland commented on HIVE-6785:


Hi,

LGTM except I see we are using the parquet... class names when creating a 
table, which are soon to be removed.

> query fails when partitioned table's table level serde is ParquetHiveSerDe 
> and partition level serde is of different SerDe
> --
>
> Key: HIVE-6785
> URL: https://issues.apache.org/jira/browse/HIVE-6785
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats, Serializers/Deserializers
>Affects Versions: 0.13.0
>Reporter: Tongjie Chen
> Fix For: 0.14.0
>
> Attachments: HIVE-6785.1.patch.txt, HIVE-6785.2.patch.txt
>
>
> When a hive table's SerDe is ParquetHiveSerDe, while some partitions are of 
> other SerDe, AND if this table has string column[s], hive generates confusing 
> error message:
> "Failed with exception java.io.IOException:java.lang.ClassCastException: 
> parquet.hive.serde.primitive.ParquetStringInspector cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.SettableTimestampObjectInspector"
> This is confusing because timestamp is mentioned even if it is not been used 
> by the table. The reason is when there is SerDe difference between table and 
> partition, hive tries to convert objectinspector of two SerDes. 
> ParquetHiveSerDe's object inspector for string type is ParquetStringInspector 
> (newly introduced), neither a subclass of WritableStringObjectInspector nor 
> JavaStringObjectInspector, which ObjectInspectorConverters expect for string 
> category objector inspector. There is no break statement in STRING case 
> statement, hence the following TIMESTAMP case statement is executed, 
> generating confusing error message.
> see also in the following parquet issue:
> https://github.com/Parquet/parquet-mr/issues/324
> To fix that it is relatively easy, just make ParquetStringInspector subclass 
> of JavaStringObjectInspector instead of AbstractPrimitiveJavaObjectInspector. 
> But because constructor of class JavaStringObjectInspector is package scope 
> instead of public or protected, we would need to move ParquetStringInspector 
> to the same package with JavaStringObjectInspector.
> Also ArrayWritableObjectInspector's setStructFieldData needs to also accept 
> List data, since the corresponding setStructFieldData and create methods 
> return a list. This is also needed when table SerDe is ParquetHiveSerDe, and 
> partition SerDe is something else.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6785) query fails when partitioned table's table level serde is ParquetHiveSerDe and partition level serde is of different SerDe

2014-04-07 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962473#comment-13962473
 ] 

Szehon Ho commented on HIVE-6785:
-

+1 (non-binding) , thanks for adding the q-test and address comments.

FYI [~brocknoland]

> query fails when partitioned table's table level serde is ParquetHiveSerDe 
> and partition level serde is of different SerDe
> --
>
> Key: HIVE-6785
> URL: https://issues.apache.org/jira/browse/HIVE-6785
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats, Serializers/Deserializers
>Affects Versions: 0.13.0
>Reporter: Tongjie Chen
> Fix For: 0.14.0
>
> Attachments: HIVE-6785.1.patch.txt, HIVE-6785.2.patch.txt
>
>
> When a hive table's SerDe is ParquetHiveSerDe, while some partitions are of 
> other SerDe, AND if this table has string column[s], hive generates confusing 
> error message:
> "Failed with exception java.io.IOException:java.lang.ClassCastException: 
> parquet.hive.serde.primitive.ParquetStringInspector cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.SettableTimestampObjectInspector"
> This is confusing because timestamp is mentioned even if it is not been used 
> by the table. The reason is when there is SerDe difference between table and 
> partition, hive tries to convert objectinspector of two SerDes. 
> ParquetHiveSerDe's object inspector for string type is ParquetStringInspector 
> (newly introduced), neither a subclass of WritableStringObjectInspector nor 
> JavaStringObjectInspector, which ObjectInspectorConverters expect for string 
> category objector inspector. There is no break statement in STRING case 
> statement, hence the following TIMESTAMP case statement is executed, 
> generating confusing error message.
> see also in the following parquet issue:
> https://github.com/Parquet/parquet-mr/issues/324
> To fix that it is relatively easy, just make ParquetStringInspector subclass 
> of JavaStringObjectInspector instead of AbstractPrimitiveJavaObjectInspector. 
> But because constructor of class JavaStringObjectInspector is package scope 
> instead of public or protected, we would need to move ParquetStringInspector 
> to the same package with JavaStringObjectInspector.
> Also ArrayWritableObjectInspector's setStructFieldData needs to also accept 
> List data, since the corresponding setStructFieldData and create methods 
> return a list. This is also needed when table SerDe is ParquetHiveSerDe, and 
> partition SerDe is something else.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962471#comment-13962471
 ] 

Hive QA commented on HIVE-6835:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12639043/HIVE-6835.1.patch

{color:green}SUCCESS:{color} +1 5550 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2167/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2167/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12639043

> Reading of partitioned Avro data fails if partition schema does not match 
> table schema
> --
>
> Key: HIVE-6835
> URL: https://issues.apache.org/jira/browse/HIVE-6835
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.12.0
>Reporter: Anthony Hsu
>Assignee: Anthony Hsu
> Attachments: HIVE-6835.1.patch
>
>
> To reproduce:
> {code}
> create table testarray (a array);
> load data local inpath '/home/ahsu/test/array.txt' into table testarray;
> # create partitioned Avro table with one array column
> create table avroarray partitioned by (y string) row format serde 
> 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
> ('avro.schema.literal'='{"namespace":"test","name":"avroarray","type": 
> "record", "fields": [ { "name":"a", "type":{"type":"array","items":"string"} 
> } ] }')  STORED as INPUTFORMAT  
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
> insert into table avroarray partition(y=1) select * from testarray;
> # add an int column with a default value of 0
> alter table avroarray set serde 
> 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
> serdeproperties('avro.schema.literal'='{"namespace":"test","name":"avroarray","type":
>  "record", "fields": [ {"name":"intfield","type":"int","default":0},{ 
> "name":"a", "type":{"type":"array","items":"string"} } ] }');
> # fails with ClassCastException
> select * from avroarray;
> {code}
> The select * fails with:
> {code}
> Failed with exception java.io.IOException:java.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
> cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6394) Implement Timestmap in ParquetSerde

2014-04-07 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962470#comment-13962470
 ] 

Szehon Ho commented on HIVE-6394:
-

We upgraded parquet to get the new Int96 libraries, but there is a parquet 
exception when writing an actual Int96 type, with dictionary encoding on.

Filed 
[https://github.com/Parquet/parquet-mr/issues/350|https://github.com/Parquet/parquet-mr/issues/350]
 which is being worked on.  Will need to wait for the fix + new version of 
parquet  before we can proceed.

> Implement Timestmap in ParquetSerde
> ---
>
> Key: HIVE-6394
> URL: https://issues.apache.org/jira/browse/HIVE-6394
> Project: Hive
>  Issue Type: Sub-task
>  Components: Serializers/Deserializers
>Reporter: Jarek Jarcec Cecho
>Assignee: Szehon Ho
>  Labels: Parquet
>
> This JIRA is to implement timestamp support in Parquet SerDe.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-04-07 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

Attachment: HIVE-5687.v6.patch

Addressing review comments from Alan, Owen and some of Lars.

Owen: DDL was used there mostly for convenience and correctness. The other 
places where API is used, cannot be accomplished via DDL.

> Streaming support in Hive
> -
>
> Key: HIVE-5687
> URL: https://issues.apache.org/jira/browse/HIVE-5687
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Roshan Naik
>Assignee: Roshan Naik
>  Labels: ACID, Streaming
> Fix For: 0.13.0
>
> Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
> 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, 
> HIVE-5687-unit-test-fix.patch, HIVE-5687.patch, HIVE-5687.v2.patch, 
> HIVE-5687.v3.patch, HIVE-5687.v4.patch, HIVE-5687.v5.patch, 
> HIVE-5687.v6.patch, Hive Streaming Ingest API for v3 patch.pdf, Hive 
> Streaming Ingest API for v4 patch.pdf
>
>
> Implement support for Streaming data into HIVE.
> - Provide a client streaming API 
> - Transaction support: Clients should be able to periodically commit a batch 
> of records atomically
> - Immediate visibility: Records should be immediately visible to queries on 
> commit
> - Should not overload HDFS with too many small files
> Use Cases:
>  - Streaming logs into HIVE via Flume
>  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HIVE-6648) Permissions are not inherited correctly when tables have multiple partition columns

2014-04-07 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho reassigned HIVE-6648:
---

Assignee: Szehon Ho

> Permissions are not inherited correctly when tables have multiple partition 
> columns
> ---
>
> Key: HIVE-6648
> URL: https://issues.apache.org/jira/browse/HIVE-6648
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.12.0
>Reporter: Henry Robinson
>Assignee: Szehon Ho
>
> {{Warehouse.mkdirs()}} always looks at the immediate parent of the path that 
> it creates when determining what permissions to inherit. However, it may have 
> created that parent directory as well, in which case it will have the default 
> permissions and will not have inherited them.
> This is a problem when performing an {{INSERT}} into a table with more than 
> one partition column. E.g., in an empty table:
> {{INSERT INTO TABLE tbl PARTITION(p1=1, p2=2) ... }}
> A new subdirectory /p1=1/p2=2  will be created, and with permission 
> inheritance (per HIVE-2504) enabled, the intention is presumably for both new 
> directories to inherit the root table dir's permissions. However, 
> {{mkdirs()}} will only set the permission of the leaf directory (i.e. 
> /p2=2/), and then only to the permissions of /p1=1/, which was just created.
> {code}
> public boolean mkdirs(Path f) throws MetaException {
> FileSystem fs = null;
> try {
>   fs = getFs(f);
>   LOG.debug("Creating directory if it doesn't exist: " + f);
>   //Check if the directory already exists. We want to change the 
> permission
>   //to that of the parent directory only for newly created directories.
>   if (this.inheritPerms) {
> try {
>   return fs.getFileStatus(f).isDir();
> } catch (FileNotFoundException ignore) {
> }
>   }
>   boolean success = fs.mkdirs(f);
>   if (this.inheritPerms && success) {
> // Set the permission of parent directory.
> // HNR: This is the bug - getParent() may refer to a just-created 
> directory.
> fs.setPermission(f, fs.getFileStatus(f.getParent()).getPermission());
>   }
>   return success;
> } catch (IOException e) {
>   closeFs(fs);
>   MetaStoreUtils.logAndThrowMetaException(e);
> }
> return false;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6860) Issue with FS based stats collection on Tez

2014-04-07 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962445#comment-13962445
 ] 

Vikram Dixit K commented on HIVE-6860:
--

+1 LGTM.

> Issue with FS based stats collection on Tez
> ---
>
> Key: HIVE-6860
> URL: https://issues.apache.org/jira/browse/HIVE-6860
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics, Tez
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-6860.patch
>
>
> Statistics from different tasks got overwritten while running on Tez.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6860) Issue with FS based stats collection on Tez

2014-04-07 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6860:
---

Status: Patch Available  (was: Open)

> Issue with FS based stats collection on Tez
> ---
>
> Key: HIVE-6860
> URL: https://issues.apache.org/jira/browse/HIVE-6860
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics, Tez
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-6860.patch
>
>
> Statistics from different tasks got overwritten while running on Tez.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6860) Issue with FS based stats collection on Tez

2014-04-07 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6860:
---

Attachment: HIVE-6860.patch

> Issue with FS based stats collection on Tez
> ---
>
> Key: HIVE-6860
> URL: https://issues.apache.org/jira/browse/HIVE-6860
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics, Tez
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-6860.patch
>
>
> Statistics from different tasks got overwritten while running on Tez.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6860) Issue with FS based stats collection on Tez

2014-04-07 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-6860:
--

 Summary: Issue with FS based stats collection on Tez
 Key: HIVE-6860
 URL: https://issues.apache.org/jira/browse/HIVE-6860
 Project: Hive
  Issue Type: Bug
  Components: Statistics, Tez
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan


Statistics from different tasks got overwritten while running on Tez.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6809) Support bulk deleting directories for partition drop with partial spec

2014-04-07 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6809:


Attachment: HIVE-6809.4.patch.txt

> Support bulk deleting directories for partition drop with partial spec
> --
>
> Key: HIVE-6809
> URL: https://issues.apache.org/jira/browse/HIVE-6809
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
> Attachments: HIVE-6809.1.patch.txt, HIVE-6809.2.patch.txt, 
> HIVE-6809.3.patch.txt, HIVE-6809.4.patch.txt
>
>
> In busy hadoop system, dropping many of partitions takes much more time than 
> expected. In hive-0.11.0, removing 1700 partitions by single partial spec 
> took 90 minutes, which is reduced to 3 minutes when deleteData is set false. 
> I couldn't test this in recent hive, which has HIVE-6256 but if the 
> time-taking part is mostly from removing directories, it seemed not helpful 
> to reduce whole processing time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6821) Fix some non-deterministic tests

2014-04-07 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6821:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to 0.13 & trunk. Thanks, Jason!

> Fix some non-deterministic tests 
> -
>
> Key: HIVE-6821
> URL: https://issues.apache.org/jira/browse/HIVE-6821
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Jason Dere
>Assignee: Jason Dere
> Fix For: 0.13.0
>
> Attachments: HIVE-6821.1.patch, HIVE-6821.2.patch, HIVE-6821.3.patch
>
>
> A bunch of qfile tests look like they need an ORDER-BY added to the queries 
> so that the output looks repeatable when testing with hadoop1/hadoop2.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6846) allow safe set commands with sql standard authorization

2014-04-07 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6846:


Attachment: HIVE-6846.3.patch

HIVE-6846.3.patch - test only changes. I have verified that the tests pass.


> allow safe set commands with sql standard authorization
> ---
>
> Key: HIVE-6846
> URL: https://issues.apache.org/jira/browse/HIVE-6846
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 0.13.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.13.0
>
> Attachments: HIVE-6846.1.patch, HIVE-6846.2.patch, HIVE-6846.3.patch
>
>
> HIVE-6827 disables all set commands when SQL standard authorization is turned 
> on, but not all set commands are unsafe. We should allow safe set commands.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6858) Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.

2014-04-07 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6858:
---

Attachment: HIVE-6858.1.patch

Attached patch modifies the tests so that it doesn't run into the jdk bug.

> Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.
> ---
>
> Key: HIVE-6858
> URL: https://issues.apache.org/jira/browse/HIVE-6858
> Project: Hive
>  Issue Type: Bug
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HIVE-6858.1.patch
>
>
> Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.
> {noformat}
> < -250.0  6583411.236 1.0 6583411.236 -0.004  -0.0048
> ---
> > -250.0  6583411.236 1.0 6583411.236 -0.0040 -0.0048
> {noformat}
> Following code reproduces this behavior when run in jdk-7 vs jdk-6. Jdk-7 
> produces -0.004 while, jdk-6 produces -0.0040.
> {code}
> public class Main {
>   public static void main(String[] a) throws Exception {
>  double val = 0.004;
>  System.out.println("Value = "+val);
>   }
> }
> {code}
> This happens to be a bug in jdk6, that has been fixed in jdk7.
> http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4511638



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6858) Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.

2014-04-07 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6858:
---

Status: Patch Available  (was: Open)

> Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.
> ---
>
> Key: HIVE-6858
> URL: https://issues.apache.org/jira/browse/HIVE-6858
> Project: Hive
>  Issue Type: Bug
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HIVE-6858.1.patch
>
>
> Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.
> {noformat}
> < -250.0  6583411.236 1.0 6583411.236 -0.004  -0.0048
> ---
> > -250.0  6583411.236 1.0 6583411.236 -0.0040 -0.0048
> {noformat}
> Following code reproduces this behavior when run in jdk-7 vs jdk-6. Jdk-7 
> produces -0.004 while, jdk-6 produces -0.0040.
> {code}
> public class Main {
>   public static void main(String[] a) throws Exception {
>  double val = 0.004;
>  System.out.println("Value = "+val);
>   }
> }
> {code}
> This happens to be a bug in jdk6, that has been fixed in jdk7.
> http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4511638



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6825) custom jars for Hive query should be uploaded to scratch dir per query; and/or versioned

2014-04-07 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962433#comment-13962433
 ] 

Vikram Dixit K commented on HIVE-6825:
--

LGTM +1 pending HiveQA test run.

> custom jars for Hive query should be uploaded to scratch dir per query; 
> and/or versioned
> 
>
> Key: HIVE-6825
> URL: https://issues.apache.org/jira/browse/HIVE-6825
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.14.0
>
> Attachments: HIVE-6825.01.patch, HIVE-6825.patch
>
>
> Currently the jars are uploaded to either user directory or global, whatever 
> is configured, which is a mess and can cause collisions. We can upload to 
> scratch directory, and/or version. 
> There's a tradeoff between having to upload files every time (for example, 
> for commonly used things like HBase input format) (which is what is done now, 
> into global/user path), and having a mess of one-off custom jars and files, 
> versioned, sitting in .hiveJars.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6850) For FetchOperator, Driver uses the valid transaction list from the previous query

2014-04-07 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962417#comment-13962417
 ] 

Harish Butani commented on HIVE-6850:
-

+1 lgtm
+1 for 0.13

> For FetchOperator, Driver uses the valid transaction list from the previous 
> query
> -
>
> Key: HIVE-6850
> URL: https://issues.apache.org/jira/browse/HIVE-6850
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Reporter: Alan Gates
>Assignee: Owen O'Malley
>Priority: Blocker
> Fix For: 0.13.0
>
> Attachments: HIVE-6850.patch
>
>
> The problem is two fold:
> * FetchTask.initialize, which is called during parsing of the query, converts 
> the HiveConf it is given into a JobConf by copying it.
> * Driver.recordValidTxns, which runs after parsing, adds the valid 
> transactions to the HiveConf.
> Thus fetch operators will use the transactions from the previous command.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6818) Array out of bounds when ORC is used with ACID and predicate push down

2014-04-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962380#comment-13962380
 ] 

Sergey Shelukhin commented on HIVE-6818:


Can you add comment about that on commit? Otherwise +1

> Array out of bounds when ORC is used with ACID and predicate push down
> --
>
> Key: HIVE-6818
> URL: https://issues.apache.org/jira/browse/HIVE-6818
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Blocker
> Fix For: 0.13.0
>
> Attachments: HIVE-6818.patch
>
>
> The users gets an ArrayOutOfBoundsException when using ORC, ACID, and 
> predicate push down.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6825) custom jars for Hive query should be uploaded to scratch dir per query; and/or versioned

2014-04-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962369#comment-13962369
 ] 

Sergey Shelukhin commented on HIVE-6825:


https://reviews.apache.org/r/20110/

> custom jars for Hive query should be uploaded to scratch dir per query; 
> and/or versioned
> 
>
> Key: HIVE-6825
> URL: https://issues.apache.org/jira/browse/HIVE-6825
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.14.0
>
> Attachments: HIVE-6825.01.patch, HIVE-6825.patch
>
>
> Currently the jars are uploaded to either user directory or global, whatever 
> is configured, which is a mess and can cause collisions. We can upload to 
> scratch directory, and/or version. 
> There's a tradeoff between having to upload files every time (for example, 
> for commonly used things like HBase input format) (which is what is done now, 
> into global/user path), and having a mess of one-off custom jars and files, 
> versioned, sitting in .hiveJars.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Review Request 20110: HIVE-6825 custom jars for Hive query should be uploaded to scratch dir per query; and/or versioned

2014-04-07 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20110/
---

Review request for hive and Vikram Dixit Kumaraswamy.


Repository: hive-git


Description
---

See JIRA


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java 14d188f 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java 74940e6 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java c355d5a 

Diff: https://reviews.apache.org/r/20110/diff/


Testing
---


Thanks,

Sergey Shelukhin



[jira] [Commented] (HIVE-6757) Remove deprecated parquet classes from outside of org.apache package

2014-04-07 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962367#comment-13962367
 ] 

Owen O'Malley commented on HIVE-6757:
-

+1 thanks Harish!


> Remove deprecated parquet classes from outside of org.apache package
> 
>
> Key: HIVE-6757
> URL: https://issues.apache.org/jira/browse/HIVE-6757
> Project: Hive
>  Issue Type: Bug
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Blocker
> Fix For: 0.13.0
>
> Attachments: HIVE-6757.2.patch, HIVE-6757.patch, parquet-hive.patch
>
>
> Apache shouldn't release projects with files outside of the org.apache 
> namespace.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HIVE-6859) 8

2014-04-07 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho resolved HIVE-6859.
-

Resolution: Invalid

Issue created by accident.

> 8
> -
>
> Key: HIVE-6859
> URL: https://issues.apache.org/jira/browse/HIVE-6859
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6859) 8

2014-04-07 Thread Szehon Ho (JIRA)
Szehon Ho created HIVE-6859:
---

 Summary: 8
 Key: HIVE-6859
 URL: https://issues.apache.org/jira/browse/HIVE-6859
 Project: Hive
  Issue Type: Bug
Reporter: Szehon Ho






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6818) Array out of bounds when ORC is used with ACID and predicate push down

2014-04-07 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962359#comment-13962359
 ] 

Owen O'Malley commented on HIVE-6818:
-

Sergey,
  My intention is to replace the current xml ast with the serialized 
SearchArgument. The serialized SearchArgument is much more compact and focused 
on predicate pushdown. However, in order for that to happen, we need to 
transition the clients from the old format to the new one. So, yes, the 
immediate patch only uses it for testing, but it should over time become the 
mainline path.

> Array out of bounds when ORC is used with ACID and predicate push down
> --
>
> Key: HIVE-6818
> URL: https://issues.apache.org/jira/browse/HIVE-6818
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
>Priority: Blocker
> Fix For: 0.13.0
>
> Attachments: HIVE-6818.patch
>
>
> The users gets an ArrayOutOfBoundsException when using ORC, ACID, and 
> predicate push down.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6846) allow safe set commands with sql standard authorization

2014-04-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962353#comment-13962353
 ] 

Hive QA commented on HIVE-6846:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12639034/HIVE-6846.2.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5552 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.parse.authorization.TestSessionUserName.testSessionConstructorUser
org.apache.hadoop.hive.ql.parse.authorization.TestSessionUserName.testSessionDefaultUser
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2165/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2165/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12639034

> allow safe set commands with sql standard authorization
> ---
>
> Key: HIVE-6846
> URL: https://issues.apache.org/jira/browse/HIVE-6846
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 0.13.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.13.0
>
> Attachments: HIVE-6846.1.patch, HIVE-6846.2.patch
>
>
> HIVE-6827 disables all set commands when SQL standard authorization is turned 
> on, but not all set commands are unsafe. We should allow safe set commands.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6825) custom jars for Hive query should be uploaded to scratch dir per query; and/or versioned

2014-04-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962346#comment-13962346
 ] 

Sergey Shelukhin commented on HIVE-6825:


[~vikram.dixit] this is the jira

> custom jars for Hive query should be uploaded to scratch dir per query; 
> and/or versioned
> 
>
> Key: HIVE-6825
> URL: https://issues.apache.org/jira/browse/HIVE-6825
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.14.0
>
> Attachments: HIVE-6825.01.patch, HIVE-6825.patch
>
>
> Currently the jars are uploaded to either user directory or global, whatever 
> is configured, which is a mess and can cause collisions. We can upload to 
> scratch directory, and/or version. 
> There's a tradeoff between having to upload files every time (for example, 
> for commonly used things like HBase input format) (which is what is done now, 
> into global/user path), and having a mess of one-off custom jars and files, 
> versioned, sitting in .hiveJars.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6825) custom jars for Hive query should be uploaded to scratch dir per query; and/or versioned

2014-04-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6825:
---

Status: Patch Available  (was: Open)

> custom jars for Hive query should be uploaded to scratch dir per query; 
> and/or versioned
> 
>
> Key: HIVE-6825
> URL: https://issues.apache.org/jira/browse/HIVE-6825
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.14.0
>
> Attachments: HIVE-6825.01.patch, HIVE-6825.patch
>
>
> Currently the jars are uploaded to either user directory or global, whatever 
> is configured, which is a mess and can cause collisions. We can upload to 
> scratch directory, and/or version. 
> There's a tradeoff between having to upload files every time (for example, 
> for commonly used things like HBase input format) (which is what is done now, 
> into global/user path), and having a mess of one-off custom jars and files, 
> versioned, sitting in .hiveJars.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6825) custom jars for Hive query should be uploaded to scratch dir per query; and/or versioned

2014-04-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6825:
---

Attachment: HIVE-6825.01.patch

> custom jars for Hive query should be uploaded to scratch dir per query; 
> and/or versioned
> 
>
> Key: HIVE-6825
> URL: https://issues.apache.org/jira/browse/HIVE-6825
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.14.0
>
> Attachments: HIVE-6825.01.patch, HIVE-6825.patch
>
>
> Currently the jars are uploaded to either user directory or global, whatever 
> is configured, which is a mess and can cause collisions. We can upload to 
> scratch directory, and/or version. 
> There's a tradeoff between having to upload files every time (for example, 
> for commonly used things like HBase input format) (which is what is done now, 
> into global/user path), and having a mess of one-off custom jars and files, 
> versioned, sitting in .hiveJars.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6782) HiveServer2Concurrency issue when running with tez intermittently, throwing "org.apache.tez.dag.api.SessionNotRunning: Application not running" error

2014-04-07 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-6782:
-

Attachment: HIVE-6782.10.patch

Needed rebase after HIVE-6739.

> HiveServer2Concurrency issue when running with tez intermittently, throwing 
> "org.apache.tez.dag.api.SessionNotRunning: Application not running" error
> -
>
> Key: HIVE-6782
> URL: https://issues.apache.org/jira/browse/HIVE-6782
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Fix For: 0.13.0, 0.14.0
>
> Attachments: HIVE-6782.1.patch, HIVE-6782.10.patch, 
> HIVE-6782.2.patch, HIVE-6782.3.patch, HIVE-6782.4.patch, HIVE-6782.5.patch, 
> HIVE-6782.6.patch, HIVE-6782.7.patch, HIVE-6782.8.patch, HIVE-6782.9.patch
>
>
> HiveServer2 concurrency is failing intermittently when using tez, throwing 
> "org.apache.tez.dag.api.SessionNotRunning: Application not running" error



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6134) Merging small files based on file size only works for CTAS queries

2014-04-07 Thread Eric Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962335#comment-13962335
 ] 

Eric Chu commented on HIVE-6134:


Hi [~xuefuz] and [~ashutoshc], it turns out this issues not only affects Hue 
but also HIVE CLI - in that results won't show up in CLI until more than a 
minute has passed with timeout error for connection to nodes.

I'm trying to make the change myself in GenMRFileSink1.java to support a new 
property that when it's turned on, Hive will merge files for a regular (i.e., 
without mvTask), map-only job that uses more than X mappers (another property). 
I'm wondering if and how we could find out the number of mappers that will be 
used for that job when we are at that stage of the optimization. I want to set 
chDir to true when this number is greater than some threshold set via a new 
property.  I notice that currWork.getMapWork().getNumMapTasks() actually 
returns null. Can you give me some pointers?

> Merging small files based on file size only works for CTAS queries
> --
>
> Key: HIVE-6134
> URL: https://issues.apache.org/jira/browse/HIVE-6134
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.8.0, 0.10.0, 0.11.0, 0.12.0
>Reporter: Eric Chu
>
> According to the documentation, if we set hive.merge.mapfiles to true, Hive 
> will launch an additional MR job to merge the small output files at the end 
> of a map-only job when the average output file size is smaller than 
> hive.merge.smallfiles.avgsize. Similarly, by setting hive.merge.mapredfiles 
> to true, Hive will merge the output files of a map-reduce job. 
> My expectation is that this is true for all MR queries. However, my 
> observation is that this is only true for CTAS queries. In 
> GenMRFileSink1.java, HIVEMERGEMAPFILES and HIVEMERGEMAPREDFILES are only used 
> if ((ctx.getMvTask() != null) && (!ctx.getMvTask().isEmpty())). So, for a 
> regular SELECT query that doesn't have move tasks, these properties are not 
> used.
> Is my understanding correct and if so, what's the reasoning behind the logic 
> of not supporting this for regular SELECT queries? It seems to me that this 
> should be supported for regular SELECT queries as well. One scenario where 
> this hits us hard is when users try to download the result in HUE, and HUE 
> times out b/c there are thousands of output files. The workaround is to 
> re-run the query as CTAS, but it's a significant time sink.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6739) Hive HBase query fails on Tez due to missing jars and then due to NPE in getSplits

2014-04-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6739:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

in trunk and 13

> Hive HBase query fails on Tez due to missing jars and then due to NPE in 
> getSplits
> --
>
> Key: HIVE-6739
> URL: https://issues.apache.org/jira/browse/HIVE-6739
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.13.0
>
> Attachments: HIVE-6739.01.patch, HIVE-6739.02.patch, 
> HIVE-6739.03.patch, HIVE-6739.04.patch, HIVE-6739.patch, 
> HIVE-6739.preliminary.patch
>
>
> Tez paths in Hive never call configure on the input/output operators, so 
> (among other things, potentially) requisite files never get added to the job



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6846) allow safe set commands with sql standard authorization

2014-04-07 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962323#comment-13962323
 ] 

Ashutosh Chauhan commented on HIVE-6846:


+1

> allow safe set commands with sql standard authorization
> ---
>
> Key: HIVE-6846
> URL: https://issues.apache.org/jira/browse/HIVE-6846
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 0.13.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.13.0
>
> Attachments: HIVE-6846.1.patch, HIVE-6846.2.patch
>
>
> HIVE-6827 disables all set commands when SQL standard authorization is turned 
> on, but not all set commands are unsafe. We should allow safe set commands.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6782) HiveServer2Concurrency issue when running with tez intermittently, throwing "org.apache.tez.dag.api.SessionNotRunning: Application not running" error

2014-04-07 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962315#comment-13962315
 ] 

Thejas M Nair commented on HIVE-6782:
-

+1 to the update as well.


> HiveServer2Concurrency issue when running with tez intermittently, throwing 
> "org.apache.tez.dag.api.SessionNotRunning: Application not running" error
> -
>
> Key: HIVE-6782
> URL: https://issues.apache.org/jira/browse/HIVE-6782
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Fix For: 0.13.0, 0.14.0
>
> Attachments: HIVE-6782.1.patch, HIVE-6782.2.patch, HIVE-6782.3.patch, 
> HIVE-6782.4.patch, HIVE-6782.5.patch, HIVE-6782.6.patch, HIVE-6782.7.patch, 
> HIVE-6782.8.patch, HIVE-6782.9.patch
>
>
> HiveServer2 concurrency is failing intermittently when using tez, throwing 
> "org.apache.tez.dag.api.SessionNotRunning: Application not running" error



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6855) A couple of errors in MySQL db creation script for transaction tables

2014-04-07 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962280#comment-13962280
 ] 

Ashutosh Chauhan commented on HIVE-6855:


+1

> A couple of errors in MySQL db creation script for transaction tables
> -
>
> Key: HIVE-6855
> URL: https://issues.apache.org/jira/browse/HIVE-6855
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.13.0
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-6855.patch
>
>
> There are a few small issues in the database creation scripts for mysql.  A 
> couple of the tables don't set the engine to InnoDB.  None of the tables set 
> default character set to latin1.  And the syntax "CREATE INDEX...USING HASH" 
> doesn't work on older versions of MySQL.  Instead the index creation should 
> be done without specifying a method (no USING clause).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6846) allow safe set commands with sql standard authorization

2014-04-07 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962277#comment-13962277
 ] 

Thejas M Nair commented on HIVE-6846:
-

I will add this to overall sql standard authorization document. I will work on 
that in a day or two.


> allow safe set commands with sql standard authorization
> ---
>
> Key: HIVE-6846
> URL: https://issues.apache.org/jira/browse/HIVE-6846
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 0.13.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.13.0
>
> Attachments: HIVE-6846.1.patch, HIVE-6846.2.patch
>
>
> HIVE-6827 disables all set commands when SQL standard authorization is turned 
> on, but not all set commands are unsafe. We should allow safe set commands.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6837) HiveServer2 thrift/http mode & binary mode proxy user check fails reporting IP null for client

2014-04-07 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6837:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

Patch committed to 0.13 branch and trunk. I made a minor edit to apply on 0.13 
branch.
Thanks for the contribution Vaibhav. Thanks for the review Dilli.


> HiveServer2 thrift/http mode & binary mode proxy user check fails reporting 
> IP null for client
> --
>
> Key: HIVE-6837
> URL: https://issues.apache.org/jira/browse/HIVE-6837
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.13.0
>Reporter: Dilli Arumugam
>Assignee: Vaibhav Gumashta
> Fix For: 0.13.0
>
> Attachments: HIVE-6837.1.patch, HIVE-6837.2.patch, HIVE-6837.3.patch, 
> hive.log
>
>
> Hive Server running thrift/http with Kerberos security.
> Kinited user knox attempting to proxy as sam.
> Beeline connection failed reporting error on hive server logs:
> Caused by: org.apache.hadoop.security.authorize.AuthorizationException: 
> Unauthorized connection for super-user: knox from IP null



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 20051: HIVE-4904: A little more CP crossing RS boundaries

2014-04-07 Thread Harish Butani


> On April 7, 2014, 6:03 p.m., John Pullokkaran wrote:
> > I took a look at this change; my knowledge of hive code is rather limited.
> > 1. Column Pruner doesn't cross Script operator boundary. Theoretically you 
> > could prune above and below the script op separately.
> > 2. It seems column pruner assumes that parent of UDTF is always select; but 
> > we haven't formalized this assumption. Other processors should throw 
> > exception if it ever come across a child that is UDTF. Theoretically you 
> > can push down certain filters below builtin UDTF. We may not be doing that 
> > today.
> > 3.  In Select Pruner it seems like there is no difference between 
> > 'prunedCols' and 'columns'.

Thanks John.Here are responses to your points

1. Column Pruner doesn't cross Script operator boundary.
  The ColumnPrunerWalker explicitly stops at the SelectOp parent of a ScriptOp. 
This may have been ok when developed; as you point out now it makes sense to 
continue pruning on the SelectOp ancestors. Can you file a jira for this.
2. The check in ColumnPrunerSelectProc is needed for the LVJoin case, where for 
the UDTFOp you end up with a empty PrunedList. What I realized was that Navis's 
fix doesn't cover the LVJoin case. Yes this should be revisited.   


- Harish


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20051/#review39706
---


On April 6, 2014, 1:33 a.m., Harish Butani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/20051/
> ---
> 
> (Updated April 6, 2014, 1:33 a.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Navis Ryu.
> 
> 
> Bugs: HIVE-4904
> https://issues.apache.org/jira/browse/HIVE-4904
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Currently, CP context cannot be propagated over RS except for JOIN/EXT. A 
> little more CP is possible.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPruner.java 58a9b59 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcCtx.java 
> db36151 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java 
> 0690fb7 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 3f16dc2 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java
>  94224b3 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 3b33dc2 
>   ql/src/test/queries/clientpositive/order_within_subquery.q PRE-CREATION 
>   ql/src/test/results/clientpositive/annotate_stats_select.q.out 1e982e6 
>   ql/src/test/results/clientpositive/auto_join18.q.out b8677f4 
>   ql/src/test/results/clientpositive/auto_join27.q.out a576190 
>   ql/src/test/results/clientpositive/auto_join30.q.out 8709198 
>   ql/src/test/results/clientpositive/auto_join31.q.out 1936e45 
>   ql/src/test/results/clientpositive/auto_join32.q.out 05f53e6 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_10.q.out 8882aac 
>   ql/src/test/results/clientpositive/count.q.out eb048b6 
>   ql/src/test/results/clientpositive/distinct_stats.q.out f715ea3 
>   ql/src/test/results/clientpositive/groupby2_map.q.out 291f196 
>   ql/src/test/results/clientpositive/groupby2_map_skew.q.out d005b6c 
>   ql/src/test/results/clientpositive/groupby3_map.q.out 1dfee08 
>   ql/src/test/results/clientpositive/groupby3_map_skew.q.out 7af59bc 
>   ql/src/test/results/clientpositive/groupby_cube1.q.out 92d81f4 
>   ql/src/test/results/clientpositive/groupby_distinct_samekey.q.out b405978 
>   ql/src/test/results/clientpositive/groupby_map_ppr.q.out 27eff75 
>   
> ql/src/test/results/clientpositive/groupby_multi_insert_common_distinct.q.out 
> ad76252 
>   ql/src/test/results/clientpositive/groupby_multi_single_reducer3.q.out 
> 51a70c4 
>   ql/src/test/results/clientpositive/groupby_position.q.out 727bccb 
>   ql/src/test/results/clientpositive/groupby_rollup1.q.out 36bf966 
>   ql/src/test/results/clientpositive/groupby_sort_11.q.out 8ee7571 
>   ql/src/test/results/clientpositive/groupby_sort_8.q.out a27cfaa 
>   ql/src/test/results/clientpositive/join18.q.out 7975c79 
>   ql/src/test/results/clientpositive/limit_pushdown.q.out 9c93ada 
>   ql/src/test/results/clientpositive/limit_pushdown_negative.q.out 115b171 
>   ql/src/test/results/clientpositive/metadataonly1.q.out 917efdf 
>   ql/src/test/results/clientpositive/multi_insert_gby2.q.out ab758cb 
>   ql/src/test/results/clientpositive/multi_insert_gby3.q.out 23ccebb 
>   ql/src/test/results/clientpositive/multi_insert_lateral_view.q.out 35e70b4 
>   ql/src/test/results/clientpositive/nullgroup.q.out 2ac7dea 
>   ql/src/test/results/clientpositive/nullgroup2.q.out cf31dc1 
>   ql/src/test/results/clientpo

[jira] [Updated] (HIVE-6782) HiveServer2Concurrency issue when running with tez intermittently, throwing "org.apache.tez.dag.api.SessionNotRunning: Application not running" error

2014-04-07 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-6782:
-

Status: Patch Available  (was: Open)

> HiveServer2Concurrency issue when running with tez intermittently, throwing 
> "org.apache.tez.dag.api.SessionNotRunning: Application not running" error
> -
>
> Key: HIVE-6782
> URL: https://issues.apache.org/jira/browse/HIVE-6782
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Fix For: 0.13.0, 0.14.0
>
> Attachments: HIVE-6782.1.patch, HIVE-6782.2.patch, HIVE-6782.3.patch, 
> HIVE-6782.4.patch, HIVE-6782.5.patch, HIVE-6782.6.patch, HIVE-6782.7.patch, 
> HIVE-6782.8.patch, HIVE-6782.9.patch
>
>
> HiveServer2 concurrency is failing intermittently when using tez, throwing 
> "org.apache.tez.dag.api.SessionNotRunning: Application not running" error



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6782) HiveServer2Concurrency issue when running with tez intermittently, throwing "org.apache.tez.dag.api.SessionNotRunning: Application not running" error

2014-04-07 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-6782:
-

Attachment: HIVE-6782.9.patch

Fix the case where a tez session is launched without a query.

> HiveServer2Concurrency issue when running with tez intermittently, throwing 
> "org.apache.tez.dag.api.SessionNotRunning: Application not running" error
> -
>
> Key: HIVE-6782
> URL: https://issues.apache.org/jira/browse/HIVE-6782
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Fix For: 0.13.0, 0.14.0
>
> Attachments: HIVE-6782.1.patch, HIVE-6782.2.patch, HIVE-6782.3.patch, 
> HIVE-6782.4.patch, HIVE-6782.5.patch, HIVE-6782.6.patch, HIVE-6782.7.patch, 
> HIVE-6782.8.patch, HIVE-6782.9.patch
>
>
> HiveServer2 concurrency is failing intermittently when using tez, throwing 
> "org.apache.tez.dag.api.SessionNotRunning: Application not running" error



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6782) HiveServer2Concurrency issue when running with tez intermittently, throwing "org.apache.tez.dag.api.SessionNotRunning: Application not running" error

2014-04-07 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-6782:
-

Status: Open  (was: Patch Available)

> HiveServer2Concurrency issue when running with tez intermittently, throwing 
> "org.apache.tez.dag.api.SessionNotRunning: Application not running" error
> -
>
> Key: HIVE-6782
> URL: https://issues.apache.org/jira/browse/HIVE-6782
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Fix For: 0.13.0, 0.14.0
>
> Attachments: HIVE-6782.1.patch, HIVE-6782.2.patch, HIVE-6782.3.patch, 
> HIVE-6782.4.patch, HIVE-6782.5.patch, HIVE-6782.6.patch, HIVE-6782.7.patch, 
> HIVE-6782.8.patch
>
>
> HiveServer2 concurrency is failing intermittently when using tez, throwing 
> "org.apache.tez.dag.api.SessionNotRunning: Application not running" error



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6739) Hive HBase query fails on Tez due to missing jars and then due to NPE in getSplits

2014-04-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962236#comment-13962236
 ] 

Hive QA commented on HIVE-6739:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12639030/HIVE-6739.04.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5549 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.mapreduce.TestHCatMutableDynamicPartitioned.testHCatDynamicPartitionedTable
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2164/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2164/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12639030

> Hive HBase query fails on Tez due to missing jars and then due to NPE in 
> getSplits
> --
>
> Key: HIVE-6739
> URL: https://issues.apache.org/jira/browse/HIVE-6739
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.13.0
>
> Attachments: HIVE-6739.01.patch, HIVE-6739.02.patch, 
> HIVE-6739.03.patch, HIVE-6739.04.patch, HIVE-6739.patch, 
> HIVE-6739.preliminary.patch
>
>
> Tez paths in Hive never call configure on the input/output operators, so 
> (among other things, potentially) requisite files never get added to the job



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6837) HiveServer2 thrift/http mode & binary mode proxy user check fails reporting IP null for client

2014-04-07 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962215#comment-13962215
 ] 

Harish Butani commented on HIVE-6837:
-

+1 for 0.13

> HiveServer2 thrift/http mode & binary mode proxy user check fails reporting 
> IP null for client
> --
>
> Key: HIVE-6837
> URL: https://issues.apache.org/jira/browse/HIVE-6837
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.13.0
>Reporter: Dilli Arumugam
>Assignee: Vaibhav Gumashta
> Fix For: 0.13.0
>
> Attachments: HIVE-6837.1.patch, HIVE-6837.2.patch, HIVE-6837.3.patch, 
> hive.log
>
>
> Hive Server running thrift/http with Kerberos security.
> Kinited user knox attempting to proxy as sam.
> Beeline connection failed reporting error on hive server logs:
> Caused by: org.apache.hadoop.security.authorize.AuthorizationException: 
> Unauthorized connection for super-user: knox from IP null



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6843) INSTR for UTF-8 returns incorrect position

2014-04-07 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-6843:


Status: Patch Available  (was: Open)

> INSTR for UTF-8 returns incorrect position
> --
>
> Key: HIVE-6843
> URL: https://issues.apache.org/jira/browse/HIVE-6843
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 0.12.0, 0.11.0
>Reporter: Clif Kranish
>Assignee: Szehon Ho
>Priority: Minor
> Attachments: HIVE-6843.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6843) INSTR for UTF-8 returns incorrect position

2014-04-07 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-6843:


Attachment: HIVE-6843.patch

This seems to work, lets see what folks think.

Original code was trying to avoid encoding the bytes and just doing 
byte-counting, but not sure if that is possible when doing unicode char 
calculations.

> INSTR for UTF-8 returns incorrect position
> --
>
> Key: HIVE-6843
> URL: https://issues.apache.org/jira/browse/HIVE-6843
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 0.11.0, 0.12.0
>Reporter: Clif Kranish
>Assignee: Szehon Ho
>Priority: Minor
> Attachments: HIVE-6843.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


Review Request 20103: HIVE-6843 INSTR for UTF-8 returns incorrect position

2014-04-07 Thread Szehon Ho

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20103/
---

Review request for hive.


Repository: hive-git


Description
---

Seems the original authors wanted to avoid encoding, but this is not possible 
if you want to handle Unicode characters.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFUtils.java 
7f4a807 
  ql/src/test/org/apache/hadoop/hive/ql/udf/TestGenericUDFUtils.java d9338a5 

Diff: https://reviews.apache.org/r/20103/diff/


Testing
---

Adding some unicode test of Cyrillic chars.


Thanks,

Szehon Ho



[jira] [Created] (HIVE-6858) Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.

2014-04-07 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HIVE-6858:
--

 Summary: Unit tests decimal_udf.q, vectorization_div0.q fail with 
jdk-7.
 Key: HIVE-6858
 URL: https://issues.apache.org/jira/browse/HIVE-6858
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey


Unit tests decimal_udf.q, vectorization_div0.q fail with jdk-7.

{noformat}
< -250.06583411.236 1.0 6583411.236 -0.004  -0.0048
---
> -250.06583411.236 1.0 6583411.236 -0.0040 -0.0048
{noformat}


Following code reproduces this behavior when run in jdk-7 vs jdk-6. Jdk-7 
produces -0.004 while, jdk-6 produces -0.0040.
{code}
public class Main {
  public static void main(String[] a) throws Exception {
 double val = 0.004;
 System.out.println("Value = "+val);
  }
}
{code}

This happens to be a bug in jdk6, that has been fixed in jdk7.
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4511638





--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6319) Insert, update, delete functionality needs a compactor

2014-04-07 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-6319:
-

Attachment: HIVE-6319.patch

Attaching new version of the patch with changes as suggested by Ashutosh.  I 
don't think we need to re-run the tests as the changes are very small.

> Insert, update, delete functionality needs a compactor
> --
>
> Key: HIVE-6319
> URL: https://issues.apache.org/jira/browse/HIVE-6319
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: 0.13.0
>
> Attachments: 6319.wip.patch, HIVE-6319.patch, HIVE-6319.patch, 
> HIVE-6319.patch, HIVE-6319.patch, HiveCompactorDesign.pdf
>
>
> In order to keep the number of delta files from spiraling out of control we 
> need a compactor to collect these delta files together, and eventually 
> rewrite the base file when the deltas get large enough.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6856) ddl commands fail with permissions issue when running using webhcat in secure Tez cluster

2014-04-07 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962176#comment-13962176
 ] 

Thejas M Nair commented on HIVE-6856:
-

+1

Hcat cli never runs any query on the cluster, so it never needs a runtime 
engine. Always using mr as the engine in config works fine.


> ddl commands fail with permissions issue when running using webhcat in secure 
> Tez cluster
> -
>
> Key: HIVE-6856
> URL: https://issues.apache.org/jira/browse/HIVE-6856
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-6856.patch
>
>
> curl -u : --negotiate -d "exec=show tables;" -X POST 
> http://server:50111/templeton/v1/ddl
> results in (when Tez is enabled in Secure cluster)
> {noformat}
> Exception in thread "main" java.lang.RuntimeException: 
> org.apache.hadoop.security.AccessControlException: Permission denied: 
> user=hrt_qa, access=WRITE, inode="/user/hcat":hcat:hcat:drwxr-xr-x
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:232)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:176)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5497)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5479)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5453)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3596)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3566)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3540)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:754)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:558)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:354)
> at org.apache.hive.hcatalog.cli.HCatCli.main(HCatCli.java:138)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Caused by: org.apache.hadoop.security.AccessControlException: Permission 
> denied: user=hrt_qa, access=WRITE, inode="/user/hcat":hcat:hcat:drwxr-xr-x
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:232)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:176)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5497)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5479)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5453)
> at 
> org.apache.hadoop.hdfs.serve

[jira] [Assigned] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-07 Thread Anthony Hsu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu reassigned HIVE-6835:
-

Assignee: Anthony Hsu

> Reading of partitioned Avro data fails if partition schema does not match 
> table schema
> --
>
> Key: HIVE-6835
> URL: https://issues.apache.org/jira/browse/HIVE-6835
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.12.0
>Reporter: Anthony Hsu
>Assignee: Anthony Hsu
> Attachments: HIVE-6835.1.patch
>
>
> To reproduce:
> {code}
> create table testarray (a array);
> load data local inpath '/home/ahsu/test/array.txt' into table testarray;
> # create partitioned Avro table with one array column
> create table avroarray partitioned by (y string) row format serde 
> 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
> ('avro.schema.literal'='{"namespace":"test","name":"avroarray","type": 
> "record", "fields": [ { "name":"a", "type":{"type":"array","items":"string"} 
> } ] }')  STORED as INPUTFORMAT  
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
> insert into table avroarray partition(y=1) select * from testarray;
> # add an int column with a default value of 0
> alter table avroarray set serde 
> 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
> serdeproperties('avro.schema.literal'='{"namespace":"test","name":"avroarray","type":
>  "record", "fields": [ {"name":"intfield","type":"int","default":0},{ 
> "name":"a", "type":{"type":"array","items":"string"} } ] }');
> # fails with ClassCastException
> select * from avroarray;
> {code}
> The select * fails with:
> {code}
> Failed with exception java.io.IOException:java.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
> cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-07 Thread Anthony Hsu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu updated HIVE-6835:
--

Assignee: (was: Anthony Hsu)
  Status: Patch Available  (was: Open)

> Reading of partitioned Avro data fails if partition schema does not match 
> table schema
> --
>
> Key: HIVE-6835
> URL: https://issues.apache.org/jira/browse/HIVE-6835
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.12.0
>Reporter: Anthony Hsu
> Attachments: HIVE-6835.1.patch
>
>
> To reproduce:
> {code}
> create table testarray (a array);
> load data local inpath '/home/ahsu/test/array.txt' into table testarray;
> # create partitioned Avro table with one array column
> create table avroarray partitioned by (y string) row format serde 
> 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
> ('avro.schema.literal'='{"namespace":"test","name":"avroarray","type": 
> "record", "fields": [ { "name":"a", "type":{"type":"array","items":"string"} 
> } ] }')  STORED as INPUTFORMAT  
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
> insert into table avroarray partition(y=1) select * from testarray;
> # add an int column with a default value of 0
> alter table avroarray set serde 
> 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
> serdeproperties('avro.schema.literal'='{"namespace":"test","name":"avroarray","type":
>  "record", "fields": [ {"name":"intfield","type":"int","default":0},{ 
> "name":"a", "type":{"type":"array","items":"string"} } ] }');
> # fails with ClassCastException
> select * from avroarray;
> {code}
> The select * fails with:
> {code}
> Failed with exception java.io.IOException:java.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
> cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-07 Thread Anthony Hsu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu updated HIVE-6835:
--

Attachment: HIVE-6835.1.patch

Uploaded a patch with a fix.  Review Board link: 
https://reviews.apache.org/r/20096/

> Reading of partitioned Avro data fails if partition schema does not match 
> table schema
> --
>
> Key: HIVE-6835
> URL: https://issues.apache.org/jira/browse/HIVE-6835
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.12.0
>Reporter: Anthony Hsu
>Assignee: Anthony Hsu
> Attachments: HIVE-6835.1.patch
>
>
> To reproduce:
> {code}
> create table testarray (a array);
> load data local inpath '/home/ahsu/test/array.txt' into table testarray;
> # create partitioned Avro table with one array column
> create table avroarray partitioned by (y string) row format serde 
> 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
> ('avro.schema.literal'='{"namespace":"test","name":"avroarray","type": 
> "record", "fields": [ { "name":"a", "type":{"type":"array","items":"string"} 
> } ] }')  STORED as INPUTFORMAT  
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
> insert into table avroarray partition(y=1) select * from testarray;
> # add an int column with a default value of 0
> alter table avroarray set serde 
> 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
> serdeproperties('avro.schema.literal'='{"namespace":"test","name":"avroarray","type":
>  "record", "fields": [ {"name":"intfield","type":"int","default":0},{ 
> "name":"a", "type":{"type":"array","items":"string"} } ] }');
> # fails with ClassCastException
> select * from avroarray;
> {code}
> The select * fails with:
> {code}
> Failed with exception java.io.IOException:java.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
> cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Review Request 20096: HIVE-6835: Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-07 Thread Anthony Hsu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20096/
---

Review request for hive.


Repository: hive-git


Description
---

The problem occurs when you store the "avro.schema.(literal|url)" in the 
SERDEPROPERTIES instead of the TBLPROPERTIES, add a partition, change the 
table's schema, and then try reading from the old partition.

I fixed this problem by passing the table properties to the partition with a 
"table." prefix, and changing the Avro SerDe to always use the table properties 
when available.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java 43cef5c 
  ql/src/test/queries/clientpositive/avro_partitioned.q 068a13c 
  ql/src/test/results/clientpositive/avro_partitioned.q.out 352ec0d 
  serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java 9d58d13 
  serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerdeUtils.java 
67d5570 

Diff: https://reviews.apache.org/r/20096/diff/


Testing
---

Added test cases


Thanks,

Anthony Hsu



[jira] [Updated] (HIVE-6856) ddl commands fail with permissions issue when running using webhcat in secure Tez cluster

2014-04-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6856:
-

Status: Patch Available  (was: Open)

> ddl commands fail with permissions issue when running using webhcat in secure 
> Tez cluster
> -
>
> Key: HIVE-6856
> URL: https://issues.apache.org/jira/browse/HIVE-6856
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-6856.patch
>
>
> curl -u : --negotiate -d "exec=show tables;" -X POST 
> http://server:50111/templeton/v1/ddl
> results in (when Tez is enabled in Secure cluster)
> {noformat}
> Exception in thread "main" java.lang.RuntimeException: 
> org.apache.hadoop.security.AccessControlException: Permission denied: 
> user=hrt_qa, access=WRITE, inode="/user/hcat":hcat:hcat:drwxr-xr-x
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:232)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:176)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5497)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5479)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5453)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3596)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3566)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3540)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:754)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:558)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:354)
> at org.apache.hive.hcatalog.cli.HCatCli.main(HCatCli.java:138)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Caused by: org.apache.hadoop.security.AccessControlException: Permission 
> denied: user=hrt_qa, access=WRITE, inode="/user/hcat":hcat:hcat:drwxr-xr-x
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:232)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:176)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5497)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5479)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5453)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3596)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.

[jira] [Updated] (HIVE-6856) ddl commands fail with permissions issue when running using webhcat in secure Tez cluster

2014-04-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6856:
-

Description: 
curl -u : --negotiate -d "exec=show tables;" -X POST 
http://server:50111/templeton/v1/ddl

results in (when Tez is enabled in Secure cluster)
{noformat}
Exception in thread "main" java.lang.RuntimeException: 
org.apache.hadoop.security.AccessControlException: Permission denied: 
user=hrt_qa, access=WRITE, inode="/user/hcat":hcat:hcat:drwxr-xr-x
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:232)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:176)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5497)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5479)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5453)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3596)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3566)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3540)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:754)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:558)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

at 
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:354)
at org.apache.hive.hcatalog.cli.HCatCli.main(HCatCli.java:138)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: org.apache.hadoop.security.AccessControlException: Permission 
denied: user=hrt_qa, access=WRITE, inode="/user/hcat":hcat:hcat:drwxr-xr-x
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:232)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:176)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5497)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5479)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5453)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3596)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3566)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3540)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:754)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:558)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
 

[jira] [Updated] (HIVE-6856) ddl commands fail with permissions issue when running using webhcat in secure Tez cluster

2014-04-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6856:
-

Attachment: HIVE-6856.patch

> ddl commands fail with permissions issue when running using webhcat in secure 
> Tez cluster
> -
>
> Key: HIVE-6856
> URL: https://issues.apache.org/jira/browse/HIVE-6856
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.13.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-6856.patch
>
>
> curl -u : --negotiate -d "exec=show tables;" -X POST 
> http://server:50111/templeton/v1/ddl
> results in (when Tez is enabled in Secure cluster)
> {noformat}
> Exception in thread "main" java.lang.RuntimeException: 
> org.apache.hadoop.security.AccessControlException: Permission denied: 
> user=hrt_qa, access=WRITE, inode="/user/hcat":hcat:hcat:drwxr-xr-x
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:232)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:176)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5497)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5479)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5453)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3596)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3566)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3540)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:754)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:558)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:354)
> at org.apache.hive.hcatalog.cli.HCatCli.main(HCatCli.java:138)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Caused by: org.apache.hadoop.security.AccessControlException: Permission 
> denied: user=hrt_qa, access=WRITE, inode="/user/hcat":hcat:hcat:drwxr-xr-x
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:232)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:176)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5497)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5479)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5453)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3596)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3566

[jira] [Created] (HIVE-6857) Consolidate HiveServer2 threadlocals

2014-04-07 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-6857:
--

 Summary: Consolidate HiveServer2 threadlocals
 Key: HIVE-6857
 URL: https://issues.apache.org/jira/browse/HIVE-6857
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta


Check the discussion here: HIVE-6837



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6837) HiveServer2 thrift/http mode & binary mode proxy user check fails reporting IP null for client

2014-04-07 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962151#comment-13962151
 ] 

Vaibhav Gumashta commented on HIVE-6837:


[~thejas] Thanks for taking a look.

Sure, I'll do that. There's another issue that I noticed caused in 
SessionManager#openSession as a result of this:
{code}
public SessionHandle openSession(TProtocolVersion protocol, String username, 
String password,
  Map sessionConf, boolean withImpersonation, String 
delegationToken)
  throws HiveSQLException {
HiveSession session;
if (withImpersonation) {
  HiveSessionImplwithUGI hiveSessionUgi = new 
HiveSessionImplwithUGI(protocol, username, password,
hiveConf, sessionConf, TSetIpAddressProcessor.getUserIpAddress(), 
delegationToken);
  session = HiveSessionProxy.getProxy(hiveSessionUgi, 
hiveSessionUgi.getSessionUgi());
  hiveSessionUgi.setProxySession(session);
} else {
  session = new HiveSessionImpl(protocol, username, password, hiveConf, 
sessionConf,
  TSetIpAddressProcessor.getUserIpAddress());
}
session.setSessionManager(this);
session.setOperationManager(operationManager);
session.open();
handleToSession.put(session.getSessionHandle(), session);

try {
  executeSessionHooks(session);
} catch (Exception e) {
  throw new HiveSQLException("Failed to execute session hooks", e);
}
return session.getSessionHandle();
  }
{code}

Notice that if withImpersonation is set to true, we're using 
TSetIpAddressProcessor.getUserIpAddress() to get the IP address which is wrong 
for a kerberized setup (should use HiveAuthFactory#getIpAddress).

Also, in case of a kerberized setup, we're wrapping the transport in a doAs 
(with UGI of the HiveServer2 process) which doesn't make sense to me: 
https://github.com/apache/hive/blob/trunk/shims/common-secure/src/main/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java#L335.
 

> HiveServer2 thrift/http mode & binary mode proxy user check fails reporting 
> IP null for client
> --
>
> Key: HIVE-6837
> URL: https://issues.apache.org/jira/browse/HIVE-6837
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.13.0
>Reporter: Dilli Arumugam
>Assignee: Vaibhav Gumashta
> Fix For: 0.13.0
>
> Attachments: HIVE-6837.1.patch, HIVE-6837.2.patch, HIVE-6837.3.patch, 
> hive.log
>
>
> Hive Server running thrift/http with Kerberos security.
> Kinited user knox attempting to proxy as sam.
> Beeline connection failed reporting error on hive server logs:
> Caused by: org.apache.hadoop.security.authorize.AuthorizationException: 
> Unauthorized connection for super-user: knox from IP null



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HIVE-2016) alter partition should throw exception if the specified partition does not exist.

2014-04-07 Thread Chinna Rao Lalam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam resolved HIVE-2016.


Resolution: Implemented

In trunk It is implemented

> alter partition should throw exception if the specified partition does not 
> exist. 
> --
>
> Key: HIVE-2016
> URL: https://issues.apache.org/jira/browse/HIVE-2016
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.8.0
> Environment: Hadoop 0.20.1, hive-0.8.0-SNAPSHOT and SUSE Linux 
> Enterprise Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5).
>Reporter: Chinna Rao Lalam
>Assignee: Chinna Rao Lalam
>
> To reproduce the issue follow the below steps
> {noformat}
>  set hive.exec.drop.ignorenonexistent=false;
>  create table page_test(view INT, userid INT, page_url STRING) PARTITIONED 
> BY(dt STRING, country STRING) STORED AS A TEXTFILE;
>  LOAD DATA LOCAL INPATH '/home/test.txt' OVERWRITE INTO TABLE page_test 
> PARTITION(dt='10-10-2010',country='US');
>  LOAD DATA LOCAL INPATH '/home/test.txt' OVERWRITE INTO TABLE page_test 
> PARTITION(dt='10-12-2010',country='IN');
> {noformat}
> {noformat}
>  ALTER TABLE page_test DROP PARTITION (dt='23-02-2010',country='UK');
> {noformat}
>  This query should throw exception because the requested partition doesn't 
> exist
>  This issue related to HIVE-1535



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6856) ddl commands fail with permissions issue when running using webhcat in secure Tez cluster

2014-04-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6856:
-

Description: 
curl -u : --negotiate -d "exec=show tables;" -X POST 
http://server:50111/templeton/v1/ddl

results in (when Tez is enabled in Secure cluster)
{noformat}
Exception in thread "main" java.lang.RuntimeException: 
org.apache.hadoop.security.AccessControlException: Permission denied: 
user=hrt_qa, access=WRITE, inode="/user/hcat":hcat:hcat:drwxr-xr-x
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:232)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:176)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5497)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5479)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5453)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3596)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3566)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3540)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:754)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:558)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

at 
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:354)
at org.apache.hive.hcatalog.cli.HCatCli.main(HCatCli.java:138)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: org.apache.hadoop.security.AccessControlException: Permission 
denied: user=hrt_qa, access=WRITE, inode="/user/hcat":hcat:hcat:drwxr-xr-x
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:232)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:176)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5497)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5479)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5453)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3596)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3566)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3540)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:754)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:558)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
 

[jira] [Created] (HIVE-6856) ddl commands fail with permissions issue when running using webhcat in secure Tez cluster

2014-04-07 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-6856:


 Summary: ddl commands fail with permissions issue when running 
using webhcat in secure Tez cluster
 Key: HIVE-6856
 URL: https://issues.apache.org/jira/browse/HIVE-6856
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman


curl -u : --negotiate -d "exec=show tables;" -X POST 
http://server:50111/templeton/v1/ddl

results in (when Tez is enabled in Secure cluster)

Exception in thread "main" java.lang.RuntimeException: 
org.apache.hadoop.security.AccessControlException: Permission denied: 
user=hrt_qa, access=WRITE, inode="/user/hcat":hcat:hcat:drwxr-xr-x
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:232)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:176)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5497)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5479)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5453)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3596)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3566)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3540)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:754)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:558)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

at 
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:354)
at org.apache.hive.hcatalog.cli.HCatCli.main(HCatCli.java:138)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: org.apache.hadoop.security.AccessControlException: Permission 
denied: user=hrt_qa, access=WRITE, inode="/user/hcat":hcat:hcat:drwxr-xr-x
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:232)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:176)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5497)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5479)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:5453)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3596)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3566)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3540)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:754)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:558)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$Cli

[jira] [Updated] (HIVE-1996) "LOAD DATA INPATH" fails when the table already contains a file of the same name

2014-04-07 Thread Chinna Rao Lalam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-1996:
---

Resolution: Duplicate
Status: Resolved  (was: Patch Available)

This issue is solved as part of HIVE-3300

> "LOAD DATA INPATH" fails when the table already contains a file of the same 
> name
> 
>
> Key: HIVE-1996
> URL: https://issues.apache.org/jira/browse/HIVE-1996
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.7.0, 0.8.1
>Reporter: Kirk True
>Assignee: Chinna Rao Lalam
> Attachments: HIVE-1996.1.Patch, HIVE-1996.2.Patch, HIVE-1996.Patch
>
>
> Steps:
> 1. From the command line copy the kv2.txt data file into the current user's 
> HDFS directory:
> {{$ hadoop fs -copyFromLocal /path/to/hive/sources/data/files/kv2.txt 
> kv2.txt}}
> 2. In Hive, create the table:
> {{create table tst_src1 (key_ int, value_ string);}}
> 3. Load the data into the table from HDFS:
> {{load data inpath './kv2.txt' into table tst_src1;}}
> 4. Repeat step 1
> 5. Repeat step 3
> Expected:
> To have kv2.txt renamed in HDFS and then copied to the destination as per 
> HIVE-307.
> Actual:
> File is renamed, but {{Hive.copyFiles}} doesn't "see" the change in {{srcs}} 
> as it continues to use the same array elements (with the un-renamed, old file 
> names). It crashes with this error:
> {noformat}
> java.lang.NullPointerException
> at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:1725)
> at org.apache.hadoop.hive.ql.metadata.Table.copyFiles(Table.java:541)
> at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1173)
> at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:197)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1060)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:897)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:745)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6841) Vectorized execution throws NPE for partitioning columns with __HIVE_DEFAULT_PARTITION__

2014-04-07 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6841:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

The failed tests are not related to the patch and passed when run locally.

Committed to trunk and branch-0.13.

> Vectorized execution throws NPE for partitioning columns with 
> __HIVE_DEFAULT_PARTITION__
> 
>
> Key: HIVE-6841
> URL: https://issues.apache.org/jira/browse/HIVE-6841
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
>Priority: Critical
> Fix For: 0.13.0
>
> Attachments: HIVE-6841.1.patch, HIVE-6841.2.patch, HIVE-6841.3.patch
>
>
> If partitioning columns have __HIVE_DEFAULT_PARTITION__ or null, vectorized 
> execution throws NPE.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6841) Vectorized execution throws NPE for partitioning columns with __HIVE_DEFAULT_PARTITION__

2014-04-07 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962126#comment-13962126
 ] 

Harish Butani commented on HIVE-6841:
-

+1 for 0.13

> Vectorized execution throws NPE for partitioning columns with 
> __HIVE_DEFAULT_PARTITION__
> 
>
> Key: HIVE-6841
> URL: https://issues.apache.org/jira/browse/HIVE-6841
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
>Priority: Critical
> Attachments: HIVE-6841.1.patch, HIVE-6841.2.patch, HIVE-6841.3.patch
>
>
> If partitioning columns have __HIVE_DEFAULT_PARTITION__ or null, vectorized 
> execution throws NPE.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6846) allow safe set commands with sql standard authorization

2014-04-07 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962115#comment-13962115
 ] 

Lefty Leverenz commented on HIVE-6846:
--

What documentation does this need?

> allow safe set commands with sql standard authorization
> ---
>
> Key: HIVE-6846
> URL: https://issues.apache.org/jira/browse/HIVE-6846
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 0.13.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.13.0
>
> Attachments: HIVE-6846.1.patch, HIVE-6846.2.patch
>
>
> HIVE-6827 disables all set commands when SQL standard authorization is turned 
> on, but not all set commands are unsafe. We should allow safe set commands.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6855) A couple of errors in MySQL db creation script for transaction tables

2014-04-07 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962113#comment-13962113
 ] 

Alan Gates commented on HIVE-6855:
--

The transaction tables aren't currently in  hive-schema-0.14.0.mysql.sql 
because we wanted to figure out a better method than adding them by hand for 
0.14.

> A couple of errors in MySQL db creation script for transaction tables
> -
>
> Key: HIVE-6855
> URL: https://issues.apache.org/jira/browse/HIVE-6855
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.13.0
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-6855.patch
>
>
> There are a few small issues in the database creation scripts for mysql.  A 
> couple of the tables don't set the engine to InnoDB.  None of the tables set 
> default character set to latin1.  And the syntax "CREATE INDEX...USING HASH" 
> doesn't work on older versions of MySQL.  Instead the index creation should 
> be done without specifying a method (no USING clause).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6841) Vectorized execution throws NPE for partitioning columns with __HIVE_DEFAULT_PARTITION__

2014-04-07 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962108#comment-13962108
 ] 

Jitendra Nath Pandey commented on HIVE-6841:


[~rhbutani] This is a critical issue in hive-0.13 and fails many queries on 
partitioned tables in vectorized execution. It should be fixed in branch-0.13 
as well.

> Vectorized execution throws NPE for partitioning columns with 
> __HIVE_DEFAULT_PARTITION__
> 
>
> Key: HIVE-6841
> URL: https://issues.apache.org/jira/browse/HIVE-6841
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
>Priority: Critical
> Attachments: HIVE-6841.1.patch, HIVE-6841.2.patch, HIVE-6841.3.patch
>
>
> If partitioning columns have __HIVE_DEFAULT_PARTITION__ or null, vectorized 
> execution throws NPE.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6846) allow safe set commands with sql standard authorization

2014-04-07 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6846:


Attachment: HIVE-6846.2.patch

Fixing tests failures, added another jdbc test.


> allow safe set commands with sql standard authorization
> ---
>
> Key: HIVE-6846
> URL: https://issues.apache.org/jira/browse/HIVE-6846
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 0.13.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.13.0
>
> Attachments: HIVE-6846.1.patch, HIVE-6846.2.patch
>
>
> HIVE-6827 disables all set commands when SQL standard authorization is turned 
> on, but not all set commands are unsafe. We should allow safe set commands.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6739) Hive HBase query fails on Tez due to missing jars and then due to NPE in getSplits

2014-04-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6739:
---

Attachment: HIVE-6739.04.patch

The previous patch was incomplete, updating. The tests that failed on 02 pass 
locally for me, which stands to reason as only the Tez path is changed here. 
This patch does not need Tez 0.4

> Hive HBase query fails on Tez due to missing jars and then due to NPE in 
> getSplits
> --
>
> Key: HIVE-6739
> URL: https://issues.apache.org/jira/browse/HIVE-6739
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.13.0
>
> Attachments: HIVE-6739.01.patch, HIVE-6739.02.patch, 
> HIVE-6739.03.patch, HIVE-6739.04.patch, HIVE-6739.patch, 
> HIVE-6739.preliminary.patch
>
>
> Tez paths in Hive never call configure on the input/output operators, so 
> (among other things, potentially) requisite files never get added to the job



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6739) Hive HBase query fails on Tez due to missing jars and then due to NPE in getSplits

2014-04-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962087#comment-13962087
 ] 

Sergey Shelukhin commented on HIVE-6739:


Will commit later today

> Hive HBase query fails on Tez due to missing jars and then due to NPE in 
> getSplits
> --
>
> Key: HIVE-6739
> URL: https://issues.apache.org/jira/browse/HIVE-6739
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.13.0
>
> Attachments: HIVE-6739.01.patch, HIVE-6739.02.patch, 
> HIVE-6739.03.patch, HIVE-6739.04.patch, HIVE-6739.patch, 
> HIVE-6739.preliminary.patch
>
>
> Tez paths in Hive never call configure on the input/output operators, so 
> (among other things, potentially) requisite files never get added to the job



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6855) A couple of errors in MySQL db creation script for transaction tables

2014-04-07 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962077#comment-13962077
 ] 

Ashutosh Chauhan commented on HIVE-6855:


Seems like we also need to update hive-schema-0.14.0.mysql.sql

> A couple of errors in MySQL db creation script for transaction tables
> -
>
> Key: HIVE-6855
> URL: https://issues.apache.org/jira/browse/HIVE-6855
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.13.0
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-6855.patch
>
>
> There are a few small issues in the database creation scripts for mysql.  A 
> couple of the tables don't set the engine to InnoDB.  None of the tables set 
> default character set to latin1.  And the syntax "CREATE INDEX...USING HASH" 
> doesn't work on older versions of MySQL.  Instead the index creation should 
> be done without specifying a method (no USING clause).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6855) A couple of errors in MySQL db creation script for transaction tables

2014-04-07 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-6855:
-

Attachment: HIVE-6855.patch

> A couple of errors in MySQL db creation script for transaction tables
> -
>
> Key: HIVE-6855
> URL: https://issues.apache.org/jira/browse/HIVE-6855
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.13.0
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-6855.patch
>
>
> There are a few small issues in the database creation scripts for mysql.  A 
> couple of the tables don't set the engine to InnoDB.  None of the tables set 
> default character set to latin1.  And the syntax "CREATE INDEX...USING HASH" 
> doesn't work on older versions of MySQL.  Instead the index creation should 
> be done without specifying a method (no USING clause).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 20051: HIVE-4904: A little more CP crossing RS boundaries

2014-04-07 Thread John Pullokkaran

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20051/#review39706
---


I took a look at this change; my knowledge of hive code is rather limited.
1. Column Pruner doesn't cross Script operator boundary. Theoretically you 
could prune above and below the script op separately.
2. It seems column pruner assumes that parent of UDTF is always select; but we 
haven't formalized this assumption. Other processors should throw exception if 
it ever come across a child that is UDTF. Theoretically you can push down 
certain filters below builtin UDTF. We may not be doing that today.
3.  In Select Pruner it seems like there is no difference between 'prunedCols' 
and 'columns'.

- John Pullokkaran


On April 6, 2014, 1:33 a.m., Harish Butani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/20051/
> ---
> 
> (Updated April 6, 2014, 1:33 a.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Navis Ryu.
> 
> 
> Bugs: HIVE-4904
> https://issues.apache.org/jira/browse/HIVE-4904
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Currently, CP context cannot be propagated over RS except for JOIN/EXT. A 
> little more CP is possible.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPruner.java 58a9b59 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcCtx.java 
> db36151 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java 
> 0690fb7 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 3f16dc2 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java
>  94224b3 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 3b33dc2 
>   ql/src/test/queries/clientpositive/order_within_subquery.q PRE-CREATION 
>   ql/src/test/results/clientpositive/annotate_stats_select.q.out 1e982e6 
>   ql/src/test/results/clientpositive/auto_join18.q.out b8677f4 
>   ql/src/test/results/clientpositive/auto_join27.q.out a576190 
>   ql/src/test/results/clientpositive/auto_join30.q.out 8709198 
>   ql/src/test/results/clientpositive/auto_join31.q.out 1936e45 
>   ql/src/test/results/clientpositive/auto_join32.q.out 05f53e6 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_10.q.out 8882aac 
>   ql/src/test/results/clientpositive/count.q.out eb048b6 
>   ql/src/test/results/clientpositive/distinct_stats.q.out f715ea3 
>   ql/src/test/results/clientpositive/groupby2_map.q.out 291f196 
>   ql/src/test/results/clientpositive/groupby2_map_skew.q.out d005b6c 
>   ql/src/test/results/clientpositive/groupby3_map.q.out 1dfee08 
>   ql/src/test/results/clientpositive/groupby3_map_skew.q.out 7af59bc 
>   ql/src/test/results/clientpositive/groupby_cube1.q.out 92d81f4 
>   ql/src/test/results/clientpositive/groupby_distinct_samekey.q.out b405978 
>   ql/src/test/results/clientpositive/groupby_map_ppr.q.out 27eff75 
>   
> ql/src/test/results/clientpositive/groupby_multi_insert_common_distinct.q.out 
> ad76252 
>   ql/src/test/results/clientpositive/groupby_multi_single_reducer3.q.out 
> 51a70c4 
>   ql/src/test/results/clientpositive/groupby_position.q.out 727bccb 
>   ql/src/test/results/clientpositive/groupby_rollup1.q.out 36bf966 
>   ql/src/test/results/clientpositive/groupby_sort_11.q.out 8ee7571 
>   ql/src/test/results/clientpositive/groupby_sort_8.q.out a27cfaa 
>   ql/src/test/results/clientpositive/join18.q.out 7975c79 
>   ql/src/test/results/clientpositive/limit_pushdown.q.out 9c93ada 
>   ql/src/test/results/clientpositive/limit_pushdown_negative.q.out 115b171 
>   ql/src/test/results/clientpositive/metadataonly1.q.out 917efdf 
>   ql/src/test/results/clientpositive/multi_insert_gby2.q.out ab758cb 
>   ql/src/test/results/clientpositive/multi_insert_gby3.q.out 23ccebb 
>   ql/src/test/results/clientpositive/multi_insert_lateral_view.q.out 35e70b4 
>   ql/src/test/results/clientpositive/nullgroup.q.out 2ac7dea 
>   ql/src/test/results/clientpositive/nullgroup2.q.out cf31dc1 
>   ql/src/test/results/clientpositive/nullgroup4.q.out feae138 
>   ql/src/test/results/clientpositive/nullgroup4_multi_distinct.q.out 2ee357f 
>   ql/src/test/results/clientpositive/order_within_subquery.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/reduce_deduplicate_extended.q.out 
> 9c6d14e 
>   ql/src/test/results/clientpositive/udf_count.q.out fb45708 
>   ql/src/test/results/clientpositive/union11.q.out f226f35 
>   ql/src/test/results/clientpositive/union14.q.out a6d349b 
>   ql/src/test/results/clientpositive/union15.q.out 88c9553 
>   ql/src/test/results/clientpositive/union16.q.out 2bd8d5e 
>   ql/src/test/results/clientpositive/union2.q.out 0fac9d9 
>   

[jira] [Updated] (HIVE-6855) A couple of errors in MySQL db creation script for transaction tables

2014-04-07 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-6855:
-

Status: Patch Available  (was: Open)

NO PRECOMMIT TESTS 

Updated the mysql scripts.

> A couple of errors in MySQL db creation script for transaction tables
> -
>
> Key: HIVE-6855
> URL: https://issues.apache.org/jira/browse/HIVE-6855
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.13.0
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-6855.patch
>
>
> There are a few small issues in the database creation scripts for mysql.  A 
> couple of the tables don't set the engine to InnoDB.  None of the tables set 
> default character set to latin1.  And the syntax "CREATE INDEX...USING HASH" 
> doesn't work on older versions of MySQL.  Instead the index creation should 
> be done without specifying a method (no USING clause).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6855) A couple of errors in MySQL db creation script for transaction tables

2014-04-07 Thread Alan Gates (JIRA)
Alan Gates created HIVE-6855:


 Summary: A couple of errors in MySQL db creation script for 
transaction tables
 Key: HIVE-6855
 URL: https://issues.apache.org/jira/browse/HIVE-6855
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.0
Reporter: Alan Gates
Assignee: Alan Gates


There are a few small issues in the database creation scripts for mysql.  A 
couple of the tables don't set the engine to InnoDB.  None of the tables set 
default character set to latin1.  And the syntax "CREATE INDEX...USING HASH" 
doesn't work on older versions of MySQL.  Instead the index creation should be 
done without specifying a method (no USING clause).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6848) Importing into an existing table fails

2014-04-07 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962030#comment-13962030
 ] 

Harish Butani commented on HIVE-6848:
-

Sure, added a new jira for this HIVE-6854

> Importing into an existing table fails
> --
>
> Key: HIVE-6848
> URL: https://issues.apache.org/jira/browse/HIVE-6848
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Arpit Gupta
>Assignee: Harish Butani
> Fix For: 0.13.0
>
> Attachments: HIVE-6848.1.patch
>
>
> This is because ImportSemanticAnalyzer:checkTable doesn't account for the 
> renaming of OutputFormat class and the setting of a default value for 
> Serialization.Format



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6854) Add unit test for Reimport use case

2014-04-07 Thread Harish Butani (JIRA)
Harish Butani created HIVE-6854:
---

 Summary: Add unit test for Reimport use case
 Key: HIVE-6854
 URL: https://issues.apache.org/jira/browse/HIVE-6854
 Project: Hive
  Issue Type: Bug
Reporter: Harish Butani


AS a followup to HIVE-6848



--
This message was sent by Atlassian JIRA
(v6.2#6252)


  1   2   >