[jira] [Commented] (HIVE-8045) SQL standard auth with cli - Errors and configuration issues

2014-09-19 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140115#comment-14140115
 ] 

Jason Dere commented on HIVE-8045:
--

+1

> SQL standard auth with cli - Errors and configuration issues
> 
>
> Key: HIVE-8045
> URL: https://issues.apache.org/jira/browse/HIVE-8045
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Jagruti Varia
>Assignee: Thejas M Nair
> Attachments: HIVE-8045.1.patch, HIVE-8045.2.patch, HIVE-8045.3.patch
>
>
> HIVE-7533 enabled sql std authorization to be set in hive cli (without 
> enabling authorization checks). This updates hive configuration so that 
> create-table and create-views set permissions appropriately for the owner of 
> the table.
> HIVE-7209 added a metastore authorization provider that can be used to 
> restricts calls made to the authorization api, so that only HS2 can make 
> those calls (when HS2 uses embedded metastore).
> Some issues were found with this.
> # Even if hive.security.authorization.enabled=false, authorization checks 
> were happening for non sql statements as add/detete/dfs/compile, which 
> results in MetaStoreAuthzAPIAuthorizerEmbedOnly throwing an error.
> # Create table from hive-cli ended up calling metastore server api call 
> (getRoles) and resulted in  MetaStoreAuthzAPIAuthorizerEmbedOnly throwing an 
> error.
> # Some users prefer to enable authorization using hive-site.xml for 
> hive-server2 (hive.security.authorization.enabled param). If this file is 
> shared by hive-cli and hive-server2,  SQL std authorizer throws an error 
> because is use in hive-cli is not allowed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7984) AccumuloOutputFormat Configuration items from StorageHandler not re-set in Configuration in Tez

2014-09-19 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-7984:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks [~elserj]!

> AccumuloOutputFormat Configuration items from StorageHandler not re-set in 
> Configuration in Tez
> ---
>
> Key: HIVE-7984
> URL: https://issues.apache.org/jira/browse/HIVE-7984
> Project: Hive
>  Issue Type: Bug
>  Components: StorageHandler, Tez
>Reporter: Josh Elser
>Assignee: Josh Elser
> Fix For: 0.14.0
>
> Attachments: HIVE-7984-1.diff, HIVE-7984-1.patch, HIVE-7984.1.patch
>
>
> Ran AccumuloStorageHandler queries with Tez and found that configuration 
> elements that are pulled from the {{-hiveconf}} and passed to the 
> inputJobProperties or outputJobProperties by the AccumuloStorageHandler 
> aren't available inside of the Tez container.
> I'm guessing that there is a disconnect from the configuration that the 
> StorageHandler creates and what the Tez container sees.
> The HBaseStorageHandler likely doesn't run into this because it expects to 
> have hbase-site.xml available via tmpjars (and can extrapolate connection 
> information from that file). Accumulo's site configuration file is not meant 
> to be shared with consumers which means that this exact approach is not 
> sufficient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7946) CBO: Merge CBO changes to Trunk

2014-09-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140144#comment-14140144
 ] 

Hive QA commented on HIVE-7946:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12669918/HIVE-7946.13.patch

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 6295 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_if
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_correctness
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_correlationoptimizer1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mapjoin_mapjoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_metadataonly1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mrr
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_bmj_schema_evolution
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_left_outer_join
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_mapjoin_reduce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_mapjoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_nested_mapjoin
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/871/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/871/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-871/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12669918

> CBO: Merge CBO changes to Trunk
> ---
>
> Key: HIVE-7946
> URL: https://issues.apache.org/jira/browse/HIVE-7946
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Laljo John Pullokkaran
>Assignee: Laljo John Pullokkaran
> Attachments: HIVE-7946.1.patch, HIVE-7946.10.patch, 
> HIVE-7946.11.patch, HIVE-7946.12.patch, HIVE-7946.13.patch, 
> HIVE-7946.2.patch, HIVE-7946.3.patch, HIVE-7946.4.patch, HIVE-7946.5.patch, 
> HIVE-7946.6.patch, HIVE-7946.7.patch, HIVE-7946.8.patch, HIVE-7946.9.patch, 
> HIVE-7946.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8138) Global Init file should allow specifying file name not only directory

2014-09-19 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140152#comment-14140152
 ] 

Vaibhav Gumashta commented on HIVE-8138:


[~dongc] I fixed the param naming as part of HIVE-7935. 

> Global Init file should allow specifying file name  not only directory
> --
>
> Key: HIVE-8138
> URL: https://issues.apache.org/jira/browse/HIVE-8138
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-8138.patch
>
>
> HIVE-5160 allows you to specify a directory where a .hiverc file exists. 
> However since .hiverc is a hidden file this can be confusing. The property 
> should allow a path to a file or a directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8188) ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight loop

2014-09-19 Thread Prasanth J (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140154#comment-14140154
 ] 

Prasanth J commented on HIVE-8188:
--

I think its because hash-aggregation needs to estimate the size of the hash 
map. The values of the hashmaps are UDAFs whose aggregation buffer size can be 
estimated if the aggregation buffer has this annotation 
"@AggregationType(estimable = true)". GroupByOperator.shouldBeFlushed() is 
called for every row that is added to hash map. shouldBeFlushed() calls 
isEstimable() helper function which uses reflection every time to see if the 
aggregation function is estimable. Not sure why it is done this way but yes 
this will be slow as hell. This needs to be fixed.

> ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight 
> loop
> -
>
> Key: HIVE-8188
> URL: https://issues.apache.org/jira/browse/HIVE-8188
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 0.14.0
>Reporter: Gopal V
> Attachments: udf-deterministic.png
>
>
> When running a near-constant UDF, most of the CPU is burnt within the VM 
> trying to read the class annotations for every row.
> !udf-deterministic.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7980) Hive on spark issue..

2014-09-19 Thread alton.jung (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140164#comment-14140164
 ] 

alton.jung commented on HIVE-7980:
--

Thanks for it..

I got really confused about current version(hive on spark)..
I succeeded with query through hive cli, but when i tested it with beeline or 
jdbc.. I always met error...
I wonder current version can support query with jdbc and beeline..

[Error]
java.lang.NullPointerException
at 
org.apache.spark.SparkContext.defaultParallelism(SparkContext.scala:1262)
at 
org.apache.spark.SparkContext.defaultMinPartitions(SparkContext.scala:1269)
at 
org.apache.spark.SparkContext.hadoopRDD$default$5(SparkContext.scala:537)
at 
org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:318)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateRDD(SparkPlanGenerator.java:160)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:88)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:156)
at 
org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.submit(SparkSessionImpl.java:52)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:76)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:161)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)

> Hive on spark issue..
> -
>
> Key: HIVE-7980
> URL: https://issues.apache.org/jira/browse/HIVE-7980
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Spark
>Affects Versions: spark-branch
> Environment: Test Environment is..
> . hive 0.14.0(spark branch version)
> . spark 
> (http://ec2-50-18-79-139.us-west-1.compute.amazonaws.com/data/spark-assembly-1.1.0-SNAPSHOT-hadoop2.3.0.jar)
> . hadoop 2.4.0 (yarn)
>Reporter: alton.jung
>Assignee: Chao
> Fix For: spark-branch
>
>
> .I followed this 
> guide(https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started).
>  and i compiled hive from spark branch. in the next step i met the below 
> error..
> (*i typed the hive query on beeline, i used the  simple query using "order 
> by" to invoke the palleral works 
>ex) select * from test where id = 1 order by id;
> )
> [Error list is]
> 2014-09-04 02:58:08,796 ERROR spark.SparkClient 
> (SparkClient.java:execute(158)) - Error generating Spark Plan
> java.lang.NullPointerException
>   at 
> org.apache.spark.SparkContext.defaultParallelism(SparkContext.scala:1262)
>   at 
> org.apache.spark.SparkContext.defaultMinPartitions(SparkContext.scala:1269)
>   at 
> org.apache.spark.SparkContext.hadoopRDD$default$5(SparkContext.scala:537)
>   at 
> org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:318)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateRDD(SparkPlanGenerator.java:160)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:88)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:156)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.submit(SparkSessionImpl.java:52)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:77)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:161)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
>   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:72)
> 2014-09-04 02:58:11,108 ERROR ql.Driver (SessionState.java:printError(696)) - 
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
> 2014-09-04 02:58:11,182 INFO  log.PerfLogger 
> (PerfLogger.java:PerfLogEnd(135)) -  start=1409824527954 end=1409824691182 duration=163228 
> from=org.apache.hadoop.hive.ql.Driver>
> 2014-09-04 02:58:11,223 INFO  log.PerfLogger 
> (PerfLogger.java:PerfLogBegin(108)) -  from=org.apache.hadoop.hive.ql.Driver>
> 2014-09-04 02:58:11,224 INFO  log.PerfLogger 
> (PerfLogger.java:PerfLogEnd(135)) -  start=1409824691223 end=1409824691224 duration=1 
> from=org.apache.hadoop.hive.ql.Driver>
> 2014-09-04 02:58:11,306 ERROR operation.Operation 
> (SQLOperation.java:run(199)) - Error running hive query: 
> org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
>   at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:284)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:146)
>   at 
> o

[jira] [Updated] (HIVE-8179) Fetch task conversion: Remove some dependencies on AST

2014-09-19 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8179:
-
Status: Patch Available  (was: Open)

> Fetch task conversion: Remove some dependencies on AST
> --
>
> Key: HIVE-8179
> URL: https://issues.apache.org/jira/browse/HIVE-8179
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-8179.1.patch, HIVE-8179.2.patch
>
>
> fetch task conversion is does some strange things:
> For instance: select * from (select * from x) t, wont get converted even 
> though it's the exact same operator plan as: select * from x.
> Or: select * from foo will get converted with minimal, but select  columns of foo> from foo won't.
> We also check the AST for group by etc, but then do the same thing in the 
> operator tree again.
> I'm also wondering why we ship with "moar" as default, but test with 
> "minimal" in the unit tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8179) Fetch task conversion: Remove some dependencies on AST

2014-09-19 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8179:
-
Status: Open  (was: Patch Available)

> Fetch task conversion: Remove some dependencies on AST
> --
>
> Key: HIVE-8179
> URL: https://issues.apache.org/jira/browse/HIVE-8179
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-8179.1.patch, HIVE-8179.2.patch
>
>
> fetch task conversion is does some strange things:
> For instance: select * from (select * from x) t, wont get converted even 
> though it's the exact same operator plan as: select * from x.
> Or: select * from foo will get converted with minimal, but select  columns of foo> from foo won't.
> We also check the AST for group by etc, but then do the same thing in the 
> operator tree again.
> I'm also wondering why we ship with "moar" as default, but test with 
> "minimal" in the unit tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8179) Fetch task conversion: Remove some dependencies on AST

2014-09-19 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8179:
-
Attachment: HIVE-8179.2.patch

> Fetch task conversion: Remove some dependencies on AST
> --
>
> Key: HIVE-8179
> URL: https://issues.apache.org/jira/browse/HIVE-8179
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-8179.1.patch, HIVE-8179.2.patch
>
>
> fetch task conversion is does some strange things:
> For instance: select * from (select * from x) t, wont get converted even 
> though it's the exact same operator plan as: select * from x.
> Or: select * from foo will get converted with minimal, but select  columns of foo> from foo won't.
> We also check the AST for group by etc, but then do the same thing in the 
> operator tree again.
> I'm also wondering why we ship with "moar" as default, but test with 
> "minimal" in the unit tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7482) The execution side changes for SMB join in hive-tez

2014-09-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140209#comment-14140209
 ] 

Hive QA commented on HIVE-7482:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12669838/HIVE-7482.5.patch

{color:red}ERROR:{color} -1 due to 68 failed/errored test(s), 6310 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_smb_mapjoin_14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketizedhiveinputformat_auto
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_mine
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_nulls
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_nullsafe
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_15
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_16
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_17
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sort_merge_join_desc_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sort_merge_join_desc_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sort_merge_join_desc_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sort_merge_join_desc_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sort_merge_join_desc_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_corr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_bucketmapjoin1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_smb_1
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin6
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_map_operators
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_quotedid_smb
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_smb_mapjoin_8
org.apache.hadoop.hive.ql.io.TestHiveBinarySearchRecordReader.testEqualOpClass
org.apache.hadoop.hive.ql.io.TestHiveBinarySearchRecordReader.testGreaterThanOpClass
org.apache.hadoop.hive.ql.io.TestHiveBinarySearchRecordReader.testGreaterThanOrEqualOpClass
org.apache.hadoop.hive.ql.io.TestHive

Re: Review Request 25575: HIVE-7615: Beeline should have an option for user to see the query progress

2014-09-19 Thread Dong Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25575/
---

(Updated Sept. 19, 2014, 9:22 a.m.)


Review request for hive.


Changes
---

Update the patch based on comments. Mainly change the HiveStatement exposed 
public API to minimal. So remove the QueryState.


Repository: hive-git


Description
---

When executing query in Beeline, user should have a option to see the progress 
through the outputs. Beeline could use the API introduced in HIVE-4629 to get 
and display the logs to the client.


Diffs (updated)
-

  beeline/pom.xml 45fa02b 
  beeline/src/java/org/apache/hive/beeline/Commands.java a92d69f 
  
itests/hive-unit/src/test/java/org/apache/hive/beeline/TestBeeLineWithArgs.java 
1e66542 
  itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java 
daf8e9e 
  jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java 2cbf58c 

Diff: https://reviews.apache.org/r/25575/diff/


Testing
---

UT passed.


Thanks,

Dong Chen



[jira] [Commented] (HIVE-8188) ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight loop

2014-09-19 Thread Prasanth J (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140232#comment-14140232
 ] 

Prasanth J commented on HIVE-8188:
--

I tried to avoid this reflection invocation multiple times in inner loop by 
computing total aggregation size once and reusing it in inner loop. I ran the 
following query
{code}
select ss_quantity, ss_store_sk, ss_promo_sk, count(ss_list_price), 
count(ss_sales_price), sum(ss_ext_sales_price) from store_sales_orc group by 
ss_quantity,ss_store_sk,ss_promo_sk;
{code}

store_sales had 2880404 rows. The original execution time was 18.5s and with 
the above changes the time went down to 15.5s which is ~17% gain which explains 
the reflection cost from the attached image.

> ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight 
> loop
> -
>
> Key: HIVE-8188
> URL: https://issues.apache.org/jira/browse/HIVE-8188
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 0.14.0
>Reporter: Gopal V
> Attachments: udf-deterministic.png
>
>
> When running a near-constant UDF, most of the CPU is burnt within the VM 
> trying to read the class annotations for every row.
> !udf-deterministic.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7615) Beeline should have an option for user to see the query progress

2014-09-19 Thread Dong Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Chen updated HIVE-7615:

Attachment: HIVE-7615.2.patch

Hi [~thejas], I have updated the patch based on your comments. I agree that we 
should minimize the exposed public API to avoid confusion for users. This is a 
precious comment to make the interface design better.

In the patch, I remove the QueryState, and use a private boolean 
isExecuteStatementFailed to let getQueryLog() method specify the throwed 
exceptions. Caller could know what exactly happened according to the exceptions.
The reason I did not use a boolean isRunning is that, I think when it is false, 
the state not Running actually could be divided into two state: not Running 
before the statement is executed, and not Running after the statement is 
executed successfully. If we use this boolean to control getQueryLog and make 
it failed for not Running, jdbc user may not get log after query is done.

Could you please take a look at it and see how does that sound? :)



> Beeline should have an option for user to see the query progress
> 
>
> Key: HIVE-7615
> URL: https://issues.apache.org/jira/browse/HIVE-7615
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI
>Reporter: Dong Chen
>Assignee: Dong Chen
> Attachments: HIVE-7615.1.patch, HIVE-7615.2.patch, HIVE-7615.patch, 
> complete_logs, simple_logs
>
>
> When executing query in Beeline, user should have a option to see the 
> progress through the outputs.
> Beeline could use the API introduced in HIVE-4629 to get and display the logs 
> to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8185) hive-jdbc-0.14.0-SNAPSHOT-standalone.jar fails verification for signatures in build

2014-09-19 Thread Damien Carol (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140286#comment-14140286
 ] 

Damien Carol commented on HIVE-8185:


I wonder if this bug is only present in CBO branch.
Because I use trunk and I have not this bug BUT when I'm using CBO branch 
Metastore/hiveserver2 or beeline throw this error.

> hive-jdbc-0.14.0-SNAPSHOT-standalone.jar fails verification for signatures in 
> build
> ---
>
> Key: HIVE-8185
> URL: https://issues.apache.org/jira/browse/HIVE-8185
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 0.14.0
>Reporter: Gopal V
>Priority: Critical
> Attachments: HIVE-8185.1.patch, HIVE-8185.2.patch
>
>
> In the current build, running
> {code}
> jarsigner --verify ./lib/hive-jdbc-0.14.0-SNAPSHOT-standalone.jar
> Jar verification failed.
> {code}
> unless that jar is removed from the lib dir, all hive queries throw the 
> following error 
> {code}
> Exception in thread "main" java.lang.SecurityException: Invalid signature 
> file digest for Manifest main attributes
>   at 
> sun.security.util.SignatureFileVerifier.processImpl(SignatureFileVerifier.java:240)
>   at 
> sun.security.util.SignatureFileVerifier.process(SignatureFileVerifier.java:193)
>   at java.util.jar.JarVerifier.processEntry(JarVerifier.java:305)
>   at java.util.jar.JarVerifier.update(JarVerifier.java:216)
>   at java.util.jar.JarFile.initializeVerifier(JarFile.java:345)
>   at java.util.jar.JarFile.getInputStream(JarFile.java:412)
>   at 
> sun.misc.URLClassPath$JarLoader$2.getInputStream(URLClassPath.java:775)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7974) Notification Event Listener movement to a new top level repl/ module

2014-09-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140297#comment-14140297
 ] 

Hive QA commented on HIVE-7974:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12669536/HIVE-7974.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6293 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.parse.TestParse.testParse_union
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/873/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/873/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-873/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12669536

> Notification Event Listener movement to a new top level repl/ module
> 
>
> Key: HIVE-7974
> URL: https://issues.apache.org/jira/browse/HIVE-7974
> Project: Hive
>  Issue Type: Sub-task
>  Components: Import/Export
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-7974.patch
>
>
> We need to create a new hive module (say hive-repl? ) to subsume the 
> NotificationListener from HCatalog.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8184) inconsistence between colList and columnExprMap when ConstantPropagate is applied to subquery

2014-09-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140399#comment-14140399
 ] 

Hive QA commented on HIVE-8184:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12669842/HIVE-8184.1.patch

{color:red}ERROR:{color} -1 due to 471 failed/errored test(s), 6294 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver_accumulo_queries
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_2_orc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_orc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_stats_orc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition_authorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_array_map_access_nonconstant
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join17
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join19
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join24
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join25
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join26
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_smb_mapjoin_14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_15
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_decimal
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_decimal_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_fields
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table_udfs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_nested_types
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_udf1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_combine3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_insert_outputformat
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_view
org.apache.hadoop.hive.cli.TestCliD

[jira] [Commented] (HIVE-6799) HiveServer2 needs to map kerberos name to local name before proxy check

2014-09-19 Thread LINTE (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140444#comment-14140444
 ] 

LINTE commented on HIVE-6799:
-

I comment out hive.metastore.uris in the hive-site.xml and then restart 
hiveserver2 with embedded  metastore and local derby database.

I have many exception for each hive request from knox but it work.







> HiveServer2 needs to map kerberos name to local name before proxy check
> ---
>
> Key: HIVE-6799
> URL: https://issues.apache.org/jira/browse/HIVE-6799
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 0.13.1
>Reporter: Dilli Arumugam
>Assignee: Dilli Arumugam
> Fix For: 0.14.0
>
> Attachments: HIVE-6799.1.patch, HIVE-6799.2.patch, HIVE-6799.patch
>
>
> HiveServer2 does not map kerberos name of authenticated principal to local 
> name.
> Due to this, I get error like the following in HiveServer log:
> Failed to validate proxy privilage of knox/hdps.example.com for sam
> I have KINITED as knox/hdps.example@example.com
> I do have the following in core-site.xml
>   
> hadoop.proxyuser.knox.groups
> users
>   
>   
> hadoop.proxyuser.knox.hosts
> *
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8115) Hive select query hang when fields contain map

2014-09-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140455#comment-14140455
 ] 

Hive QA commented on HIVE-8115:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12669865/HIVE-8115.2.patch

{color:green}SUCCESS:{color} +1 6293 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/875/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/875/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-875/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12669865

> Hive select query hang when fields contain map
> --
>
> Key: HIVE-8115
> URL: https://issues.apache.org/jira/browse/HIVE-8115
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HIVE-8115.1.patch, HIVE-8115.2.patch, createTable.hql, 
> data
>
>
> Attached the repro of the issue. When creating an table loading the data 
> attached, all hive query with hangs even just select * from the table.
> repro steps:
> 1. run createTable.hql
> 2. hadoop fs ls -put data /data
> 3. LOAD DATA INPATH '/data' OVERWRITE INTO TABLE testtable;
> 4. SELECT * FROM testtable;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7647) Beeline does not honor --headerInterval and --color when executing with "-e"

2014-09-19 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140486#comment-14140486
 ] 

Naveen Gangam commented on HIVE-7647:
-

[~xuefuz] I will attach a new patch today.

> Beeline does not honor --headerInterval and --color when executing with "-e"
> 
>
> Key: HIVE-7647
> URL: https://issues.apache.org/jira/browse/HIVE-7647
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.14.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-7647.1.patch
>
>
> --showHeader is being honored
> [root@localhost ~]# beeline --showHeader=false -u 
> 'jdbc:hive2://localhost:1/default' -n hive -d 
> org.apache.hive.jdbc.HiveDriver -e "select * from sample_07 limit 10;"
> Connecting to jdbc:hive2://localhost:1/default
> Connected to: Apache Hive (version 0.12.0-cdh5.0.1)
> Driver: Hive JDBC (version 0.12.0-cdh5.0.1)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> -hiveconf (No such file or directory)
> +--+--++-+
> | 00-  | All Occupations  | 135185230  | 42270   |
> | 11-  | Management occupations   | 6152650| 100310  |
> | 11-1011  | Chief executives | 301930 | 160440  |
> | 11-1021  | General and operations managers  | 1697690| 107970  |
> | 11-1031  | Legislators  | 64650  | 37980   |
> | 11-2011  | Advertising and promotions managers  | 36100  | 94720   |
> | 11-2021  | Marketing managers   | 166790 | 118160  |
> | 11-2022  | Sales managers   | 333910 | 110390  |
> | 11-2031  | Public relations managers| 51730  | 101220  |
> | 11-3011  | Administrative services managers | 246930 | 79500   |
> +--+--++-+
> 10 rows selected (0.838 seconds)
> Beeline version 0.12.0-cdh5.1.0 by Apache Hive
> Closing: org.apache.hive.jdbc.HiveConnection
> --outputFormat is being honored.
> [root@localhost ~]# beeline --outputFormat=csv -u 
> 'jdbc:hive2://localhost:1/default' -n hive -d 
> org.apache.hive.jdbc.HiveDriver -e "select * from sample_07 limit 10;"
> Connecting to jdbc:hive2://localhost:1/default
> Connected to: Apache Hive (version 0.12.0-cdh5.0.1)
> Driver: Hive JDBC (version 0.12.0-cdh5.0.1)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> 'code','description','total_emp','salary'
> '00-','All Occupations','135185230','42270'
> '11-','Management occupations','6152650','100310'
> '11-1011','Chief executives','301930','160440'
> '11-1021','General and operations managers','1697690','107970'
> '11-1031','Legislators','64650','37980'
> '11-2011','Advertising and promotions managers','36100','94720'
> '11-2021','Marketing managers','166790','118160'
> '11-2022','Sales managers','333910','110390'
> '11-2031','Public relations managers','51730','101220'
> '11-3011','Administrative services managers','246930','79500'
> 10 rows selected (0.664 seconds)
> Beeline version 0.12.0-cdh5.1.0 by Apache Hive
> Closing: org.apache.hive.jdbc.HiveConnection
> both --color & --headerInterval are being honored when executing using "-f" 
> option (reads query from a file rather than the commandline) (cannot really 
> see the color here but use the terminal colors)
> [root@localhost ~]# beeline --showheader=true --color=true --headerInterval=5 
> -u 'jdbc:hive2://localhost:1/default' -n hive -d 
> org.apache.hive.jdbc.HiveDriver -f /tmp/tmp.sql  
> Connecting to jdbc:hive2://localhost:1/default
> Connected to: Apache Hive (version 0.12.0-cdh5.0.1)
> Driver: Hive JDBC (version 0.12.0-cdh5.0.1)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Beeline version 0.12.0-cdh5.1.0 by Apache Hive
> 0: jdbc:hive2://localhost> select * from sample_07 limit 8;
> +--+--++-+
> |   code   | description  | total_emp  | salary  |
> +--+--++-+
> | 00-  | All Occupations  | 135185230  | 42270   |
> | 11-  | Management occupations   | 6152650| 100310  |
> | 11-1011  | Chief executives | 301930 | 160440  |
> | 11-1021  | General and operations managers  | 1697690| 107970  |
> | 11-1031  | Legislators  | 64650  | 37980   |
> +--+--++-+
> |   code   | description  | total_emp  | salary  |
> +---

[jira] [Commented] (HIVE-7420) Parameterize tests for HCatalog Pig interfaces for testing against all storage formats

2014-09-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140680#comment-14140680
 ] 

Hive QA commented on HIVE-7420:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12669860/HIVE-7420.6.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 6401 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes[1]
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes[2]
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes[3]
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes[4]
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes[5]
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/876/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/876/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-876/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12669860

> Parameterize tests for HCatalog Pig interfaces for testing against all 
> storage formats
> --
>
> Key: HIVE-7420
> URL: https://issues.apache.org/jira/browse/HIVE-7420
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog
>Reporter: David Chen
>Assignee: David Chen
> Attachments: HIVE-7420-without-HIVE-7457.2.patch, 
> HIVE-7420-without-HIVE-7457.3.patch, HIVE-7420-without-HIVE-7457.4.patch, 
> HIVE-7420-without-HIVE-7457.5.patch, HIVE-7420.1.patch, HIVE-7420.2.patch, 
> HIVE-7420.3.patch, HIVE-7420.4.patch, HIVE-7420.5.patch, HIVE-7420.6.patch
>
>
> Currently, HCatalog tests only test against RCFile with a few testing against 
> ORC. The tests should be covering other Hive storage formats as well.
> HIVE-7286 turns HCatMapReduceTest into a test fixture that can be run with 
> all Hive storage formats and with that patch, all test suites built on 
> HCatMapReduceTest are running and passing against Sequence File, Text, and 
> ORC in addition to RCFile.
> Similar changes should be made to make the tests for HCatLoader and 
> HCatStorer generic so that they can be run against all Hive storage formats.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7647) Beeline does not honor --headerInterval and --color when executing with "-e"

2014-09-19 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-7647:

Attachment: HIVE-7647.2.patch

> Beeline does not honor --headerInterval and --color when executing with "-e"
> 
>
> Key: HIVE-7647
> URL: https://issues.apache.org/jira/browse/HIVE-7647
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.14.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-7647.1.patch, HIVE-7647.2.patch
>
>
> --showHeader is being honored
> [root@localhost ~]# beeline --showHeader=false -u 
> 'jdbc:hive2://localhost:1/default' -n hive -d 
> org.apache.hive.jdbc.HiveDriver -e "select * from sample_07 limit 10;"
> Connecting to jdbc:hive2://localhost:1/default
> Connected to: Apache Hive (version 0.12.0-cdh5.0.1)
> Driver: Hive JDBC (version 0.12.0-cdh5.0.1)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> -hiveconf (No such file or directory)
> +--+--++-+
> | 00-  | All Occupations  | 135185230  | 42270   |
> | 11-  | Management occupations   | 6152650| 100310  |
> | 11-1011  | Chief executives | 301930 | 160440  |
> | 11-1021  | General and operations managers  | 1697690| 107970  |
> | 11-1031  | Legislators  | 64650  | 37980   |
> | 11-2011  | Advertising and promotions managers  | 36100  | 94720   |
> | 11-2021  | Marketing managers   | 166790 | 118160  |
> | 11-2022  | Sales managers   | 333910 | 110390  |
> | 11-2031  | Public relations managers| 51730  | 101220  |
> | 11-3011  | Administrative services managers | 246930 | 79500   |
> +--+--++-+
> 10 rows selected (0.838 seconds)
> Beeline version 0.12.0-cdh5.1.0 by Apache Hive
> Closing: org.apache.hive.jdbc.HiveConnection
> --outputFormat is being honored.
> [root@localhost ~]# beeline --outputFormat=csv -u 
> 'jdbc:hive2://localhost:1/default' -n hive -d 
> org.apache.hive.jdbc.HiveDriver -e "select * from sample_07 limit 10;"
> Connecting to jdbc:hive2://localhost:1/default
> Connected to: Apache Hive (version 0.12.0-cdh5.0.1)
> Driver: Hive JDBC (version 0.12.0-cdh5.0.1)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> 'code','description','total_emp','salary'
> '00-','All Occupations','135185230','42270'
> '11-','Management occupations','6152650','100310'
> '11-1011','Chief executives','301930','160440'
> '11-1021','General and operations managers','1697690','107970'
> '11-1031','Legislators','64650','37980'
> '11-2011','Advertising and promotions managers','36100','94720'
> '11-2021','Marketing managers','166790','118160'
> '11-2022','Sales managers','333910','110390'
> '11-2031','Public relations managers','51730','101220'
> '11-3011','Administrative services managers','246930','79500'
> 10 rows selected (0.664 seconds)
> Beeline version 0.12.0-cdh5.1.0 by Apache Hive
> Closing: org.apache.hive.jdbc.HiveConnection
> both --color & --headerInterval are being honored when executing using "-f" 
> option (reads query from a file rather than the commandline) (cannot really 
> see the color here but use the terminal colors)
> [root@localhost ~]# beeline --showheader=true --color=true --headerInterval=5 
> -u 'jdbc:hive2://localhost:1/default' -n hive -d 
> org.apache.hive.jdbc.HiveDriver -f /tmp/tmp.sql  
> Connecting to jdbc:hive2://localhost:1/default
> Connected to: Apache Hive (version 0.12.0-cdh5.0.1)
> Driver: Hive JDBC (version 0.12.0-cdh5.0.1)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Beeline version 0.12.0-cdh5.1.0 by Apache Hive
> 0: jdbc:hive2://localhost> select * from sample_07 limit 8;
> +--+--++-+
> |   code   | description  | total_emp  | salary  |
> +--+--++-+
> | 00-  | All Occupations  | 135185230  | 42270   |
> | 11-  | Management occupations   | 6152650| 100310  |
> | 11-1011  | Chief executives | 301930 | 160440  |
> | 11-1021  | General and operations managers  | 1697690| 107970  |
> | 11-1031  | Legislators  | 64650  | 37980   |
> +--+--++-+
> |   code   | description  | total_emp  | salary  |
> +--+--+-

[jira] [Commented] (HIVE-7647) Beeline does not honor --headerInterval and --color when executing with "-e"

2014-09-19 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140695#comment-14140695
 ] 

Naveen Gangam commented on HIVE-7647:
-

[~xuefuz] Patch has been rebased to the latest trunk. Thank you 

> Beeline does not honor --headerInterval and --color when executing with "-e"
> 
>
> Key: HIVE-7647
> URL: https://issues.apache.org/jira/browse/HIVE-7647
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.14.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-7647.1.patch, HIVE-7647.2.patch
>
>
> --showHeader is being honored
> [root@localhost ~]# beeline --showHeader=false -u 
> 'jdbc:hive2://localhost:1/default' -n hive -d 
> org.apache.hive.jdbc.HiveDriver -e "select * from sample_07 limit 10;"
> Connecting to jdbc:hive2://localhost:1/default
> Connected to: Apache Hive (version 0.12.0-cdh5.0.1)
> Driver: Hive JDBC (version 0.12.0-cdh5.0.1)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> -hiveconf (No such file or directory)
> +--+--++-+
> | 00-  | All Occupations  | 135185230  | 42270   |
> | 11-  | Management occupations   | 6152650| 100310  |
> | 11-1011  | Chief executives | 301930 | 160440  |
> | 11-1021  | General and operations managers  | 1697690| 107970  |
> | 11-1031  | Legislators  | 64650  | 37980   |
> | 11-2011  | Advertising and promotions managers  | 36100  | 94720   |
> | 11-2021  | Marketing managers   | 166790 | 118160  |
> | 11-2022  | Sales managers   | 333910 | 110390  |
> | 11-2031  | Public relations managers| 51730  | 101220  |
> | 11-3011  | Administrative services managers | 246930 | 79500   |
> +--+--++-+
> 10 rows selected (0.838 seconds)
> Beeline version 0.12.0-cdh5.1.0 by Apache Hive
> Closing: org.apache.hive.jdbc.HiveConnection
> --outputFormat is being honored.
> [root@localhost ~]# beeline --outputFormat=csv -u 
> 'jdbc:hive2://localhost:1/default' -n hive -d 
> org.apache.hive.jdbc.HiveDriver -e "select * from sample_07 limit 10;"
> Connecting to jdbc:hive2://localhost:1/default
> Connected to: Apache Hive (version 0.12.0-cdh5.0.1)
> Driver: Hive JDBC (version 0.12.0-cdh5.0.1)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> 'code','description','total_emp','salary'
> '00-','All Occupations','135185230','42270'
> '11-','Management occupations','6152650','100310'
> '11-1011','Chief executives','301930','160440'
> '11-1021','General and operations managers','1697690','107970'
> '11-1031','Legislators','64650','37980'
> '11-2011','Advertising and promotions managers','36100','94720'
> '11-2021','Marketing managers','166790','118160'
> '11-2022','Sales managers','333910','110390'
> '11-2031','Public relations managers','51730','101220'
> '11-3011','Administrative services managers','246930','79500'
> 10 rows selected (0.664 seconds)
> Beeline version 0.12.0-cdh5.1.0 by Apache Hive
> Closing: org.apache.hive.jdbc.HiveConnection
> both --color & --headerInterval are being honored when executing using "-f" 
> option (reads query from a file rather than the commandline) (cannot really 
> see the color here but use the terminal colors)
> [root@localhost ~]# beeline --showheader=true --color=true --headerInterval=5 
> -u 'jdbc:hive2://localhost:1/default' -n hive -d 
> org.apache.hive.jdbc.HiveDriver -f /tmp/tmp.sql  
> Connecting to jdbc:hive2://localhost:1/default
> Connected to: Apache Hive (version 0.12.0-cdh5.0.1)
> Driver: Hive JDBC (version 0.12.0-cdh5.0.1)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Beeline version 0.12.0-cdh5.1.0 by Apache Hive
> 0: jdbc:hive2://localhost> select * from sample_07 limit 8;
> +--+--++-+
> |   code   | description  | total_emp  | salary  |
> +--+--++-+
> | 00-  | All Occupations  | 135185230  | 42270   |
> | 11-  | Management occupations   | 6152650| 100310  |
> | 11-1011  | Chief executives | 301930 | 160440  |
> | 11-1021  | General and operations managers  | 1697690| 107970  |
> | 11-1031  | Legislators  | 64650  | 37980   |
> +--+--++-+
> |   code   | description

[jira] [Commented] (HIVE-7689) Enable Postgres as METASTORE back-end

2014-09-19 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140707#comment-14140707
 ] 

Alan Gates commented on HIVE-7689:
--

All these calls to getEscape make the code hard to read.  If postgres requires 
lower case table and column names I'd prefer to change the postgres version of 
hive-txn-schema.sql to create the tables and columns with lower case names.  
Wouldn't that be easier?

> Enable Postgres as METASTORE back-end
> -
>
> Key: HIVE-7689
> URL: https://issues.apache.org/jira/browse/HIVE-7689
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 0.14.0
>Reporter: Damien Carol
>Assignee: Damien Carol
>Priority: Minor
>  Labels: metastore, postgres
> Fix For: 0.14.0
>
> Attachments: HIVE-7689.5.patch, HIVE-7689.6.patch, HIVE-7689.7.patch, 
> HIVE-7689.8.patch, HIVE-7889.1.patch, HIVE-7889.2.patch, HIVE-7889.3.patch, 
> HIVE-7889.4.patch
>
>
> I maintain few patches to make Metastore works with Postgres back end in our 
> production environment.
> The main goal of this JIRA is to push upstream these patches.
> This patch enable LOCKS, COMPACTION and fix error in STATS on postgres 
> metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7359) Stats based compute query replies fail to do simple column transforms

2014-09-19 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7359:
---
   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk.

> Stats based compute query replies fail to do simple column transforms
> -
>
> Key: HIVE-7359
> URL: https://issues.apache.org/jira/browse/HIVE-7359
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 0.13.0, 0.14.0, 0.13.1
>Reporter: Gopal V
>Assignee: Ashutosh Chauhan
> Fix For: 0.14.0
>
> Attachments: HIVE-7359.patch
>
>
> The following two queries return the same answer (the second one is incorrect)
> {code}
> hive> set hive.compute.query.using.stats=true;
> hive> select count(1) from trips;
> OK
> 187271461
> Time taken: 0.173 seconds, Fetched: 1 row(s)
> hive> select count(1)/5109828 from trips;
> OK
> 187271461
> Time taken: 0.125 seconds, Fetched: 1 row(s)
> {code}
> The second query should have output 36.649 instead of the returning the value 
> of count(1).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7359) Stats based compute query replies fail to do simple column transforms

2014-09-19 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7359:
---
Component/s: Logical Optimizer

> Stats based compute query replies fail to do simple column transforms
> -
>
> Key: HIVE-7359
> URL: https://issues.apache.org/jira/browse/HIVE-7359
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 0.13.0, 0.14.0, 0.13.1
>Reporter: Gopal V
>Assignee: Ashutosh Chauhan
> Fix For: 0.14.0
>
> Attachments: HIVE-7359.patch
>
>
> The following two queries return the same answer (the second one is incorrect)
> {code}
> hive> set hive.compute.query.using.stats=true;
> hive> select count(1) from trips;
> OK
> 187271461
> Time taken: 0.173 seconds, Fetched: 1 row(s)
> hive> select count(1)/5109828 from trips;
> OK
> 187271461
> Time taken: 0.125 seconds, Fetched: 1 row(s)
> {code}
> The second query should have output 36.649 instead of the returning the value 
> of count(1).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7980) Hive on spark issue..

2014-09-19 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140778#comment-14140778
 ] 

Xuefu Zhang commented on HIVE-7980:
---

[~alton.jung] For hive, you need the latest from Spark branch. For Spark, you 
can also have the latest in their master branch. Since both are in the 
development, issues can arrive. Could you describe what you are trying to do 
and how to reproduce your issue(s)? Thanks.

> Hive on spark issue..
> -
>
> Key: HIVE-7980
> URL: https://issues.apache.org/jira/browse/HIVE-7980
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Spark
>Affects Versions: spark-branch
> Environment: Test Environment is..
> . hive 0.14.0(spark branch version)
> . spark 
> (http://ec2-50-18-79-139.us-west-1.compute.amazonaws.com/data/spark-assembly-1.1.0-SNAPSHOT-hadoop2.3.0.jar)
> . hadoop 2.4.0 (yarn)
>Reporter: alton.jung
>Assignee: Chao
> Fix For: spark-branch
>
>
> .I followed this 
> guide(https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started).
>  and i compiled hive from spark branch. in the next step i met the below 
> error..
> (*i typed the hive query on beeline, i used the  simple query using "order 
> by" to invoke the palleral works 
>ex) select * from test where id = 1 order by id;
> )
> [Error list is]
> 2014-09-04 02:58:08,796 ERROR spark.SparkClient 
> (SparkClient.java:execute(158)) - Error generating Spark Plan
> java.lang.NullPointerException
>   at 
> org.apache.spark.SparkContext.defaultParallelism(SparkContext.scala:1262)
>   at 
> org.apache.spark.SparkContext.defaultMinPartitions(SparkContext.scala:1269)
>   at 
> org.apache.spark.SparkContext.hadoopRDD$default$5(SparkContext.scala:537)
>   at 
> org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:318)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateRDD(SparkPlanGenerator.java:160)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:88)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:156)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.submit(SparkSessionImpl.java:52)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:77)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:161)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
>   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:72)
> 2014-09-04 02:58:11,108 ERROR ql.Driver (SessionState.java:printError(696)) - 
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
> 2014-09-04 02:58:11,182 INFO  log.PerfLogger 
> (PerfLogger.java:PerfLogEnd(135)) -  start=1409824527954 end=1409824691182 duration=163228 
> from=org.apache.hadoop.hive.ql.Driver>
> 2014-09-04 02:58:11,223 INFO  log.PerfLogger 
> (PerfLogger.java:PerfLogBegin(108)) -  from=org.apache.hadoop.hive.ql.Driver>
> 2014-09-04 02:58:11,224 INFO  log.PerfLogger 
> (PerfLogger.java:PerfLogEnd(135)) -  start=1409824691223 end=1409824691224 duration=1 
> from=org.apache.hadoop.hive.ql.Driver>
> 2014-09-04 02:58:11,306 ERROR operation.Operation 
> (SQLOperation.java:run(199)) - Error running hive query: 
> org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
>   at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:284)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:146)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:508)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at ja

[jira] [Created] (HIVE-8189) A select statement with a subquery is failing.

2014-09-19 Thread Yongzhi Chen (JIRA)
Yongzhi Chen created HIVE-8189:
--

 Summary: A select statement with a subquery is failing. 
 Key: HIVE-8189
 URL: https://issues.apache.org/jira/browse/HIVE-8189
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1, 0.12.0
Reporter: Yongzhi Chen


Hive tables in the query are hbase tables, and the subquery is a join statement.
When
set hive.optimize.ppd=true;
  and
set hive.auto.convert.join=false;
The query does not return data. 
While hive.optimize.ppd=true and hive.auto.convert.join=true return values 
back. See attached query file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8094) add LIKE keyword support for SHOW FUNCTIONS

2014-09-19 Thread peter liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

peter liu updated HIVE-8094:

Attachment: HIVE-8094.2.patch

> add LIKE keyword support for SHOW FUNCTIONS
> ---
>
> Key: HIVE-8094
> URL: https://issues.apache.org/jira/browse/HIVE-8094
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.14.0, 0.13.1
>Reporter: peter liu
>Assignee: peter liu
> Fix For: 0.14.0
>
> Attachments: HIVE-8094.1.patch, HIVE-8094.2.patch
>
>
> It would be nice to  add LIKE keyword support for SHOW FUNCTIONS as below, 
> and keep the patterns consistent to the way as SHOW DATABASES, SHOW TABLES.
> bq. SHOW FUNCTIONS LIKE 'foo*';



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8189) A select statement with a subquery is failing.

2014-09-19 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-8189:
---
Attachment: hbase_ppd_join.q

The query can reproduce the issue. 

> A select statement with a subquery is failing. 
> ---
>
> Key: HIVE-8189
> URL: https://issues.apache.org/jira/browse/HIVE-8189
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.12.0, 0.13.1
>Reporter: Yongzhi Chen
> Attachments: hbase_ppd_join.q
>
>
> Hive tables in the query are hbase tables, and the subquery is a join 
> statement.
> When
> set hive.optimize.ppd=true;
>   and
> set hive.auto.convert.join=false;
> The query does not return data. 
> While hive.optimize.ppd=true and hive.auto.convert.join=true return values 
> back. See attached query file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7100) Users of hive should be able to specify skipTrash when dropping tables.

2014-09-19 Thread david serafini (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140786#comment-14140786
 ] 

david serafini commented on HIVE-7100:
--

No.  It hasn't.  I've looked at the code a little, but I haven't found an
answer yet. I'm a novice with the hive code, and I didn't do the original
work on this ticket - I'm just trying to get it finished.   I'll probably
need to find time to find or write a test case to verify the behavior.

On Wed, Sep 17, 2014 at 10:27 PM, Lefty Leverenz (JIRA) 



> Users of hive should be able to specify skipTrash when dropping tables.
> ---
>
> Key: HIVE-7100
> URL: https://issues.apache.org/jira/browse/HIVE-7100
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.13.0
>Reporter: Ravi Prakash
>Assignee: david serafini
> Attachments: HIVE-7100.1.patch, HIVE-7100.10.patch, 
> HIVE-7100.2.patch, HIVE-7100.3.patch, HIVE-7100.4.patch, HIVE-7100.5.patch, 
> HIVE-7100.8.patch, HIVE-7100.9.patch, HIVE-7100.patch
>
>
> Users of our clusters are often running up against their quota limits because 
> of Hive tables. When they drop tables, they have to then manually delete the 
> files from HDFS using skipTrash. This is cumbersome and unnecessary. We 
> should enable users to skipTrash directly when dropping tables.
> We should also be able to provide this functionality without polluting SQL 
> syntax.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Edit access to the Hive Wiki

2014-09-19 Thread Pavan Lanka
Hi,

I have registered myself as pavibhai on the https://cwiki.apache.org and
would like change privilege so that I can contribute to the content.


Regards,
Pavan


Re: Edit access to the Hive Wiki

2014-09-19 Thread Xuefu Zhang
Done!

On Fri, Sep 19, 2014 at 8:55 AM, Pavan Lanka  wrote:

> Hi,
>
> I have registered myself as pavibhai on the https://cwiki.apache.org and
> would like change privilege so that I can contribute to the content.
>
>
> Regards,
> Pavan
>


[jira] [Commented] (HIVE-8162) hive.optimize.sort.dynamic.partition causes RuntimeException for inserting into dynamic partitioned table when map function is used in the subquery

2014-09-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140814#comment-14140814
 ] 

Hive QA commented on HIVE-8162:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12669863/HIVE-8162.1.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6293 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_dyn_part
org.apache.hadoop.hive.ql.parse.TestParse.testParse_union
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/877/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/877/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-877/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12669863

> hive.optimize.sort.dynamic.partition causes RuntimeException for inserting 
> into dynamic partitioned table when map function is used in the subquery 
> 
>
> Key: HIVE-8162
> URL: https://issues.apache.org/jira/browse/HIVE-8162
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0
>Reporter: Na Yang
>Assignee: Prasanth J
> Attachments: 47rows.txt, HIVE-8162.1.patch
>
>
> Exception:
> Diagnostic Messages for this Task:
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error: Unable to deserialize reduce input key from 
> x1x129x51x83x14x1x128x0x0x2x1x1x1x120x95x112x114x111x100x117x99x116x95x105x100x0x1x0x0x255
>  with properties {columns=reducesinkkey0,reducesinkkey1,reducesinkkey2, 
> serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe,
>  serialization.sort.order=+++, columns.types=int,map,int}
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:283)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:518)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:462)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:282)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1122)
>   at org.apache.hadoop.mapred.Child.main(Child.java:271)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error: Unable to deserialize reduce input key from 
> x1x129x51x83x14x1x128x0x0x2x1x1x1x120x95x112x114x111x100x117x99x116x95x105x100x0x1x0x0x255
>  with properties {columns=reducesinkkey0,reducesinkkey1,reducesinkkey2, 
> serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe,
>  serialization.sort.order=+++, columns.types=int,map,int}
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:222)
>   ... 7 more
> Caused by: org.apache.hadoop.hive.serde2.SerDeException: java.io.EOFException
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:189)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:220)
>   ... 7 more
> Caused by: java.io.EOFException
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.InputByteBuffer.read(InputByteBuffer.java:54)
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserializeInt(BinarySortableSerDe.java:533)
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:236)
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:185)
>   ... 8 more
> Step to reproduce the exception:
> -
> CREATE TABLE associateddata(creative_id int,creative_group_id int,placement_id
> int,sm_campaign_id int,browser_id string, trans_type_p string,trans_time_p
> string,group_name string,event_name string,order_id string,revenue
> float,currency string, trans_type_ci string,trans_time_ci string,f16
> map,campaign_id int,user_agent_cat string,geo_country
> string,geo_city string,geo_state string,geo_zip string,geo_dma string,geo_area
> 

[jira] [Created] (HIVE-8190) LDAP user match for authentication on hiveserver2

2014-09-19 Thread LINTE (JIRA)
LINTE created HIVE-8190:
---

 Summary: LDAP user match for authentication on hiveserver2
 Key: HIVE-8190
 URL: https://issues.apache.org/jira/browse/HIVE-8190
 Project: Hive
  Issue Type: Improvement
  Components: Authorization, Clients
Affects Versions: 0.13.1
 Environment: Centos 6.5
Reporter: LINTE


Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :
1/ uid : 
 - uid={0},basedn

2/ or cn :
- cn={0},basedn






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8190) LDAP user match for authentication on hiveserver2

2014-09-19 Thread LINTE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LINTE updated HIVE-8190:

Description: 
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :

uid : uid={0}, basedn

or cn : cn={0}, basedn




  was:
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :

uid : uid={0},basedn

or cn : cn={0},basedn





> LDAP user match for authentication on hiveserver2
> -
>
> Key: HIVE-8190
> URL: https://issues.apache.org/jira/browse/HIVE-8190
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization, Clients
>Affects Versions: 0.13.1
> Environment: Centos 6.5
>Reporter: LINTE
>
> Some LDAP has the user composant as CN and not UID.
> SO when you try to authenticate the LDAP authentication module of hive try to 
> authenticate with the following string :  
> uid=$login,basedn
> Some AD have user objects that are not uid but cn, so it is be important to 
> personalize the kind of objects that the authentication moduel look for in 
> ldap.
> We can see an exemple in knox LDAP module configuration the parameter 
> main.ldapRealm.userDnTemplate can be configured to look for :
> uid : uid={0}, basedn
> or cn : cn={0}, basedn



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8190) LDAP user match for authentication on hiveserver2

2014-09-19 Thread LINTE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LINTE updated HIVE-8190:

Description: 
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :

uid : uid={0},basedn

or cn : cn={0},basedn




  was:
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :

uid : 
-uid={0},basedn

or cn :
-cn={0},basedn





> LDAP user match for authentication on hiveserver2
> -
>
> Key: HIVE-8190
> URL: https://issues.apache.org/jira/browse/HIVE-8190
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization, Clients
>Affects Versions: 0.13.1
> Environment: Centos 6.5
>Reporter: LINTE
>
> Some LDAP has the user composant as CN and not UID.
> SO when you try to authenticate the LDAP authentication module of hive try to 
> authenticate with the following string :  
> uid=$login,basedn
> Some AD have user objects that are not uid but cn, so it is be important to 
> personalize the kind of objects that the authentication moduel look for in 
> ldap.
> We can see an exemple in knox LDAP module configuration the parameter 
> main.ldapRealm.userDnTemplate can be configured to look for :
> uid : uid={0},basedn
> or cn : cn={0},basedn



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8190) LDAP user match for authentication on hiveserver2

2014-09-19 Thread LINTE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LINTE updated HIVE-8190:

Description: 
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :


uid : 
uid={0}, basedn

or cn :
cn={0}, basedn




  was:
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :

uid ==> uid= {0}, basedn


uid : uid={0}, basedn

or cn : cn={0}, basedn





> LDAP user match for authentication on hiveserver2
> -
>
> Key: HIVE-8190
> URL: https://issues.apache.org/jira/browse/HIVE-8190
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization, Clients
>Affects Versions: 0.13.1
> Environment: Centos 6.5
>Reporter: LINTE
>
> Some LDAP has the user composant as CN and not UID.
> SO when you try to authenticate the LDAP authentication module of hive try to 
> authenticate with the following string :  
> uid=$login,basedn
> Some AD have user objects that are not uid but cn, so it is be important to 
> personalize the kind of objects that the authentication moduel look for in 
> ldap.
> We can see an exemple in knox LDAP module configuration the parameter 
> main.ldapRealm.userDnTemplate can be configured to look for :
> uid : 
> uid={0}, basedn
> or cn :
> cn={0}, basedn



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8190) LDAP user match for authentication on hiveserver2

2014-09-19 Thread LINTE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LINTE updated HIVE-8190:

Description: 
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :

uid ==> uid= {0}, basedn


uid : uid={0}, basedn

or cn : cn={0}, basedn




  was:
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :

uid ==> uid={0}, basedn


uid : uid={0}, basedn

or cn : cn={0}, basedn





> LDAP user match for authentication on hiveserver2
> -
>
> Key: HIVE-8190
> URL: https://issues.apache.org/jira/browse/HIVE-8190
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization, Clients
>Affects Versions: 0.13.1
> Environment: Centos 6.5
>Reporter: LINTE
>
> Some LDAP has the user composant as CN and not UID.
> SO when you try to authenticate the LDAP authentication module of hive try to 
> authenticate with the following string :  
> uid=$login,basedn
> Some AD have user objects that are not uid but cn, so it is be important to 
> personalize the kind of objects that the authentication moduel look for in 
> ldap.
> We can see an exemple in knox LDAP module configuration the parameter 
> main.ldapRealm.userDnTemplate can be configured to look for :
> uid ==> uid= {0}, basedn
> uid : uid={0}, basedn
> or cn : cn={0}, basedn



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8190) LDAP user match for authentication on hiveserver2

2014-09-19 Thread LINTE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LINTE updated HIVE-8190:

Description: 
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :

uid : 
-uid={0},basedn

or cn :
-cn={0},basedn




  was:
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :
1/ uid : 
 - uid={0},basedn

2/ or cn :
- cn={0},basedn





> LDAP user match for authentication on hiveserver2
> -
>
> Key: HIVE-8190
> URL: https://issues.apache.org/jira/browse/HIVE-8190
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization, Clients
>Affects Versions: 0.13.1
> Environment: Centos 6.5
>Reporter: LINTE
>
> Some LDAP has the user composant as CN and not UID.
> SO when you try to authenticate the LDAP authentication module of hive try to 
> authenticate with the following string :  
> uid=$login,basedn
> Some AD have user objects that are not uid but cn, so it is be important to 
> personalize the kind of objects that the authentication moduel look for in 
> ldap.
> We can see an exemple in knox LDAP module configuration the parameter 
> main.ldapRealm.userDnTemplate can be configured to look for :
> uid : 
> -uid={0},basedn
> or cn :
> -cn={0},basedn



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8190) LDAP user match for authentication on hiveserver2

2014-09-19 Thread LINTE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LINTE updated HIVE-8190:

Description: 
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :


uid : 'uid={0}, basedn'

or cn : 'cn={0}, basedn'




  was:
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :


uid : "uid={0}, basedn"

or cn : "cn={0}, basedn"





> LDAP user match for authentication on hiveserver2
> -
>
> Key: HIVE-8190
> URL: https://issues.apache.org/jira/browse/HIVE-8190
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization, Clients
>Affects Versions: 0.13.1
> Environment: Centos 6.5
>Reporter: LINTE
>
> Some LDAP has the user composant as CN and not UID.
> SO when you try to authenticate the LDAP authentication module of hive try to 
> authenticate with the following string :  
> uid=$login,basedn
> Some AD have user objects that are not uid but cn, so it is be important to 
> personalize the kind of objects that the authentication moduel look for in 
> ldap.
> We can see an exemple in knox LDAP module configuration the parameter 
> main.ldapRealm.userDnTemplate can be configured to look for :
> uid : 'uid={0}, basedn'
> or cn : 'cn={0}, basedn'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8190) LDAP user match for authentication on hiveserver2

2014-09-19 Thread LINTE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LINTE updated HIVE-8190:

Description: 
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :


uid : "uid={0}, basedn"

or cn : "cn={0}, basedn"




  was:
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :


uid : 
uid={0}, basedn

or cn :
cn={0}, basedn





> LDAP user match for authentication on hiveserver2
> -
>
> Key: HIVE-8190
> URL: https://issues.apache.org/jira/browse/HIVE-8190
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization, Clients
>Affects Versions: 0.13.1
> Environment: Centos 6.5
>Reporter: LINTE
>
> Some LDAP has the user composant as CN and not UID.
> SO when you try to authenticate the LDAP authentication module of hive try to 
> authenticate with the following string :  
> uid=$login,basedn
> Some AD have user objects that are not uid but cn, so it is be important to 
> personalize the kind of objects that the authentication moduel look for in 
> ldap.
> We can see an exemple in knox LDAP module configuration the parameter 
> main.ldapRealm.userDnTemplate can be configured to look for :
> uid : "uid={0}, basedn"
> or cn : "cn={0}, basedn"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8190) LDAP user match for authentication on hiveserver2

2014-09-19 Thread LINTE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LINTE updated HIVE-8190:

Description: 
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :

uid ==> uid={0}, basedn


uid : uid={0}, basedn

or cn : cn={0}, basedn




  was:
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :

uid : uid={0}, basedn

or cn : cn={0}, basedn





> LDAP user match for authentication on hiveserver2
> -
>
> Key: HIVE-8190
> URL: https://issues.apache.org/jira/browse/HIVE-8190
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization, Clients
>Affects Versions: 0.13.1
> Environment: Centos 6.5
>Reporter: LINTE
>
> Some LDAP has the user composant as CN and not UID.
> SO when you try to authenticate the LDAP authentication module of hive try to 
> authenticate with the following string :  
> uid=$login,basedn
> Some AD have user objects that are not uid but cn, so it is be important to 
> personalize the kind of objects that the authentication moduel look for in 
> ldap.
> We can see an exemple in knox LDAP module configuration the parameter 
> main.ldapRealm.userDnTemplate can be configured to look for :
> uid ==> uid={0}, basedn
> uid : uid={0}, basedn
> or cn : cn={0}, basedn



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Apply for Hive contributor

2014-09-19 Thread Yongzhi Chen
Hi,
I'd like to be a hive contributer, my JIRA account ID is ychena

Thanks

Yongzhi


[jira] [Updated] (HIVE-8189) A select statement with a subquery is failing with HBaseSerde

2014-09-19 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8189:
---
Summary: A select statement with a subquery is failing with HBaseSerde  
(was: A select statement with a subquery is failing. )

> A select statement with a subquery is failing with HBaseSerde
> -
>
> Key: HIVE-8189
> URL: https://issues.apache.org/jira/browse/HIVE-8189
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.12.0, 0.13.1
>Reporter: Yongzhi Chen
> Attachments: hbase_ppd_join.q
>
>
> Hive tables in the query are hbase tables, and the subquery is a join 
> statement.
> When
> set hive.optimize.ppd=true;
>   and
> set hive.auto.convert.join=false;
> The query does not return data. 
> While hive.optimize.ppd=true and hive.auto.convert.join=true return values 
> back. See attached query file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-5201) Create new initial rev of new hive site in staging

2014-09-19 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland resolved HIVE-5201.

Resolution: Fixed

> Create new initial rev of new hive site in staging
> --
>
> Key: HIVE-5201
> URL: https://issues.apache.org/jira/browse/HIVE-5201
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Brock Noland
>Assignee: Brock Noland
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-4938) Update website to use Apache CMS

2014-09-19 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland resolved HIVE-4938.

Resolution: Fixed

> Update website to use Apache CMS
> 
>
> Key: HIVE-4938
> URL: https://issues.apache.org/jira/browse/HIVE-4938
> Project: Hive
>  Issue Type: Improvement
>Reporter: Brock Noland
>Assignee: Brock Noland
>
> A 
> [vote|http://mail-archives.apache.org/mod_mbox/hive-dev/201307.mbox/%3CCAENxBwx47KQsFRBbBB-i3y1VovBwA8E2dymsfcenkb7X5vhVnQ%40mail.gmail.com%3E]
>  was held and we decided to move from Apache Forrest to Apache CMS for the 
> website. This is an uber ticket to track this effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8100) Add QTEST_LEAVE_FILES to QTestUtil

2014-09-19 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8100:
---
Description: Basically the idea here is to have an option to not delete the 
warehouse directory. I am using an env variable so it's always passed to all 
sub-processes.

> Add QTEST_LEAVE_FILES to QTestUtil
> --
>
> Key: HIVE-8100
> URL: https://issues.apache.org/jira/browse/HIVE-8100
> Project: Hive
>  Issue Type: Improvement
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-8100.patch, HIVE-8100.patch
>
>
> Basically the idea here is to have an option to not delete the warehouse 
> directory. I am using an env variable so it's always passed to all 
> sub-processes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8138) Global Init file should allow specifying file name not only directory

2014-09-19 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8138:
---
Attachment: HIVE-8138.patch

> Global Init file should allow specifying file name  not only directory
> --
>
> Key: HIVE-8138
> URL: https://issues.apache.org/jira/browse/HIVE-8138
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-8138.patch, HIVE-8138.patch
>
>
> HIVE-5160 allows you to specify a directory where a .hiverc file exists. 
> However since .hiverc is a hidden file this can be confusing. The property 
> should allow a path to a file or a directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 25834: HIVE-8138 - Global Init file should allow specifying file name not only directory

2014-09-19 Thread Brock Noland

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25834/
---

Review request for hive.


Repository: hive-git


Description
---

Allows either a file or dir.


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3a045b7 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
5231d5e 
  
service/src/test/org/apache/hive/service/cli/session/TestSessionGlobalInitFile.java
 5b1cbc0 

Diff: https://reviews.apache.org/r/25834/diff/


Testing
---


Thanks,

Brock Noland



[jira] [Commented] (HIVE-8138) Global Init file should allow specifying file name not only directory

2014-09-19 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140924#comment-14140924
 ] 

Brock Noland commented on HIVE-8138:


https://reviews.apache.org/r/25834/

> Global Init file should allow specifying file name  not only directory
> --
>
> Key: HIVE-8138
> URL: https://issues.apache.org/jira/browse/HIVE-8138
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-8138.patch, HIVE-8138.patch
>
>
> HIVE-5160 allows you to specify a directory where a .hiverc file exists. 
> However since .hiverc is a hidden file this can be confusing. The property 
> should allow a path to a file or a directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8115) Hive select query hang when fields contain map

2014-09-19 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HIVE-8115:

Description: 
Attached the repro of the issue. When creating an table loading the data 
attached, all hive query with hangs even just select * from the table.

repro steps:
1. run createTable.hql
2. hadoop fs -put data /data
3. LOAD DATA INPATH '/data' OVERWRITE INTO TABLE testtable;
4. SELECT * FROM testtable;

  was:
Attached the repro of the issue. When creating an table loading the data 
attached, all hive query with hangs even just select * from the table.

repro steps:
1. run createTable.hql
2. hadoop fs ls -put data /data
3. LOAD DATA INPATH '/data' OVERWRITE INTO TABLE testtable;
4. SELECT * FROM testtable;


> Hive select query hang when fields contain map
> --
>
> Key: HIVE-8115
> URL: https://issues.apache.org/jira/browse/HIVE-8115
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HIVE-8115.1.patch, HIVE-8115.2.patch, createTable.hql, 
> data
>
>
> Attached the repro of the issue. When creating an table loading the data 
> attached, all hive query with hangs even just select * from the table.
> repro steps:
> 1. run createTable.hql
> 2. hadoop fs -put data /data
> 3. LOAD DATA INPATH '/data' OVERWRITE INTO TABLE testtable;
> 4. SELECT * FROM testtable;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8111) CBO trunk merge: duplicated casts for arithmetic expressions in Hive and CBO

2014-09-19 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140929#comment-14140929
 ] 

Sergey Shelukhin commented on HIVE-8111:


ping? [~ashutoshc] [~jpullokkaran]

> CBO trunk merge: duplicated casts for arithmetic expressions in Hive and CBO
> 
>
> Key: HIVE-8111
> URL: https://issues.apache.org/jira/browse/HIVE-8111
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-8111.patch
>
>
> Original test failure: looks like column type changes to different decimals 
> in most cases. In one case it causes the integer part to be too big to fit, 
> so the result becomes null it seems.
> What happens is that CBO adds casts to arithmetic expressions to make them 
> type compatible; these casts become part of new AST, and then Hive adds casts 
> on top of these casts. This (the first part) also causes lots of out file 
> changes. It's not clear how to best fix it so far, in addition to incorrect 
> decimal width and sometimes nulls when width is larger than allowed in Hive.
> Option one - don't add those for numeric ops - cannot be done if numeric op 
> is a part of compare, for which CBO needs correct types.
> Option two - unwrap casts when determining type in Hive - hard or impossible 
> to tell apart CBO-added casts and user casts. 
> Option three - don't change types in Hive if CBO has run - seems hacky and 
> hard to ensure it's applied everywhere.
> Option four - map all expressions precisely between two trees and remove 
> casts again after optimization, will be pretty difficult.
> Option five - somehow mark those casts. Not sure about how yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8187) CBO: Change Optiq Type System Precision/scale to use Hive Type System Precision/Scale

2014-09-19 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-8187:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> CBO: Change Optiq Type System Precision/scale to use Hive Type System 
> Precision/Scale
> -
>
> Key: HIVE-8187
> URL: https://issues.apache.org/jira/browse/HIVE-8187
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Laljo John Pullokkaran
>Assignee: Laljo John Pullokkaran
> Attachments: HIVE-8187.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8105) booleans and nulls not handled properly in insert/values

2014-09-19 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140938#comment-14140938
 ] 

Eugene Koifman commented on HIVE-8105:
--

It would be useful to add some comments in unparseExprForValuesClause() about 
NULL and FALSE handling.  I don't think it would be clear why this works.
Otherwise
+1 pending tests


> booleans and nulls not handled properly in insert/values
> 
>
> Key: HIVE-8105
> URL: https://issues.apache.org/jira/browse/HIVE-8105
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.14.0
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Critical
> Attachments: HIVE-8105.2.patch, HIVE-8105.2.patch, HIVE-8105.patch
>
>
> Doing an insert/values with a boolean always results in a value of true, 
> regardless of whether true or false is given in the query.
> Doing an insert/values with a null for a column value results in a semantic 
> error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8149) hive.optimize.reducededuplication should be set to false for IUD ops

2014-09-19 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140952#comment-14140952
 ] 

Eugene Koifman commented on HIVE-8149:
--

+1 pending tests

> hive.optimize.reducededuplication should be set to false for IUD ops
> 
>
> Key: HIVE-8149
> URL: https://issues.apache.org/jira/browse/HIVE-8149
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Alan Gates
> Attachments: HIVE-8149.patch
>
>
> this optimizer causes both old and new rows to show up in a select after 
> update (for tables involving few rows)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 25394: HIVE-7503: Support Hive's multi-table insert query with Spark [Spark Branch]

2014-09-19 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25394/#review53871
---


Nice work.

Besides comment below, I think there are some improvement can be done, either 
here or in a different patch:

1. If we have a module that can compile an op tree (given by top ops) into a 
spark task, then we can reuse it after the original op tree is broken into 
several trees. From each tree, we compile it into a spark task. In the end, we 
hook up parent child relation ship. The current logic is a little complicated 
and hard to understand.
2. Tests 
3. Optimizations


ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java


maybe we can call it opToParentMap?



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java


Comment here?



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java


We should be able to reuse the hash map by emptyping the previous one.



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java


Let's use meaningful variable names even though they are local.



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java


I feel that the logic here can be simplified. Could we just pop all paths 
and then check if the root is the same and keep doing so until the common 
parent is found?



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java


This seems covering only the case where all FSs have a commont FORWARD 
parent. What if only some of them sharing a FORWARD parent, but other FSs and 
the FORWARD operator sharing some common parent?

I think the rule for whether to break the plan goes like this:

A plan needs to be broken if and only if there are more than one 
FileSinkOperator that can be traced back to a common parent and the tracing has 
to pass a ReduceSinkOperator on the way.



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java


Here we are mapping the children of lca to lca itself. Why is this 
necessary, as you can find the chidren of lca later without the map. Cannot we 
just store lca here?


- Xuefu Zhang


On Sept. 18, 2014, 6:38 p.m., Chao Sun wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25394/
> ---
> 
> (Updated Sept. 18, 2014, 6:38 p.m.)
> 
> 
> Review request for hive, Brock Noland and Xuefu Zhang.
> 
> 
> Bugs: HIVE-7503
> https://issues.apache.org/jira/browse/HIVE-7503
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> For Hive's multi insert query 
> (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML), there 
> may be an MR job for each insert. When we achieve this with Spark, it would 
> be nice if all the inserts can happen concurrently.
> It seems that this functionality isn't available in Spark. To make things 
> worse, the source of the insert may be re-computed unless it's staged. Even 
> with this, the inserts will happen sequentially, making the performance 
> suffer.
> This task is to find out what takes in Spark to enable this without requiring 
> staging the source and sequential insertion. If this has to be solved in 
> Hive, find out an optimum way to do this.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 
> 4211a0703f5b6bfd8a628b13864fac75ef4977cf 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
> 695d8b90cb1989805a7ff4e39a9635bbcea9c66c 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkWork.java 
> 864965e03a3f9d665e21e1c1b10b19dc286b842f 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
> 76fc290f00430dbc34dbbc1a0cef0d0eb59e6029 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkMergeTaskProcessor.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkMultiInsertionProcessor.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkProcessAnalyzeTable.java
>  5fcaf643a0e90fc4acc21187f6d78cefdb1b691a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/25394/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Chao Sun
> 
>



[jira] [Updated] (HIVE-7482) The execution side changes for SMB join in hive-tez

2014-09-19 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-7482:
-
Attachment: HIVE-7482.6.patch

Fix for failing tests in map reduce.

> The execution side changes for SMB join in hive-tez
> ---
>
> Key: HIVE-7482
> URL: https://issues.apache.org/jira/browse/HIVE-7482
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: tez-branch
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-7482.1.patch, HIVE-7482.2.patch, HIVE-7482.3.patch, 
> HIVE-7482.4.patch, HIVE-7482.5.patch, HIVE-7482.6.patch, 
> HIVE-7482.WIP.2.patch, HIVE-7482.WIP.3.patch, HIVE-7482.WIP.4.patch, 
> HIVE-7482.WIP.patch
>
>
> A piece of HIVE-7430.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7482) The execution side changes for SMB join in hive-tez

2014-09-19 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-7482:
-
Status: Patch Available  (was: Open)

> The execution side changes for SMB join in hive-tez
> ---
>
> Key: HIVE-7482
> URL: https://issues.apache.org/jira/browse/HIVE-7482
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: tez-branch
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-7482.1.patch, HIVE-7482.2.patch, HIVE-7482.3.patch, 
> HIVE-7482.4.patch, HIVE-7482.5.patch, HIVE-7482.6.patch, 
> HIVE-7482.WIP.2.patch, HIVE-7482.WIP.3.patch, HIVE-7482.WIP.4.patch, 
> HIVE-7482.WIP.patch
>
>
> A piece of HIVE-7430.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8185) hive-jdbc-0.14.0-SNAPSHOT-standalone.jar fails verification for signatures in build

2014-09-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140956#comment-14140956
 ] 

Hive QA commented on HIVE-8185:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12669892/HIVE-8185.2.patch

{color:green}SUCCESS:{color} +1 6292 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/878/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/878/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-878/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12669892

> hive-jdbc-0.14.0-SNAPSHOT-standalone.jar fails verification for signatures in 
> build
> ---
>
> Key: HIVE-8185
> URL: https://issues.apache.org/jira/browse/HIVE-8185
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 0.14.0
>Reporter: Gopal V
>Priority: Critical
> Attachments: HIVE-8185.1.patch, HIVE-8185.2.patch
>
>
> In the current build, running
> {code}
> jarsigner --verify ./lib/hive-jdbc-0.14.0-SNAPSHOT-standalone.jar
> Jar verification failed.
> {code}
> unless that jar is removed from the lib dir, all hive queries throw the 
> following error 
> {code}
> Exception in thread "main" java.lang.SecurityException: Invalid signature 
> file digest for Manifest main attributes
>   at 
> sun.security.util.SignatureFileVerifier.processImpl(SignatureFileVerifier.java:240)
>   at 
> sun.security.util.SignatureFileVerifier.process(SignatureFileVerifier.java:193)
>   at java.util.jar.JarVerifier.processEntry(JarVerifier.java:305)
>   at java.util.jar.JarVerifier.update(JarVerifier.java:216)
>   at java.util.jar.JarFile.initializeVerifier(JarFile.java:345)
>   at java.util.jar.JarFile.getInputStream(JarFile.java:412)
>   at 
> sun.misc.URLClassPath$JarLoader$2.getInputStream(URLClassPath.java:775)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7482) The execution side changes for SMB join in hive-tez

2014-09-19 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-7482:
-
Status: Open  (was: Patch Available)

> The execution side changes for SMB join in hive-tez
> ---
>
> Key: HIVE-7482
> URL: https://issues.apache.org/jira/browse/HIVE-7482
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: tez-branch
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-7482.1.patch, HIVE-7482.2.patch, HIVE-7482.3.patch, 
> HIVE-7482.4.patch, HIVE-7482.5.patch, HIVE-7482.6.patch, 
> HIVE-7482.WIP.2.patch, HIVE-7482.WIP.3.patch, HIVE-7482.WIP.4.patch, 
> HIVE-7482.WIP.patch
>
>
> A piece of HIVE-7430.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8138) Global Init file should allow specifying file name not only directory

2014-09-19 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140959#comment-14140959
 ] 

Szehon Ho commented on HIVE-8138:
-

+1

> Global Init file should allow specifying file name  not only directory
> --
>
> Key: HIVE-8138
> URL: https://issues.apache.org/jira/browse/HIVE-8138
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-8138.patch, HIVE-8138.patch
>
>
> HIVE-5160 allows you to specify a directory where a .hiverc file exists. 
> However since .hiverc is a hidden file this can be confusing. The property 
> should allow a path to a file or a directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8096) Fix a few small nits in TestExtendedAcls

2014-09-19 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140966#comment-14140966
 ] 

Szehon Ho commented on HIVE-8096:
-

+1, thanks for the cleanup.

> Fix a few small nits in TestExtendedAcls
> 
>
> Key: HIVE-8096
> URL: https://issues.apache.org/jira/browse/HIVE-8096
> Project: Hive
>  Issue Type: Improvement
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-8096.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8138) Global Init file should allow specifying file name not only directory

2014-09-19 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140971#comment-14140971
 ] 

Szehon Ho commented on HIVE-8138:
-

(pending tests)

> Global Init file should allow specifying file name  not only directory
> --
>
> Key: HIVE-8138
> URL: https://issues.apache.org/jira/browse/HIVE-8138
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-8138.patch, HIVE-8138.patch
>
>
> HIVE-5160 allows you to specify a directory where a .hiverc file exists. 
> However since .hiverc is a hidden file this can be confusing. The property 
> should allow a path to a file or a directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8189) A select statement with a subquery is failing with HBaseSerde

2014-09-19 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-8189:
---
Attachment: HIVE-8189.1.patch

Need code review.
The patch is for trunk,   
Problem:
 1) Predicates are used to filter out regions in hbase which do not need to 
 2) The predicates are sticking around in the jobConf from table with predic
Solution:
 removing the predicates before we reset them we remove this bad stat


> A select statement with a subquery is failing with HBaseSerde
> -
>
> Key: HIVE-8189
> URL: https://issues.apache.org/jira/browse/HIVE-8189
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.12.0, 0.13.1
>Reporter: Yongzhi Chen
> Attachments: HIVE-8189.1.patch, hbase_ppd_join.q
>
>
> Hive tables in the query are hbase tables, and the subquery is a join 
> statement.
> When
> set hive.optimize.ppd=true;
>   and
> set hive.auto.convert.join=false;
> The query does not return data. 
> While hive.optimize.ppd=true and hive.auto.convert.join=true return values 
> back. See attached query file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8189) A select statement with a subquery is failing with HBaseSerde

2014-09-19 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-8189:
---
Status: Patch Available  (was: Open)

need code review   

> A select statement with a subquery is failing with HBaseSerde
> -
>
> Key: HIVE-8189
> URL: https://issues.apache.org/jira/browse/HIVE-8189
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1, 0.12.0
>Reporter: Yongzhi Chen
> Attachments: HIVE-8189.1.patch, hbase_ppd_join.q
>
>
> Hive tables in the query are hbase tables, and the subquery is a join 
> statement.
> When
> set hive.optimize.ppd=true;
>   and
> set hive.auto.convert.join=false;
> The query does not return data. 
> While hive.optimize.ppd=true and hive.auto.convert.join=true return values 
> back. See attached query file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 25716: Type coercion for union queries.

2014-09-19 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25716/
---

(Updated Sept. 19, 2014, 5:55 p.m.)


Review request for hive and John Pullokkaran.


Changes
---

updated per feedback


Bugs: HIVE-8150
https://issues.apache.org/jira/browse/HIVE-8150


Repository: hive-git


Description
---

Type coercion for union queries.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 0d934ef 

Diff: https://reviews.apache.org/r/25716/diff/


Testing
---

union32.q


Thanks,

Ashutosh Chauhan



[jira] [Commented] (HIVE-6799) HiveServer2 needs to map kerberos name to local name before proxy check

2014-09-19 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140993#comment-14140993
 ] 

Thejas M Nair commented on HIVE-6799:
-

bq.  and local derby database.
You should be able to use remote rdbms as well.


> HiveServer2 needs to map kerberos name to local name before proxy check
> ---
>
> Key: HIVE-6799
> URL: https://issues.apache.org/jira/browse/HIVE-6799
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 0.13.1
>Reporter: Dilli Arumugam
>Assignee: Dilli Arumugam
> Fix For: 0.14.0
>
> Attachments: HIVE-6799.1.patch, HIVE-6799.2.patch, HIVE-6799.patch
>
>
> HiveServer2 does not map kerberos name of authenticated principal to local 
> name.
> Due to this, I get error like the following in HiveServer log:
> Failed to validate proxy privilage of knox/hdps.example.com for sam
> I have KINITED as knox/hdps.example@example.com
> I do have the following in core-site.xml
>   
> hadoop.proxyuser.knox.groups
> users
>   
>   
> hadoop.proxyuser.knox.hosts
> *
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 25394: HIVE-7503: Support Hive's multi-table insert query with Spark [Spark Branch]

2014-09-19 Thread Chao Sun


> On Sept. 19, 2014, 5:45 p.m., Xuefu Zhang wrote:
> > Nice work.
> > 
> > Besides comment below, I think there are some improvement can be done, 
> > either here or in a different patch:
> > 
> > 1. If we have a module that can compile an op tree (given by top ops) into 
> > a spark task, then we can reuse it after the original op tree is broken 
> > into several trees. From each tree, we compile it into a spark task. In the 
> > end, we hook up parent child relation ship. The current logic is a little 
> > complicated and hard to understand.
> > 2. Tests 
> > 3. Optimizations

I agree. I can do these in separate following patches.


> On Sept. 19, 2014, 5:45 p.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java,
> >  line 142
> > 
> >
> > Here we are mapping the children of lca to lca itself. Why is this 
> > necessary, as you can find the chidren of lca later without the map. Cannot 
> > we just store lca here?

The problem is because we are only generating one FS but multiple TSs. After 
the FS and the first TS is generated, the relation between child-parent is lost 
(since the optree is modified), and hence we need to store this information 
somewhere else, to be used when process the rest TSs.


> On Sept. 19, 2014, 5:45 p.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java,
> >  line 140
> > 
> >
> > This seems covering only the case where all FSs have a commont FORWARD 
> > parent. What if only some of them sharing a FORWARD parent, but other FSs 
> > and the FORWARD operator sharing some common parent?
> > 
> > I think the rule for whether to break the plan goes like this:
> > 
> > A plan needs to be broken if and only if there are more than one 
> > FileSinkOperator that can be traced back to a common parent and the tracing 
> > has to pass a ReduceSinkOperator on the way.

In this case the LCA is not a FOR, then break at this point is safe (might not 
be optimal), is that right?
Personally, after so many attempts, I'm a bit inclined to just do what MR does: 
go top-down and keep the first RS in the same SparkWork. For the rests RS, just 
break the plan.


> On Sept. 19, 2014, 5:45 p.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java,
> >  line 120
> > 
> >
> > I feel that the logic here can be simplified. Could we just pop all 
> > paths and then check if the root is the same and keep doing so until the 
> > common parent is found?

I'm not quite sure. I would happily accept if you have a better algorithm :) 
(the one I'm using is a just standard algorithm for finding LCA).
The LCA could be at different place in each path. How do you proceed to pop all 
paths? Also, there could be multiple common parents, but we need to identify 
the lowest one.


- Chao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25394/#review53871
---


On Sept. 18, 2014, 6:38 p.m., Chao Sun wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25394/
> ---
> 
> (Updated Sept. 18, 2014, 6:38 p.m.)
> 
> 
> Review request for hive, Brock Noland and Xuefu Zhang.
> 
> 
> Bugs: HIVE-7503
> https://issues.apache.org/jira/browse/HIVE-7503
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> For Hive's multi insert query 
> (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML), there 
> may be an MR job for each insert. When we achieve this with Spark, it would 
> be nice if all the inserts can happen concurrently.
> It seems that this functionality isn't available in Spark. To make things 
> worse, the source of the insert may be re-computed unless it's staged. Even 
> with this, the inserts will happen sequentially, making the performance 
> suffer.
> This task is to find out what takes in Spark to enable this without requiring 
> staging the source and sequential insertion. If this has to be solved in 
> Hive, find out an optimum way to do this.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 
> 4211a0703f5b6bfd8a628b13864fac75ef4977cf 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
> 695d8b90cb1989805a7ff4e39a9635bbcea9c66c 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkWork.java 
> 864965e03a3f9d665e21e1c1b10b19dc286b842f 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/s

[jira] [Updated] (HIVE-7883) DBTxnManager trying to close already closed metastore client connection

2014-09-19 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-7883:
-
Attachment: HIVE-7883.patch

The real question is why is DbTxnManager creating its own HiveMetastoreClient 
rather than using the existing one.  The attached patch fixes that, and removes 
the close.

> DBTxnManager trying to close already closed metastore client connection
> ---
>
> Key: HIVE-7883
> URL: https://issues.apache.org/jira/browse/HIVE-7883
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 0.14.0
>Reporter: Mostafa Mokhtar
>Assignee: Alan Gates
> Attachments: HIVE-7883.patch
>
>
> You will find following log message :
> {code}
> ERROR hive.metastore: Unable to shutdown local metastore client
> org.apache.thrift.transport.TTransportException: Cannot write to null 
> outputStream
>at 
> org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:142)
>at 
> org.apache.thrift.protocol.TBinaryProtocol.writeI32(TBinaryProtocol.java:163)
>at 
> org.apache.thrift.protocol.TBinaryProtocol.writeMessageBegin(TBinaryProtocol.java:91)
>at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62)
>at 
> com.facebook.fb303.FacebookService$Client.send_shutdown(FacebookService.java:431)
>at 
> com.facebook.fb303.FacebookService$Client.shutdown(FacebookService.java:425)
>at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.close(HiveMetaStoreClient.java:435)
>at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.destruct(DbTxnManager.java:304)
>at 
> org.apache.hadoop.hive.ql.lockmgr.HiveTxnManagerImpl.finalize(HiveTxnManagerImpl.java:44)
>at java.lang.ref.Finalizer.invokeFinalizeMethod(Native Method)
>at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:101)
>at java.lang.ref.Finalizer.access$100(Finalizer.java:32)
>at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:190)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7883) DBTxnManager trying to close already closed metastore client connection

2014-09-19 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-7883:
-
Status: Patch Available  (was: Open)

> DBTxnManager trying to close already closed metastore client connection
> ---
>
> Key: HIVE-7883
> URL: https://issues.apache.org/jira/browse/HIVE-7883
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 0.14.0
>Reporter: Mostafa Mokhtar
>Assignee: Alan Gates
> Attachments: HIVE-7883.patch
>
>
> You will find following log message :
> {code}
> ERROR hive.metastore: Unable to shutdown local metastore client
> org.apache.thrift.transport.TTransportException: Cannot write to null 
> outputStream
>at 
> org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:142)
>at 
> org.apache.thrift.protocol.TBinaryProtocol.writeI32(TBinaryProtocol.java:163)
>at 
> org.apache.thrift.protocol.TBinaryProtocol.writeMessageBegin(TBinaryProtocol.java:91)
>at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62)
>at 
> com.facebook.fb303.FacebookService$Client.send_shutdown(FacebookService.java:431)
>at 
> com.facebook.fb303.FacebookService$Client.shutdown(FacebookService.java:425)
>at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.close(HiveMetaStoreClient.java:435)
>at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.destruct(DbTxnManager.java:304)
>at 
> org.apache.hadoop.hive.ql.lockmgr.HiveTxnManagerImpl.finalize(HiveTxnManagerImpl.java:44)
>at java.lang.ref.Finalizer.invokeFinalizeMethod(Native Method)
>at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:101)
>at java.lang.ref.Finalizer.access$100(Finalizer.java:32)
>at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:190)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 25716: Type coercion for union queries.

2014-09-19 Thread John Pullokkaran

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25716/#review53979
---

Ship it!


Ship It!

- John Pullokkaran


On Sept. 19, 2014, 5:55 p.m., Ashutosh Chauhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25716/
> ---
> 
> (Updated Sept. 19, 2014, 5:55 p.m.)
> 
> 
> Review request for hive and John Pullokkaran.
> 
> 
> Bugs: HIVE-8150
> https://issues.apache.org/jira/browse/HIVE-8150
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Type coercion for union queries.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 0d934ef 
> 
> Diff: https://reviews.apache.org/r/25716/diff/
> 
> 
> Testing
> ---
> 
> union32.q
> 
> 
> Thanks,
> 
> Ashutosh Chauhan
> 
>



Re: Review Request 25754: HIVE-8111 CBO trunk merge: duplicated casts for arithmetic expressions in Hive and CBO

2014-09-19 Thread John Pullokkaran

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25754/#review53982
---



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseNumeric.java


Avoid CBO name in function. Its a generic function current consumer is CBO.


- John Pullokkaran


On Sept. 17, 2014, 9:25 p.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25754/
> ---
> 
> (Updated Sept. 17, 2014, 9:25 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and John Pullokkaran.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/translator/RexNodeConverter.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseNumeric.java 
> 6131d3d 
>   ql/src/test/queries/clientpositive/decimal_udf.q 591c210 
>   ql/src/test/results/clientpositive/decimal_udf.q.out c5c2031 
> 
> Diff: https://reviews.apache.org/r/25754/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



Re: Review Request 25394: HIVE-7503: Support Hive's multi-table insert query with Spark [Spark Branch]

2014-09-19 Thread Chao Sun


> On Sept. 19, 2014, 5:45 p.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java,
> >  line 142
> > 
> >
> > Here we are mapping the children of lca to lca itself. Why is this 
> > necessary, as you can find the chidren of lca later without the map. Cannot 
> > we just store lca here?
> 
> Chao Sun wrote:
> The problem is because we are only generating one FS but multiple TSs. 
> After the FS and the first TS is generated, the relation between child-parent 
> is lost (since the optree is modified), and hence we need to store this 
> information somewhere else, to be used when process the rest TSs.

It might be tricky to just store LCA. When the graph walker reaches a node, it 
needs to check whether that node is a child of LCA, and if so, break the plan.
You could say that since we have LCA, we have all its children info. However, 
after the first child, the children for the LCA are changed, so we need to 
store this info somewhere, IMHO.


- Chao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25394/#review53871
---


On Sept. 18, 2014, 6:38 p.m., Chao Sun wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25394/
> ---
> 
> (Updated Sept. 18, 2014, 6:38 p.m.)
> 
> 
> Review request for hive, Brock Noland and Xuefu Zhang.
> 
> 
> Bugs: HIVE-7503
> https://issues.apache.org/jira/browse/HIVE-7503
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> For Hive's multi insert query 
> (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML), there 
> may be an MR job for each insert. When we achieve this with Spark, it would 
> be nice if all the inserts can happen concurrently.
> It seems that this functionality isn't available in Spark. To make things 
> worse, the source of the insert may be re-computed unless it's staged. Even 
> with this, the inserts will happen sequentially, making the performance 
> suffer.
> This task is to find out what takes in Spark to enable this without requiring 
> staging the source and sequential insertion. If this has to be solved in 
> Hive, find out an optimum way to do this.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 
> 4211a0703f5b6bfd8a628b13864fac75ef4977cf 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
> 695d8b90cb1989805a7ff4e39a9635bbcea9c66c 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkWork.java 
> 864965e03a3f9d665e21e1c1b10b19dc286b842f 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
> 76fc290f00430dbc34dbbc1a0cef0d0eb59e6029 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkMergeTaskProcessor.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkMultiInsertionProcessor.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkProcessAnalyzeTable.java
>  5fcaf643a0e90fc4acc21187f6d78cefdb1b691a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/25394/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Chao Sun
> 
>



[jira] [Updated] (HIVE-7812) Disable CombineHiveInputFormat when ACID format is used

2014-09-19 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-7812:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

I committed this. Thanks for the review, Ashutosh.

> Disable CombineHiveInputFormat when ACID format is used
> ---
>
> Key: HIVE-7812
> URL: https://issues.apache.org/jira/browse/HIVE-7812
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 0.14.0
>
> Attachments: HIVE-7812.patch, HIVE-7812.patch, HIVE-7812.patch, 
> HIVE-7812.patch
>
>
> Currently the HiveCombineInputFormat complains when called on an ACID 
> directory. Modify HiveCombineInputFormat so that HiveInputFormat is used 
> instead if the directory is ACID format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7145) Remove dependence on apache commons-lang

2014-09-19 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-7145:

Assignee: (was: Owen O'Malley)

> Remove dependence on apache commons-lang
> 
>
> Key: HIVE-7145
> URL: https://issues.apache.org/jira/browse/HIVE-7145
> Project: Hive
>  Issue Type: Bug
>Reporter: Owen O'Malley
>
> We currently depend on both Apache commons-lang and commons-lang3. They are 
> the same project, just at version 2.x vs 3.x. I propose that we move all of 
> the references in Hive to commons-lang3 and remove the v2 usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 25754: HIVE-8111 CBO trunk merge: duplicated casts for arithmetic expressions in Hive and CBO

2014-09-19 Thread John Pullokkaran

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25754/#review53986
---



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseNumeric.java


Instead of these changes why don't you use 
FunctionRegistry.getTypeInfoForPrimitiveCategory(a,b)


- John Pullokkaran


On Sept. 17, 2014, 9:25 p.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25754/
> ---
> 
> (Updated Sept. 17, 2014, 9:25 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and John Pullokkaran.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/translator/RexNodeConverter.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseNumeric.java 
> 6131d3d 
>   ql/src/test/queries/clientpositive/decimal_udf.q 591c210 
>   ql/src/test/results/clientpositive/decimal_udf.q.out c5c2031 
> 
> Diff: https://reviews.apache.org/r/25754/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



[jira] [Commented] (HIVE-8111) CBO trunk merge: duplicated casts for arithmetic expressions in Hive and CBO

2014-09-19 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14141065#comment-14141065
 ] 

Laljo John Pullokkaran commented on HIVE-8111:
--

Why don't use FunctionRegistry.getTypeInfoForPrimitiveCategory() to decide 
common type; may be add a utility to find common type among n args.

> CBO trunk merge: duplicated casts for arithmetic expressions in Hive and CBO
> 
>
> Key: HIVE-8111
> URL: https://issues.apache.org/jira/browse/HIVE-8111
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-8111.patch
>
>
> Original test failure: looks like column type changes to different decimals 
> in most cases. In one case it causes the integer part to be too big to fit, 
> so the result becomes null it seems.
> What happens is that CBO adds casts to arithmetic expressions to make them 
> type compatible; these casts become part of new AST, and then Hive adds casts 
> on top of these casts. This (the first part) also causes lots of out file 
> changes. It's not clear how to best fix it so far, in addition to incorrect 
> decimal width and sometimes nulls when width is larger than allowed in Hive.
> Option one - don't add those for numeric ops - cannot be done if numeric op 
> is a part of compare, for which CBO needs correct types.
> Option two - unwrap casts when determining type in Hive - hard or impossible 
> to tell apart CBO-added casts and user casts. 
> Option three - don't change types in Hive if CBO has run - seems hacky and 
> hard to ensure it's applied everywhere.
> Option four - map all expressions precisely between two trees and remove 
> casts again after optimization, will be pretty difficult.
> Option five - somehow mark those casts. Not sure about how yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8184) inconsistence between colList and columnExprMap when ConstantPropagate is applied to subquery

2014-09-19 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8184:
--
Attachment: HIVE-8184.2.patch

fix null pointer exception

>  inconsistence between colList and columnExprMap when ConstantPropagate is 
> applied to subquery
> --
>
> Key: HIVE-8184
> URL: https://issues.apache.org/jira/browse/HIVE-8184
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-8184.1.patch, HIVE-8184.2.patch
>
>
> Query like 
>  select * from (select a.key as ak, a.value as av, b.key as bk, b.value as bv 
> from src a join src1 b where a.key = '428' ) c;
> will fail as
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8184) inconsistence between colList and columnExprMap when ConstantPropagate is applied to subquery

2014-09-19 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8184:
--
Status: Open  (was: Patch Available)

>  inconsistence between colList and columnExprMap when ConstantPropagate is 
> applied to subquery
> --
>
> Key: HIVE-8184
> URL: https://issues.apache.org/jira/browse/HIVE-8184
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-8184.1.patch, HIVE-8184.2.patch
>
>
> Query like 
>  select * from (select a.key as ak, a.value as av, b.key as bk, b.value as bv 
> from src a join src1 b where a.key = '428' ) c;
> will fail as
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8184) inconsistence between colList and columnExprMap when ConstantPropagate is applied to subquery

2014-09-19 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8184:
--
Status: Patch Available  (was: Open)

>  inconsistence between colList and columnExprMap when ConstantPropagate is 
> applied to subquery
> --
>
> Key: HIVE-8184
> URL: https://issues.apache.org/jira/browse/HIVE-8184
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-8184.1.patch, HIVE-8184.2.patch
>
>
> Query like 
>  select * from (select a.key as ak, a.value as av, b.key as bk, b.value as bv 
> from src a join src1 b where a.key = '428' ) c;
> will fail as
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8184) inconsistence between colList and columnExprMap when ConstantPropagate is applied to subquery

2014-09-19 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8184:
--
Attachment: (was: HIVE-8184.2.patch)

>  inconsistence between colList and columnExprMap when ConstantPropagate is 
> applied to subquery
> --
>
> Key: HIVE-8184
> URL: https://issues.apache.org/jira/browse/HIVE-8184
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-8184.1.patch
>
>
> Query like 
>  select * from (select a.key as ak, a.value as av, b.key as bk, b.value as bv 
> from src a join src1 b where a.key = '428' ) c;
> will fail as
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6936) Provide table properties to InputFormats

2014-09-19 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-6936:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

I messed up and committed both HIVE-7812 and HIVE-6936 (with one change) in 
r1626292. The last part of HIVE-6936 is r1626294.

> Provide table properties to InputFormats
> 
>
> Key: HIVE-6936
> URL: https://issues.apache.org/jira/browse/HIVE-6936
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 0.14.0
>
> Attachments: HIVE-6936.patch, HIVE-6936.patch, HIVE-6936.patch, 
> HIVE-6936.patch, HIVE-6936.patch, HIVE-6936.patch, HIVE-6936.patch, 
> HIVE-6936.patch, HIVE-6936.patch, HIVE-6936.patch
>
>
> Some advanced file formats need the table properties made available to them. 
> Additionally, it would be convenient to provide a unique id for fetch 
> operators and the complete list of directories.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8191) Update and delete on tables with non Acid output formats gives runtime error

2014-09-19 Thread Alan Gates (JIRA)
Alan Gates created HIVE-8191:


 Summary: Update and delete on tables with non Acid output formats 
gives runtime error
 Key: HIVE-8191
 URL: https://issues.apache.org/jira/browse/HIVE-8191
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.14.0
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Critical


{code}
create table not_an_acid_table(a int, b varchar(128));
insert into table not_an_acid_table select cint, cast(cstring1 as varchar(128)) 
from alltypesorc where cint is not null order by cint limit 10;
delete from not_an_acid_table where b = '0ruyd6Y50JpdGRf6HqD';
{code}

This generates a runtime error.  It should get a compile error instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8150) [CBO] Type coercion in union queries

2014-09-19 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8150:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to cbo branch.

> [CBO] Type coercion in union queries
> 
>
> Key: HIVE-8150
> URL: https://issues.apache.org/jira/browse/HIVE-8150
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-8150.cbo.patch, HIVE-8150.cbo.patch
>
>
> If we can't get common type from Optiq, bail out for now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8100) Add QTEST_LEAVE_FILES to QTestUtil

2014-09-19 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8100:
---
Description: Basically the idea here is to have an option to not delete the 
warehouse directory. I am using an env variable so it's always passed to all 
sub-processes. This is useful when you want to see the table structure of the 
warehouse directory after a test.  (was: Basically the idea here is to have an 
option to not delete the warehouse directory. I am using an env variable so 
it's always passed to all sub-processes.)

> Add QTEST_LEAVE_FILES to QTestUtil
> --
>
> Key: HIVE-8100
> URL: https://issues.apache.org/jira/browse/HIVE-8100
> Project: Hive
>  Issue Type: Improvement
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-8100.patch, HIVE-8100.patch
>
>
> Basically the idea here is to have an option to not delete the warehouse 
> directory. I am using an env variable so it's always passed to all 
> sub-processes. This is useful when you want to see the table structure of the 
> warehouse directory after a test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8184) inconsistence between colList and columnExprMap when ConstantPropagate is applied to subquery

2014-09-19 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8184:
--
Status: Open  (was: Patch Available)

>  inconsistence between colList and columnExprMap when ConstantPropagate is 
> applied to subquery
> --
>
> Key: HIVE-8184
> URL: https://issues.apache.org/jira/browse/HIVE-8184
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-8184.1.patch
>
>
> Query like 
>  select * from (select a.key as ak, a.value as av, b.key as bk, b.value as bv 
> from src a join src1 b where a.key = '428' ) c;
> will fail as
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8100) Add QTEST_LEAVE_FILES to QTestUtil

2014-09-19 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14141085#comment-14141085
 ] 

Szehon Ho commented on HIVE-8100:
-

+1

> Add QTEST_LEAVE_FILES to QTestUtil
> --
>
> Key: HIVE-8100
> URL: https://issues.apache.org/jira/browse/HIVE-8100
> Project: Hive
>  Issue Type: Improvement
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-8100.patch, HIVE-8100.patch
>
>
> Basically the idea here is to have an option to not delete the warehouse 
> directory. I am using an env variable so it's always passed to all 
> sub-processes. This is useful when you want to see the table structure of the 
> warehouse directory after a test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8184) inconsistence between colList and columnExprMap when ConstantPropagate is applied to subquery

2014-09-19 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8184:
--
Status: Patch Available  (was: Open)

>  inconsistence between colList and columnExprMap when ConstantPropagate is 
> applied to subquery
> --
>
> Key: HIVE-8184
> URL: https://issues.apache.org/jira/browse/HIVE-8184
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-8184.1.patch, HIVE-8184.2.patch
>
>
> Query like 
>  select * from (select a.key as ak, a.value as av, b.key as bk, b.value as bv 
> from src a join src1 b where a.key = '428' ) c;
> will fail as
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8184) inconsistence between colList and columnExprMap when ConstantPropagate is applied to subquery

2014-09-19 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8184:
--
Attachment: HIVE-8184.2.patch

>  inconsistence between colList and columnExprMap when ConstantPropagate is 
> applied to subquery
> --
>
> Key: HIVE-8184
> URL: https://issues.apache.org/jira/browse/HIVE-8184
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-8184.1.patch, HIVE-8184.2.patch
>
>
> Query like 
>  select * from (select a.key as ak, a.value as av, b.key as bk, b.value as bv 
> from src a join src1 b where a.key = '428' ) c;
> will fail as
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 25800: inconsistence between colList and columnExprMap when ConstantPropagate is applied to subquery

2014-09-19 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25800/
---

(Updated Sept. 19, 2014, 6:55 p.m.)


Review request for hive.


Changes
---

address null pointer exception


Repository: hive-git


Description
---

Query like
select * from (select a.key as ak, a.value as av, b.key as bk, b.value as bv 
from src a join src1 b where a.key = '428' ) c;
will fail as
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask


Diffs (updated)
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConstantPropagateProcFactory.java
 790a92e 
  ql/src/test/queries/clientpositive/constantPropagateForSubQuery.q 
PRE-CREATION 
  ql/src/test/results/clientpositive/constantPropagateForSubQuery.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/25800/diff/


Testing
---


Thanks,

pengcheng xiong



[jira] [Assigned] (HIVE-7856) Enable parallelism in Reduce Side Join [Spark Branch]

2014-09-19 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho reassigned HIVE-7856:
---

Assignee: Szehon Ho

> Enable parallelism in Reduce Side Join [Spark Branch]
> -
>
> Key: HIVE-7856
> URL: https://issues.apache.org/jira/browse/HIVE-7856
> Project: Hive
>  Issue Type: New Feature
>  Components: Spark
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>
> This is dependent on new transformation to be provided by SPARK-2978, see 
> parent JIRA for details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8188) ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight loop

2014-09-19 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14141095#comment-14141095
 ] 

Gopal V commented on HIVE-8188:
---

[~prasanth_j]: that is pretty neat, speedup.

But that's not the place I found the fix in, it was in isDeterministic() within 
the Constant codepath in ExprNode evaluator.

> ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight 
> loop
> -
>
> Key: HIVE-8188
> URL: https://issues.apache.org/jira/browse/HIVE-8188
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 0.14.0
>Reporter: Gopal V
> Attachments: udf-deterministic.png
>
>
> When running a near-constant UDF, most of the CPU is burnt within the VM 
> trying to read the class annotations for every row.
> !udf-deterministic.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8188) ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight loop

2014-09-19 Thread Prasanth J (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14141099#comment-14141099
 ] 

Prasanth J commented on HIVE-8188:
--

Looking at the attached PNG (GBY + Reflection), I thought its UDAF that uses 
reflection in inner loop. Looks like many places needs improvement then. 

> ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight 
> loop
> -
>
> Key: HIVE-8188
> URL: https://issues.apache.org/jira/browse/HIVE-8188
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 0.14.0
>Reporter: Gopal V
> Attachments: udf-deterministic.png
>
>
> When running a near-constant UDF, most of the CPU is burnt within the VM 
> trying to read the class annotations for every row.
> !udf-deterministic.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8186) CBO Trunk Merge: join_vc fails

2014-09-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14141113#comment-14141113
 ] 

Hive QA commented on HIVE-8186:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12669886/HIVE-8186.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6293 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_vc
org.apache.hadoop.hive.ql.parse.TestParse.testParse_union
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/879/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/879/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-879/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12669886

> CBO Trunk Merge: join_vc fails
> --
>
> Key: HIVE-8186
> URL: https://issues.apache.org/jira/browse/HIVE-8186
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-8186.patch
>
>
> Simplified query appears to fail in CBO branch even with CBO disabled. I'm 
> looking...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8185) hive-jdbc-0.14.0-SNAPSHOT-standalone.jar fails verification for signatures in build

2014-09-19 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14141130#comment-14141130
 ] 

Gopal V commented on HIVE-8185:
---

+1 - LGTM.

> hive-jdbc-0.14.0-SNAPSHOT-standalone.jar fails verification for signatures in 
> build
> ---
>
> Key: HIVE-8185
> URL: https://issues.apache.org/jira/browse/HIVE-8185
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 0.14.0
>Reporter: Gopal V
>Priority: Critical
> Attachments: HIVE-8185.1.patch, HIVE-8185.2.patch
>
>
> In the current build, running
> {code}
> jarsigner --verify ./lib/hive-jdbc-0.14.0-SNAPSHOT-standalone.jar
> Jar verification failed.
> {code}
> unless that jar is removed from the lib dir, all hive queries throw the 
> following error 
> {code}
> Exception in thread "main" java.lang.SecurityException: Invalid signature 
> file digest for Manifest main attributes
>   at 
> sun.security.util.SignatureFileVerifier.processImpl(SignatureFileVerifier.java:240)
>   at 
> sun.security.util.SignatureFileVerifier.process(SignatureFileVerifier.java:193)
>   at java.util.jar.JarVerifier.processEntry(JarVerifier.java:305)
>   at java.util.jar.JarVerifier.update(JarVerifier.java:216)
>   at java.util.jar.JarFile.initializeVerifier(JarFile.java:345)
>   at java.util.jar.JarFile.getInputStream(JarFile.java:412)
>   at 
> sun.misc.URLClassPath$JarLoader$2.getInputStream(URLClassPath.java:775)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8191) Update and delete on tables with non Acid output formats gives runtime error

2014-09-19 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-8191:
-
Attachment: HIVE-8191.patch

Added a check when updating and deleting that the table is acid compliant.

> Update and delete on tables with non Acid output formats gives runtime error
> 
>
> Key: HIVE-8191
> URL: https://issues.apache.org/jira/browse/HIVE-8191
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.14.0
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Critical
> Attachments: HIVE-8191.patch
>
>
> {code}
> create table not_an_acid_table(a int, b varchar(128));
> insert into table not_an_acid_table select cint, cast(cstring1 as 
> varchar(128)) from alltypesorc where cint is not null order by cint limit 10;
> delete from not_an_acid_table where b = '0ruyd6Y50JpdGRf6HqD';
> {code}
> This generates a runtime error.  It should get a compile error instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8191) Update and delete on tables with non Acid output formats gives runtime error

2014-09-19 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-8191:
-
Status: Patch Available  (was: Open)

> Update and delete on tables with non Acid output formats gives runtime error
> 
>
> Key: HIVE-8191
> URL: https://issues.apache.org/jira/browse/HIVE-8191
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.14.0
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Critical
> Attachments: HIVE-8191.patch
>
>
> {code}
> create table not_an_acid_table(a int, b varchar(128));
> insert into table not_an_acid_table select cint, cast(cstring1 as 
> varchar(128)) from alltypesorc where cint is not null order by cint limit 10;
> delete from not_an_acid_table where b = '0ruyd6Y50JpdGRf6HqD';
> {code}
> This generates a runtime error.  It should get a compile error instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 25394: HIVE-7503: Support Hive's multi-table insert query with Spark [Spark Branch]

2014-09-19 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25394/#review54004
---



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java


I was thinking that you have all the paths at hand, now you just keep 
moving up all branches up together and then checking if lca is hit. I didn't 
realize we do this while we are still traversing the tree.

I have to admit that I don't quite get the whole logic here. Does the lca 
change before all TS is visited?



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java


Correct me if I'm wrong. For the whole graph, we at most find one LCA to 
split the plan, right? Also, in no way, th LCA can be an FORWARD, right? But 
there can be mutliple FORARDs, which can have a common ancestor, which might be 
a point to split.

Again, I don't quite understand how lca is identified while we are still 
visiting the tree. But I'm sure that we don't want to create more spark jobs 
then needed. If we don't do better than MR when we could, then the meaning of 
the project would be greatly compromised.



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java


Okay. Fair enough.


- Xuefu Zhang


On Sept. 18, 2014, 6:38 p.m., Chao Sun wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25394/
> ---
> 
> (Updated Sept. 18, 2014, 6:38 p.m.)
> 
> 
> Review request for hive, Brock Noland and Xuefu Zhang.
> 
> 
> Bugs: HIVE-7503
> https://issues.apache.org/jira/browse/HIVE-7503
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> For Hive's multi insert query 
> (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML), there 
> may be an MR job for each insert. When we achieve this with Spark, it would 
> be nice if all the inserts can happen concurrently.
> It seems that this functionality isn't available in Spark. To make things 
> worse, the source of the insert may be re-computed unless it's staged. Even 
> with this, the inserts will happen sequentially, making the performance 
> suffer.
> This task is to find out what takes in Spark to enable this without requiring 
> staging the source and sequential insertion. If this has to be solved in 
> Hive, find out an optimum way to do this.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 
> 4211a0703f5b6bfd8a628b13864fac75ef4977cf 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
> 695d8b90cb1989805a7ff4e39a9635bbcea9c66c 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkWork.java 
> 864965e03a3f9d665e21e1c1b10b19dc286b842f 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
> 76fc290f00430dbc34dbbc1a0cef0d0eb59e6029 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkMergeTaskProcessor.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkMultiInsertionProcessor.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkProcessAnalyzeTable.java
>  5fcaf643a0e90fc4acc21187f6d78cefdb1b691a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/25394/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Chao Sun
> 
>



[jira] [Created] (HIVE-8192) Check DDL's writetype in DummyTxnManager

2014-09-19 Thread cw (JIRA)
cw created HIVE-8192:


 Summary: Check DDL's writetype in DummyTxnManager
 Key: HIVE-8192
 URL: https://issues.apache.org/jira/browse/HIVE-8192
 Project: Hive
  Issue Type: Bug
  Components: Locking
Affects Versions: 0.13.1, 0.13.0
 Environment: hive0.13.1
Reporter: cw
Priority: Minor
 Fix For: 0.14.0


The patch of HIVE-6734 added some DDL writetypes and checked DDL writetype in 
DbTxnManager.java.
We use DummyTxnManager as the default value of hive.txn.manager in 
hive-site.xml. We noticed that the operation of CREATE TEMPORARY FUNCTION has a 
DLL_NO_LOCK writetype but it requires a EXCLUSIVE lock. If we try to create a 
temporary function while there's a SELECT is processing at the same database, 
then the console will print 'conflicting lock present for default mode 
EXCLUSIVE' and the CREATE TEMPORARY FUNCTION operation won't get the lock until 
the SELECT is done. Maybe it's a good idea to check the DDL's writetype in 
DummyTxnManager too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8192) Check DDL's writetype in DummyTxnManager

2014-09-19 Thread cw (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

cw updated HIVE-8192:
-
Attachment: HIVE-8192.patch.txt

> Check DDL's writetype in DummyTxnManager
> 
>
> Key: HIVE-8192
> URL: https://issues.apache.org/jira/browse/HIVE-8192
> Project: Hive
>  Issue Type: Bug
>  Components: Locking
>Affects Versions: 0.13.0, 0.13.1
> Environment: hive0.13.1
>Reporter: cw
>Priority: Minor
>  Labels: patch
> Fix For: 0.14.0
>
> Attachments: HIVE-8192.patch.txt
>
>
> The patch of HIVE-6734 added some DDL writetypes and checked DDL writetype in 
> DbTxnManager.java.
> We use DummyTxnManager as the default value of hive.txn.manager in 
> hive-site.xml. We noticed that the operation of CREATE TEMPORARY FUNCTION has 
> a DLL_NO_LOCK writetype but it requires a EXCLUSIVE lock. If we try to create 
> a temporary function while there's a SELECT is processing at the same 
> database, then the console will print 'conflicting lock present for default 
> mode EXCLUSIVE' and the CREATE TEMPORARY FUNCTION operation won't get the 
> lock until the SELECT is done. Maybe it's a good idea to check the DDL's 
> writetype in DummyTxnManager too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >