date:20141006

[jira] [Commented] (HIVE-8352) Enable windowing.q for spark

2014-10-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160055#comment-14160055
 ] 

Hive QA commented on HIVE-8352:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12673055/HIVE-8352.1-spark.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6739 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_parallel
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/196/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/196/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-196/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12673055

> Enable windowing.q for spark
> 
>
> Key: HIVE-8352
> URL: https://issues.apache.org/jira/browse/HIVE-8352
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Jimmy Xiang
>Priority: Minor
> Attachments: HIVE-8352.1-spark.patch, HIVE-8352.1-spark.patch, 
> hive-8385.patch
>
>
> We should enable windowing.q for basic windowing coverage. After checking out 
> the spark branch, we would build:
> {noformat}
> $ mvn clean install -DskipTests -Phadoop-2
> $ cd itests/
> $ mvn clean install -DskipTests -Phadoop-2
> {noformat}
> Then generate the windowing.q.out file:
> {noformat}
> $ cd qtest-spark/
> $ mvn test -Dtest=TestSparkCliDriver -Dqfile=windowing.q -Phadoop-2 
> -Dtest.output.overwrite=true
> {noformat}
> Compare the output against MapReduce:
> {noformat}
> $ diff -y -W 150 
> ../../ql/src/test/results/clientpositive/spark/windowing.q.out 
> ../../ql/src/test/results/clientpositive/windowing.q.out| less
> {noformat}
> And if everything looks good, add it to {{spark.query.files}} in 
> {{./itests/src/test/resources/testconfiguration.properties}}
> then submit the patch including the .q file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7205) Wrong results when union all of grouping followed by group by with correlation optimization

2014-10-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160067#comment-14160067
 ] 

Hive QA commented on HIVE-7205:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12673048/HIVE-7205.4.patch.txt

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6525 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter
org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1128/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1128/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1128/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12673048

> Wrong results when union all of grouping followed by group by with 
> correlation optimization
> ---
>
> Key: HIVE-7205
> URL: https://issues.apache.org/jira/browse/HIVE-7205
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.12.0, 0.13.0, 0.13.1
>Reporter: dima machlin
>Assignee: Navis
>Priority: Critical
> Attachments: HIVE-7205.1.patch.txt, HIVE-7205.2.patch.txt, 
> HIVE-7205.3.patch.txt, HIVE-7205.4.patch.txt
>
>
> use case :
> table TBL (a string,b string) contains single row : 'a','a'
> the following query :
> {code:sql}
> select b, sum(cc) from (
> select b,count(1) as cc from TBL group by b
> union all
> select a as b,count(1) as cc from TBL group by a
> ) z
> group by b
> {code}
> returns 
> a 1
> a 1
> while set hive.optimize.correlation=true;
> if we change set hive.optimize.correlation=false;
> it returns correct results : a 2
> The plan with correlation optimization :
> {code:sql}
> ABSTRACT SYNTAX TREE:
>   (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_UNION (TOK_QUERY (TOK_FROM 
> (TOK_TABREF (TOK_TABNAME DB TBL))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR 
> TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL b)) (TOK_SELEXPR 
> (TOK_FUNCTION count 1) cc)) (TOK_GROUPBY (TOK_TABLE_OR_COL b (TOK_QUERY 
> (TOK_FROM (TOK_TABREF (TOK_TABNAME DB TBL))) (TOK_INSERT (TOK_DESTINATION 
> (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL a) b) 
> (TOK_SELEXPR (TOK_FUNCTION count 1) cc)) (TOK_GROUPBY (TOK_TABLE_OR_COL 
> a) z)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT 
> (TOK_SELEXPR (TOK_TABLE_OR_COL b)) (TOK_SELEXPR (TOK_FUNCTION sum 
> (TOK_TABLE_OR_COL cc (TOK_GROUPBY (TOK_TABLE_OR_COL b
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 is a root stage
> STAGE PLANS:
>   Stage: Stage-1
> Map Reduce
>   Alias -> Map Operator Tree:
> null-subquery1:z-subquery1:TBL 
>   TableScan
> alias: TBL
> Select Operator
>   expressions:
> expr: b
> type: string
>   outputColumnNames: b
>   Group By Operator
> aggregations:
>   expr: count(1)
> bucketGroup: false
> keys:
>   expr: b
>   type: string
> mode: hash
> outputColumnNames: _col0, _col1
> Reduce Output Operator
>   key expressions:
> expr: _col0
> type: string
>   sort order: +
>   Map-reduce partition columns:
> expr: _col0
> type: string
>   tag: 0
>   value expressions:
> expr: _col1
> type: bigint
> null-subquery2:z-subquery2:TBL 
>   TableScan
> alias: TBL
> Select Operator
>   expressions:
> expr: a
> type: string
>   outputColumnNames: a
>   Group By Operator
> aggregations:
>   expr: count(1)
> bucketGroup: false
> keys:
>   expr: a
>   type: string
> mode: hash
> outpu

[jira] [Commented] (HIVE-8193) Hook HiveServer2 dynamic service discovery with session time out

2014-10-06 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160068#comment-14160068
 ] 

Vaibhav Gumashta commented on HIVE-8193:


[~thejas] None of the failures are related. Thanks!

> Hook HiveServer2 dynamic service discovery with session time out
> 
>
> Key: HIVE-8193
> URL: https://issues.apache.org/jira/browse/HIVE-8193
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.14.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8193.1.patch
>
>
> For dynamic service discovery, if the HiveServer2 instance is removed from 
> ZooKeeper, currently, on the last client close, the server shuts down. 
> However, we need to ensure that this also happens when a session is closed on 
> timeout and no current sessions exit on this instance of HiveServer2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8172) HiveServer2 dynamic service discovery should let the JDBC client use default ZooKeeper namespace

2014-10-06 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160070#comment-14160070
 ] 

Vaibhav Gumashta commented on HIVE-8172:


cc [~thejas]

> HiveServer2 dynamic service discovery should let the JDBC client use default 
> ZooKeeper namespace
> 
>
> Key: HIVE-8172
> URL: https://issues.apache.org/jira/browse/HIVE-8172
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 0.14.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Critical
>  Labels: TODOC14
> Fix For: 0.14.0
>
> Attachments: HIVE-8172.1.patch
>
>
> Currently the client provides a url like:
>  
> jdbc:hive2://vgumashta.local:2181,vgumashta.local:2182,vgumashta.local:2183/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2.
>  
> The zooKeeperNamespace param when not provided should use the default value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8324) Shim KerberosName (causes build failure on hadoop-1)

2014-10-06 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-8324:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

The test failure is not related.

Patch committed to trunk and 14. Thanks for reviewing [~szehon]!

> Shim KerberosName (causes build failure on hadoop-1)
> 
>
> Key: HIVE-8324
> URL: https://issues.apache.org/jira/browse/HIVE-8324
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Szehon Ho
>Assignee: Vaibhav Gumashta
>Priority: Blocker
> Fix For: 0.14.0
>
> Attachments: HIVE-8324.1.patch, HIVE-8324.2.patch
>
>
> Unfortunately even after HIVE-8265, there are still more compile failures.
> {code}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
> on project hive-service: Compilation failure: Compilation failure:
> [ERROR] 
> /Users/szehon/svn-repos/trunk/service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java:[35,54]
>  cannot find symbol
> [ERROR] symbol:   class KerberosName
> [ERROR] location: package org.apache.hadoop.security.authentication.util
> [ERROR] 
> /Users/szehon/svn-repos/trunk/service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java:[241,7]
>  cannot find symbol
> [ERROR] symbol:   class KerberosName
> [ERROR] location: class 
> org.apache.hive.service.cli.thrift.ThriftHttpServlet.HttpKerberosServerAction
> [ERROR] 
> /Users/szehon/svn-repos/trunk/service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java:[241,43]
>  cannot find symbol
> [ERROR] symbol:   class KerberosName
> [ERROR] location: class 
> org.apache.hive.service.cli.thrift.ThriftHttpServlet.HttpKerberosServerAction
> [ERROR] 
> /Users/szehon/svn-repos/trunk/service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java:[252,7]
>  cannot find symbol
> [ERROR] symbol:   class KerberosName
> [ERROR] location: class 
> org.apache.hive.service.cli.thrift.ThriftHttpServlet.HttpKerberosServerAction
> [ERROR] 
> /Users/szehon/svn-repos/trunk/service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java:[252,43]
>  cannot find symbol
> [ERROR] symbol:   class KerberosName
> [ERROR] location: class 
> org.apache.hive.service.cli.thrift.ThriftHttpServlet.HttpKerberosServerAction
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 24136: HIVE-4329: HCatalog should use getHiveRecordWriter.

2014-10-06 Thread David Chen


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24136/
---

(Updated Oct. 6, 2014, 8:03 a.m.)


Review request for hive.


Changes
---

Rebase on trunk. Disable specific test methods for storage formats.


Bugs: HIVE-4329
https://issues.apache.org/jira/browse/HIVE-4329


Repository: hive-git


Description
---

HIVE-4329: HCatalog should use getHiveRecordWriter.


Diffs (updated)
-

  hcatalog/core/src/main/java/org/apache/hive/hcatalog/common/HCatUtil.java 
4fdb5c985108bb3225cf945024ae679745e5f3bc 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DefaultOutputFormatContainer.java
 3a07b0ca7c1956d45e611005cbc5ba2464596471 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DefaultRecordWriterContainer.java
 209d7bcef5624100c6cdbc2a0a137dcaf1c1fc42 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DynamicPartitionFileRecordWriterContainer.java
 4df912a935221e527c106c754ff233d212df9246 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputFormatContainer.java
 1a7595fd6dd0a5ffbe529bc24015c482068233bf 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileRecordWriterContainer.java
 2a883d6517bfe732b6a6dffa647d9d44e4145b38 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java
 bfa8657cd1b16aec664aab3e22b430b304a3698d 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatBaseOutputFormat.java
 4f7a74a002cedf3b54d0133041184fbcd9d9c4ab 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatMapRedUtil.java
 b651cb323771843da43667016a7dd2c9d9a1ddac 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatOutputFormat.java
 694739821a202780818924d54d10edb707cfbcfa 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/InitializeInput.java
 1980ef50af42499e0fed8863b6ff7a45f926d9fc 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/InternalUtil.java
 9b979395e47e54aac87487cb990824e3c3a2ee19 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/OutputFormatContainer.java
 d83b003f9c16e78a39b3cc7ce810ff19f70848c2 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/RecordWriterContainer.java
 5905b46178b510b3a43311739fea2b95f47b4ed7 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/StaticPartitionFileRecordWriterContainer.java
 b3ea76e6a79f94e09972bc060c06105f60087b71 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/HCatMapReduceTest.java
 ee57f3fd126af2e36039f84686a4169ef6267593 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatDynamicPartitioned.java
 0d87c6ce2b9a2169c3b7c9d80ff33417279fb465 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatExternalDynamicPartitioned.java
 58764a5d093524a0a3566e6db817fdb4b2364ac8 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatExternalNonPartitioned.java
 6e060c08ce03b71a4f2216f5137d73b468e5be46 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatExternalPartitioned.java
 9f16b3b9811c2020adfb6a2da7eb76ac1bc8cfb9 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatMutableDynamicPartitioned.java
 5b18739d0e9a92b94a6cc2647bc37d1aa0c0e5ca 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatMutableNonPartitioned.java
 354ae109adbec93363a5f3813413dcc50bd8ffa3 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatMutablePartitioned.java
 a22a993c8f154fcbf2faaaea2ab1ce69c4f13717 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatNonPartitioned.java
 174a92f443cb5deeb4972f4016109ecedae8bd3e 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatPartitioned.java
 a386415fb406bb0cda18f7913650874d6a236e21 
  
hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java
 36221b77d52474393668284d12877fd6b43c88d6 
  
hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestHCatLoader.java
 5eabba151b6b39b8e251fbbce2ffd4b9f7b503c6 
  
hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestHCatLoaderComplexSchema.java
 447f39fade0b5d562dd30915377a3ddf8dd422cd 
  
hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestHCatStorer.java
 a380f619493c12c440679f501a401d0a61788838 
  
hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestHCatStorerMulti.java
 0c3ec8bd93f2a50d2d44c2d892180142613dc68d 
  
hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestUtil.java
 8a652f0bb9323497bbcc7fd4a76f616ee8917c1e 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java
 2ad7330365b8327e6f1b78ad5b9760e252d1339b

[jira] [Updated] (HIVE-4329) HCatalog should use getHiveRecordWriter rather than getRecordWriter

2014-10-06 Thread David Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Chen updated HIVE-4329:
-
Attachment: HIVE-4329.4.patch

Attaching a new patch rebased on master, incorporating the test utils from 
HIVE-7286 to disable specific test methods for given storage formats.

> HCatalog should use getHiveRecordWriter rather than getRecordWriter
> ---
>
> Key: HIVE-4329
> URL: https://issues.apache.org/jira/browse/HIVE-4329
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Serializers/Deserializers
>Affects Versions: 0.14.0
> Environment: discovered in Pig, but it looks like the root cause 
> impacts all non-Hive users
>Reporter: Sean Busbey
>Assignee: David Chen
> Attachments: HIVE-4329.0.patch, HIVE-4329.1.patch, HIVE-4329.2.patch, 
> HIVE-4329.3.patch, HIVE-4329.4.patch
>
>
> Attempting to write to a HCatalog defined table backed by the AvroSerde fails 
> with the following stacktrace:
> {code}
> java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be 
> cast to org.apache.hadoop.io.LongWritable
>   at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84)
>   at 
> org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253)
>   at 
> org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53)
>   at 
> org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242)
>   at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
>   at 
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559)
>   at 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
> {code}
> The proximal cause of this failure is that the AvroContainerOutputFormat's 
> signature mandates a LongWritable key and HCat's FileRecordWriterContainer 
> forces a NullWritable. I'm not sure of a general fix, other than redefining 
> HiveOutputFormat to mandate a WritableComparable.
> It looks like accepting WritableComparable is what's done in the other Hive 
> OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also 
> be changed, since it's ignoring the key. That way fixing things so 
> FileRecordWriterContainer can always use NullWritable could get spun into a 
> different issue?
> The underlying cause for failure to write to AvroSerde tables is that 
> AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so 
> fixing the above will just push the failure into the placeholder RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-6692) Location for new table or partition should be a write entity

2014-10-06 Thread Navis (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis resolved HIVE-6692.
-
Resolution: Won't Fix

Because locations for new table or partition is decided to be read entities by 
policy, which is felt strange to me still, the remaining part of the patch is 
whether to use qualified path or simple string for path type entities. I'll 
close this and make a new issue for that.

> Location for new table or partition should be a write entity
> 
>
> Key: HIVE-6692
> URL: https://issues.apache.org/jira/browse/HIVE-6692
> Project: Hive
>  Issue Type: Task
>  Components: Authorization
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-6692.1.patch.txt
>
>
> Locations for "create table" and "alter table add partition"should be write 
> entities.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8357) Path type entities should use qualified path rather than string

2014-10-06 Thread Navis (JIRA)

Navis created HIVE-8357:
---

 Summary: Path type entities should use qualified path rather than 
string
 Key: HIVE-8357
 URL: https://issues.apache.org/jira/browse/HIVE-8357
 Project: Hive
  Issue Type: Improvement
  Components: Authorization
Reporter: Navis
Assignee: Navis
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8357) Path type entities should use qualified path rather than string

2014-10-06 Thread Navis (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-8357:

Attachment: HIVE-8357.1.patch.txt

Running preliminary test, expecting many test fails.

> Path type entities should use qualified path rather than string
> ---
>
> Key: HIVE-8357
> URL: https://issues.apache.org/jira/browse/HIVE-8357
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-8357.1.patch.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8357) Path type entities should use qualified path rather than string

2014-10-06 Thread Navis (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-8357:

Status: Patch Available  (was: Open)

> Path type entities should use qualified path rather than string
> ---
>
> Key: HIVE-8357
> URL: https://issues.apache.org/jira/browse/HIVE-8357
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-8357.1.patch.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8186) Self join may fail if one side has VCs and other doesn't

2014-10-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160159#comment-14160159
 ] 

Hive QA commented on HIVE-8186:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12673047/HIVE-8186.3.patch.txt

{color:green}SUCCESS:{color} +1 6525 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1129/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1129/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1129/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12673047

> Self join may fail if one side has VCs and other doesn't
> 
>
> Key: HIVE-8186
> URL: https://issues.apache.org/jira/browse/HIVE-8186
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-8186.1.patch.txt, HIVE-8186.2.patch.txt, 
> HIVE-8186.3.patch.txt
>
>
> See comments. This also fails on trunk, although not on original join_vc query



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7733) Ambiguous column reference error on query

2014-10-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160271#comment-14160271
 ] 

Hive QA commented on HIVE-7733:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12673054/HIVE-7733.5.patch.txt

{color:red}ERROR:{color} -1 due to 54 failed/errored test(s), 6526 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_correctness
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cluster
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_or_replace_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_view_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cte_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_describe_formatted_view_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_describe_formatted_view_partitioned_json
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_field_garbage
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_repeated_alias
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_udf_col
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_union_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subq_where_serialization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_exists
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_exists_explain_rewrite
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_exists_having
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in_explain_rewrite
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in_having
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notexists
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notexists_having
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin_having
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_unqualcolumnrefs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_views
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_temp_table_subquery1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_compare_java_string
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_to_unix_timestamp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_top_level
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_mapjoin_reduce
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_view_inputs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_windowing_streaming
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_subquery_exists
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_subquery_in
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_mapjoin_reduce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_view_as_select_with_partition
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_view_failure6
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_ambiguous_col0
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_ambiguous_col1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_ambiguous_col2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_create_or_replace_view1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_create_or_replace_view2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_create_or_replace_view7
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_invalid_select_column_with_subquery
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_invalidate_view1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_recursive_view
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1130/testReport
Console output: 
http://ec2-174-129-184-35.comp

[jira] [Commented] (HIVE-7641) INSERT ... SELECT with no source table leads to NPE

2014-10-06 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160302#comment-14160302
 ] 

Xuefu Zhang commented on HIVE-7641:
---

Looking at the patch, it seems making more sense to return an error in the 
case, in order to be consist with regular "select x from table" query, in which 
error is given if "from table" is missed.

> INSERT ... SELECT with no source table leads to NPE
> ---
>
> Key: HIVE-7641
> URL: https://issues.apache.org/jira/browse/HIVE-7641
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.13.1
>Reporter: Lenni Kuff
>Assignee: Navis
> Attachments: HIVE-7641.1.patch.txt
>
>
> When no source table is provided for an INSERT statement Hive fails with NPE. 
> {code}
> 0: jdbc:hive2://localhost:11050/default> create table test_tbl(i int);
> No rows affected (0.333 seconds)
> 0: jdbc:hive2://localhost:11050/default> insert into table test_tbl select 1;
> Error: Error while compiling statement: FAILED: NullPointerException null 
> (state=42000,code=4)
> -- Get a NPE even when using incorrect syntax (no TABLE keyword)
> 0: jdbc:hive2://localhost:11050/default> insert into test_tbl select 1;
> Error: Error while compiling statement: FAILED: NullPointerException null 
> (state=42000,code=4)
> -- Works when a source table is provided
> 0: jdbc:hive2://localhost:11050/default> insert into table test_tbl select 1 
> from foo;
> No rows affected (5.751 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-4329) HCatalog should use getHiveRecordWriter rather than getRecordWriter

2014-10-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160352#comment-14160352
 ] 

Hive QA commented on HIVE-4329:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12673066/HIVE-4329.4.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6563 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.testPigPopulation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1131/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1131/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1131/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12673066

> HCatalog should use getHiveRecordWriter rather than getRecordWriter
> ---
>
> Key: HIVE-4329
> URL: https://issues.apache.org/jira/browse/HIVE-4329
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Serializers/Deserializers
>Affects Versions: 0.14.0
> Environment: discovered in Pig, but it looks like the root cause 
> impacts all non-Hive users
>Reporter: Sean Busbey
>Assignee: David Chen
> Attachments: HIVE-4329.0.patch, HIVE-4329.1.patch, HIVE-4329.2.patch, 
> HIVE-4329.3.patch, HIVE-4329.4.patch
>
>
> Attempting to write to a HCatalog defined table backed by the AvroSerde fails 
> with the following stacktrace:
> {code}
> java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be 
> cast to org.apache.hadoop.io.LongWritable
>   at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84)
>   at 
> org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253)
>   at 
> org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53)
>   at 
> org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242)
>   at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
>   at 
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559)
>   at 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
> {code}
> The proximal cause of this failure is that the AvroContainerOutputFormat's 
> signature mandates a LongWritable key and HCat's FileRecordWriterContainer 
> forces a NullWritable. I'm not sure of a general fix, other than redefining 
> HiveOutputFormat to mandate a WritableComparable.
> It looks like accepting WritableComparable is what's done in the other Hive 
> OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also 
> be changed, since it's ignoring the key. That way fixing things so 
> FileRecordWriterContainer can always use NullWritable could get spun into a 
> different issue?
> The underlying cause for failure to write to AvroSerde tables is that 
> AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so 
> fixing the above will just push the failure into the placeholder RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8319) Add configuration for custom services in hiveserver2

2014-10-06 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160428#comment-14160428
 ] 

Ashutosh Chauhan commented on HIVE-8319:


cc: [~thejas] , [~vgumashta]

> Add configuration for custom services in hiveserver2
> 
>
> Key: HIVE-8319
> URL: https://issues.apache.org/jira/browse/HIVE-8319
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-8319.1.patch.txt
>
>
> NO PRECOMMIT TESTS
> Register services to hiveserver2, for example, 
> {noformat}
> 
>   hive.server2.service.classes
>   
> com.nexr.hive.service.HiveStatus,com.nexr.hive.service.AzkabanService
> 
> 
>   azkaban.ssl.port
>   ...
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8137) Empty ORC file handling

2014-10-06 Thread Pankit Thapar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160456#comment-14160456
 ] 

Pankit Thapar commented on HIVE-8137:
-

[~gopalv] , could you please comment on the failures. I don't think that the 
above failures are due to my patch. 
Could you please comment on the same?
Also, could you please review the patch as well?


> Empty ORC file handling
> ---
>
> Key: HIVE-8137
> URL: https://issues.apache.org/jira/browse/HIVE-8137
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Affects Versions: 0.13.1
>Reporter: Pankit Thapar
> Fix For: 0.14.0
>
> Attachments: HIVE-8137.patch
>
>
> Hive 13 does not handle reading of a zero size Orc File properly. An Orc file 
> is suposed to have a post-script
> which the ReaderIml class tries to read and initialize the footer with it. 
> But in case, the file is empty 
> or is of zero size, then it runs into an IndexOutOfBound Exception because of 
> ReaderImpl trying to read in its constructor.
> Code Snippet : 
> //get length of PostScript
> int psLen = buffer.get(readSize - 1) & 0xff; 
> In the above code, readSize for an empty file is zero.
> I see that ensureOrcFooter() method performs some sanity checks for footer , 
> so, either we can move the above code snippet to ensureOrcFooter() and throw 
> a "Malformed ORC file exception" or we can create a dummy Reader that does 
> not initialize footer and basically has hasNext() set to false so that it 
> returns false on the first call.
> Basically, I would like to know what might be the correct way to handle an 
> empty ORC file in a mapred job?
> Should we neglect it and not throw an exception or we can throw an exeption 
> that the ORC file is malformed.
> Please let me know your thoughts on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6050) JDBC backward compatibility is broken

2014-10-06 Thread Ken Williams (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160480#comment-14160480
 ] 

Ken Williams commented on HIVE-6050:


I'm also looking for a workaround to this - I'm seeing the error when trying to 
connect to a 0.13 Hive.

> JDBC backward compatibility is broken
> -
>
> Key: HIVE-6050
> URL: https://issues.apache.org/jira/browse/HIVE-6050
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 0.13.0
>Reporter: Szehon Ho
>Assignee: Carl Steinbach
>Priority: Blocker
>
> Connect from JDBC driver of Hive 0.13 (TProtocolVersion=v4) to HiveServer2 of 
> Hive 0.10 (TProtocolVersion=v1), will return the following exception:
> {noformat}
> java.sql.SQLException: Could not establish connection to 
> jdbc:hive2://localhost:1/default: Required field 'client_protocol' is 
> unset! Struct:TOpenSessionReq(client_protocol:null)
>   at 
> org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:336)
>   at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:158)
>   at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
>   at java.sql.DriverManager.getConnection(DriverManager.java:571)
>   at java.sql.DriverManager.getConnection(DriverManager.java:187)
>   at 
> org.apache.hive.jdbc.MyTestJdbcDriver2.getConnection(MyTestJdbcDriver2.java:73)
>   at 
> org.apache.hive.jdbc.MyTestJdbcDriver2.(MyTestJdbcDriver2.java:49)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:187)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:236)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:233)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
>   at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
>   at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:523)
>   at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1063)
>   at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:914)
> Caused by: org.apache.thrift.TApplicationException: Required field 
> 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null)
>   at 
> org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
>   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:160)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:147)
>   at 
> org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:327)
>   ... 37 more
> {noformat}
> On code analysis, it looks like the 'client_protocol' scheme is a ThriftEnum, 
> which doesn't seem to be backward-compatible.  Look at the code path in the 
> generated file 'TOpenSessionReq.java', method 
> TOpenSessionReqStandardScheme.read():
> 1. The method will call 'TProtocolVersion.findValue()' on the thrift 
> protocol's byte stream, which returns null if the client is sending an enum 
> value unknown to the server.  (v4 is unknown to server)
> 2. The method will then call struct.validate(), which will throw the above 
> exception because of null version.  
> So doesn't look like the current backward-compatibility scheme will work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8358) Constant folding should happen before predicate pushdown

2014-10-06 Thread Ashutosh Chauhan (JIRA)

Ashutosh Chauhan created HIVE-8358:
--

 Summary: Constant folding should happen before predicate pushdown
 Key: HIVE-8358
 URL: https://issues.apache.org/jira/browse/HIVE-8358
 Project: Hive
  Issue Type: Improvement
  Components: Logical Optimizer
Affects Versions: 0.14.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan


So, that partition pruning and transitive predicate propagation may take 
advantage of constant folding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8358) Constant folding should happen before predicate pushdown

2014-10-06 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8358:
---
Status: Patch Available  (was: Open)

> Constant folding should happen before predicate pushdown
> 
>
> Key: HIVE-8358
> URL: https://issues.apache.org/jira/browse/HIVE-8358
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Affects Versions: 0.14.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-8358.patch
>
>
> So, that partition pruning and transitive predicate propagation may take 
> advantage of constant folding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8358) Constant folding should happen before predicate pushdown

2014-10-06 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8358:
---
Attachment: HIVE-8358.patch

Running tests to see if there any failures. Not ready for review yet.

> Constant folding should happen before predicate pushdown
> 
>
> Key: HIVE-8358
> URL: https://issues.apache.org/jira/browse/HIVE-8358
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Affects Versions: 0.14.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-8358.patch
>
>
> So, that partition pruning and transitive predicate propagation may take 
> advantage of constant folding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6050) JDBC backward compatibility is broken

2014-10-06 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160519#comment-14160519
 ] 

Brock Noland commented on HIVE-6050:


AFAIK there is no present work around. The server must be higher or equal to 
the client.

> JDBC backward compatibility is broken
> -
>
> Key: HIVE-6050
> URL: https://issues.apache.org/jira/browse/HIVE-6050
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 0.13.0
>Reporter: Szehon Ho
>Assignee: Carl Steinbach
>Priority: Blocker
>
> Connect from JDBC driver of Hive 0.13 (TProtocolVersion=v4) to HiveServer2 of 
> Hive 0.10 (TProtocolVersion=v1), will return the following exception:
> {noformat}
> java.sql.SQLException: Could not establish connection to 
> jdbc:hive2://localhost:1/default: Required field 'client_protocol' is 
> unset! Struct:TOpenSessionReq(client_protocol:null)
>   at 
> org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:336)
>   at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:158)
>   at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
>   at java.sql.DriverManager.getConnection(DriverManager.java:571)
>   at java.sql.DriverManager.getConnection(DriverManager.java:187)
>   at 
> org.apache.hive.jdbc.MyTestJdbcDriver2.getConnection(MyTestJdbcDriver2.java:73)
>   at 
> org.apache.hive.jdbc.MyTestJdbcDriver2.(MyTestJdbcDriver2.java:49)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:187)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:236)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:233)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
>   at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
>   at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:523)
>   at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1063)
>   at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:914)
> Caused by: org.apache.thrift.TApplicationException: Required field 
> 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null)
>   at 
> org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
>   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:160)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:147)
>   at 
> org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:327)
>   ... 37 more
> {noformat}
> On code analysis, it looks like the 'client_protocol' scheme is a ThriftEnum, 
> which doesn't seem to be backward-compatible.  Look at the code path in the 
> generated file 'TOpenSessionReq.java', method 
> TOpenSessionReqStandardScheme.read():
> 1. The method will call 'TProtocolVersion.findValue()' on the thrift 
> protocol's byte stream, which returns null if the client is sending an enum 
> value unknown to the server.  (v4 is unknown to server)
> 2. The method will then call struct.validate(), which will throw the above 
> exception because of null version.  
> So doesn't look like the current backward-compatibility scheme will work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8352) Enable windowing.q for spark

2014-10-06 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160524#comment-14160524
 ] 

Brock Noland commented on HIVE-8352:


[~jxiang] does parallel.q pass for you locally, without 
test.overwrite.output=true? If it does pass, can you open a subtask of 
HIVE-7292 to investigate the flakiness? 

+1 pending resolution of parallel.q

> Enable windowing.q for spark
> 
>
> Key: HIVE-8352
> URL: https://issues.apache.org/jira/browse/HIVE-8352
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Jimmy Xiang
>Priority: Minor
> Attachments: HIVE-8352.1-spark.patch, HIVE-8352.1-spark.patch, 
> hive-8385.patch
>
>
> We should enable windowing.q for basic windowing coverage. After checking out 
> the spark branch, we would build:
> {noformat}
> $ mvn clean install -DskipTests -Phadoop-2
> $ cd itests/
> $ mvn clean install -DskipTests -Phadoop-2
> {noformat}
> Then generate the windowing.q.out file:
> {noformat}
> $ cd qtest-spark/
> $ mvn test -Dtest=TestSparkCliDriver -Dqfile=windowing.q -Phadoop-2 
> -Dtest.output.overwrite=true
> {noformat}
> Compare the output against MapReduce:
> {noformat}
> $ diff -y -W 150 
> ../../ql/src/test/results/clientpositive/spark/windowing.q.out 
> ../../ql/src/test/results/clientpositive/windowing.q.out| less
> {noformat}
> And if everything looks good, add it to {{spark.query.files}} in 
> {{./itests/src/test/resources/testconfiguration.properties}}
> then submit the patch including the .q file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8359) Map containing null values are not correctly written in Parquet files

2014-10-06 Thread JIRA

Frédéric TERRAZZONI created HIVE-8359:
-

 Summary: Map containing null values are not correctly written in 
Parquet files
 Key: HIVE-8359
 URL: https://issues.apache.org/jira/browse/HIVE-8359
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Frédéric TERRAZZONI


Tried write a map column in a Parquet file. The table should 
contain :
{code}
{"key3":"val3","key4":null}
{"key3":"val3","key4":null}
{"key1":null,"key2":"val2"}
{"key3":"val3","key4":null}
{"key3":"val3","key4":null}
{code}
... and when you do a query like {code}SELECT * from mytable{code}
We can see that the table is corrupted :
{code}
{"key3":"val3"}
{"key4":"val3"}
{"key3":"val2"}
{"key4":"val3"}
{"key1":"val3"}
{code}

I've not been able to read the Parquet file in our software afterwards, and 
consequently I suspect it to be corrupted. 

For those who are interested, I generated this Parquet table from an Avro file. 
Don't know how to attach it here though ... :)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8359) Map containing null values are not correctly written in Parquet files

2014-10-06 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frédéric TERRAZZONI updated HIVE-8359:
--
Description: 
Tried write a map column in a Parquet file. The table should 
contain :
{code}
{"key3":"val3","key4":null}
{"key3":"val3","key4":null}
{"key1":null,"key2":"val2"}
{"key3":"val3","key4":null}
{"key3":"val3","key4":null}
{code}
... and when you do a query like {code}SELECT * from mytable{code}
We can see that the table is corrupted :
{code}
{"key3":"val3"}
{"key4":"val3"}
{"key3":"val2"}
{"key4":"val3"}
{"key1":"val3"}
{code}

I've not been able to read the Parquet file in our software afterwards, and 
consequently I suspect it to be corrupted. 

For those who are interested, I generated this Parquet table from an Avro file. 

  was:
Tried write a map column in a Parquet file. The table should 
contain :
{code}
{"key3":"val3","key4":null}
{"key3":"val3","key4":null}
{"key1":null,"key2":"val2"}
{"key3":"val3","key4":null}
{"key3":"val3","key4":null}
{code}
... and when you do a query like {code}SELECT * from mytable{code}
We can see that the table is corrupted :
{code}
{"key3":"val3"}
{"key4":"val3"}
{"key3":"val2"}
{"key4":"val3"}
{"key1":"val3"}
{code}

I've not been able to read the Parquet file in our software afterwards, and 
consequently I suspect it to be corrupted. 

For those who are interested, I generated this Parquet table from an Avro file. 
Don't know how to attach it here though ... :)


> Map containing null values are not correctly written in Parquet files
> -
>
> Key: HIVE-8359
> URL: https://issues.apache.org/jira/browse/HIVE-8359
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1
>Reporter: Frédéric TERRAZZONI
>
> Tried write a map column in a Parquet file. The table should 
> contain :
> {code}
> {"key3":"val3","key4":null}
> {"key3":"val3","key4":null}
> {"key1":null,"key2":"val2"}
> {"key3":"val3","key4":null}
> {"key3":"val3","key4":null}
> {code}
> ... and when you do a query like {code}SELECT * from mytable{code}
> We can see that the table is corrupted :
> {code}
> {"key3":"val3"}
> {"key4":"val3"}
> {"key3":"val2"}
> {"key4":"val3"}
> {"key1":"val3"}
> {code}
> I've not been able to read the Parquet file in our software afterwards, and 
> consequently I suspect it to be corrupted. 
> For those who are interested, I generated this Parquet table from an Avro 
> file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8359) Map containing null values are not correctly written in Parquet files

2014-10-06 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frédéric TERRAZZONI updated HIVE-8359:
--
Attachment: map_null_val.avro

Avro file containing the sample data. To reproduce the issue, just create a 
Hive table from this file and issue a 
{code}
CREATE TABLE broken_parquet_table STORED AS PARQUET
AS SELECT * FROM the_avro_table;

SELECT * FROM broken_parquet_table;
{code}

> Map containing null values are not correctly written in Parquet files
> -
>
> Key: HIVE-8359
> URL: https://issues.apache.org/jira/browse/HIVE-8359
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1
>Reporter: Frédéric TERRAZZONI
> Attachments: map_null_val.avro
>
>
> Tried write a map column in a Parquet file. The table should 
> contain :
> {code}
> {"key3":"val3","key4":null}
> {"key3":"val3","key4":null}
> {"key1":null,"key2":"val2"}
> {"key3":"val3","key4":null}
> {"key3":"val3","key4":null}
> {code}
> ... and when you do a query like {code}SELECT * from mytable{code}
> We can see that the table is corrupted :
> {code}
> {"key3":"val3"}
> {"key4":"val3"}
> {"key3":"val2"}
> {"key4":"val3"}
> {"key1":"val3"}
> {code}
> I've not been able to read the Parquet file in our software afterwards, and 
> consequently I suspect it to be corrupted. 
> For those who are interested, I generated this Parquet table from an Avro 
> file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8319) Add configuration for custom services in hiveserver2

2014-10-06 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160536#comment-14160536
 ] 

Thejas M Nair commented on HIVE-8319:
-

This patch is making the Service interface public. We should mark it with 
@public annotation in that case, and probably @unstable or (at least @evolving) 
as well.
The interface also needs some cleanup, so that unused functions are removed 
(such as register/unregister). We should also clarify the public/private api 
status of the classes within org.apache.hive.service package, as users might 
also end up using classes like CompositeService. (I think marking them as 
@private unless it is clear that users would benefit from it and it can be kept 
stable).



> Add configuration for custom services in hiveserver2
> 
>
> Key: HIVE-8319
> URL: https://issues.apache.org/jira/browse/HIVE-8319
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-8319.1.patch.txt
>
>
> NO PRECOMMIT TESTS
> Register services to hiveserver2, for example, 
> {noformat}
> 
>   hive.server2.service.classes
>   
> com.nexr.hive.service.HiveStatus,com.nexr.hive.service.AzkabanService
> 
> 
>   azkaban.ssl.port
>   ...
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8357) Path type entities should use qualified path rather than string

2014-10-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160549#comment-14160549
 ] 

Hive QA commented on HIVE-8357:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12673067/HIVE-8357.1.patch.txt

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 6524 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_multiple
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_map_ppr_multi_distinct
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_udf_local_resource
org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers
org.apache.hadoop.hive.ql.TestCreateUdfEntities.testUdfWithDfsResource
org.apache.hadoop.hive.ql.TestCreateUdfEntities.testUdfWithLocalResource
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1132/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1132/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1132/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12673067

> Path type entities should use qualified path rather than string
> ---
>
> Key: HIVE-8357
> URL: https://issues.apache.org/jira/browse/HIVE-8357
> Project: Hive
>  Issue Type: Improvement
>  Components: Authorization
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-8357.1.patch.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8225) CBO trunk merge: union11 test fails due to incorrect plan

2014-10-06 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8225:
--
Status: Patch Available  (was: Open)

> CBO trunk merge: union11 test fails due to incorrect plan
> -
>
> Key: HIVE-8225
> URL: https://issues.apache.org/jira/browse/HIVE-8225
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8225.1.patch, HIVE-8225.2.patch, HIVE-8225.3.patch, 
> HIVE-8225.4.patch, HIVE-8225.5.patch, HIVE-8225.inprogress.patch, 
> HIVE-8225.inprogress.patch, HIVE-8225.patch
>
>
> The result changes to as if the union didn't have count() inside. The issue 
> can be fixed by using srcunion.value outside the subquery in count (replace 
> count(1) with count(srcunion.value)). Otherwise, it looks like count(1) node 
> from union-ed queries is not present in AST at all, which might cause this 
> result.
> -Interestingly, adding group by to each query in a union produces completely 
> weird result (count(1) is 309 for each key, whereas it should be 1 and the 
> "logical" incorrect value if internal count is lost is 500)- Nm, that groups 
> by table column called key, which is weird but is what Hive does



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 26209: CBO trunk merge: union11 test fails due to incorrect plan

2014-10-06 Thread pengcheng xiong


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26209/
---

(Updated Oct. 6, 2014, 5:39 p.m.)


Review request for hive and Ashutosh Chauhan.


Repository: hive-git


Description
---

create a derived table with new proj and aggr to address it


Diffs (updated)
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/translator/PlanModifierForASTConv.java
 3d90ae7 
  ql/src/test/queries/clientpositive/cbo_correctness.q f7f0722 
  ql/src/test/results/clientpositive/cbo_correctness.q.out 3335d4d 
  ql/src/test/results/clientpositive/tez/cbo_correctness.q.out 5920612 

Diff: https://reviews.apache.org/r/26209/diff/


Testing
---


Thanks,

pengcheng xiong

[jira] [Updated] (HIVE-8225) CBO trunk merge: union11 test fails due to incorrect plan

2014-10-06 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8225:
--
Status: Open  (was: Patch Available)

> CBO trunk merge: union11 test fails due to incorrect plan
> -
>
> Key: HIVE-8225
> URL: https://issues.apache.org/jira/browse/HIVE-8225
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8225.1.patch, HIVE-8225.2.patch, HIVE-8225.3.patch, 
> HIVE-8225.4.patch, HIVE-8225.5.patch, HIVE-8225.inprogress.patch, 
> HIVE-8225.inprogress.patch, HIVE-8225.patch
>
>
> The result changes to as if the union didn't have count() inside. The issue 
> can be fixed by using srcunion.value outside the subquery in count (replace 
> count(1) with count(srcunion.value)). Otherwise, it looks like count(1) node 
> from union-ed queries is not present in AST at all, which might cause this 
> result.
> -Interestingly, adding group by to each query in a union produces completely 
> weird result (count(1) is 309 for each key, whereas it should be 1 and the 
> "logical" incorrect value if internal count is lost is 500)- Nm, that groups 
> by table column called key, which is weird but is what Hive does



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8225) CBO trunk merge: union11 test fails due to incorrect plan

2014-10-06 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8225:
--
Attachment: HIVE-8225.5.patch

> CBO trunk merge: union11 test fails due to incorrect plan
> -
>
> Key: HIVE-8225
> URL: https://issues.apache.org/jira/browse/HIVE-8225
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8225.1.patch, HIVE-8225.2.patch, HIVE-8225.3.patch, 
> HIVE-8225.4.patch, HIVE-8225.5.patch, HIVE-8225.inprogress.patch, 
> HIVE-8225.inprogress.patch, HIVE-8225.patch
>
>
> The result changes to as if the union didn't have count() inside. The issue 
> can be fixed by using srcunion.value outside the subquery in count (replace 
> count(1) with count(srcunion.value)). Otherwise, it looks like count(1) node 
> from union-ed queries is not present in AST at all, which might cause this 
> result.
> -Interestingly, adding group by to each query in a union produces completely 
> weird result (count(1) is 309 for each key, whereas it should be 1 and the 
> "logical" incorrect value if internal count is lost is 500)- Nm, that groups 
> by table column called key, which is weird but is what Hive does



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8340) HiveServer2 service doesn't stop backend jvm process, which prevents follow-up service start.

2014-10-06 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-8340:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk and 14. Thanks for the patch [~xiaobingo]!

> HiveServer2 service doesn't stop backend jvm process, which prevents 
> follow-up service start.
> -
>
> Key: HIVE-8340
> URL: https://issues.apache.org/jira/browse/HIVE-8340
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.14.0
> Environment: Windows
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8340.1.patch, HIVE-8340.2.patch, HIVE-8340.3.patch, 
> HIVE-8340.4.patch
>
>
> On stopping the HS2 service from the services tab, it only kills the root 
> process and does not kill the child java process. As a result resources are 
> not freed and this throws an error on restarting from command line.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8340) HiveServer2 service doesn't stop backend jvm process, which prevents follow-up service start.

2014-10-06 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160567#comment-14160567
 ] 

Vaibhav Gumashta commented on HIVE-8340:


Thanks for reviewing the configs [~leftylev]

> HiveServer2 service doesn't stop backend jvm process, which prevents 
> follow-up service start.
> -
>
> Key: HIVE-8340
> URL: https://issues.apache.org/jira/browse/HIVE-8340
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.14.0
> Environment: Windows
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8340.1.patch, HIVE-8340.2.patch, HIVE-8340.3.patch, 
> HIVE-8340.4.patch
>
>
> On stopping the HS2 service from the services tab, it only kills the root 
> process and does not kill the child java process. As a result resources are 
> not freed and this throws an error on restarting from command line.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8360) Add cross cluster support for webhcat E2E tests

2014-10-06 Thread Aswathy Chellammal Sreekumar (JIRA)

Aswathy Chellammal Sreekumar created HIVE-8360:
--

 Summary: Add cross cluster support for webhcat E2E tests
 Key: HIVE-8360
 URL: https://issues.apache.org/jira/browse/HIVE-8360
 Project: Hive
  Issue Type: Test
  Components: Tests, WebHCat
 Environment: Secure cluster
Reporter: Aswathy Chellammal Sreekumar


In current Webhcat E2E test setup, cross domain secure cluster runs will fail 
since the realm name for user principles are not included in the kinit command. 
This patch concatenates the realm name to the user principal there by resulting 
in a successful kinit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8360) Add cross cluster support for webhcat E2E tests

2014-10-06 Thread Aswathy Chellammal Sreekumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aswathy Chellammal Sreekumar updated HIVE-8360:
---
Attachment: AD-MIT.patch

Including the patch that implements cross domain support in secure cluster for 
E2E tests. Please review the same.

> Add cross cluster support for webhcat E2E tests
> ---
>
> Key: HIVE-8360
> URL: https://issues.apache.org/jira/browse/HIVE-8360
> Project: Hive
>  Issue Type: Test
>  Components: Tests, WebHCat
> Environment: Secure cluster
>Reporter: Aswathy Chellammal Sreekumar
> Attachments: AD-MIT.patch
>
>
> In current Webhcat E2E test setup, cross domain secure cluster runs will fail 
> since the realm name for user principles are not included in the kinit 
> command. This patch concatenates the realm name to the user principal there 
> by resulting in a successful kinit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8340) HiveServer2 service doesn't stop backend jvm process, which prevents follow-up service start.

2014-10-06 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-8340:
-
Labels: TODOC14  (was: )

> HiveServer2 service doesn't stop backend jvm process, which prevents 
> follow-up service start.
> -
>
> Key: HIVE-8340
> URL: https://issues.apache.org/jira/browse/HIVE-8340
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.14.0
> Environment: Windows
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>Priority: Critical
>  Labels: TODOC14
> Fix For: 0.14.0
>
> Attachments: HIVE-8340.1.patch, HIVE-8340.2.patch, HIVE-8340.3.patch, 
> HIVE-8340.4.patch
>
>
> On stopping the HS2 service from the services tab, it only kills the root 
> process and does not kill the child java process. As a result resources are 
> not freed and this throws an error on restarting from command line.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8340) HiveServer2 service doesn't stop backend jvm process, which prevents follow-up service start.

2014-10-06 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160612#comment-14160612
 ] 

Lefty Leverenz commented on HIVE-8340:
--

Doc note:  This adds *hive.hadoop.classpath* to HiveConf.java, so it needs to 
be documented in the wiki. Although the parameter doesn't start with 
"hive.server2...", it belongs in the HiveServer2 section:

* [Configuration Properties -- HiveServer2 | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-HiveServer2]

> HiveServer2 service doesn't stop backend jvm process, which prevents 
> follow-up service start.
> -
>
> Key: HIVE-8340
> URL: https://issues.apache.org/jira/browse/HIVE-8340
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.14.0
> Environment: Windows
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>Priority: Critical
>  Labels: TODOC14
> Fix For: 0.14.0
>
> Attachments: HIVE-8340.1.patch, HIVE-8340.2.patch, HIVE-8340.3.patch, 
> HIVE-8340.4.patch
>
>
> On stopping the HS2 service from the services tab, it only kills the root 
> process and does not kill the child java process. As a result resources are 
> not freed and this throws an error on restarting from command line.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8336) Update pom, now that Optiq is renamed to Calcite

2014-10-06 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8336:
-
   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch .14

> Update pom, now that Optiq is renamed to Calcite
> 
>
> Key: HIVE-8336
> URL: https://issues.apache.org/jira/browse/HIVE-8336
> Project: Hive
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Gunther Hagleitner
> Fix For: 0.14.0
>
> Attachments: HIVE-8336.1.patch
>
>
> Apache Optiq is in the process of renaming to Apache Calcite. See INFRA-8413 
> and OPTIQ-430.
> There is not yet a snapshot of {groupId: 'org.apache.calcite', artifactId: 
> 'calcite-*'} deployed to nexus. When there is, I'll post a patch to pom.xml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8336) Update pom, now that Optiq is renamed to Calcite

2014-10-06 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160619#comment-14160619
 ] 

Gunther Hagleitner commented on HIVE-8336:
--

[~leftylev] i've changed the name in hiveconf on commit.

> Update pom, now that Optiq is renamed to Calcite
> 
>
> Key: HIVE-8336
> URL: https://issues.apache.org/jira/browse/HIVE-8336
> Project: Hive
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Gunther Hagleitner
> Fix For: 0.14.0
>
> Attachments: HIVE-8336.1.patch
>
>
> Apache Optiq is in the process of renaming to Apache Calcite. See INFRA-8413 
> and OPTIQ-430.
> There is not yet a snapshot of {groupId: 'org.apache.calcite', artifactId: 
> 'calcite-*'} deployed to nexus. When there is, I'll post a patch to pom.xml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8336) Update pom, now that Optiq is renamed to Calcite

2014-10-06 Thread Vikram Dixit K (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160622#comment-14160622
 ] 

Vikram Dixit K commented on HIVE-8336:
--

+1 for 0.14

> Update pom, now that Optiq is renamed to Calcite
> 
>
> Key: HIVE-8336
> URL: https://issues.apache.org/jira/browse/HIVE-8336
> Project: Hive
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Gunther Hagleitner
> Fix For: 0.14.0
>
> Attachments: HIVE-8336.1.patch
>
>
> Apache Optiq is in the process of renaming to Apache Calcite. See INFRA-8413 
> and OPTIQ-430.
> There is not yet a snapshot of {groupId: 'org.apache.calcite', artifactId: 
> 'calcite-*'} deployed to nexus. When there is, I'll post a patch to pom.xml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8258) Compactor cleaners can be starved on a busy table or partition.

2014-10-06 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-8258:
-
Status: Open  (was: Patch Available)

The unit test is failing due to timing issues.

> Compactor cleaners can be starved on a busy table or partition.
> ---
>
> Key: HIVE-8258
> URL: https://issues.apache.org/jira/browse/HIVE-8258
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 0.13.1
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8258.2.patch, HIVE-8258.3.patch, HIVE-8258.4.patch, 
> HIVE-8258.patch
>
>
> Currently the cleaning thread in the compactor does not run on a table or 
> partition while any locks are held on this partition.  This leaves it open to 
> starvation in the case of a busy table or partition.  It only needs to wait 
> until all locks on the table/partition at the time of the compaction have 
> expired.  Any jobs initiated after that (and thus any locks obtained) will be 
> for the new versions of the files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8258) Compactor cleaners can be starved on a busy table or partition.

2014-10-06 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-8258:
-
Attachment: HIVE-8258.4.patch

A new version of the patch that actually makes sure the cleaner goes through 
the loop rather than relying on timing and hoping it works out.

> Compactor cleaners can be starved on a busy table or partition.
> ---
>
> Key: HIVE-8258
> URL: https://issues.apache.org/jira/browse/HIVE-8258
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 0.13.1
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8258.2.patch, HIVE-8258.3.patch, HIVE-8258.4.patch, 
> HIVE-8258.patch
>
>
> Currently the cleaning thread in the compactor does not run on a table or 
> partition while any locks are held on this partition.  This leaves it open to 
> starvation in the case of a busy table or partition.  It only needs to wait 
> until all locks on the table/partition at the time of the compaction have 
> expired.  Any jobs initiated after that (and thus any locks obtained) will be 
> for the new versions of the files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8258) Compactor cleaners can be starved on a busy table or partition.

2014-10-06 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-8258:
-
Status: Patch Available  (was: Open)

> Compactor cleaners can be starved on a busy table or partition.
> ---
>
> Key: HIVE-8258
> URL: https://issues.apache.org/jira/browse/HIVE-8258
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 0.13.1
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8258.2.patch, HIVE-8258.3.patch, HIVE-8258.4.patch, 
> HIVE-8258.patch
>
>
> Currently the cleaning thread in the compactor does not run on a table or 
> partition while any locks are held on this partition.  This leaves it open to 
> starvation in the case of a busy table or partition.  It only needs to wait 
> until all locks on the table/partition at the time of the compaction have 
> expired.  Any jobs initiated after that (and thus any locks obtained) will be 
> for the new versions of the files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8344) Hive on Tez sets mapreduce.framework.name to yarn-tez

2014-10-06 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8344:
-
Status: Open  (was: Patch Available)

> Hive on Tez sets mapreduce.framework.name to yarn-tez
> -
>
> Key: HIVE-8344
> URL: https://issues.apache.org/jira/browse/HIVE-8344
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-8344.1.patch, HIVE-8344.2.patch
>
>
> This was done to run MR jobs when in Tez mode (emulate MR on Tez). However, 
> we don't switch back when the user specifies MR as exec engine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8344) Hive on Tez sets mapreduce.framework.name to yarn-tez

2014-10-06 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8344:
-
Status: Patch Available  (was: Open)

> Hive on Tez sets mapreduce.framework.name to yarn-tez
> -
>
> Key: HIVE-8344
> URL: https://issues.apache.org/jira/browse/HIVE-8344
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-8344.1.patch, HIVE-8344.2.patch
>
>
> This was done to run MR jobs when in Tez mode (emulate MR on Tez). However, 
> we don't switch back when the user specifies MR as exec engine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8344) Hive on Tez sets mapreduce.framework.name to yarn-tez

2014-10-06 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8344:
-
Attachment: HIVE-8344.2.patch

> Hive on Tez sets mapreduce.framework.name to yarn-tez
> -
>
> Key: HIVE-8344
> URL: https://issues.apache.org/jira/browse/HIVE-8344
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-8344.1.patch, HIVE-8344.2.patch
>
>
> This was done to run MR jobs when in Tez mode (emulate MR on Tez). However, 
> we don't switch back when the user specifies MR as exec engine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7375) Add option in test infra to compile in other profiles (like hadoop-1)

2014-10-06 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160650#comment-14160650
 ] 

Szehon Ho commented on HIVE-7375:
-

[~brocknoland] I had filed this sometime back to try to catch hadoop-1 compile 
errors in precommit.  (At the time trying to avoid having to fund an additional 
precommit machine cluster for hadoop-1).  Are you thinking we can get funding 
for one more cluster for hadoop-1 in the near future, as HIVE-8351 suggests?  
If so , I can resolve this JIRA in favor of that one.

> Add option in test infra to compile in other profiles (like hadoop-1)
> -
>
> Key: HIVE-7375
> URL: https://issues.apache.org/jira/browse/HIVE-7375
> Project: Hive
>  Issue Type: Test
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>
> As we are seeing some commits breaking hadoop-1 compilation due to lack of 
> pre-commit converage, it might be nice to add an option in the test infra to 
> compile on optional profiles as a pre-step before testing on the main profile.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7375) Add option in test infra to compile in other profiles (like hadoop-1)

2014-10-06 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160667#comment-14160667
 ] 

Brock Noland commented on HIVE-7375:


Yes, I think we can resolve this one in favor of HIVE-8351.

> Add option in test infra to compile in other profiles (like hadoop-1)
> -
>
> Key: HIVE-7375
> URL: https://issues.apache.org/jira/browse/HIVE-7375
> Project: Hive
>  Issue Type: Test
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>
> As we are seeing some commits breaking hadoop-1 compilation due to lack of 
> pre-commit converage, it might be nice to add an option in the test infra to 
> compile on optional profiles as a pre-step before testing on the main profile.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8361) NPE in PTFOperator when there are empty partitions

2014-10-06 Thread Harish Butani (JIRA)

Harish Butani created HIVE-8361:
---

 Summary: NPE in PTFOperator when there are empty partitions
 Key: HIVE-8361
 URL: https://issues.apache.org/jira/browse/HIVE-8361
 Project: Hive
  Issue Type: Bug
Reporter: Harish Butani
Assignee: Harish Butani


Here is a simple query to reproduce this:
{code}
select sum(p_size) over (partition by p_mfgr )
from part where p_mfgr = 'some non existent mfgr';
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8361) NPE in PTFOperator when there are empty partitions

2014-10-06 Thread Harish Butani (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-8361:

Status: Patch Available  (was: Open)

> NPE in PTFOperator when there are empty partitions
> --
>
> Key: HIVE-8361
> URL: https://issues.apache.org/jira/browse/HIVE-8361
> Project: Hive
>  Issue Type: Bug
>Reporter: Harish Butani
>Assignee: Harish Butani
> Attachments: HIVE-8361.1.patch
>
>
> Here is a simple query to reproduce this:
> {code}
> select sum(p_size) over (partition by p_mfgr )
> from part where p_mfgr = 'some non existent mfgr';
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8361) NPE in PTFOperator when there are empty partitions

2014-10-06 Thread Harish Butani (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-8361:

Attachment: HIVE-8361.1.patch

> NPE in PTFOperator when there are empty partitions
> --
>
> Key: HIVE-8361
> URL: https://issues.apache.org/jira/browse/HIVE-8361
> Project: Hive
>  Issue Type: Bug
>Reporter: Harish Butani
>Assignee: Harish Butani
> Attachments: HIVE-8361.1.patch
>
>
> Here is a simple query to reproduce this:
> {code}
> select sum(p_size) over (partition by p_mfgr )
> from part where p_mfgr = 'some non existent mfgr';
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8292) Reading from partitioned bucketed tables has high overhead in MapOperator.cleanUpInputFileChangedOp

2014-10-06 Thread Mostafa Mokhtar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-8292:
--
Attachment: HIVE-8292.1.patch

This patch addresses the regression but doesn't handle multiple inputs for SMB 
join.

> Reading from partitioned bucketed tables has high overhead in 
> MapOperator.cleanUpInputFileChangedOp
> ---
>
> Key: HIVE-8292
> URL: https://issues.apache.org/jira/browse/HIVE-8292
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.14.0
> Environment: cn105
>Reporter: Mostafa Mokhtar
>Assignee: Vikram Dixit K
> Fix For: 0.14.0
>
> Attachments: 2014_09_29_14_46_04.jfr, HIVE-8292.1.patch
>
>
> Reading from bucketed partitioned tables has significantly higher overhead 
> compared to non-bucketed non-partitioned files.
> 50% of the profile is spent in MapOperator.cleanUpInputFileChangedOp
> 5% the CPU in 
> {code}
>  Path onepath = normalizePath(onefile);
> {code}
> And 
> 45% the CPU in 
> {code}
>  onepath.toUri().relativize(fpath.toUri()).equals(fpath.toUri());
> {code}
> From the profiler 
> {code}
> Stack Trace   Sample CountPercentage(%)
> hive.ql.exec.tez.MapRecordSource.processRow(Object)   5,327   62.348
>hive.ql.exec.vector.VectorMapOperator.process(Writable)5,326   62.336
>   hive.ql.exec.Operator.cleanUpInputFileChanged() 4,851   56.777
>  hive.ql.exec.MapOperator.cleanUpInputFileChangedOp() 4,849   56.753
>  java.net.URI.relativize(URI) 3,903   45.681
> java.net.URI.relativize(URI, URI) 3,903   
> 45.681
>java.net.URI.normalize(String) 2,169   
> 25.386
>java.net.URI.equal(String, String) 
> 526 6.156
>java.net.URI.equalIgnoringCase(String, 
> String) 1   0.012
>java.lang.String.substring(int)
> 1   0.012
> hive.ql.exec.MapOperator.normalizePath(String)506 5.922
> org.apache.commons.logging.impl.Log4JLogger.info(Object)  32  
> 0.375
>  java.net.URI.equals(Object)  12  0.14
>  java.util.HashMap$KeySet.iterator()  5   
> 0.059
>  java.util.HashMap.get(Object)4   
> 0.047
>  java.util.LinkedHashMap.get(Object)  3   
> 0.035
>  hive.ql.exec.Operator.cleanUpInputFileChanged()  1   0.012
>   hive.ql.exec.Operator.forward(Object, ObjectInspector)  473 5.536
>   hive.ql.exec.mr.ExecMapperContext.inputFileChanged()1   0.012
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8292) Reading from partitioned bucketed tables has high overhead in MapOperator.cleanUpInputFileChangedOp

2014-10-06 Thread Mostafa Mokhtar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160683#comment-14160683
 ] 

Mostafa Mokhtar commented on HIVE-8292:
---

[~vikram.dixit]
Patch which addresses the regression attached.

> Reading from partitioned bucketed tables has high overhead in 
> MapOperator.cleanUpInputFileChangedOp
> ---
>
> Key: HIVE-8292
> URL: https://issues.apache.org/jira/browse/HIVE-8292
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.14.0
> Environment: cn105
>Reporter: Mostafa Mokhtar
>Assignee: Vikram Dixit K
> Fix For: 0.14.0
>
> Attachments: 2014_09_29_14_46_04.jfr, HIVE-8292.1.patch
>
>
> Reading from bucketed partitioned tables has significantly higher overhead 
> compared to non-bucketed non-partitioned files.
> 50% of the profile is spent in MapOperator.cleanUpInputFileChangedOp
> 5% the CPU in 
> {code}
>  Path onepath = normalizePath(onefile);
> {code}
> And 
> 45% the CPU in 
> {code}
>  onepath.toUri().relativize(fpath.toUri()).equals(fpath.toUri());
> {code}
> From the profiler 
> {code}
> Stack Trace   Sample CountPercentage(%)
> hive.ql.exec.tez.MapRecordSource.processRow(Object)   5,327   62.348
>hive.ql.exec.vector.VectorMapOperator.process(Writable)5,326   62.336
>   hive.ql.exec.Operator.cleanUpInputFileChanged() 4,851   56.777
>  hive.ql.exec.MapOperator.cleanUpInputFileChangedOp() 4,849   56.753
>  java.net.URI.relativize(URI) 3,903   45.681
> java.net.URI.relativize(URI, URI) 3,903   
> 45.681
>java.net.URI.normalize(String) 2,169   
> 25.386
>java.net.URI.equal(String, String) 
> 526 6.156
>java.net.URI.equalIgnoringCase(String, 
> String) 1   0.012
>java.lang.String.substring(int)
> 1   0.012
> hive.ql.exec.MapOperator.normalizePath(String)506 5.922
> org.apache.commons.logging.impl.Log4JLogger.info(Object)  32  
> 0.375
>  java.net.URI.equals(Object)  12  0.14
>  java.util.HashMap$KeySet.iterator()  5   
> 0.059
>  java.util.HashMap.get(Object)4   
> 0.047
>  java.util.LinkedHashMap.get(Object)  3   
> 0.035
>  hive.ql.exec.Operator.cleanUpInputFileChanged()  1   0.012
>   hive.ql.exec.Operator.forward(Object, ObjectInspector)  473 5.536
>   hive.ql.exec.mr.ExecMapperContext.inputFileChanged()1   0.012
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6500) Stats collection via filesystem

2014-10-06 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160690#comment-14160690
 ] 

Szehon Ho commented on HIVE-6500:
-

Hi [~leftylev] I had a question about docs.  I came across an outdated wiki 
page still mentioning db as the only option, should that page be maintained as 
FS is now supported?  
[https://cwiki.apache.org/confluence/display/Hive/StatsDev|https://cwiki.apache.org/confluence/display/Hive/StatsDev]
  It is actually not linked from the top, but it does seem useful.  Not sure 
the policy for these pages?

> Stats collection via filesystem
> ---
>
> Key: HIVE-6500
> URL: https://issues.apache.org/jira/browse/HIVE-6500
> Project: Hive
>  Issue Type: New Feature
>  Components: Statistics
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>  Labels: TODOC14
> Fix For: 0.13.0
>
> Attachments: HIVE-6500.2.patch, HIVE-6500.3.patch, HIVE-6500.patch
>
>
> Recently, support for stats gathering via counter was [added | 
> https://issues.apache.org/jira/browse/HIVE-4632] Although, its useful it has 
> following issues:
> * [Length of counter group name is limited | 
> https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L340]
> * [Length of counter name is limited | 
> https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L337]
> * [Number of distinct counter groups are limited | 
> https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L343]
> * [Number of distinct counters are limited | 
> https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L334]
> Although, these limits are configurable, but setting them to higher value 
> implies increased memory load on AM and job history server.
> Now, whether these limits makes sense or not is [debatable | 
> https://issues.apache.org/jira/browse/MAPREDUCE-5680] it is desirable that 
> Hive doesn't make use of counters features of framework so that it we can 
> evolve this feature without relying on support from framework. Filesystem 
> based counter collection is a step in that direction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8362) Investigate flaky test parallel.q [Spark Branch]

2014-10-06 Thread Jimmy Xiang (JIRA)

Jimmy Xiang created HIVE-8362:
-

 Summary: Investigate flaky test parallel.q [Spark Branch]
 Key: HIVE-8362
 URL: https://issues.apache.org/jira/browse/HIVE-8362
 Project: Hive
  Issue Type: Sub-task
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang


Test parallel.q is flaky. It fails sometimes with error like:

{noformat}
Failed tests: 
  TestSparkCliDriver.testCliDriver_parallel:120->runTest:146 Unexpected 
exception junit.framework.AssertionFailedError: Client Execution results failed 
with error code = 1
See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, or 
check ./ql/target/surefire-reports or ./itests/qtest/target/surefire-reports/ 
for specific test cases logs.
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8168) With dynamic partition enabled fact table selectivity is not taken into account when generating the physical plan (Use CBO cardinality using physical plan generation)

2014-10-06 Thread Prasanth J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-8168:
-
Attachment: HIVE-8168.4.patch

Addressed [~mmokhtar]'s review comments.

> With dynamic partition enabled fact table selectivity is not taken into 
> account when generating the physical plan (Use CBO cardinality using physical 
> plan generation)
> --
>
> Key: HIVE-8168
> URL: https://issues.apache.org/jira/browse/HIVE-8168
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 0.14.0
>Reporter: Mostafa Mokhtar
>Assignee: Prasanth J
>Priority: Critical
>  Labels: performance
> Fix For: vectorization-branch, 0.14.0
>
> Attachments: HIVE-8168.1.patch, HIVE-8168.2.patch, HIVE-8168.3.patch, 
> HIVE-8168.4.patch
>
>
> When calculating estimate row counts & data size during physical plan 
> generation in StatsRulesProcFactory doesn't know that there will be dynamic 
> partition pruning and it is hard to know how many partitions will qualify at 
> runtime, as a result with Dynamic partition pruning enabled a query 32 can 
> run with 570 compared to 70 tasks with dynamic partition pruning disabled and 
> actual partition filters on the fact table.
> The long term solution for this issue is to use the cardinality estimates 
> from CBO as it takes into account join selectivity and such, estimate from 
> CBO won't address the number of the tasks used for the partitioned table but 
> they will address the incorrect number of tasks used for the concequent 
> reducers where the majority of the slowdown is coming from.
> Plan dynamic partition pruning on 
> {code}
>Map 5 
> Map Operator Tree:
> TableScan
>   alias: ss
>   filterExpr: ss_store_sk is not null (type: boolean)
>   Statistics: Num rows: 550076554 Data size: 47370018896 
> Basic stats: COMPLETE Column stats: NONE
>   Filter Operator
> predicate: ss_store_sk is not null (type: boolean)
> Statistics: Num rows: 275038277 Data size: 23685009448 
> Basic stats: COMPLETE Column stats: NONE
> Map Join Operator
>   condition map:
>Inner Join 0 to 1
>   condition expressions:
> 0 {ss_store_sk} {ss_net_profit}
> 1 
>   keys:
> 0 ss_sold_date_sk (type: int)
> 1 d_date_sk (type: int)
>   outputColumnNames: _col6, _col21
>   input vertices:
> 1 Map 1
>   Statistics: Num rows: 302542112 Data size: 26053511168 
> Basic stats: COMPLETE Column stats: NONE
>   Map Join Operator
> condition map:
>  Inner Join 0 to 1
> condition expressions:
>   0 {_col21}
>   1 {s_county} {s_state}
> keys:
>   0 _col6 (type: int)
>   1 s_store_sk (type: int)
> outputColumnNames: _col21, _col80, _col81
> input vertices:
>   1 Map 2
> Statistics: Num rows: 332796320 Data size: 
> 28658862080 Basic stats: COMPLETE Column stats: NONE
> Map Join Operator
>   condition map:
>Left Semi Join 0 to 1
>   condition expressions:
> 0 {_col21} {_col80} {_col81}
> 1 
>   keys:
> 0 _col81 (type: string)
> 1 _col0 (type: string)
>   outputColumnNames: _col21, _col80, _col81
>   input vertices:
> 1 Reducer 11
>   Statistics: Num rows: 366075968 Data size: 
> 31524749312 Basic stats: COMPLETE Column stats: NONE
>   Select Operator
> expressions: _col81 (type: string), _col80 (type: 
> string), _col21 (type: float)
> outputColumnNames: _col81, _col80, _col21
> Statistics: Num rows: 366075968 Data size: 
> 31524749312 Basic stats: COMPLETE Column stats: NONE
> Group By Operator
>   aggregations: sum(_col

[jira] [Commented] (HIVE-8352) Enable windowing.q for spark

2014-10-06 Thread Jimmy Xiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160716#comment-14160716
 ] 

Jimmy Xiang commented on HIVE-8352:
---

Parallel.q is ok for me locally sometimes. Filed HIVE-8362 to look into the 
failure.

> Enable windowing.q for spark
> 
>
> Key: HIVE-8352
> URL: https://issues.apache.org/jira/browse/HIVE-8352
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Jimmy Xiang
>Priority: Minor
> Attachments: HIVE-8352.1-spark.patch, HIVE-8352.1-spark.patch, 
> hive-8385.patch
>
>
> We should enable windowing.q for basic windowing coverage. After checking out 
> the spark branch, we would build:
> {noformat}
> $ mvn clean install -DskipTests -Phadoop-2
> $ cd itests/
> $ mvn clean install -DskipTests -Phadoop-2
> {noformat}
> Then generate the windowing.q.out file:
> {noformat}
> $ cd qtest-spark/
> $ mvn test -Dtest=TestSparkCliDriver -Dqfile=windowing.q -Phadoop-2 
> -Dtest.output.overwrite=true
> {noformat}
> Compare the output against MapReduce:
> {noformat}
> $ diff -y -W 150 
> ../../ql/src/test/results/clientpositive/spark/windowing.q.out 
> ../../ql/src/test/results/clientpositive/windowing.q.out| less
> {noformat}
> And if everything looks good, add it to {{spark.query.files}} in 
> {{./itests/src/test/resources/testconfiguration.properties}}
> then submit the patch including the .q file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8363) AccumuloStorageHandler compile failure hadoop-1

2014-10-06 Thread Szehon Ho (JIRA)

Szehon Ho created HIVE-8363:
---

 Summary: AccumuloStorageHandler compile failure hadoop-1
 Key: HIVE-8363
 URL: https://issues.apache.org/jira/browse/HIVE-8363
 Project: Hive
  Issue Type: Bug
  Components: StorageHandler
Affects Versions: 0.14.0
Reporter: Szehon Ho
Priority: Blocker


There's an error about AccumuloStorageHandler compiling on hadoop-1.  It seems 
the signature of split() is not the same.  Looks like we can should use another 
utils to fix this.

{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on 
project hive-accumulo-handler: Compilation failure
[ERROR] 
/data/hive-ptest/working/apache-svn-trunk-source/accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/columns/ColumnMapper.java:[57,52]
 no suitable method found for split(java.lang.String,char)
[ERROR] method 
org.apache.hadoop.util.StringUtils.split(java.lang.String,char,char) is not 
applicable
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8364) We're not waiting for all inputs in MapRecordProcessor on Tez

2014-10-06 Thread Gunther Hagleitner (JIRA)

Gunther Hagleitner created HIVE-8364:


 Summary: We're not waiting for all inputs in MapRecordProcessor on 
Tez
 Key: HIVE-8364
 URL: https://issues.apache.org/jira/browse/HIVE-8364
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Vikram Dixit K
 Fix For: 0.14.0


Seems like this could be a race condition: We're blocking for some inputs to 
become available, but the main MR input is just assumed ready...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8364) We're not waiting for all inputs in MapRecordProcessor on Tez

2014-10-06 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8364:
-
Attachment: HIVE-8364.1.patch

Proposed patch.

> We're not waiting for all inputs in MapRecordProcessor on Tez
> -
>
> Key: HIVE-8364
> URL: https://issues.apache.org/jira/browse/HIVE-8364
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Vikram Dixit K
> Fix For: 0.14.0
>
> Attachments: HIVE-8364.1.patch
>
>
> Seems like this could be a race condition: We're blocking for some inputs to 
> become available, but the main MR input is just assumed ready...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8292) Reading from partitioned bucketed tables has high overhead in MapOperator.cleanUpInputFileChangedOp

2014-10-06 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160729#comment-14160729
 ] 

Gopal V commented on HIVE-8292:
---

[~mmokhtar]: Probably better to just read exec context off 
mapOp.getExecContext().

> Reading from partitioned bucketed tables has high overhead in 
> MapOperator.cleanUpInputFileChangedOp
> ---
>
> Key: HIVE-8292
> URL: https://issues.apache.org/jira/browse/HIVE-8292
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.14.0
> Environment: cn105
>Reporter: Mostafa Mokhtar
>Assignee: Vikram Dixit K
> Fix For: 0.14.0
>
> Attachments: 2014_09_29_14_46_04.jfr, HIVE-8292.1.patch
>
>
> Reading from bucketed partitioned tables has significantly higher overhead 
> compared to non-bucketed non-partitioned files.
> 50% of the profile is spent in MapOperator.cleanUpInputFileChangedOp
> 5% the CPU in 
> {code}
>  Path onepath = normalizePath(onefile);
> {code}
> And 
> 45% the CPU in 
> {code}
>  onepath.toUri().relativize(fpath.toUri()).equals(fpath.toUri());
> {code}
> From the profiler 
> {code}
> Stack Trace   Sample CountPercentage(%)
> hive.ql.exec.tez.MapRecordSource.processRow(Object)   5,327   62.348
>hive.ql.exec.vector.VectorMapOperator.process(Writable)5,326   62.336
>   hive.ql.exec.Operator.cleanUpInputFileChanged() 4,851   56.777
>  hive.ql.exec.MapOperator.cleanUpInputFileChangedOp() 4,849   56.753
>  java.net.URI.relativize(URI) 3,903   45.681
> java.net.URI.relativize(URI, URI) 3,903   
> 45.681
>java.net.URI.normalize(String) 2,169   
> 25.386
>java.net.URI.equal(String, String) 
> 526 6.156
>java.net.URI.equalIgnoringCase(String, 
> String) 1   0.012
>java.lang.String.substring(int)
> 1   0.012
> hive.ql.exec.MapOperator.normalizePath(String)506 5.922
> org.apache.commons.logging.impl.Log4JLogger.info(Object)  32  
> 0.375
>  java.net.URI.equals(Object)  12  0.14
>  java.util.HashMap$KeySet.iterator()  5   
> 0.059
>  java.util.HashMap.get(Object)4   
> 0.047
>  java.util.LinkedHashMap.get(Object)  3   
> 0.035
>  hive.ql.exec.Operator.cleanUpInputFileChanged()  1   0.012
>   hive.ql.exec.Operator.forward(Object, ObjectInspector)  473 5.536
>   hive.ql.exec.mr.ExecMapperContext.inputFileChanged()1   0.012
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8364) We're not waiting for all inputs in MapRecordProcessor on Tez

2014-10-06 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8364:
-
Status: Patch Available  (was: Open)

> We're not waiting for all inputs in MapRecordProcessor on Tez
> -
>
> Key: HIVE-8364
> URL: https://issues.apache.org/jira/browse/HIVE-8364
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Vikram Dixit K
> Fix For: 0.14.0
>
> Attachments: HIVE-8364.1.patch
>
>
> Seems like this could be a race condition: We're blocking for some inputs to 
> become available, but the main MR input is just assumed ready...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8227) NPE w/ hive on tez when doing unions on empty tables

2014-10-06 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8227:
-
Fix Version/s: 0.14.0

> NPE w/ hive on tez when doing unions on empty tables
> 
>
> Key: HIVE-8227
> URL: https://issues.apache.org/jira/browse/HIVE-8227
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Fix For: 0.14.0
>
> Attachments: HIVE-8227.1.patch, HIVE-8227.2.patch
>
>
> We're looking at aliasToWork.values() to determine input paths etc. This can 
> contain nulls when we're scanning empty tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8272) Query with particular decimal expression causes NPE during execution initialization

2014-10-06 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160762#comment-14160762
 ] 

Ashutosh Chauhan commented on HIVE-8272:


+1

> Query with particular decimal expression causes NPE during execution 
> initialization
> ---
>
> Key: HIVE-8272
> URL: https://issues.apache.org/jira/browse/HIVE-8272
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer, Physical Optimizer
>Reporter: Matt McCline
>Assignee: Jason Dere
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8272.1.patch
>
>
> Query:
> {code}
> select 
>   cast(sum(dc)*100 as decimal(11,3)) as c1
>   from somedecimaltable
>   order by c1
>   limit 100;
> {code}
> Fails during execution initialization due to *null* ExprNodeDesc.
> Noticed while trying to simplify a Vectorization issue and realized it was a 
> more general issue.
> {code}
> Caused by: java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:154)
>   ... 22 more
> Caused by: java.lang.RuntimeException: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.initializeOp(ReduceSinkOperator.java:215)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:427)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:425)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133)
>   ... 22 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.getExprString(ExprNodeGenericFuncDesc.java:154)
>   at 
> org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.getExprString(ExprNodeGenericFuncDesc.java:154)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.initializeOp(ReduceSinkOperator.java:148)
>   ... 38 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8258) Compactor cleaners can be starved on a busy table or partition.

2014-10-06 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-8258:
-
Status: Open  (was: Patch Available)

Found an issue where this patch prevents the initiator from starting properly.

> Compactor cleaners can be starved on a busy table or partition.
> ---
>
> Key: HIVE-8258
> URL: https://issues.apache.org/jira/browse/HIVE-8258
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 0.13.1
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8258.2.patch, HIVE-8258.3.patch, HIVE-8258.4.patch, 
> HIVE-8258.patch
>
>
> Currently the cleaning thread in the compactor does not run on a table or 
> partition while any locks are held on this partition.  This leaves it open to 
> starvation in the case of a busy table or partition.  It only needs to wait 
> until all locks on the table/partition at the time of the compaction have 
> expired.  Any jobs initiated after that (and thus any locks obtained) will be 
> for the new versions of the files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8365) TPCDS query #7 fails with IndexOutOfBoundsException [Spark Branch]

2014-10-06 Thread Xuefu Zhang (JIRA)

Xuefu Zhang created HIVE-8365:
-

 Summary: TPCDS query #7 fails with IndexOutOfBoundsException 
[Spark Branch]
 Key: HIVE-8365
 URL: https://issues.apache.org/jira/browse/HIVE-8365
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xuefu Zhang


Running TPCDS query #17, given below, results IndexOutOfBoundsException: 
{code}
14/10/06 12:24:05 ERROR executor.Executor: Exception in task 0.0 in stage 7.0 
(TID 2)
java.lang.IndexOutOfBoundsException: Index: 1902425, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:604)
at java.util.ArrayList.get(ArrayList.java:382)
at 
org.apache.hive.com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:42)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:820)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:670)
at 
org.apache.hadoop.hive.ql.exec.spark.KryoSerializer.deserialize(KryoSerializer.java:51)
at 
org.apache.hadoop.hive.ql.exec.spark.HiveKVResultCache.next(HiveKVResultCache.java:114)
at 
org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.next(HiveBaseFunctionResultList.java:139)
at 
org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.next(HiveBaseFunctionResultList.java:92)
at 
scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:42)
at 
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:210)
at 
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:56)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:182)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
{code}

The query is:
{code}
select
  i_item_id,
  avg(ss_quantity) agg1,
  avg(ss_list_price) agg2,
  avg(ss_coupon_amt) agg3,
  avg(ss_sales_price) agg4
from
  store_sales,
  customer_demographics,
  date_dim,
  item,
  promotion
where
  ss_sold_date_sk = d_date_sk
  and ss_item_sk = i_item_sk
  and ss_cdemo_sk = cd_demo_sk
  and ss_promo_sk = p_promo_sk
  and cd_gender = 'F'
  and cd_marital_status = 'W'
  and cd_education_status = 'Primary'
  and (p_channel_email = 'N'
or p_channel_event = 'N')
  and d_year = 1998
  and ss_sold_date_sk between 2450815 and 2451179 -- partition key filter
group by
  i_item_id
order by
  i_item_id
limit 100;
{code},
though many other TPCDS queries give the same exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8366) CBO fails if there is a table sample in subquery

2014-10-06 Thread Ashutosh Chauhan (JIRA)

Ashutosh Chauhan created HIVE-8366:
--

 Summary: CBO fails if there is a table sample in subquery
 Key: HIVE-8366
 URL: https://issues.apache.org/jira/browse/HIVE-8366
 Project: Hive
  Issue Type: Bug
  Components: CBO, Logical Optimizer
Affects Versions: 0.14.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-8366.patch

Bail out from cbo in such cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8366) CBO fails if there is a table sample in subquery

2014-10-06 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8366:
---
Attachment: HIVE-8366.patch

> CBO fails if there is a table sample in subquery
> 
>
> Key: HIVE-8366
> URL: https://issues.apache.org/jira/browse/HIVE-8366
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Affects Versions: 0.14.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-8366.patch
>
>
> Bail out from cbo in such cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8366) CBO fails if there is a table sample in subquery

2014-10-06 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8366:
---
Status: Patch Available  (was: Open)

> CBO fails if there is a table sample in subquery
> 
>
> Key: HIVE-8366
> URL: https://issues.apache.org/jira/browse/HIVE-8366
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Affects Versions: 0.14.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-8366.patch
>
>
> Bail out from cbo in such cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6500) Stats collection via filesystem

2014-10-06 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160813#comment-14160813
 ] 

Lefty Leverenz commented on HIVE-6500:
--

Good catch, [~szehon].  Yes, the "Newly Created Tables" section of the StatsDev 
wikidoc needs to be updated, keeping in mind that releases 0.7 though 0.12 have 
"jdbc:derby" as the default for *hive.stats.dbclass* so we can't just swap in 
the new default value.  Linking to/from *hive.stats.dbclass* in the 
Configuration Properties doc will help with future maintenance.

* [StatsDev -- Newly Created Tables | 
https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-NewlyCreatedTables]
* [Configuration Properties -- hive.stats.dbclass | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.stats.dbclass]

Also, the HiveConf.java description of *hive.stats.dbclass* omits the "fs" 
value.  I can correct that in the next patch for HIVE-6586, perhaps using the 
wiki description or a variant of it:

{quote}
The storage that stores temporary Hive statistics. In FS based statistics 
collection, each task writes statistics it has collected in a file on the 
filesystem, which will be aggregated after the job has finished. Supported 
values are fs (filesystem), jdbc(:.*), hbase, counter and custom (HIVE-6500).
{quote}

Suggested changes to that description:  (1) change "FS" to "filesystem (fs)", 
(2) remove or move "(HIVE-6500)" so it doesn't imply that HIVE-6500 added 
"custom", (3) change "jdbc(:.*)" to "jdbc:" and explain that 
 can be derby, mysql, ... and what others -- is there a complete list 
anywhere?

P.S.  What do you mean by "It is actually not linked from the top"?  Top of 
what?  Maybe you mean it belongs on the Home page.  Currently it's listed on 
the LanguageManual page, but that's easy to change -- we can even list it both 
places.

> Stats collection via filesystem
> ---
>
> Key: HIVE-6500
> URL: https://issues.apache.org/jira/browse/HIVE-6500
> Project: Hive
>  Issue Type: New Feature
>  Components: Statistics
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>  Labels: TODOC14
> Fix For: 0.13.0
>
> Attachments: HIVE-6500.2.patch, HIVE-6500.3.patch, HIVE-6500.patch
>
>
> Recently, support for stats gathering via counter was [added | 
> https://issues.apache.org/jira/browse/HIVE-4632] Although, its useful it has 
> following issues:
> * [Length of counter group name is limited | 
> https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L340]
> * [Length of counter name is limited | 
> https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L337]
> * [Number of distinct counter groups are limited | 
> https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L343]
> * [Number of distinct counters are limited | 
> https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L334]
> Although, these limits are configurable, but setting them to higher value 
> implies increased memory load on AM and job history server.
> Now, whether these limits makes sense or not is [debatable | 
> https://issues.apache.org/jira/browse/MAPREDUCE-5680] it is desirable that 
> Hive doesn't make use of counters features of framework so that it we can 
> evolve this feature without relying on support from framework. Filesystem 
> based counter collection is a step in that direction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 26379: Disable cbo for tablesample

2014-10-06 Thread Ashutosh Chauhan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26379/
---

Review request for hive and John Pullokkaran.


Bugs: HIVE-8366
https://issues.apache.org/jira/browse/HIVE-8366


Repository: hive-git


Description
---

Disable cbo for tablesample


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/HiveOptiqUtil.java 
7c2b0cd 

Diff: https://reviews.apache.org/r/26379/diff/


Testing
---

udf_substr.q


Thanks,

Ashutosh Chauhan

[jira] [Commented] (HIVE-8120) Umbrella JIRA tracking Parquet improvements

2014-10-06 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160823#comment-14160823
 ] 

Brock Noland commented on HIVE-8120:


Linking to HIVE-4329

> Umbrella JIRA tracking Parquet improvements
> ---
>
> Key: HIVE-8120
> URL: https://issues.apache.org/jira/browse/HIVE-8120
> Project: Hive
>  Issue Type: Improvement
>Reporter: Brock Noland
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6500) Stats collection via filesystem

2014-10-06 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-6500:
-
Labels: TODOC13 TODOC14  (was: TODOC14)

> Stats collection via filesystem
> ---
>
> Key: HIVE-6500
> URL: https://issues.apache.org/jira/browse/HIVE-6500
> Project: Hive
>  Issue Type: New Feature
>  Components: Statistics
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>  Labels: TODOC13, TODOC14
> Fix For: 0.13.0
>
> Attachments: HIVE-6500.2.patch, HIVE-6500.3.patch, HIVE-6500.patch
>
>
> Recently, support for stats gathering via counter was [added | 
> https://issues.apache.org/jira/browse/HIVE-4632] Although, its useful it has 
> following issues:
> * [Length of counter group name is limited | 
> https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L340]
> * [Length of counter name is limited | 
> https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L337]
> * [Number of distinct counter groups are limited | 
> https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L343]
> * [Number of distinct counters are limited | 
> https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L334]
> Although, these limits are configurable, but setting them to higher value 
> implies increased memory load on AM and job history server.
> Now, whether these limits makes sense or not is [debatable | 
> https://issues.apache.org/jira/browse/MAPREDUCE-5680] it is desirable that 
> Hive doesn't make use of counters features of framework so that it we can 
> evolve this feature without relying on support from framework. Filesystem 
> based counter collection is a step in that direction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7800) Parquet Column Index Access Schema Size Checking

2014-10-06 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7800:
---
   Resolution: Fixed
Fix Version/s: (was: 0.14.0)
   0.15.0
   Status: Resolved  (was: Patch Available)

Thank you so much Daniel! I have committed this to trunk.

[~vikram.dixit] could we get this into 0.14?

> Parquet Column Index Access Schema Size Checking
> 
>
> Key: HIVE-7800
> URL: https://issues.apache.org/jira/browse/HIVE-7800
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Daniel Weeks
>Assignee: Daniel Weeks
>Priority: Critical
> Fix For: 0.15.0
>
> Attachments: HIVE-7800.1.patch, HIVE-7800.2.patch, HIVE-7800.3.patch
>
>
> In the case that a parquet formatted table has partitions where the files 
> have different size schema, using column index access can result in an index 
> out of bounds exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-6500) Stats collection via filesystem

2014-10-06 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14003107#comment-14003107
 ] 

Lefty Leverenz edited comment on HIVE-6500 at 10/6/14 8:02 PM:
---

Unfortunately my review board advice not to patch hive-default.xml.template led 
to release 0.13.0 having the obsolete default value for *hive.stats.dbclass* in 
the template file.  But it's updated in the most recent patch for HIVE-6037, so 
presumably it will be corrected by release 0.14.0.

Sorry about that.

Edit:  The updated parameter description didn't make it into the new version of 
HiveConf.java, so it needs to be fixed in another patch.  (I suggest HIVE-6586.)


was (Author: le...@hortonworks.com):
Unfortunately my review board advice not to patch hive-default.xml.template led 
to release 0.13.0 having the obsolete default value for *hive.stats.dbclass* in 
the template file.  But it's updated in the most recent patch for HIVE-6037, so 
presumably it will be corrected by release 0.14.0.

Sorry about that.

> Stats collection via filesystem
> ---
>
> Key: HIVE-6500
> URL: https://issues.apache.org/jira/browse/HIVE-6500
> Project: Hive
>  Issue Type: New Feature
>  Components: Statistics
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>  Labels: TODOC13, TODOC14
> Fix For: 0.13.0
>
> Attachments: HIVE-6500.2.patch, HIVE-6500.3.patch, HIVE-6500.patch
>
>
> Recently, support for stats gathering via counter was [added | 
> https://issues.apache.org/jira/browse/HIVE-4632] Although, its useful it has 
> following issues:
> * [Length of counter group name is limited | 
> https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L340]
> * [Length of counter name is limited | 
> https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L337]
> * [Number of distinct counter groups are limited | 
> https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L343]
> * [Number of distinct counters are limited | 
> https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L334]
> Although, these limits are configurable, but setting them to higher value 
> implies increased memory load on AM and job history server.
> Now, whether these limits makes sense or not is [debatable | 
> https://issues.apache.org/jira/browse/MAPREDUCE-5680] it is desirable that 
> Hive doesn't make use of counters features of framework so that it we can 
> evolve this feature without relying on support from framework. Filesystem 
> based counter collection is a step in that direction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8361) NPE in PTFOperator when there are empty partitions

2014-10-06 Thread Mostafa Mokhtar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160851#comment-14160851
 ] 

Mostafa Mokhtar commented on HIVE-8361:
---

[~rhbutani]
Validated the fix on query98 and it ran fine.

> NPE in PTFOperator when there are empty partitions
> --
>
> Key: HIVE-8361
> URL: https://issues.apache.org/jira/browse/HIVE-8361
> Project: Hive
>  Issue Type: Bug
>Reporter: Harish Butani
>Assignee: Harish Butani
> Attachments: HIVE-8361.1.patch
>
>
> Here is a simple query to reproduce this:
> {code}
> select sum(p_size) over (partition by p_mfgr )
> from part where p_mfgr = 'some non existent mfgr';
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8352) Enable windowing.q for spark

2014-10-06 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8352:
---
   Resolution: Fixed
Fix Version/s: spark-branch
   Status: Resolved  (was: Patch Available)

> Enable windowing.q for spark
> 
>
> Key: HIVE-8352
> URL: https://issues.apache.org/jira/browse/HIVE-8352
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: spark-branch
>
> Attachments: HIVE-8352.1-spark.patch, HIVE-8352.1-spark.patch, 
> hive-8385.patch
>
>
> We should enable windowing.q for basic windowing coverage. After checking out 
> the spark branch, we would build:
> {noformat}
> $ mvn clean install -DskipTests -Phadoop-2
> $ cd itests/
> $ mvn clean install -DskipTests -Phadoop-2
> {noformat}
> Then generate the windowing.q.out file:
> {noformat}
> $ cd qtest-spark/
> $ mvn test -Dtest=TestSparkCliDriver -Dqfile=windowing.q -Phadoop-2 
> -Dtest.output.overwrite=true
> {noformat}
> Compare the output against MapReduce:
> {noformat}
> $ diff -y -W 150 
> ../../ql/src/test/results/clientpositive/spark/windowing.q.out 
> ../../ql/src/test/results/clientpositive/windowing.q.out| less
> {noformat}
> And if everything looks good, add it to {{spark.query.files}} in 
> {{./itests/src/test/resources/testconfiguration.properties}}
> then submit the patch including the .q file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 26325: HiveServer2 dynamic service discovery should let the JDBC client use default ZooKeeper namespace

2014-10-06 Thread Thejas Nair


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26325/#review55571
---

Ship it!


Ship It!

- Thejas Nair


On Oct. 3, 2014, 7:13 p.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26325/
> ---
> 
> (Updated Oct. 3, 2014, 7:13 p.m.)
> 
> 
> Review request for hive and Thejas Nair.
> 
> 
> Bugs: HIVE-8172
> https://issues.apache.org/jira/browse/HIVE-8172
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-8172
> 
> 
> Diffs
> -
> 
>   jdbc/src/java/org/apache/hive/jdbc/Utils.java e6b1a36 
>   jdbc/src/java/org/apache/hive/jdbc/ZooKeeperHiveClientHelper.java 06795a5 
> 
> Diff: https://reviews.apache.org/r/26325/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>

[jira] [Commented] (HIVE-8172) HiveServer2 dynamic service discovery should let the JDBC client use default ZooKeeper namespace

2014-10-06 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160863#comment-14160863
 ] 

Thejas M Nair commented on HIVE-8172:
-

+1

> HiveServer2 dynamic service discovery should let the JDBC client use default 
> ZooKeeper namespace
> 
>
> Key: HIVE-8172
> URL: https://issues.apache.org/jira/browse/HIVE-8172
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 0.14.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Critical
>  Labels: TODOC14
> Fix For: 0.14.0
>
> Attachments: HIVE-8172.1.patch
>
>
> Currently the client provides a url like:
>  
> jdbc:hive2://vgumashta.local:2181,vgumashta.local:2182,vgumashta.local:2183/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2.
>  
> The zooKeeperNamespace param when not provided should use the default value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-8335) TestHCatLoader/TestHCatStorer failures on pre-commit tests

2014-10-06 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere resolved HIVE-8335.
--
   Resolution: Fixed
Fix Version/s: 0.14.0
 Assignee: Gopal V

Issue was resolved by Gopal reverting HIVE-8271.

> TestHCatLoader/TestHCatStorer failures on pre-commit tests
> --
>
> Key: HIVE-8335
> URL: https://issues.apache.org/jira/browse/HIVE-8335
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Tests
>Reporter: Jason Dere
>Assignee: Gopal V
> Fix For: 0.14.0
>
>
> Looks like a number of Hive pre-commit tests have been failing with the 
> following failures:
> {noformat}
> org.apache.hive.hcatalog.pig.TestHCatLoader.testSchemaLoadBasic[5]
> org.apache.hive.hcatalog.pig.TestHCatLoader.testConvertBooleanToInt[5]
> org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes[5]
> org.apache.hive.hcatalog.pig.TestHCatLoader.testSchemaLoadComplex[5]
> org.apache.hive.hcatalog.pig.TestHCatLoader.testColumnarStorePushdown[5]
> org.apache.hive.hcatalog.pig.TestHCatLoader.testGetInputBytes[5]
> org.apache.hive.hcatalog.pig.TestHCatStorer.testNoAlias[5]
> org.apache.hive.hcatalog.pig.TestHCatStorer.testEmptyStore[5]
> org.apache.hive.hcatalog.pig.TestHCatStorer.testDynamicPartitioningMultiPartColsNoDataInDataNoSpec[5]
> org.apache.hive.hcatalog.pig.TestHCatStorer.testPartitionPublish[5]
> org.apache.hive.hcatalog.pig.TestHCatLoader.testSchemaLoadPrimitiveTypes[5]
> org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataBasic[5]
> org.apache.hive.hcatalog.pig.TestHCatLoader.testReadPartitionedBasic[5]
> org.apache.hive.hcatalog.pig.TestHCatLoader.testProjectionsBasic[5]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 26277: Shim KerberosName (causes build failure on hadoop-1)

2014-10-06 Thread Thejas Nair


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26277/#review55572
---

Ship it!


Ship It!

- Thejas Nair


On Oct. 3, 2014, 6:39 p.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26277/
> ---
> 
> (Updated Oct. 3, 2014, 6:39 p.m.)
> 
> 
> Review request for hive, dilli dorai, Szehon Ho, and Thejas Nair.
> 
> 
> Bugs: HIVE-8324
> https://issues.apache.org/jira/browse/HIVE-8324
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-8324
> 
> 
> Diffs
> -
> 
>   service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java 83dd2e6 
>   service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java 
> 312d05e 
>   shims/0.20/src/main/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java 
> a353a46 
>   shims/0.20S/src/main/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java 
> 030cb75 
>   shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java 
> 0731108 
>   shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 
> 4fcaa1e 
> 
> Diff: https://reviews.apache.org/r/26277/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>

[jira] [Updated] (HIVE-8321) Fix serialization of TypeInfo for qualified types

2014-10-06 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-8321:
-
Attachment: HIVE-8321.3.patch

Looks like HCat tests were failing due to HIVE-8335.  Re-attaching same patch.

> Fix serialization of TypeInfo for qualified types
> -
>
> Key: HIVE-8321
> URL: https://issues.apache.org/jira/browse/HIVE-8321
> Project: Hive
>  Issue Type: Bug
>  Components: Types
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-8321.1.patch, HIVE-8321.2.patch, HIVE-8321.3.patch
>
>
> TypeInfos for decimal/char/varchar don't appear to be serializing properly 
> with javaXML.
> Decimal needed proper getters/setters for precision/scale.
> Also disabling setTypeInfo since for decimal/char/varchar the proper type 
> name should already be set by the constructor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8352) Enable windowing.q for spark [Spark Branch]

2014-10-06 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8352:
---
Summary: Enable windowing.q for spark [Spark Branch]  (was: Enable 
windowing.q for spark)

> Enable windowing.q for spark [Spark Branch]
> ---
>
> Key: HIVE-8352
> URL: https://issues.apache.org/jira/browse/HIVE-8352
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: spark-branch
>
> Attachments: HIVE-8352.1-spark.patch, HIVE-8352.1-spark.patch, 
> hive-8385.patch
>
>
> We should enable windowing.q for basic windowing coverage. After checking out 
> the spark branch, we would build:
> {noformat}
> $ mvn clean install -DskipTests -Phadoop-2
> $ cd itests/
> $ mvn clean install -DskipTests -Phadoop-2
> {noformat}
> Then generate the windowing.q.out file:
> {noformat}
> $ cd qtest-spark/
> $ mvn test -Dtest=TestSparkCliDriver -Dqfile=windowing.q -Phadoop-2 
> -Dtest.output.overwrite=true
> {noformat}
> Compare the output against MapReduce:
> {noformat}
> $ diff -y -W 150 
> ../../ql/src/test/results/clientpositive/spark/windowing.q.out 
> ../../ql/src/test/results/clientpositive/windowing.q.out| less
> {noformat}
> And if everything looks good, add it to {{spark.query.files}} in 
> {{./itests/src/test/resources/testconfiguration.properties}}
> then submit the patch including the .q file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8321) Fix serialization of TypeInfo for qualified types

2014-10-06 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-8321:
-
Status: Patch Available  (was: Open)

> Fix serialization of TypeInfo for qualified types
> -
>
> Key: HIVE-8321
> URL: https://issues.apache.org/jira/browse/HIVE-8321
> Project: Hive
>  Issue Type: Bug
>  Components: Types
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-8321.1.patch, HIVE-8321.2.patch, HIVE-8321.3.patch
>
>
> TypeInfos for decimal/char/varchar don't appear to be serializing properly 
> with javaXML.
> Decimal needed proper getters/setters for precision/scale.
> Also disabling setTypeInfo since for decimal/char/varchar the proper type 
> name should already be set by the constructor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8358) Constant folding should happen before predicate pushdown

2014-10-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160884#comment-14160884
 ] 

Hive QA commented on HIVE-8358:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12673119/HIVE-8358.patch

{color:red}ERROR:{color} -1 due to 57 failed/errored test(s), 6525 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_partlvl
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_partlvl_dp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constprog_dp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_1_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_skew_1_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_unused
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_stale_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input25
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input26
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input42
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part0
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_partition_metadataonly
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nonmr_fetch
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nonmr_fetch_threshold
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_udf_case
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_union_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_quotedid_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rand_partitionpruner2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rand_partitionpruner3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_transform_ppr2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_column_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_25
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucket3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_sample1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_transform_ppr2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_part1
org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample1
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1133/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1133/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1133/

Messages:
{noformat}
Executi

[jira] [Commented] (HIVE-7068) Integrate AccumuloStorageHandler

2014-10-06 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160899#comment-14160899
 ] 

Szehon Ho commented on HIVE-7068:
-

This breaks hadoop-1 compilation, [~elserj] would you have a chance to look at 
this?  HIVE-8363, a reference to StringUtils method that changed signature

> Integrate AccumuloStorageHandler
> 
>
> Key: HIVE-7068
> URL: https://issues.apache.org/jira/browse/HIVE-7068
> Project: Hive
>  Issue Type: New Feature
>Reporter: Josh Elser
>Assignee: Josh Elser
> Fix For: 0.14.0
>
> Attachments: HIVE-7068.1.patch, HIVE-7068.2.patch, HIVE-7068.3.patch, 
> HIVE-7068.4.patch
>
>
> [Accumulo|http://accumulo.apache.org] is a BigTable-clone which is similar to 
> HBase. Some [initial 
> work|https://github.com/bfemiano/accumulo-hive-storage-manager] has been done 
> to support querying an Accumulo table using Hive already. It is not a 
> complete solution as, most notably, the current implementation presently 
> lacks support for INSERTs.
> I would like to polish up the AccumuloStorageHandler (presently based on 
> 0.10), implement missing basic functionality and compare it to the 
> HBaseStorageHandler (to ensure that we follow the same general usage 
> patterns).
> I've also been in communication with [~bfem] (the initial author) who 
> expressed interest in working on this again. I hope to coordinate efforts 
> with him.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-8363) AccumuloStorageHandler compile failure hadoop-1

2014-10-06 Thread Josh Elser (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser reassigned HIVE-8363:


Assignee: Josh Elser

> AccumuloStorageHandler compile failure hadoop-1
> ---
>
> Key: HIVE-8363
> URL: https://issues.apache.org/jira/browse/HIVE-8363
> Project: Hive
>  Issue Type: Bug
>  Components: StorageHandler
>Affects Versions: 0.14.0
>Reporter: Szehon Ho
>Assignee: Josh Elser
>Priority: Blocker
>
> There's an error about AccumuloStorageHandler compiling on hadoop-1.  It 
> seems the signature of split() is not the same.  Looks like we can should use 
> another utils to fix this.
> {code}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
> on project hive-accumulo-handler: Compilation failure
> [ERROR] 
> /data/hive-ptest/working/apache-svn-trunk-source/accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/columns/ColumnMapper.java:[57,52]
>  no suitable method found for split(java.lang.String,char)
> [ERROR] method 
> org.apache.hadoop.util.StringUtils.split(java.lang.String,char,char) is not 
> applicable
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7068) Integrate AccumuloStorageHandler

2014-10-06 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160903#comment-14160903
 ] 

Josh Elser commented on HIVE-7068:
--

[~szehon], yeah, I can get a patch up there today.

> Integrate AccumuloStorageHandler
> 
>
> Key: HIVE-7068
> URL: https://issues.apache.org/jira/browse/HIVE-7068
> Project: Hive
>  Issue Type: New Feature
>Reporter: Josh Elser
>Assignee: Josh Elser
> Fix For: 0.14.0
>
> Attachments: HIVE-7068.1.patch, HIVE-7068.2.patch, HIVE-7068.3.patch, 
> HIVE-7068.4.patch
>
>
> [Accumulo|http://accumulo.apache.org] is a BigTable-clone which is similar to 
> HBase. Some [initial 
> work|https://github.com/bfemiano/accumulo-hive-storage-manager] has been done 
> to support querying an Accumulo table using Hive already. It is not a 
> complete solution as, most notably, the current implementation presently 
> lacks support for INSERTs.
> I would like to polish up the AccumuloStorageHandler (presently based on 
> 0.10), implement missing basic functionality and compare it to the 
> HBaseStorageHandler (to ensure that we follow the same general usage 
> patterns).
> I've also been in communication with [~bfem] (the initial author) who 
> expressed interest in working on this again. I hope to coordinate efforts 
> with him.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 26379: Disable cbo for tablesample

2014-10-06 Thread John Pullokkaran


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26379/#review55577
---

Ship it!


Ship It!

- John Pullokkaran


On Oct. 6, 2014, 7:50 p.m., Ashutosh Chauhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26379/
> ---
> 
> (Updated Oct. 6, 2014, 7:50 p.m.)
> 
> 
> Review request for hive and John Pullokkaran.
> 
> 
> Bugs: HIVE-8366
> https://issues.apache.org/jira/browse/HIVE-8366
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Disable cbo for tablesample
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/HiveOptiqUtil.java 
> 7c2b0cd 
> 
> Diff: https://reviews.apache.org/r/26379/diff/
> 
> 
> Testing
> ---
> 
> udf_substr.q
> 
> 
> Thanks,
> 
> Ashutosh Chauhan
> 
>

[jira] [Created] (HIVE-8367) delete writes records in wrong order in some cases

2014-10-06 Thread Alan Gates (JIRA)

Alan Gates created HIVE-8367:


 Summary: delete writes records in wrong order in some cases
 Key: HIVE-8367
 URL: https://issues.apache.org/jira/browse/HIVE-8367
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.14.0
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Blocker
 Fix For: 0.14.0


I have found one query with 10k records where you do:
create table
insert into table -- 10k records
delete from table -- just some records

The records in the delete delta are not ordered properly by rowid.

I assume this applies to updates as well, but I haven't tested it yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 26282: Hook HiveServer2 dynamic service discovery with session time out

2014-10-06 Thread Thejas Nair


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26282/#review55581
---

Ship it!


Ship It!

- Thejas Nair


On Oct. 2, 2014, 9 p.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26282/
> ---
> 
> (Updated Oct. 2, 2014, 9 p.m.)
> 
> 
> Review request for hive and Thejas Nair.
> 
> 
> Bugs: HIVE-8193
> https://issues.apache.org/jira/browse/HIVE-8193
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-8193
> 
> 
> Diffs
> -
> 
>   service/src/java/org/apache/hive/service/cli/CLIService.java b46c5b4 
>   service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
> ecc9b96 
>   
> service/src/java/org/apache/hive/service/cli/thrift/EmbeddedThriftBinaryCLIService.java
>  9ee9785 
>   service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
> 4a1e004 
>   service/src/java/org/apache/hive/service/server/HiveServer2.java c667533 
>   service/src/test/org/apache/hive/service/auth/TestPlainSaslHelper.java 
> fb784aa 
>   
> service/src/test/org/apache/hive/service/cli/session/TestSessionGlobalInitFile.java
>  47d3a56 
> 
> Diff: https://reviews.apache.org/r/26282/diff/
> 
> 
> Testing
> ---
> 
> Manually with ZooKeeper.
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>

[jira] [Commented] (HIVE-8193) Hook HiveServer2 dynamic service discovery with session time out

2014-10-06 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160935#comment-14160935
 ] 

Thejas M Nair commented on HIVE-8193:
-

+1

> Hook HiveServer2 dynamic service discovery with session time out
> 
>
> Key: HIVE-8193
> URL: https://issues.apache.org/jira/browse/HIVE-8193
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.14.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8193.1.patch
>
>
> For dynamic service discovery, if the HiveServer2 instance is removed from 
> ZooKeeper, currently, on the last client close, the server shuts down. 
> However, we need to ensure that this also happens when a session is closed on 
> timeout and no current sessions exit on this instance of HiveServer2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8368) compactor is improperly writing delete records in base file

2014-10-06 Thread Alan Gates (JIRA)

Alan Gates created HIVE-8368:


 Summary: compactor is improperly writing delete records in base 
file
 Key: HIVE-8368
 URL: https://issues.apache.org/jira/browse/HIVE-8368
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 0.14.0
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Critical
 Fix For: 0.14.0


When the compactor reads records from the base and deltas, it is not properly 
dropping delete records.  This leads to oversized base files, and possibly to 
wrong query results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8360) Add cross cluster support for webhcat E2E tests

2014-10-06 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160957#comment-14160957
 ] 

Thejas M Nair commented on HIVE-8360:
-

+1

> Add cross cluster support for webhcat E2E tests
> ---
>
> Key: HIVE-8360
> URL: https://issues.apache.org/jira/browse/HIVE-8360
> Project: Hive
>  Issue Type: Test
>  Components: Tests, WebHCat
> Environment: Secure cluster
>Reporter: Aswathy Chellammal Sreekumar
> Attachments: AD-MIT.patch
>
>
> In current Webhcat E2E test setup, cross domain secure cluster runs will fail 
> since the realm name for user principles are not included in the kinit 
> command. This patch concatenates the realm name to the user principal there 
> by resulting in a successful kinit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8362) Investigate flaky test parallel.q [Spark Branch]

2014-10-06 Thread Chao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160959#comment-14160959
 ] 

Chao commented on HIVE-8362:


Ran it several times - sometimes I got this diff:

{noformat}
--- a/ql/src/test/results/clientpositive/spark/parallel.q.out
+++ b/ql/src/test/results/clientpositive/spark/parallel.q.out
@@ -149,6 +149,7 @@ POSTHOOK: type: QUERY
 POSTHOOK: Input: default@src
 POSTHOOK: Output: default@src_a
 POSTHOOK: Output: default@src_b
+POSTHOOK: Lineage: src_a.key SIMPLE [(src)src.FieldSchema(name:key, 
type:string, comment:default), ]
 POSTHOOK: Lineage: src_a.value SIMPLE [(src)src.FieldSchema(name:value, 
type:string, comment:default), ]
 POSTHOOK: Lineage: src_b.key SIMPLE [(src)src.FieldSchema(name:key, 
type:string, comment:default), ]
 POSTHOOK: Lineage: src_b.value SIMPLE [(src)src.FieldSchema(name:value, 
type:string, comment:default), ]
{noformat}

> Investigate flaky test parallel.q [Spark Branch]
> 
>
> Key: HIVE-8362
> URL: https://issues.apache.org/jira/browse/HIVE-8362
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>  Labels: spark
>
> Test parallel.q is flaky. It fails sometimes with error like:
> {noformat}
> Failed tests: 
>   TestSparkCliDriver.testCliDriver_parallel:120->runTest:146 Unexpected 
> exception junit.framework.AssertionFailedError: Client Execution results 
> failed with error code = 1
> See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, 
> or check ./ql/target/surefire-reports or 
> ./itests/qtest/target/surefire-reports/ for specific test cases logs.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-2828) make timestamp accessible in the hbase KeyValue

2014-10-06 Thread Sushanth Sowmyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160981#comment-14160981
 ] 

Sushanth Sowmyan commented on HIVE-2828:


Sure, I'll try to look into this tonight.

> make timestamp accessible in the hbase KeyValue 
> 
>
> Key: HIVE-2828
> URL: https://issues.apache.org/jira/browse/HIVE-2828
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2828.D1989.1.patch, 
> ASF.LICENSE.NOT.GRANTED--HIVE-2828.D1989.2.patch, 
> ASF.LICENSE.NOT.GRANTED--HIVE-2828.D1989.3.patch, 
> ASF.LICENSE.NOT.GRANTED--HIVE-2828.D1989.4.patch, 
> ASF.LICENSE.NOT.GRANTED--HIVE-2828.D1989.5.patch, HIVE-2828.6.patch.txt, 
> HIVE-2828.7.patch.txt, HIVE-2828.8.patch.txt
>
>
> Originated from HIVE-2781 and not accepted, but I think this could be helpful 
> to someone.
> By using special column notation ':timestamp' in HBASE_COLUMNS_MAPPING, user 
> might access timestamp value in hbase KeyValue.
> {code}
> CREATE TABLE hbase_table (key int, value string, time timestamp)
>   STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>   WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf:string,:timestamp")
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8261) CBO : Predicate pushdown is removed by Optiq

2014-10-06 Thread Harish Butani (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160999#comment-14160999
 ] 

Harish Butani commented on HIVE-8261:
-

[~vikram.dixit]  can be add this to 0.14 branch

> CBO : Predicate pushdown is removed by Optiq 
> -
>
> Key: HIVE-8261
> URL: https://issues.apache.org/jira/browse/HIVE-8261
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 0.14.0, 0.13.1
>Reporter: Mostafa Mokhtar
>Assignee: Harish Butani
> Fix For: 0.14.0
>
> Attachments: HIVE-8261.1.patch
>
>
> Plan for TPC-DS Q64 wasn't optimal upon looking at the logical plan I 
> realized that predicate pushdown is not applied on date_dim d1.
> Interestingly before optiq we have the predicate pushed :
> {code}
> HiveFilterRel(condition=[<=($5, $1)])
> HiveJoinRel(condition=[=($3, $6)], joinType=[inner])
>   HiveProjectRel(_o__col0=[$0], _o__col1=[$2], _o__col2=[$3], 
> _o__col3=[$1])
> HiveFilterRel(condition=[=($0, 2000)])
>   HiveAggregateRel(group=[{0, 1}], agg#0=[count()], agg#1=[sum($2)])
> HiveProjectRel($f0=[$4], $f1=[$5], $f2=[$2])
>   HiveJoinRel(condition=[=($1, $8)], joinType=[inner])
> HiveJoinRel(condition=[=($1, $5)], joinType=[inner])
>   HiveJoinRel(condition=[=($0, $3)], joinType=[inner])
> HiveProjectRel(ss_sold_date_sk=[$0], ss_item_sk=[$2], 
> ss_wholesale_cost=[$11])
>   
> HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.store_sales]])
> HiveProjectRel(d_date_sk=[$0], d_year=[$6])
>   
> HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.date_dim]])
>   HiveFilterRel(condition=[AND(in($2, 'maroon', 'burnished', 
> 'dim', 'steel', 'navajo', 'chocolate'), between(false, $1, 35, +(35, 10)), 
> between(false, $1, +(35, 1), +(35, 15)))])
> HiveProjectRel(i_item_sk=[$0], i_current_price=[$5], 
> i_color=[$17])
>   
> HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.item]])
> HiveProjectRel(_o__col0=[$0])
>   HiveAggregateRel(group=[{0}])
> HiveProjectRel($f0=[$0])
>   HiveJoinRel(condition=[AND(=($0, $2), =($1, $3))], 
> joinType=[inner])
> HiveProjectRel(cs_item_sk=[$15], 
> cs_order_number=[$17])
>   
> HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.catalog_sales]])
> HiveProjectRel(cr_item_sk=[$2], cr_order_number=[$16])
>   
> HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.catalog_returns]])
>   HiveProjectRel(_o__col0=[$0], _o__col1=[$2], _o__col3=[$1])
> HiveFilterRel(condition=[=($0, +(2000, 1))])
>   HiveAggregateRel(group=[{0, 1}], agg#0=[count()])
> HiveProjectRel($f0=[$4], $f1=[$5], $f2=[$2])
>   HiveJoinRel(condition=[=($1, $8)], joinType=[inner])
> HiveJoinRel(condition=[=($1, $5)], joinType=[inner])
>   HiveJoinRel(condition=[=($0, $3)], joinType=[inner])
> HiveProjectRel(ss_sold_date_sk=[$0], ss_item_sk=[$2], 
> ss_wholesale_cost=[$11])
>   
> HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.store_sales]])
> HiveProjectRel(d_date_sk=[$0], d_year=[$6])
>   
> HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.date_dim]])
>   HiveFilterRel(condition=[AND(in($2, 'maroon', 'burnished', 
> 'dim', 'steel', 'navajo', 'chocolate'), between(false, $1, 35, +(35, 10)), 
> between(false, $1, +(35, 1), +(35, 15)))])
> HiveProjectRel(i_item_sk=[$0], i_current_price=[$5], 
> i_color=[$17])
>   
> HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.item]])
> HiveProjectRel(_o__col0=[$0])
>   HiveAggregateRel(group=[{0}])
> HiveProjectRel($f0=[$0])
>   HiveJoinRel(condition=[AND(=($0, $2), =($1, $3))], 
> joinType=[inner])
> HiveProjectRel(cs_item_sk=[$15], 
> cs_order_number=[$17])
>   
> HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.catalog_sales]])
> HiveProjectRel(cr_item_sk=[$2], cr_order_number=[$16])
>   
> HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.catalog_returns]])
> {code}
> While after Optiq the filter on date_dim gets pulled up the plan 
> {code}
>   HiveFilterRel(condition=[<=($5, $1)]): rowcount = 1.0, cumulative cost = 
> {5.50188454E8 rows, 0.0 cpu, 0.0 io}, id = 6895
> HiveProjectRel(_o__col0=[$0],

[jira] [Commented] (HIVE-7914) Simplify join predicates for CBO to avoid cross products

2014-10-06 Thread Mostafa Mokhtar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14161000#comment-14161000
 ] 

Mostafa Mokhtar commented on HIVE-7914:
---

Issue still exists 
{code}
hive> explain select avg(ss_quantity) ,avg(ss_ext_sales_price) 
,avg(ss_ext_wholesale_cost) ,sum(ss_ext_wholesale_cost) from store_sales ,store 
,customer_demographics ,household_demographics ,customer_address ,date_dim 
where store.s_store_sk = store_sales.ss_store_sk and 
store_sales.ss_sold_date_sk = date_dim.d_date_sk and date_dim.d_year = 2001 
and((store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk and 
customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk and 
customer_demographics.cd_marital_status = 'M' and 
customer_demographics.cd_education_status = '4 yr Degree' and 
store_sales.ss_sales_price between 100.00 and 150.00 and 
household_demographics.hd_dep_count = 3 )or 
(store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk and 
customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk and 
customer_demographics.cd_marital_status = 'D' and 
customer_demographics.cd_education_status = 'Primary' and 
store_sales.ss_sales_price between 50.00 and 100.00 and 
household_demographics.hd_dep_count = 1 ) or 
(store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk and 
customer_demographics.cd_demo_sk = ss_cdemo_sk and 
customer_demographics.cd_marital_status = 'U' and 
customer_demographics.cd_education_status = 'Advanced Degree' and 
store_sales.ss_sales_price between 150.00 and 200.00 and 
household_demographics.hd_dep_count = 1 )) and((store_sales.ss_addr_sk = 
customer_address.ca_address_sk and customer_address.ca_country = 'United 
States' and customer_address.ca_state in ('KY', 'GA', 'NM') and 
store_sales.ss_net_profit between 100 and 200 ) or (store_sales.ss_addr_sk = 
customer_address.ca_address_sk and customer_address.ca_country = 'United 
States' and customer_address.ca_state in ('MT', 'OR', 'IN') and 
store_sales.ss_net_profit between 150 and 300 ) or (store_sales.ss_addr_sk = 
customer_address.ca_address_sk and customer_address.ca_country = 'United 
States' and customer_address.ca_state in ('WI', 'MO', 'WV') and 
store_sales.ss_net_profit between 50 and 250 )) ;
Warning: Map Join MAPJOIN[49][bigTable=?] in task 'Map 4' is a cross product
Warning: Map Join MAPJOIN[48][bigTable=?] in task 'Map 4' is a cross product
Warning: Map Join MAPJOIN[47][bigTable=?] in task 'Map 4' is a cross product
OK
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 depends on stages: Stage-1

STAGE PLANS:
  Stage: Stage-1
Tez
  Edges:
Map 4 <- Map 1 (BROADCAST_EDGE), Map 2 (BROADCAST_EDGE), Map 3 
(BROADCAST_EDGE), Map 6 (BROADCAST_EDGE), Map 7 (BROADCAST_EDGE)
Reducer 5 <- Map 4 (SIMPLE_EDGE)
  DagName: mmokhtar_20141006173232_992a372b-cc0e-40d5-b51f-7098561df464:3
  Vertices:
Map 1
Map Operator Tree:
TableScan
  alias: household_demographics
  Statistics: Num rows: 7200 Data size: 770400 Basic stats: 
COMPLETE Column stats: NONE
  Reduce Output Operator
sort order:
Statistics: Num rows: 7200 Data size: 770400 Basic stats: 
COMPLETE Column stats: NONE
value expressions: hd_demo_sk (type: int), hd_dep_count 
(type: int)
Execution mode: vectorized
Map 2
Map Operator Tree:
TableScan
  alias: store
  filterExpr: s_store_sk is not null (type: boolean)
  Statistics: Num rows: 212 Data size: 405680 Basic stats: 
COMPLETE Column stats: NONE
  Filter Operator
predicate: s_store_sk is not null (type: boolean)
Statistics: Num rows: 106 Data size: 202840 Basic stats: 
COMPLETE Column stats: NONE
Reduce Output Operator
  key expressions: s_store_sk (type: int)
  sort order: +
  Map-reduce partition columns: s_store_sk (type: int)
  Statistics: Num rows: 106 Data size: 202840 Basic stats: 
COMPLETE Column stats: NONE
Execution mode: vectorized
Map 3
Map Operator Tree:
TableScan
  alias: customer_address
  Statistics: Num rows: 80 Data size: 811903688 Basic 
stats: COMPLETE Column stats: NONE
  Reduce Output Operator
sort order:
Statistics: Num rows: 80 Data size: 811903688 Basic 
stats: COMPLETE Column stats: NONE
value expressions: ca_address_sk (type: int), ca_state 
(type: string), ca_country (type: string)
Execution mode: vectorized
Map 4
Map Operator Tree:
TableScan
  alias: stor

[jira] [Created] (HIVE-8369) SimpleFetchOptimizer needs to re-enable FS caching before scanning dirs

2014-10-06 Thread Gopal V (JIRA)

Gopal V created HIVE-8369:
-

 Summary: SimpleFetchOptimizer needs to re-enable FS caching before 
scanning dirs
 Key: HIVE-8369
 URL: https://issues.apache.org/jira/browse/HIVE-8369
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Affects Versions: 0.14.0
Reporter: Gopal V


SimpleFetchOptimizer spends a lot of CPU within itself because hive disables 
HDFS fs caching (fs.hdfs.impl.disable.cache).

SimpleFetchOptimizer needs a revisit for its optimization rules, along with a 
fix for this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 3 >

1 - 100 of 211 matches

Mail list logo