date:20141101


[ 
https://issues.apache.org/jira/browse/HIVE-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192975#comment-14192975
 ] 

Hive QA commented on HIVE-8689:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12678641/HIVE-8689.01.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 6608 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1586/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1586/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1586/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12678641 - PreCommit-HIVE-TRUNK-Build

 handle overflows in statistics better
 -

 Key: HIVE-8689
 URL: https://issues.apache.org/jira/browse/HIVE-8689
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.14.0

 Attachments: HIVE-8689.01.patch, HIVE-8689.patch


 Improve overflow checks in StatsAnnotation optimizer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8688) serialized plan OutputStream is not being closed


[ 
https://issues.apache.org/jira/browse/HIVE-8688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192984#comment-14192984
 ] 

Thejas M Nair commented on HIVE-8688:
-

I found this issue while trying to find the cause of the failure. I am not sure 
if this fixes the following issue, because its hard to reproduce.

{code}
Error: java.lang.RuntimeException: 
org.apache.hive.com.esotericsoftware.kryo.KryoException: Buffer underflow.
at 
org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:422)
at 
org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:285)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:263)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:475)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:468)
at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:648)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:169)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: Buffer 
underflow.
at 
org.apache.hive.com.esotericsoftware.kryo.io.Input.require(Input.java:181)
at 
org.apache.hive.com.esotericsoftware.kryo.io.Input.readVarInt(Input.java:355)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:809)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:670)
at 
org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:1023)
at 
org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:931)
at 
org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:945)
at 
org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:389)
... 13 more
{code}

 serialized plan OutputStream is not being closed
 

 Key: HIVE-8688
 URL: https://issues.apache.org/jira/browse/HIVE-8688
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8688.1.patch


 The OutputStream to which serialized plan is not being closed in several 
 places.
 This can result in plan not getting written correctly.
 I have seen intermittent issues in deserializing the plan, and I think this 
 could be the/a cause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8688) serialized plan OutputStream is not being closed


[ 
https://issues.apache.org/jira/browse/HIVE-8688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192985#comment-14192985
 ] 

Thejas M Nair commented on HIVE-8688:
-

[~hagleitn] This is a simple bug fix. It might help avoid issues like above 
stack trace. I think its useful for 0.14



 serialized plan OutputStream is not being closed
 

 Key: HIVE-8688
 URL: https://issues.apache.org/jira/browse/HIVE-8688
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8688.1.patch


 The OutputStream to which serialized plan is not being closed in several 
 places.
 This can result in plan not getting written correctly.
 I have seen intermittent issues in deserializing the plan, and I think this 
 could be the/a cause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8656) CBO: auto_join_filters fails


[ 
https://issues.apache.org/jira/browse/HIVE-8656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192997#comment-14192997
 ] 

Hive QA commented on HIVE-8656:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12678609/HIVE-8656.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6609 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1587/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1587/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1587/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12678609 - PreCommit-HIVE-TRUNK-Build

 CBO: auto_join_filters fails
 

 Key: HIVE-8656
 URL: https://issues.apache.org/jira/browse/HIVE-8656
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Sergey Shelukhin
Assignee: Ashutosh Chauhan
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8656.patch


 Haven't looked why yet



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8461) Make Vectorized Decimal query results match Non-Vectorized query results with respect to trailing zeroes... .0000


[ 
https://issues.apache.org/jira/browse/HIVE-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192998#comment-14192998
 ] 

Hive QA commented on HIVE-8461:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12678616/HIVE-8461.04.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1588/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1588/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1588/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-1588/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
Reverted 
'ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/translator/TypeConverter.java'
++ awk '{print $2}'
++ egrep -v '^X|^Performing status on external'
++ svn status --no-ignore
+ rm -rf target datanucleus.log ant/target shims/target shims/0.20/target 
shims/0.20S/target shims/0.23/target shims/aggregator/target 
shims/common/target shims/common-secure/target packaging/target 
hbase-handler/target testutils/target jdbc/target metastore/target 
itests/target itests/hcatalog-unit/target itests/test-serde/target 
itests/qtest/target itests/hive-unit-hadoop2/target itests/hive-minikdc/target 
itests/hive-unit/target itests/custom-serde/target itests/util/target 
hcatalog/target hcatalog/core/target hcatalog/streaming/target 
hcatalog/server-extensions/target hcatalog/hcatalog-pig-adapter/target 
hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target 
accumulo-handler/target hwi/target common/target common/src/gen contrib/target 
service/target serde/target beeline/target odbc/target cli/target 
ql/dependency-reduced-pom.xml ql/target
+ svn update

Fetching external item into 'hcatalog/src/test/e2e/harness'
External at revision 1635895.

At revision 1635895.
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12678616 - PreCommit-HIVE-TRUNK-Build

 Make Vectorized Decimal query results match Non-Vectorized query results with 
 respect to trailing zeroes... .
 -

 Key: HIVE-8461
 URL: https://issues.apache.org/jira/browse/HIVE-8461
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 0.14.0
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8461.01.patch, HIVE-8461.02.patch, 
 HIVE-8461.03.patch, HIVE-8461.04.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8395) CBO: enable by default


[ 
https://issues.apache.org/jira/browse/HIVE-8395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193022#comment-14193022
 ] 

Hive QA commented on HIVE-8395:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12678618/HIVE-8395.20.patch

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 6608 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_filters
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_louter_join_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_outer_join_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_outer_join2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_outer_join3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_router_join_ppr
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hive.hcatalog.streaming.TestStreaming.testRemainingTransactions
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1589/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1589/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1589/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12678618 - PreCommit-HIVE-TRUNK-Build

 CBO: enable by default
 --

 Key: HIVE-8395
 URL: https://issues.apache.org/jira/browse/HIVE-8395
 Project: Hive
  Issue Type: Improvement
  Components: CBO
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.15.0

 Attachments: HIVE-8395.01.patch, HIVE-8395.02.patch, 
 HIVE-8395.03.patch, HIVE-8395.04.patch, HIVE-8395.05.patch, 
 HIVE-8395.06.patch, HIVE-8395.07.patch, HIVE-8395.08.patch, 
 HIVE-8395.09.patch, HIVE-8395.10.patch, HIVE-8395.11.patch, 
 HIVE-8395.12.patch, HIVE-8395.12.patch, HIVE-8395.13.patch, 
 HIVE-8395.13.patch, HIVE-8395.14.patch, HIVE-8395.15.patch, 
 HIVE-8395.16.patch, HIVE-8395.17.patch, HIVE-8395.18.patch, 
 HIVE-8395.18.patch, HIVE-8395.19.patch, HIVE-8395.20.patch, HIVE-8395.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8690) Move Avro dependency to 1.7.7


[ 
https://issues.apache.org/jira/browse/HIVE-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193029#comment-14193029
 ] 

Hive QA commented on HIVE-8690:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12678631/HIVE-8690.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6609 tests executed
*Failed tests:*
{noformat}
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1590/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1590/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1590/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12678631 - PreCommit-HIVE-TRUNK-Build

 Move Avro dependency to 1.7.7
 -

 Key: HIVE-8690
 URL: https://issues.apache.org/jira/browse/HIVE-8690
 Project: Hive
  Issue Type: New Feature
Affects Versions: 0.13.1
Reporter: Thomas Friedrich
Assignee: Thomas Friedrich
Priority: Minor
 Fix For: 0.14.0, 0.15.0

 Attachments: HIVE-8690.1.patch


 Move Avro dependency from 1.7.5 to current release 1.7.7.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8687) Support Avro through HCatalog


[ 
https://issues.apache.org/jira/browse/HIVE-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193049#comment-14193049
 ] 

Hive QA commented on HIVE-8687:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12678649/HIVE-8687.3.patch

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 6634 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_charvarchar
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_date
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_decimal_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_evolution_native
org.apache.hive.hcatalog.streaming.TestStreaming.testInterleavedTransactionBatchCommits
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1591/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1591/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1591/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12678649 - PreCommit-HIVE-TRUNK-Build

 Support Avro through HCatalog
 -

 Key: HIVE-8687
 URL: https://issues.apache.org/jira/browse/HIVE-8687
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Serializers/Deserializers
Affects Versions: 0.14.0
 Environment: discovered in Pig, but it looks like the root cause 
 impacts all non-Hive users
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8687.2.patch, HIVE-8687.3.patch, 
 HIVE-8687.branch-0.14.2.patch, HIVE-8687.branch-0.14.3.patch, 
 HIVE-8687.branch-0.14.patch, HIVE-8687.patch


 Attempting to write to a HCatalog defined table backed by the AvroSerde fails 
 with the following stacktrace:
 {code}
 java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be 
 cast to org.apache.hadoop.io.LongWritable
   at 
 org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84)
   at 
 org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253)
   at 
 org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53)
   at 
 org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242)
   at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
   at 
 org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559)
   at 
 org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
 {code}
 The proximal cause of this failure is that the AvroContainerOutputFormat's 
 signature mandates a LongWritable key and HCat's FileRecordWriterContainer 
 forces a NullWritable. I'm not sure of a general fix, other than redefining 
 HiveOutputFormat to mandate a WritableComparable.
 It looks like accepting WritableComparable is what's done in the other Hive 
 OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also 
 be changed, since it's ignoring the key. That way fixing things so 
 FileRecordWriterContainer can always use NullWritable could get spun into a 
 different issue?
 The underlying cause for failure to write to AvroSerde tables is that 
 AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so 
 fixing the above will just push the failure into the placeholder RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-4490) HS2 - 'select null ..' fails with NPE


[ 
https://issues.apache.org/jira/browse/HIVE-4490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193052#comment-14193052
 ] 

qiaohaijun commented on HIVE-4490:
--

+1

 HS2 - 'select null ..' fails with NPE
 -

 Key: HIVE-4490
 URL: https://issues.apache.org/jira/browse/HIVE-4490
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Thejas M Nair

 Eg, from beeline 
 {code}
  select null, i from t1 ;
 Error: Error running query: java.lang.NullPointerException (state=,code=0)
 Error: Error running query: java.lang.NullPointerException (state=,code=0)
 {code}
 In HS2 log
 org.apache.hive.service.cli.HiveSQLException: Error running query: 
 java.lang.NullPointerException
 at 
 org.apache.hive.service.cli.operation.SQLOperation.run(SQLOperation.java:113)
 at 
 org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:169)
 at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:62)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1178)
 at 
 org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:524)
 at 
 org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:57)
 at $Proxy8.executeStatement(Unknown Source)
 at 
 org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:148)
 at 
 org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:203)
 at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1133)
 at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1118)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
 at 
 org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge20S.java:565)
 at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-4172) JDBC2 does not support VOID type


[ 
https://issues.apache.org/jira/browse/HIVE-4172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193053#comment-14193053
 ] 

qiaohaijun commented on HIVE-4172:
--

+1

 JDBC2 does not support VOID type
 

 Key: HIVE-4172
 URL: https://issues.apache.org/jira/browse/HIVE-4172
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2, JDBC
Affects Versions: 0.11.0
Reporter: Navis
Assignee: Navis
Priority: Minor
  Labels: HiveServer2
 Fix For: 0.12.0

 Attachments: HIVE-4172.D9555.1.patch, HIVE-4172.D9555.2.patch, 
 HIVE-4172.D9555.3.patch, HIVE-4172.D9555.4.patch, HIVE-4172.D9555.5.patch


 In beeline, select key, null from src fails with exception,
 {noformat}
 org.apache.hive.service.cli.HiveSQLException: Error running query: 
 java.lang.NullPointerException
   at 
 org.apache.hive.service.cli.operation.SQLOperation.run(SQLOperation.java:112)
   at 
 org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:166)
   at 
 org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:148)
   at 
 org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:183)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1133)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1118)
   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
   at 
 org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:39)
   at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-5683) JDBC support for char


[ 
https://issues.apache.org/jira/browse/HIVE-5683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193061#comment-14193061
 ] 

qiaohaijun commented on HIVE-5683:
--

+1

 JDBC support for char
 -

 Key: HIVE-5683
 URL: https://issues.apache.org/jira/browse/HIVE-5683
 Project: Hive
  Issue Type: Bug
  Components: JDBC, Types
Reporter: Jason Dere
Assignee: Jason Dere
 Fix For: 0.13.0

 Attachments: HIVE-5683.1.patch, HIVE-5683.2.patch, HIVE-5683.3.patch


 Support char type in JDBC, including char length in result set metadata.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-5230) Better error reporting by async threads in HiveServer2


[ 
https://issues.apache.org/jira/browse/HIVE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193060#comment-14193060
 ] 

qiaohaijun commented on HIVE-5230:
--

+1

 Better error reporting by async threads in HiveServer2
 --

 Key: HIVE-5230
 URL: https://issues.apache.org/jira/browse/HIVE-5230
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 0.12.0, 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-5230.1.patch, HIVE-5230.1.patch, 
 HIVE-5230.10.patch, HIVE-5230.2.patch, HIVE-5230.3.patch, HIVE-5230.4.patch, 
 HIVE-5230.6.patch, HIVE-5230.7.patch, HIVE-5230.8.patch, HIVE-5230.9.patch


 [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support 
 for async execution in HS2. When a background thread gets an error, currently 
 the client can only poll for the operation state and also the error with its 
 stacktrace is logged. However, it will be useful to provide a richer error 
 response like thrift API does with TStatus (which is constructed while 
 building a Thrift response object). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8529) HiveSessionImpl#fetchResults should not try to fetch operation log when hive.server2.logging.operation.enabled is false.


[ 
https://issues.apache.org/jira/browse/HIVE-8529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193070#comment-14193070
 ] 

qiaohaijun commented on HIVE-8529:
--

+1

 HiveSessionImpl#fetchResults should not try to fetch operation log when 
 hive.server2.logging.operation.enabled is false.
 

 Key: HIVE-8529
 URL: https://issues.apache.org/jira/browse/HIVE-8529
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Reporter: Vaibhav Gumashta
 Fix For: 0.15.0


 Throws this even when it is disabled:
 {code}
 14/10/20 15:53:14 [HiveServer2-Handler-Pool: Thread-53]: DEBUG 
 security.UserGroupInformation: PrivilegedActionException as:vgumashta 
 (auth:SIMPLE) cause:org.apache.hive.service.cli.HiveSQLException: Couldn't 
 find log associated with operation handle: OperationHandle 
 [opType=EXECUTE_STATEMENT, 
 getHandleIdentifier()=b3d05ca6-e3e8-4bef-b869-0ea0732c3ac5]
 14/10/20 15:53:14 [HiveServer2-Handler-Pool: Thread-53]: WARN 
 thrift.ThriftCLIService: Error fetching results: 
 org.apache.hive.service.cli.HiveSQLException: Couldn't find log associated 
 with operation handle: OperationHandle [opType=EXECUTE_STATEMENT, 
 getHandleIdentifier()=b3d05ca6-e3e8-4bef-b869-0ea0732c3ac5]
   at 
 org.apache.hive.service.cli.operation.OperationManager.getOperationLogRowSet(OperationManager.java:240)
   at 
 org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:665)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79)
   at 
 org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37)
   at 
 org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:394)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at 
 org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:508)
   at 
 org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:60)
   at com.sun.proxy.$Proxy20.fetchResults(Unknown Source)
   at 
 org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:427)
   at 
 org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:582)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1553)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1538)
   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
   at 
 org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
   at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
   at java.lang.Thread.run(Thread.java:695)
 14/10/20 15:53:14 [HiveServer2-Handler-Pool: Thread-53]: DEBUG 
 transport.TSaslTransport: writing data length: 2525
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6050) JDBC backward compatibility is broken


[ 
https://issues.apache.org/jira/browse/HIVE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193079#comment-14193079
 ] 

qiaohaijun commented on HIVE-6050:
--

+1

 JDBC backward compatibility is broken
 -

 Key: HIVE-6050
 URL: https://issues.apache.org/jira/browse/HIVE-6050
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.13.0
Reporter: Szehon Ho
Assignee: Carl Steinbach
Priority: Blocker

 Connect from JDBC driver of Hive 0.13 (TProtocolVersion=v4) to HiveServer2 of 
 Hive 0.10 (TProtocolVersion=v1), will return the following exception:
 {noformat}
 java.sql.SQLException: Could not establish connection to 
 jdbc:hive2://localhost:1/default: Required field 'client_protocol' is 
 unset! Struct:TOpenSessionReq(client_protocol:null)
   at 
 org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:336)
   at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:158)
   at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
   at java.sql.DriverManager.getConnection(DriverManager.java:571)
   at java.sql.DriverManager.getConnection(DriverManager.java:187)
   at 
 org.apache.hive.jdbc.MyTestJdbcDriver2.getConnection(MyTestJdbcDriver2.java:73)
   at 
 org.apache.hive.jdbc.MyTestJdbcDriver2.lt;initgt;(MyTestJdbcDriver2.java:49)
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:187)
   at 
 org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:236)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:233)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
   at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
   at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
   at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:523)
   at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1063)
   at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:914)
 Caused by: org.apache.thrift.TApplicationException: Required field 
 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null)
   at 
 org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:160)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:147)
   at 
 org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:327)
   ... 37 more
 {noformat}
 On code analysis, it looks like the 'client_protocol' scheme is a ThriftEnum, 
 which doesn't seem to be backward-compatible.  Look at the code path in the 
 generated file 'TOpenSessionReq.java', method 
 TOpenSessionReqStandardScheme.read():
 1. The method will call 'TProtocolVersion.findValue()' on the thrift 
 protocol's byte stream, which returns null if the client is sending an enum 
 value unknown to the server.  (v4 is unknown to server)
 2. The method will then call struct.validate(), which will throw the above 
 exception because of null version.  
 So doesn't look like the current backward-compatibility scheme will work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6160) Follow-on to HS2 ResultSet Serialization Performance Regression


[ 
https://issues.apache.org/jira/browse/HIVE-6160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193080#comment-14193080
 ] 

qiaohaijun commented on HIVE-6160:
--

+1

 Follow-on to HS2 ResultSet Serialization Performance Regression
 ---

 Key: HIVE-6160
 URL: https://issues.apache.org/jira/browse/HIVE-6160
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: George Chow
Assignee: Xiao Meng
Priority: Minor

 As suggested by Brock, this is follow-on to HIVE-3746 to address:
 1) test backwards compatibility with the older driver and fix any outstanding 
 issues
 2) remove the debug stuff that is included (printStackTrace and System.out)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8693) Separate out fair scheduler dependency from hadoop 0.23 shim


[ 
https://issues.apache.org/jira/browse/HIVE-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193081#comment-14193081
 ] 

Hive QA commented on HIVE-8693:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12678653/HIVE-8693.1.patch

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 6609 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection
org.apache.hive.hcatalog.streaming.TestStreaming.testInterleavedTransactionBatchCommits
org.apache.hive.hcatalog.streaming.TestStreaming.testRemainingTransactions
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1592/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1592/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1592/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12678653 - PreCommit-HIVE-TRUNK-Build

 Separate out fair scheduler dependency from hadoop 0.23 shim
 

 Key: HIVE-8693
 URL: https://issues.apache.org/jira/browse/HIVE-8693
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Shims
Affects Versions: 0.14.0, 0.15.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-8693.1.patch


 As part of HIVE-8424 HiveServer2 uses Fair scheduler APIs to determine 
 resource queue allocation for non-impersonation case. This adds a hard 
 dependency of Yarn server jars for Hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6050) JDBC backward compatibility is broken


[ 
https://issues.apache.org/jira/browse/HIVE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193085#comment-14193085
 ] 

qiaohaijun commented on HIVE-6050:
--

14/11/01 19:12:44 ERROR jdbc.HiveConnection: Error opening session
org.apache.thrift.TApplicationException: Required field 'client_protocol' is 
unset! Struct:TOpenSessionReq(client_protocol:null)
at 
org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
at 
org.apache.hive.service.cli.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:156)
at 
org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:143)
at 
org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:415)
at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:193)
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:187)
at 
org.apache.hive.beeline.DatabaseConnection.connect(DatabaseConnection.java:145)
at 
org.apache.hive.beeline.DatabaseConnection.getConnection(DatabaseConnection.java:186)
at org.apache.hive.beeline.Commands.connect(Commands.java:959)
at org.apache.hive.beeline.Commands.connect(Commands.java:880)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:44)
at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:801)
at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:659)
at 
org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:368)
at org.apache.hive.beeline.BeeLine.main(BeeLine.java:351)
Error: Invalid URL: jdbc:hive2://10.134.34.181:1 (state=08S01,code=0)

---
spark 1.1.1
hive 0.12-probuf-2.5

 JDBC backward compatibility is broken
 -

 Key: HIVE-6050
 URL: https://issues.apache.org/jira/browse/HIVE-6050
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.13.0
Reporter: Szehon Ho
Assignee: Carl Steinbach
Priority: Blocker

 Connect from JDBC driver of Hive 0.13 (TProtocolVersion=v4) to HiveServer2 of 
 Hive 0.10 (TProtocolVersion=v1), will return the following exception:
 {noformat}
 java.sql.SQLException: Could not establish connection to 
 jdbc:hive2://localhost:1/default: Required field 'client_protocol' is 
 unset! Struct:TOpenSessionReq(client_protocol:null)
   at 
 org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:336)
   at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:158)
   at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
   at java.sql.DriverManager.getConnection(DriverManager.java:571)
   at java.sql.DriverManager.getConnection(DriverManager.java:187)
   at 
 org.apache.hive.jdbc.MyTestJdbcDriver2.getConnection(MyTestJdbcDriver2.java:73)
   at 
 org.apache.hive.jdbc.MyTestJdbcDriver2.lt;initgt;(MyTestJdbcDriver2.java:49)
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:187)
   at 
 org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:236)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:233)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
   at

[jira] [Commented] (HIVE-8688) serialized plan OutputStream is not being closed


[ 
https://issues.apache.org/jira/browse/HIVE-8688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193117#comment-14193117
 ] 

Hive QA commented on HIVE-8688:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12678590/HIVE-8688.1.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6609 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1593/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1593/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1593/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12678590 - PreCommit-HIVE-TRUNK-Build

 serialized plan OutputStream is not being closed
 

 Key: HIVE-8688
 URL: https://issues.apache.org/jira/browse/HIVE-8688
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8688.1.patch


 The OutputStream to which serialized plan is not being closed in several 
 places.
 This can result in plan not getting written correctly.
 I have seen intermittent issues in deserializing the plan, and I think this 
 could be the/a cause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8671) Overflow in estimate row count and data size with fetch column stats


[ 
https://issues.apache.org/jira/browse/HIVE-8671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193142#comment-14193142
 ] 

Hive QA commented on HIVE-8671:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12678665/HIVE-8671.5.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6609 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.streaming.TestStreaming.testRemainingTransactions
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1594/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1594/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1594/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12678665 - PreCommit-HIVE-TRUNK-Build

 Overflow in estimate row count and data size with fetch column stats
 

 Key: HIVE-8671
 URL: https://issues.apache.org/jira/browse/HIVE-8671
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Prasanth J
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8671.1.patch, HIVE-8671.2.patch, HIVE-8671.3.patch, 
 HIVE-8671.4.patch, HIVE-8671.5.patch


 Overflow in row counts and data size for several TPC-DS queries.
 Interestingly the operators which have overflow end up running with a small 
 parallelism.
 For instance Reducer 2 has an overflow but it only runs with parallelism of 2.
 {code}
Reducer 2 
 Reduce Operator Tree:
   Group By Operator
 aggregations: sum(VALUE._col0)
 keys: KEY._col0 (type: string), KEY._col1 (type: string), 
 KEY._col2 (type: string), KEY._col3 (type: string), KEY._col4 (type: float)
 mode: mergepartial
 outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
 Statistics: Num rows: 9223372036854775807 Data size: 
 9223372036854775341 Basic stats: COMPLETE Column stats: COMPLETE
 Reduce Output Operator
   key expressions: _col3 (type: string), _col3 (type: string)
   sort order: ++
   Map-reduce partition columns: _col3 (type: string)
   Statistics: Num rows: 9223372036854775807 Data size: 
 9223372036854775341 Basic stats: COMPLETE Column stats: COMPLETE
   value expressions: _col0 (type: string), _col1 (type: 
 string), _col2 (type: string), _col3 (type: string), _col4 (type: float), 
 _col5 (type: double)
 Execution mode: vectorized
 {code}
 {code}
 VERTEX   TOTAL_TASKSDURATION_SECONDS CPU_TIME_MILLIS 
 INPUT_RECORDS   OUTPUT_RECORDS 
 Map 1 62   26.41   1,779,510   
 211,978,502   60,628,390
 Map 5  14.28   6,950   
 138,098  138,098
 Map 6  12.44   3,910
 31   31
 Reducer 2  2   22.69  61,320
 60,628,390   69,182
 Reducer 3  12.63   3,910
 69,182  100
 Reducer 4  11.01   1,180   
 100  100
 {code}
 Query
 {code}
 explain  
 select  i_item_desc 
   ,i_category 
   ,i_class 
   ,i_current_price
   ,i_item_id
   ,sum(ws_ext_sales_price) as itemrevenue 
   ,sum(ws_ext_sales_price)*100/sum(sum(ws_ext_sales_price)) over
   (partition by i_class) as revenueratio
 from  
   web_sales
   ,item 
   ,date_dim
 where 
   web_sales.ws_item_sk = item.i_item_sk 
   and item.i_category in ('Jewelry', 'Sports', 'Books')
   and web_sales.ws_sold_date_sk = date_dim.d_date_sk
   and date_dim.d_date between '2001-01-12' and '2001-02-11'
 group by 
   i_item_id
 ,i_item_desc 
 ,i_category
 ,i_class
 ,i_current_price
 order by 
   i_category
 ,i_class
 ,i_item_id
 ,i_item_desc
 ,revenueratio
 limit 100
 {code}
 Explain 
 {code}
 STAGE PLANS:
   Stage: Stage-1
 Tez
   Edges:
 Map 1

[jira] [Commented] (HIVE-8435) Add identity project remover optimization


[ 
https://issues.apache.org/jira/browse/HIVE-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193176#comment-14193176
 ] 

Hive QA commented on HIVE-8435:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12678666/HIVE-8435.06.patch

{color:red}ERROR:{color} -1 due to 891 failed/errored test(s), 6610 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver_accumulo_predicate_pushdown
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver_accumulo_queries
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver_accumulo_single_sourced_multi_insert
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_allcolref_in_udf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_2_orc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ambiguous_col
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_analyze_table_null_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_excludeHadoop20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_multi
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_create_temp_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join0
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join15
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join16
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join18
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join18_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join21
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join22
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join24
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join26
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join27
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join28
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join29
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join30
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join31
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join32
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_without_localtask
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_smb_mapjoin_14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_15
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_8

[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode


 [ 
https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-8685:
-
Description: 
This makes DDL commands fail
This was stupidly broken in HIVE-8643


  was:
This makes DDL commands fail
This was stupidly broken in HIVE-8643

NO PRECOMMIT TESTS


 DDL operations in WebHCat set proxy user to null in unsecure mode
 ---

 Key: HIVE-8685
 URL: https://issues.apache.org/jira/browse/HIVE-8685
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Priority: Critical
 Attachments: HIVE-8685.2.patch, HIVE-8685.patch


 This makes DDL commands fail
 This was stupidly broken in HIVE-8643



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode


 [ 
https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-8685:
-
Status: Patch Available  (was: Open)

 DDL operations in WebHCat set proxy user to null in unsecure mode
 ---

 Key: HIVE-8685
 URL: https://issues.apache.org/jira/browse/HIVE-8685
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Priority: Critical
 Attachments: HIVE-8685.2.patch, HIVE-8685.patch


 This makes DDL commands fail
 This was stupidly broken in HIVE-8643



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode


 [ 
https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-8685:
-
Status: Open  (was: Patch Available)

 DDL operations in WebHCat set proxy user to null in unsecure mode
 ---

 Key: HIVE-8685
 URL: https://issues.apache.org/jira/browse/HIVE-8685
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Priority: Critical
 Attachments: HIVE-8685.2.patch, HIVE-8685.patch


 This makes DDL commands fail
 This was stupidly broken in HIVE-8643



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode


 [ 
https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-8685:
-
Attachment: HIVE-8685.3.patch

patch 2 and 3 are the same - just trying to kick off build bot

 DDL operations in WebHCat set proxy user to null in unsecure mode
 ---

 Key: HIVE-8685
 URL: https://issues.apache.org/jira/browse/HIVE-8685
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Priority: Critical
 Attachments: HIVE-8685.2.patch, HIVE-8685.3.patch, HIVE-8685.patch


 This makes DDL commands fail
 This was stupidly broken in HIVE-8643



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode


 [ 
https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-8685:
-
Status: Patch Available  (was: Open)

 DDL operations in WebHCat set proxy user to null in unsecure mode
 ---

 Key: HIVE-8685
 URL: https://issues.apache.org/jira/browse/HIVE-8685
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Priority: Critical
 Attachments: HIVE-8685.2.patch, HIVE-8685.3.patch, HIVE-8685.patch


 This makes DDL commands fail
 This was stupidly broken in HIVE-8643



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode


 [ 
https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-8685:
-
Status: Open  (was: Patch Available)

 DDL operations in WebHCat set proxy user to null in unsecure mode
 ---

 Key: HIVE-8685
 URL: https://issues.apache.org/jira/browse/HIVE-8685
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Priority: Critical
 Attachments: HIVE-8685.2.patch, HIVE-8685.3.patch, HIVE-8685.patch


 This makes DDL commands fail
 This was stupidly broken in HIVE-8643



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8687) Support Avro through HCatalog


 [ 
https://issues.apache.org/jira/browse/HIVE-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-8687:
---
Status: Open  (was: Patch Available)

 Support Avro through HCatalog
 -

 Key: HIVE-8687
 URL: https://issues.apache.org/jira/browse/HIVE-8687
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Serializers/Deserializers
Affects Versions: 0.14.0
 Environment: discovered in Pig, but it looks like the root cause 
 impacts all non-Hive users
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8687.2.patch, HIVE-8687.3.patch, 
 HIVE-8687.branch-0.14.2.patch, HIVE-8687.branch-0.14.3.patch, 
 HIVE-8687.branch-0.14.patch, HIVE-8687.patch


 Attempting to write to a HCatalog defined table backed by the AvroSerde fails 
 with the following stacktrace:
 {code}
 java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be 
 cast to org.apache.hadoop.io.LongWritable
   at 
 org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84)
   at 
 org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253)
   at 
 org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53)
   at 
 org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242)
   at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
   at 
 org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559)
   at 
 org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
 {code}
 The proximal cause of this failure is that the AvroContainerOutputFormat's 
 signature mandates a LongWritable key and HCat's FileRecordWriterContainer 
 forces a NullWritable. I'm not sure of a general fix, other than redefining 
 HiveOutputFormat to mandate a WritableComparable.
 It looks like accepting WritableComparable is what's done in the other Hive 
 OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also 
 be changed, since it's ignoring the key. That way fixing things so 
 FileRecordWriterContainer can always use NullWritable could get spun into a 
 different issue?
 The underlying cause for failure to write to AvroSerde tables is that 
 AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so 
 fixing the above will just push the failure into the placeholder RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8687) Support Avro through HCatalog


 [ 
https://issues.apache.org/jira/browse/HIVE-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-8687:
---
Attachment: HIVE-8687.4.patch

Attaching updated version of trunk patch to fix above issue (branch-0.14 
version 3 of the patch was good for branch-0.14)

 Support Avro through HCatalog
 -

 Key: HIVE-8687
 URL: https://issues.apache.org/jira/browse/HIVE-8687
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Serializers/Deserializers
Affects Versions: 0.14.0
 Environment: discovered in Pig, but it looks like the root cause 
 impacts all non-Hive users
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8687.2.patch, HIVE-8687.3.patch, HIVE-8687.4.patch, 
 HIVE-8687.branch-0.14.2.patch, HIVE-8687.branch-0.14.3.patch, 
 HIVE-8687.branch-0.14.patch, HIVE-8687.patch


 Attempting to write to a HCatalog defined table backed by the AvroSerde fails 
 with the following stacktrace:
 {code}
 java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be 
 cast to org.apache.hadoop.io.LongWritable
   at 
 org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84)
   at 
 org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253)
   at 
 org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53)
   at 
 org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242)
   at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
   at 
 org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559)
   at 
 org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
 {code}
 The proximal cause of this failure is that the AvroContainerOutputFormat's 
 signature mandates a LongWritable key and HCat's FileRecordWriterContainer 
 forces a NullWritable. I'm not sure of a general fix, other than redefining 
 HiveOutputFormat to mandate a WritableComparable.
 It looks like accepting WritableComparable is what's done in the other Hive 
 OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also 
 be changed, since it's ignoring the key. That way fixing things so 
 FileRecordWriterContainer can always use NullWritable could get spun into a 
 different issue?
 The underlying cause for failure to write to AvroSerde tables is that 
 AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so 
 fixing the above will just push the failure into the placeholder RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8687) Support Avro through HCatalog


 [ 
https://issues.apache.org/jira/browse/HIVE-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-8687:
---
Status: Patch Available  (was: Open)

 Support Avro through HCatalog
 -

 Key: HIVE-8687
 URL: https://issues.apache.org/jira/browse/HIVE-8687
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Serializers/Deserializers
Affects Versions: 0.14.0
 Environment: discovered in Pig, but it looks like the root cause 
 impacts all non-Hive users
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8687.2.patch, HIVE-8687.3.patch, HIVE-8687.4.patch, 
 HIVE-8687.branch-0.14.2.patch, HIVE-8687.branch-0.14.3.patch, 
 HIVE-8687.branch-0.14.patch, HIVE-8687.patch


 Attempting to write to a HCatalog defined table backed by the AvroSerde fails 
 with the following stacktrace:
 {code}
 java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be 
 cast to org.apache.hadoop.io.LongWritable
   at 
 org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84)
   at 
 org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253)
   at 
 org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53)
   at 
 org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242)
   at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
   at 
 org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559)
   at 
 org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
 {code}
 The proximal cause of this failure is that the AvroContainerOutputFormat's 
 signature mandates a LongWritable key and HCat's FileRecordWriterContainer 
 forces a NullWritable. I'm not sure of a general fix, other than redefining 
 HiveOutputFormat to mandate a WritableComparable.
 It looks like accepting WritableComparable is what's done in the other Hive 
 OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also 
 be changed, since it's ignoring the key. That way fixing things so 
 FileRecordWriterContainer can always use NullWritable could get spun into a 
 different issue?
 The underlying cause for failure to write to AvroSerde tables is that 
 AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so 
 fixing the above will just push the failure into the placeholder RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8461) Make Vectorized Decimal query results match Non-Vectorized query results with respect to trailing zeroes... .0000

2014-11-01 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8461:
---
Status: In Progress  (was: Patch Available)

 Make Vectorized Decimal query results match Non-Vectorized query results with 
 respect to trailing zeroes... .
 -

 Key: HIVE-8461
 URL: https://issues.apache.org/jira/browse/HIVE-8461
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 0.14.0
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8461.01.patch, HIVE-8461.02.patch, 
 HIVE-8461.03.patch, HIVE-8461.04.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8461) Make Vectorized Decimal query results match Non-Vectorized query results with respect to trailing zeroes... .0000

2014-11-01 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8461:
---
Attachment: HIVE-8461.05.patch

Merge conflicts from recent commit (HIVE-8632) that touched 
VectorHashKeyWrapper.

 Make Vectorized Decimal query results match Non-Vectorized query results with 
 respect to trailing zeroes... .
 -

 Key: HIVE-8461
 URL: https://issues.apache.org/jira/browse/HIVE-8461
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 0.14.0
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8461.01.patch, HIVE-8461.02.patch, 
 HIVE-8461.03.patch, HIVE-8461.04.patch, HIVE-8461.05.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8395) CBO: enable by default

2014-11-01 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193297#comment-14193297
 ] 

Sergey Shelukhin commented on HIVE-8395:


[~ashutoshc] after fixing a bug usually some out files would need to be updated 
(assuming they have acceptable changes) as a followup... this might be such a 
case

 CBO: enable by default
 --

 Key: HIVE-8395
 URL: https://issues.apache.org/jira/browse/HIVE-8395
 Project: Hive
  Issue Type: Improvement
  Components: CBO
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.15.0

 Attachments: HIVE-8395.01.patch, HIVE-8395.02.patch, 
 HIVE-8395.03.patch, HIVE-8395.04.patch, HIVE-8395.05.patch, 
 HIVE-8395.06.patch, HIVE-8395.07.patch, HIVE-8395.08.patch, 
 HIVE-8395.09.patch, HIVE-8395.10.patch, HIVE-8395.11.patch, 
 HIVE-8395.12.patch, HIVE-8395.12.patch, HIVE-8395.13.patch, 
 HIVE-8395.13.patch, HIVE-8395.14.patch, HIVE-8395.15.patch, 
 HIVE-8395.16.patch, HIVE-8395.17.patch, HIVE-8395.18.patch, 
 HIVE-8395.18.patch, HIVE-8395.19.patch, HIVE-8395.20.patch, HIVE-8395.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8461) Make Vectorized Decimal query results match Non-Vectorized query results with respect to trailing zeroes... .0000

2014-11-01 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-8461:
---
Status: Patch Available  (was: In Progress)

Try again.

 Make Vectorized Decimal query results match Non-Vectorized query results with 
 respect to trailing zeroes... .
 -

 Key: HIVE-8461
 URL: https://issues.apache.org/jira/browse/HIVE-8461
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 0.14.0
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8461.01.patch, HIVE-8461.02.patch, 
 HIVE-8461.03.patch, HIVE-8461.04.patch, HIVE-8461.05.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8594) Wrong condition in SettableConfigUpdater#setHiveConfWhiteList()

2014-11-01 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HIVE-8594:
-
Attachment: hive-8594.txt

 Wrong condition in SettableConfigUpdater#setHiveConfWhiteList()
 ---

 Key: HIVE-8594
 URL: https://issues.apache.org/jira/browse/HIVE-8594
 Project: Hive
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor
 Attachments: hive-8594.txt


 {code}
 if(whiteListParamsStr == null  whiteListParamsStr.trim().isEmpty()) {
 {code}
 If whiteListParamsStr is null, the call to trim() would result in NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-8594) Wrong condition in SettableConfigUpdater#setHiveConfWhiteList()

2014-11-01 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HIVE-8594:


Assignee: Ted Yu

 Wrong condition in SettableConfigUpdater#setHiveConfWhiteList()
 ---

 Key: HIVE-8594
 URL: https://issues.apache.org/jira/browse/HIVE-8594
 Project: Hive
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor
 Attachments: hive-8594.txt


 {code}
 if(whiteListParamsStr == null  whiteListParamsStr.trim().isEmpty()) {
 {code}
 If whiteListParamsStr is null, the call to trim() would result in NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8594) Wrong condition in SettableConfigUpdater#setHiveConfWhiteList()

2014-11-01 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HIVE-8594:
-
Status: Patch Available  (was: Open)

 Wrong condition in SettableConfigUpdater#setHiveConfWhiteList()
 ---

 Key: HIVE-8594
 URL: https://issues.apache.org/jira/browse/HIVE-8594
 Project: Hive
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor
 Attachments: hive-8594.txt


 {code}
 if(whiteListParamsStr == null  whiteListParamsStr.trim().isEmpty()) {
 {code}
 If whiteListParamsStr is null, the call to trim() would result in NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8687) Support Avro through HCatalog


[ 
https://issues.apache.org/jira/browse/HIVE-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193339#comment-14193339
 ] 

Hive QA commented on HIVE-8687:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12678690/HIVE-8687.4.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6637 tests executed
*Failed tests:*
{noformat}
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1596/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1596/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1596/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12678690 - PreCommit-HIVE-TRUNK-Build

 Support Avro through HCatalog
 -

 Key: HIVE-8687
 URL: https://issues.apache.org/jira/browse/HIVE-8687
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Serializers/Deserializers
Affects Versions: 0.14.0
 Environment: discovered in Pig, but it looks like the root cause 
 impacts all non-Hive users
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8687.2.patch, HIVE-8687.3.patch, HIVE-8687.4.patch, 
 HIVE-8687.branch-0.14.2.patch, HIVE-8687.branch-0.14.3.patch, 
 HIVE-8687.branch-0.14.patch, HIVE-8687.patch


 Attempting to write to a HCatalog defined table backed by the AvroSerde fails 
 with the following stacktrace:
 {code}
 java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be 
 cast to org.apache.hadoop.io.LongWritable
   at 
 org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84)
   at 
 org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253)
   at 
 org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53)
   at 
 org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242)
   at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
   at 
 org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559)
   at 
 org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
 {code}
 The proximal cause of this failure is that the AvroContainerOutputFormat's 
 signature mandates a LongWritable key and HCat's FileRecordWriterContainer 
 forces a NullWritable. I'm not sure of a general fix, other than redefining 
 HiveOutputFormat to mandate a WritableComparable.
 It looks like accepting WritableComparable is what's done in the other Hive 
 OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also 
 be changed, since it's ignoring the key. That way fixing things so 
 FileRecordWriterContainer can always use NullWritable could get spun into a 
 different issue?
 The underlying cause for failure to write to AvroSerde tables is that 
 AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so 
 fixing the above will just push the failure into the placeholder RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8671) Overflow in estimate row count and data size with fetch column stats


[ 
https://issues.apache.org/jira/browse/HIVE-8671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193351#comment-14193351
 ] 

Prasanth J commented on HIVE-8671:
--

[~hagleitn] Can we have this is 0.14?

 Overflow in estimate row count and data size with fetch column stats
 

 Key: HIVE-8671
 URL: https://issues.apache.org/jira/browse/HIVE-8671
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Prasanth J
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8671.1.patch, HIVE-8671.2.patch, HIVE-8671.3.patch, 
 HIVE-8671.4.patch, HIVE-8671.5.patch


 Overflow in row counts and data size for several TPC-DS queries.
 Interestingly the operators which have overflow end up running with a small 
 parallelism.
 For instance Reducer 2 has an overflow but it only runs with parallelism of 2.
 {code}
Reducer 2 
 Reduce Operator Tree:
   Group By Operator
 aggregations: sum(VALUE._col0)
 keys: KEY._col0 (type: string), KEY._col1 (type: string), 
 KEY._col2 (type: string), KEY._col3 (type: string), KEY._col4 (type: float)
 mode: mergepartial
 outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
 Statistics: Num rows: 9223372036854775807 Data size: 
 9223372036854775341 Basic stats: COMPLETE Column stats: COMPLETE
 Reduce Output Operator
   key expressions: _col3 (type: string), _col3 (type: string)
   sort order: ++
   Map-reduce partition columns: _col3 (type: string)
   Statistics: Num rows: 9223372036854775807 Data size: 
 9223372036854775341 Basic stats: COMPLETE Column stats: COMPLETE
   value expressions: _col0 (type: string), _col1 (type: 
 string), _col2 (type: string), _col3 (type: string), _col4 (type: float), 
 _col5 (type: double)
 Execution mode: vectorized
 {code}
 {code}
 VERTEX   TOTAL_TASKSDURATION_SECONDS CPU_TIME_MILLIS 
 INPUT_RECORDS   OUTPUT_RECORDS 
 Map 1 62   26.41   1,779,510   
 211,978,502   60,628,390
 Map 5  14.28   6,950   
 138,098  138,098
 Map 6  12.44   3,910
 31   31
 Reducer 2  2   22.69  61,320
 60,628,390   69,182
 Reducer 3  12.63   3,910
 69,182  100
 Reducer 4  11.01   1,180   
 100  100
 {code}
 Query
 {code}
 explain  
 select  i_item_desc 
   ,i_category 
   ,i_class 
   ,i_current_price
   ,i_item_id
   ,sum(ws_ext_sales_price) as itemrevenue 
   ,sum(ws_ext_sales_price)*100/sum(sum(ws_ext_sales_price)) over
   (partition by i_class) as revenueratio
 from  
   web_sales
   ,item 
   ,date_dim
 where 
   web_sales.ws_item_sk = item.i_item_sk 
   and item.i_category in ('Jewelry', 'Sports', 'Books')
   and web_sales.ws_sold_date_sk = date_dim.d_date_sk
   and date_dim.d_date between '2001-01-12' and '2001-02-11'
 group by 
   i_item_id
 ,i_item_desc 
 ,i_category
 ,i_class
 ,i_current_price
 order by 
   i_category
 ,i_class
 ,i_item_id
 ,i_item_desc
 ,revenueratio
 limit 100
 {code}
 Explain 
 {code}
 STAGE PLANS:
   Stage: Stage-1
 Tez
   Edges:
 Map 1 - Map 5 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE)
 Reducer 2 - Map 1 (SIMPLE_EDGE)
 Reducer 3 - Reducer 2 (SIMPLE_EDGE)
 Reducer 4 - Reducer 3 (SIMPLE_EDGE)
   DagName: mmokhtar_20141019164343_854cb757-01bd-40cb-843e-9ada7c5e6f38:1
   Vertices:
 Map 1 
 Map Operator Tree:
 TableScan
   alias: web_sales
   filterExpr: ws_item_sk is not null (type: boolean)
   Statistics: Num rows: 21594638446 Data size: 2850189889652 
 Basic stats: COMPLETE Column stats: COMPLETE
   Filter Operator
 predicate: ws_item_sk is not null (type: boolean)
 Statistics: Num rows: 21594638446 Data size: 172746300152 
 Basic stats: COMPLETE Column stats: COMPLETE
 Select Operator
   expressions: ws_item_sk (type: int), ws_ext_sales_price 
 (type: float), ws_sold_date_sk (type: int)
   outputColumnNames: _col0, _col1, _col2
   Statistics: Num rows: 21594638446

[jira] [Commented] (HIVE-8671) Overflow in estimate row count and data size with fetch column stats


[ 
https://issues.apache.org/jira/browse/HIVE-8671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193366#comment-14193366
 ] 

Gunther Hagleitner commented on HIVE-8671:
--

+1 for 0.14

 Overflow in estimate row count and data size with fetch column stats
 

 Key: HIVE-8671
 URL: https://issues.apache.org/jira/browse/HIVE-8671
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Prasanth J
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8671.1.patch, HIVE-8671.2.patch, HIVE-8671.3.patch, 
 HIVE-8671.4.patch, HIVE-8671.5.patch


 Overflow in row counts and data size for several TPC-DS queries.
 Interestingly the operators which have overflow end up running with a small 
 parallelism.
 For instance Reducer 2 has an overflow but it only runs with parallelism of 2.
 {code}
Reducer 2 
 Reduce Operator Tree:
   Group By Operator
 aggregations: sum(VALUE._col0)
 keys: KEY._col0 (type: string), KEY._col1 (type: string), 
 KEY._col2 (type: string), KEY._col3 (type: string), KEY._col4 (type: float)
 mode: mergepartial
 outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
 Statistics: Num rows: 9223372036854775807 Data size: 
 9223372036854775341 Basic stats: COMPLETE Column stats: COMPLETE
 Reduce Output Operator
   key expressions: _col3 (type: string), _col3 (type: string)
   sort order: ++
   Map-reduce partition columns: _col3 (type: string)
   Statistics: Num rows: 9223372036854775807 Data size: 
 9223372036854775341 Basic stats: COMPLETE Column stats: COMPLETE
   value expressions: _col0 (type: string), _col1 (type: 
 string), _col2 (type: string), _col3 (type: string), _col4 (type: float), 
 _col5 (type: double)
 Execution mode: vectorized
 {code}
 {code}
 VERTEX   TOTAL_TASKSDURATION_SECONDS CPU_TIME_MILLIS 
 INPUT_RECORDS   OUTPUT_RECORDS 
 Map 1 62   26.41   1,779,510   
 211,978,502   60,628,390
 Map 5  14.28   6,950   
 138,098  138,098
 Map 6  12.44   3,910
 31   31
 Reducer 2  2   22.69  61,320
 60,628,390   69,182
 Reducer 3  12.63   3,910
 69,182  100
 Reducer 4  11.01   1,180   
 100  100
 {code}
 Query
 {code}
 explain  
 select  i_item_desc 
   ,i_category 
   ,i_class 
   ,i_current_price
   ,i_item_id
   ,sum(ws_ext_sales_price) as itemrevenue 
   ,sum(ws_ext_sales_price)*100/sum(sum(ws_ext_sales_price)) over
   (partition by i_class) as revenueratio
 from  
   web_sales
   ,item 
   ,date_dim
 where 
   web_sales.ws_item_sk = item.i_item_sk 
   and item.i_category in ('Jewelry', 'Sports', 'Books')
   and web_sales.ws_sold_date_sk = date_dim.d_date_sk
   and date_dim.d_date between '2001-01-12' and '2001-02-11'
 group by 
   i_item_id
 ,i_item_desc 
 ,i_category
 ,i_class
 ,i_current_price
 order by 
   i_category
 ,i_class
 ,i_item_id
 ,i_item_desc
 ,revenueratio
 limit 100
 {code}
 Explain 
 {code}
 STAGE PLANS:
   Stage: Stage-1
 Tez
   Edges:
 Map 1 - Map 5 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE)
 Reducer 2 - Map 1 (SIMPLE_EDGE)
 Reducer 3 - Reducer 2 (SIMPLE_EDGE)
 Reducer 4 - Reducer 3 (SIMPLE_EDGE)
   DagName: mmokhtar_20141019164343_854cb757-01bd-40cb-843e-9ada7c5e6f38:1
   Vertices:
 Map 1 
 Map Operator Tree:
 TableScan
   alias: web_sales
   filterExpr: ws_item_sk is not null (type: boolean)
   Statistics: Num rows: 21594638446 Data size: 2850189889652 
 Basic stats: COMPLETE Column stats: COMPLETE
   Filter Operator
 predicate: ws_item_sk is not null (type: boolean)
 Statistics: Num rows: 21594638446 Data size: 172746300152 
 Basic stats: COMPLETE Column stats: COMPLETE
 Select Operator
   expressions: ws_item_sk (type: int), ws_ext_sales_price 
 (type: float), ws_sold_date_sk (type: int)
   outputColumnNames: _col0, _col1, _col2
   Statistics: Num rows: 21594638446 Data size:

[jira] [Resolved] (HIVE-8424) Support fair scheduler user queue mapping in non-impersonation mode

2014-11-01 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V resolved HIVE-8424.
---
Resolution: Fixed

Sure, [~brocknoland] - I see that HIVE-8693 has been opened for this.

 Support fair scheduler user queue mapping in non-impersonation mode
 ---

 Key: HIVE-8424
 URL: https://issues.apache.org/jira/browse/HIVE-8424
 Project: Hive
  Issue Type: Improvement
  Components: Shims
Reporter: Mohit Sabharwal
Assignee: Mohit Sabharwal
  Labels: TODOC15
 Fix For: 0.15.0

 Attachments: HIVE-8424.1.patch, HIVE-8424.2.patch, HIVE-8424.3.patch, 
 HIVE-8424.patch


 Under non-impersonation mode, all MR jobs run as the hive system user. The 
 default scheduler queue mapping is one queue per user. This is problematic 
 for users who use the queues to regulate and track their MR resource usage.
 Yarn exposes an API to retrieve the fair scheduler queue mapping, which we 
 can use to set the appropriate MR queue for the current user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8687) Support Avro through HCatalog


[ 
https://issues.apache.org/jira/browse/HIVE-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193383#comment-14193383
 ] 

Gunther Hagleitner commented on HIVE-8687:
--

+1 for hive .14

 Support Avro through HCatalog
 -

 Key: HIVE-8687
 URL: https://issues.apache.org/jira/browse/HIVE-8687
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Serializers/Deserializers
Affects Versions: 0.14.0
 Environment: discovered in Pig, but it looks like the root cause 
 impacts all non-Hive users
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8687.2.patch, HIVE-8687.3.patch, HIVE-8687.4.patch, 
 HIVE-8687.branch-0.14.2.patch, HIVE-8687.branch-0.14.3.patch, 
 HIVE-8687.branch-0.14.patch, HIVE-8687.patch


 Attempting to write to a HCatalog defined table backed by the AvroSerde fails 
 with the following stacktrace:
 {code}
 java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be 
 cast to org.apache.hadoop.io.LongWritable
   at 
 org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84)
   at 
 org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253)
   at 
 org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53)
   at 
 org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242)
   at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
   at 
 org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559)
   at 
 org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
 {code}
 The proximal cause of this failure is that the AvroContainerOutputFormat's 
 signature mandates a LongWritable key and HCat's FileRecordWriterContainer 
 forces a NullWritable. I'm not sure of a general fix, other than redefining 
 HiveOutputFormat to mandate a WritableComparable.
 It looks like accepting WritableComparable is what's done in the other Hive 
 OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also 
 be changed, since it's ignoring the key. That way fixing things so 
 FileRecordWriterContainer can always use NullWritable could get spun into a 
 different issue?
 The underlying cause for failure to write to AvroSerde tables is that 
 AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so 
 fixing the above will just push the failure into the placeholder RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8671) Overflow in estimate row count and data size with fetch column stats


 [ 
https://issues.apache.org/jira/browse/HIVE-8671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-8671:
-
   Resolution: Fixed
Fix Version/s: 0.15.0
   Status: Resolved  (was: Patch Available)

Patch committed to trunk and branch-0.14.

 Overflow in estimate row count and data size with fetch column stats
 

 Key: HIVE-8671
 URL: https://issues.apache.org/jira/browse/HIVE-8671
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Prasanth J
Priority: Critical
 Fix For: 0.14.0, 0.15.0

 Attachments: HIVE-8671.1.patch, HIVE-8671.2.patch, HIVE-8671.3.patch, 
 HIVE-8671.4.patch, HIVE-8671.5.patch


 Overflow in row counts and data size for several TPC-DS queries.
 Interestingly the operators which have overflow end up running with a small 
 parallelism.
 For instance Reducer 2 has an overflow but it only runs with parallelism of 2.
 {code}
Reducer 2 
 Reduce Operator Tree:
   Group By Operator
 aggregations: sum(VALUE._col0)
 keys: KEY._col0 (type: string), KEY._col1 (type: string), 
 KEY._col2 (type: string), KEY._col3 (type: string), KEY._col4 (type: float)
 mode: mergepartial
 outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
 Statistics: Num rows: 9223372036854775807 Data size: 
 9223372036854775341 Basic stats: COMPLETE Column stats: COMPLETE
 Reduce Output Operator
   key expressions: _col3 (type: string), _col3 (type: string)
   sort order: ++
   Map-reduce partition columns: _col3 (type: string)
   Statistics: Num rows: 9223372036854775807 Data size: 
 9223372036854775341 Basic stats: COMPLETE Column stats: COMPLETE
   value expressions: _col0 (type: string), _col1 (type: 
 string), _col2 (type: string), _col3 (type: string), _col4 (type: float), 
 _col5 (type: double)
 Execution mode: vectorized
 {code}
 {code}
 VERTEX   TOTAL_TASKSDURATION_SECONDS CPU_TIME_MILLIS 
 INPUT_RECORDS   OUTPUT_RECORDS 
 Map 1 62   26.41   1,779,510   
 211,978,502   60,628,390
 Map 5  14.28   6,950   
 138,098  138,098
 Map 6  12.44   3,910
 31   31
 Reducer 2  2   22.69  61,320
 60,628,390   69,182
 Reducer 3  12.63   3,910
 69,182  100
 Reducer 4  11.01   1,180   
 100  100
 {code}
 Query
 {code}
 explain  
 select  i_item_desc 
   ,i_category 
   ,i_class 
   ,i_current_price
   ,i_item_id
   ,sum(ws_ext_sales_price) as itemrevenue 
   ,sum(ws_ext_sales_price)*100/sum(sum(ws_ext_sales_price)) over
   (partition by i_class) as revenueratio
 from  
   web_sales
   ,item 
   ,date_dim
 where 
   web_sales.ws_item_sk = item.i_item_sk 
   and item.i_category in ('Jewelry', 'Sports', 'Books')
   and web_sales.ws_sold_date_sk = date_dim.d_date_sk
   and date_dim.d_date between '2001-01-12' and '2001-02-11'
 group by 
   i_item_id
 ,i_item_desc 
 ,i_category
 ,i_class
 ,i_current_price
 order by 
   i_category
 ,i_class
 ,i_item_id
 ,i_item_desc
 ,revenueratio
 limit 100
 {code}
 Explain 
 {code}
 STAGE PLANS:
   Stage: Stage-1
 Tez
   Edges:
 Map 1 - Map 5 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE)
 Reducer 2 - Map 1 (SIMPLE_EDGE)
 Reducer 3 - Reducer 2 (SIMPLE_EDGE)
 Reducer 4 - Reducer 3 (SIMPLE_EDGE)
   DagName: mmokhtar_20141019164343_854cb757-01bd-40cb-843e-9ada7c5e6f38:1
   Vertices:
 Map 1 
 Map Operator Tree:
 TableScan
   alias: web_sales
   filterExpr: ws_item_sk is not null (type: boolean)
   Statistics: Num rows: 21594638446 Data size: 2850189889652 
 Basic stats: COMPLETE Column stats: COMPLETE
   Filter Operator
 predicate: ws_item_sk is not null (type: boolean)
 Statistics: Num rows: 21594638446 Data size: 172746300152 
 Basic stats: COMPLETE Column stats: COMPLETE
 Select Operator
   expressions: ws_item_sk (type: int), ws_ext_sales_price 
 (type: float), ws_sold_date_sk (type: int)
   outputColumnNames: _col0, _col1, _col2

[jira] [Commented] (HIVE-8689) handle overflows in statistics better


[ 
https://issues.apache.org/jira/browse/HIVE-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193394#comment-14193394
 ] 

Prasanth J commented on HIVE-8689:
--

[~sershe] HIVE-8671 committed now. Can you rebase this patch now? Also can you 
fix Mostafa's change to reducer estimation. It will estimate one reducer less 
than the previous code. For example: if totalInputFileSize is 140 and 
bytesPerReducer is 100 then current change will just say 1 reducer. We should 
either have Math.ceil or Math.max(totalInputFileSize, totalInputFileSize + 
bytesPerReducer - 1)/bytesPerReducer.. 

 handle overflows in statistics better
 -

 Key: HIVE-8689
 URL: https://issues.apache.org/jira/browse/HIVE-8689
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.14.0

 Attachments: HIVE-8689.01.patch, HIVE-8689.patch


 Improve overflow checks in StatsAnnotation optimizer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode


[ 
https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193395#comment-14193395
 ] 

Hive QA commented on HIVE-8685:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12678688/HIVE-8685.3.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6609 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1597/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1597/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1597/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12678688 - PreCommit-HIVE-TRUNK-Build

 DDL operations in WebHCat set proxy user to null in unsecure mode
 ---

 Key: HIVE-8685
 URL: https://issues.apache.org/jira/browse/HIVE-8685
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Priority: Critical
 Attachments: HIVE-8685.2.patch, HIVE-8685.3.patch, HIVE-8685.patch


 This makes DDL commands fail
 This was stupidly broken in HIVE-8643



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8694) every WebHCat e2e test should specify statusdir parameter

Eugene Koifman created HIVE-8694:


 Summary: every WebHCat e2e test should specify statusdir parameter
 Key: HIVE-8694
 URL: https://issues.apache.org/jira/browse/HIVE-8694
 Project: Hive
  Issue Type: Bug
  Components: Tests, WebHCat
Affects Versions: 0.13.1
Reporter: Eugene Koifman
Assignee: Eugene Koifman


e.g. 'statusdir=TestSqoop_:TNUM:'
This captures stdout/stderr for job submission and helps diagnosing failures.

See if it's easy to add something to the test harness to collect all the info 
in these dirs to make it available after cluster shutdown.

NO _PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-4329) HCatalog should use getHiveRecordWriter rather than getRecordWriter


 [ 
https://issues.apache.org/jira/browse/HIVE-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-4329:
-
Priority: Major  (was: Critical)

 HCatalog should use getHiveRecordWriter rather than getRecordWriter
 ---

 Key: HIVE-4329
 URL: https://issues.apache.org/jira/browse/HIVE-4329
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Serializers/Deserializers
Affects Versions: 0.14.0
 Environment: discovered in Pig, but it looks like the root cause 
 impacts all non-Hive users
Reporter: Sean Busbey
Assignee: David Chen
 Attachments: HIVE-4329.0.patch, HIVE-4329.1.patch, HIVE-4329.2.patch, 
 HIVE-4329.3.patch, HIVE-4329.4.patch, HIVE-4329.5.patch


 Attempting to write to a HCatalog defined table backed by the AvroSerde fails 
 with the following stacktrace:
 {code}
 java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be 
 cast to org.apache.hadoop.io.LongWritable
   at 
 org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84)
   at 
 org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253)
   at 
 org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53)
   at 
 org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242)
   at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
   at 
 org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559)
   at 
 org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
 {code}
 The proximal cause of this failure is that the AvroContainerOutputFormat's 
 signature mandates a LongWritable key and HCat's FileRecordWriterContainer 
 forces a NullWritable. I'm not sure of a general fix, other than redefining 
 HiveOutputFormat to mandate a WritableComparable.
 It looks like accepting WritableComparable is what's done in the other Hive 
 OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also 
 be changed, since it's ignoring the key. That way fixing things so 
 FileRecordWriterContainer can always use NullWritable could get spun into a 
 different issue?
 The underlying cause for failure to write to AvroSerde tables is that 
 AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so 
 fixing the above will just push the failure into the placeholder RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-4329) HCatalog should use getHiveRecordWriter rather than getRecordWriter


[ 
https://issues.apache.org/jira/browse/HIVE-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193398#comment-14193398
 ] 

Gunther Hagleitner commented on HIVE-4329:
--

Setting Major because with HIVE-8687 it's not critical for hive .14 anymore.

 HCatalog should use getHiveRecordWriter rather than getRecordWriter
 ---

 Key: HIVE-4329
 URL: https://issues.apache.org/jira/browse/HIVE-4329
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Serializers/Deserializers
Affects Versions: 0.14.0
 Environment: discovered in Pig, but it looks like the root cause 
 impacts all non-Hive users
Reporter: Sean Busbey
Assignee: David Chen
 Attachments: HIVE-4329.0.patch, HIVE-4329.1.patch, HIVE-4329.2.patch, 
 HIVE-4329.3.patch, HIVE-4329.4.patch, HIVE-4329.5.patch


 Attempting to write to a HCatalog defined table backed by the AvroSerde fails 
 with the following stacktrace:
 {code}
 java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be 
 cast to org.apache.hadoop.io.LongWritable
   at 
 org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84)
   at 
 org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253)
   at 
 org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53)
   at 
 org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242)
   at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
   at 
 org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559)
   at 
 org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
 {code}
 The proximal cause of this failure is that the AvroContainerOutputFormat's 
 signature mandates a LongWritable key and HCat's FileRecordWriterContainer 
 forces a NullWritable. I'm not sure of a general fix, other than redefining 
 HiveOutputFormat to mandate a WritableComparable.
 It looks like accepting WritableComparable is what's done in the other Hive 
 OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also 
 be changed, since it's ignoring the key. That way fixing things so 
 FileRecordWriterContainer can always use NullWritable could get spun into a 
 different issue?
 The underlying cause for failure to write to AvroSerde tables is that 
 AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so 
 fixing the above will just push the failure into the placeholder RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8656) CBO: auto_join_filters fails


[ 
https://issues.apache.org/jira/browse/HIVE-8656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193402#comment-14193402
 ] 

Ashutosh Chauhan commented on HIVE-8656:


+1
[~julianhyde] Please update CALCITE-448 with correct description.

 CBO: auto_join_filters fails
 

 Key: HIVE-8656
 URL: https://issues.apache.org/jira/browse/HIVE-8656
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Sergey Shelukhin
Assignee: Ashutosh Chauhan
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8656.patch


 Haven't looked why yet



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode


[ 
https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193409#comment-14193409
 ] 

Eugene Koifman commented on HIVE-8685:
--

the 2 test failures are not related
testNegativeTokenAuth has been failing for many builds now

org.apache.hive.hcatalog.streaming.TestStreaming is failing intermittently, for 
example,
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1594/testReport/junit/org.apache.hive.hcatalog.streaming/TestStreaming/testRemainingTransactions/
 has exactly the same stack trace


 DDL operations in WebHCat set proxy user to null in unsecure mode
 ---

 Key: HIVE-8685
 URL: https://issues.apache.org/jira/browse/HIVE-8685
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Priority: Critical
 Attachments: HIVE-8685.2.patch, HIVE-8685.3.patch, HIVE-8685.patch


 This makes DDL commands fail
 This was stupidly broken in HIVE-8643



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8693) Separate out fair scheduler dependency from hadoop 0.23 shim


 [ 
https://issues.apache.org/jira/browse/HIVE-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8693:
---
Attachment: HIVE-8693.1.patch

 Separate out fair scheduler dependency from hadoop 0.23 shim
 

 Key: HIVE-8693
 URL: https://issues.apache.org/jira/browse/HIVE-8693
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Shims
Affects Versions: 0.14.0, 0.15.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-8693.1.patch, HIVE-8693.1.patch


 As part of HIVE-8424 HiveServer2 uses Fair scheduler APIs to determine 
 resource queue allocation for non-impersonation case. This adds a hard 
 dependency of Yarn server jars for Hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8656) CBO: auto_join_filters fails


[ 
https://issues.apache.org/jira/browse/HIVE-8656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193423#comment-14193423
 ] 

Ashutosh Chauhan commented on HIVE-8656:


[~hagleitn] ok for 0.14 ?

 CBO: auto_join_filters fails
 

 Key: HIVE-8656
 URL: https://issues.apache.org/jira/browse/HIVE-8656
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Sergey Shelukhin
Assignee: Ashutosh Chauhan
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8656.patch


 Haven't looked why yet



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8675) Increase thrift server protocol test coverage


 [ 
https://issues.apache.org/jira/browse/HIVE-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8675:
---
Attachment: HIVE-8675.patch

 Increase thrift server protocol test coverage
 -

 Key: HIVE-8675
 URL: https://issues.apache.org/jira/browse/HIVE-8675
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Fix For: 0.14.0

 Attachments: HIVE-8675.patch, HIVE-8675.patch, HIVE-8675.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8688) serialized plan OutputStream is not being closed


[ 
https://issues.apache.org/jira/browse/HIVE-8688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193427#comment-14193427
 ] 

Gunther Hagleitner commented on HIVE-8688:
--

+1 for hive.14

 serialized plan OutputStream is not being closed
 

 Key: HIVE-8688
 URL: https://issues.apache.org/jira/browse/HIVE-8688
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8688.1.patch


 The OutputStream to which serialized plan is not being closed in several 
 places.
 This can result in plan not getting written correctly.
 I have seen intermittent issues in deserializing the plan, and I think this 
 could be the/a cause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8688) serialized plan OutputStream is not being closed


 [ 
https://issues.apache.org/jira/browse/HIVE-8688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-8688:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Patch committed to trunk and 0.14 branch.
Thanks for the review Jason and Gunther!


 serialized plan OutputStream is not being closed
 

 Key: HIVE-8688
 URL: https://issues.apache.org/jira/browse/HIVE-8688
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8688.1.patch


 The OutputStream to which serialized plan is not being closed in several 
 places.
 This can result in plan not getting written correctly.
 I have seen intermittent issues in deserializing the plan, and I think this 
 could be the/a cause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7576) Add PartitionSpec support in HCatClient API


 [ 
https://issues.apache.org/jira/browse/HIVE-7576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-7576:
---
Attachment: HIVE-7576.2.patch

After verbal confirmation from Mithun that he's okay with me adding 
InterfaceAudience.LimitedPrivate(Hive) and InterfaceStability.Evolving on all 
the new methods using PartitionSpec, I updated his patch with them.

 Add PartitionSpec support in HCatClient API
 ---

 Key: HIVE-7576
 URL: https://issues.apache.org/jira/browse/HIVE-7576
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Metastore
Affects Versions: 0.13.1
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-7576.1.patch, HIVE-7576.2.patch


 HIVE-7223 adds support for PartitionSpecs in Hive Metastore. The HCatClient 
 API must add support to fetch partitions, add partitions, etc. using 
 PartitionSpec semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8680) Set Max Message for Binary Thrift endpoints


 [ 
https://issues.apache.org/jira/browse/HIVE-8680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8680:
---
Attachment: HIVE-8680.patch

 Set Max Message for Binary Thrift endpoints
 ---

 Key: HIVE-8680
 URL: https://issues.apache.org/jira/browse/HIVE-8680
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-8680.patch, HIVE-8680.patch


 Thrift has a configuration open to restrict incoming message size. If we 
 configure this we'll stop OOM'ing when someone sends us an HTTP request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7576) Add PartitionSpec support in HCatClient API


[ 
https://issues.apache.org/jira/browse/HIVE-7576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193433#comment-14193433
 ] 

Sushanth Sowmyan commented on HIVE-7576:


(Submitted to pre-commit queue manually - 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1601
 will test it.)

 Add PartitionSpec support in HCatClient API
 ---

 Key: HIVE-7576
 URL: https://issues.apache.org/jira/browse/HIVE-7576
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Metastore
Affects Versions: 0.13.1
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-7576.1.patch, HIVE-7576.2.patch


 HIVE-7223 adds support for PartitionSpecs in Hive Metastore. The HCatClient 
 API must add support to fetch partitions, add partitions, etc. using 
 PartitionSpec semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8666) hive.metastore.server.max.threads default is too high


 [ 
https://issues.apache.org/jira/browse/HIVE-8666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8666:
---
   Resolution: Fixed
Fix Version/s: 0.15.0
   Status: Resolved  (was: Patch Available)

 hive.metastore.server.max.threads default is too high
 -

 Key: HIVE-8666
 URL: https://issues.apache.org/jira/browse/HIVE-8666
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Fix For: 0.15.0

 Attachments: HIVE-8666.patch


 {{hive.metastore.server.max.threads}} defaults to 100K. Each thread requires 
 a 1024KB stack which is 100GB. We should move the default to something more 
 sensible like 1000.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode


 [ 
https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-8685:
-
   Resolution: Fixed
Fix Version/s: 0.15.0
   0.14.0
   Status: Resolved  (was: Patch Available)

Committed to 0.14 and 0.15.  Thanks [~thejas] for reivew

 DDL operations in WebHCat set proxy user to null in unsecure mode
 ---

 Key: HIVE-8685
 URL: https://issues.apache.org/jira/browse/HIVE-8685
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Priority: Critical
 Fix For: 0.14.0, 0.15.0

 Attachments: HIVE-8685.2.patch, HIVE-8685.3.patch, HIVE-8685.patch


 This makes DDL commands fail
 This was stupidly broken in HIVE-8643



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8687) Support Avro through HCatalog


 [ 
https://issues.apache.org/jira/browse/HIVE-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8687:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to branch and trunk. Thanks [~sushanth]

 Support Avro through HCatalog
 -

 Key: HIVE-8687
 URL: https://issues.apache.org/jira/browse/HIVE-8687
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Serializers/Deserializers
Affects Versions: 0.14.0
 Environment: discovered in Pig, but it looks like the root cause 
 impacts all non-Hive users
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8687.2.patch, HIVE-8687.3.patch, HIVE-8687.4.patch, 
 HIVE-8687.branch-0.14.2.patch, HIVE-8687.branch-0.14.3.patch, 
 HIVE-8687.branch-0.14.patch, HIVE-8687.patch


 Attempting to write to a HCatalog defined table backed by the AvroSerde fails 
 with the following stacktrace:
 {code}
 java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be 
 cast to org.apache.hadoop.io.LongWritable
   at 
 org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84)
   at 
 org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253)
   at 
 org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53)
   at 
 org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242)
   at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
   at 
 org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559)
   at 
 org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
 {code}
 The proximal cause of this failure is that the AvroContainerOutputFormat's 
 signature mandates a LongWritable key and HCat's FileRecordWriterContainer 
 forces a NullWritable. I'm not sure of a general fix, other than redefining 
 HiveOutputFormat to mandate a WritableComparable.
 It looks like accepting WritableComparable is what's done in the other Hive 
 OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also 
 be changed, since it's ignoring the key. That way fixing things so 
 FileRecordWriterContainer can always use NullWritable could get spun into a 
 different issue?
 The underlying cause for failure to write to AvroSerde tables is that 
 AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so 
 fixing the above will just push the failure into the placeholder RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 26968: HIVE-8122: convert ExprNode to Parquet supported FilterPredict

2014-11-01 Thread Brock Noland


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26968/#review59500
---


Hi,

This approach looks great! I think we should try and avoid creating 
FilterPredicateType which duplicates Type. We can update Type and make the 
associated changes in ORC as needed.

Additionally the latest parquet supports Timestamp and Decimal.

Thanks!!


serde/src/java/org/apache/hadoop/hive/ql/io/sarg/PredicateLeaf.java
https://reviews.apache.org/r/26968/#comment100778

Perhaps we can change the Type enum to seperate out the types we need and 
then alter Orc to perform an type == String || type == CHAR || type VARCHAR?


- Brock Noland


On Oct. 21, 2014, 8:13 a.m., cheng xu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/26968/
 ---
 
 (Updated Oct. 21, 2014, 8:13 a.m.)
 
 
 Review request for hive.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 HIVE-8122: convert ExprNode to Parquet supported FilterPredict
 
 
 Diffs
 -
 
   pom.xml c69498004cdf93d3955c863031858a2dde2d8ccc 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java 
 PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java
  f5da46d392d8ac5f5589f66c37d567b1d3bd8843 
   ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java 
 eeb9641545ed0ad69f3bbc9a8383697fc7efe37d 
   ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java 
 831ef8c8ec64c270ef62d5336b4cc78d9e34b398 
   serde/pom.xml 98e55061b6b3abe18030b0b8d3f511bd98bee5f7 
   serde/src/java/org/apache/hadoop/hive/ql/io/sarg/PredicateLeaf.java 
 616c6dbd1ec71ad178f41e8666bad2500e68e151 
   serde/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java 
 db0f0148e2a995534a4c1369fc4c542cd0b4e6ab 
 
 Diff: https://reviews.apache.org/r/26968/diff/
 
 
 Testing
 ---
 
 local UT passed
 
 
 Thanks,
 
 cheng xu

[jira] [Commented] (HIVE-8461) Make Vectorized Decimal query results match Non-Vectorized query results with respect to trailing zeroes... .0000


[ 
https://issues.apache.org/jira/browse/HIVE-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193452#comment-14193452
 ] 

Hive QA commented on HIVE-8461:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12678693/HIVE-8461.05.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 6640 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.streaming.TestStreaming.testInterleavedTransactionBatchCommits
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1598/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1598/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1598/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12678693 - PreCommit-HIVE-TRUNK-Build

 Make Vectorized Decimal query results match Non-Vectorized query results with 
 respect to trailing zeroes... .
 -

 Key: HIVE-8461
 URL: https://issues.apache.org/jira/browse/HIVE-8461
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 0.14.0
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8461.01.patch, HIVE-8461.02.patch, 
 HIVE-8461.03.patch, HIVE-8461.04.patch, HIVE-8461.05.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8656) CBO: auto_join_filters fails


[ 
https://issues.apache.org/jira/browse/HIVE-8656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193474#comment-14193474
 ] 

Gunther Hagleitner commented on HIVE-8656:
--

Yes, please. +1 for 0.14.

 CBO: auto_join_filters fails
 

 Key: HIVE-8656
 URL: https://issues.apache.org/jira/browse/HIVE-8656
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Sergey Shelukhin
Assignee: Ashutosh Chauhan
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8656.patch


 Haven't looked why yet



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8461) Make Vectorized Decimal query results match Non-Vectorized query results with respect to trailing zeroes... .0000


 [ 
https://issues.apache.org/jira/browse/HIVE-8461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8461:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to .14 and trunk. (Test failures are unrelated).

 Make Vectorized Decimal query results match Non-Vectorized query results with 
 respect to trailing zeroes... .
 -

 Key: HIVE-8461
 URL: https://issues.apache.org/jira/browse/HIVE-8461
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 0.14.0
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8461.01.patch, HIVE-8461.02.patch, 
 HIVE-8461.03.patch, HIVE-8461.04.patch, HIVE-8461.05.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7111) Extend join transitivity PPD to non-column expressions


[ 
https://issues.apache.org/jira/browse/HIVE-7111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193481#comment-14193481
 ] 

Ashutosh Chauhan commented on HIVE-7111:


+1 
New logic is much cleaner than earlier as it doesn't refer to parse info and 
ASTs. Good work, Navis!

 Extend join transitivity PPD to non-column expressions
 --

 Key: HIVE-7111
 URL: https://issues.apache.org/jira/browse/HIVE-7111
 Project: Hive
  Issue Type: Task
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-7111.1.patch.txt, HIVE-7111.2.patch.txt, 
 HIVE-7111.2.patch.txt, HIVE-7111.3.patch.txt, HIVE-7111.4.patch.txt


 Join transitive in PPD only supports column expressions, but it's possible to 
 extend this to generic expressions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7576) Add PartitionSpec support in HCatClient API


[ 
https://issues.apache.org/jira/browse/HIVE-7576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193484#comment-14193484
 ] 

Mithun Radhakrishnan commented on HIVE-7576:


Thanks for adding the Interface annotations. That's a good idea.

 Add PartitionSpec support in HCatClient API
 ---

 Key: HIVE-7576
 URL: https://issues.apache.org/jira/browse/HIVE-7576
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Metastore
Affects Versions: 0.13.1
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-7576.1.patch, HIVE-7576.2.patch


 HIVE-7223 adds support for PartitionSpecs in Hive Metastore. The HCatClient 
 API must add support to fetch partitions, add partitions, etc. using 
 PartitionSpec semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8656) CBO: auto_join_filters fails


 [ 
https://issues.apache.org/jira/browse/HIVE-8656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8656:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

 CBO: auto_join_filters fails
 

 Key: HIVE-8656
 URL: https://issues.apache.org/jira/browse/HIVE-8656
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Sergey Shelukhin
Assignee: Julian Hyde
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8656.patch


 Haven't looked why yet



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8656) CBO: auto_join_filters fails


 [ 
https://issues.apache.org/jira/browse/HIVE-8656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8656:
---
Assignee: Julian Hyde  (was: Ashutosh Chauhan)

 CBO: auto_join_filters fails
 

 Key: HIVE-8656
 URL: https://issues.apache.org/jira/browse/HIVE-8656
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Sergey Shelukhin
Assignee: Julian Hyde
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8656.patch


 Haven't looked why yet



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8656) CBO: auto_join_filters fails


[ 
https://issues.apache.org/jira/browse/HIVE-8656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193490#comment-14193490
 ] 

Ashutosh Chauhan commented on HIVE-8656:


Committed to trunk  0.14. Thanks, Julian !

 CBO: auto_join_filters fails
 

 Key: HIVE-8656
 URL: https://issues.apache.org/jira/browse/HIVE-8656
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Sergey Shelukhin
Assignee: Julian Hyde
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8656.patch


 Haven't looked why yet



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7803) Enable Hadoop speculative execution may cause corrupt output directory (dynamic partition)


 [ 
https://issues.apache.org/jira/browse/HIVE-7803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-7803:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

(Closing as duplicate without committing, since this functionality is subsumed 
and improved by HIVE-8394)

 Enable Hadoop speculative execution may cause corrupt output directory 
 (dynamic partition)
 --

 Key: HIVE-7803
 URL: https://issues.apache.org/jira/browse/HIVE-7803
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.1
 Environment: 
Reporter: Selina Zhang
Assignee: Selina Zhang
Priority: Critical
 Attachments: HIVE-7803.1.patch, HIVE-7803.2.patch


 One of our users reports they see intermittent failures due to attempt 
 directories in the input paths. We found with speculative execution turned 
 on, two mappers tried to commit task at the same time using the same 
 committed task path,  which cause the corrupt output directory. 
 The original Pig script:
 {code}
 STORE AdvertiserDataParsedClean INTO '$DB_NAME.$ADVERTISER_META_TABLE_NAME'
 USING org.apache.hcatalog.pig.HCatStorer();
 {code}
 Two mappers
 attempt_1405021984947_5394024_m_000523_0: KILLED
 attempt_1405021984947_5394024_m_000523_1: SUCCEEDED
 attempt_1405021984947_5394024_m_000523_0 was killed right after the commit.
 As a result, it created corrupt directory as 
   
 /projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/task_1405021984947_5394024_m_000523/
 containing 
part-m-00523 (from attempt_1405021984947_5394024_m_000523_0)
 and 
attempt_1405021984947_5394024_m_000523_1/part-m-00523
 Namenode Audit log
 ==
 1. 2014-08-05 05:04:36,811 INFO FSNamesystem.audit: ugi=* ip=ipaddress1 
 cmd=create 
 src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_0/part-m-00523
  dst=null  perm=user:group:rw-r-
 2. 2014-08-05 05:04:53,112 INFO FSNamesystem.audit: ugi=* ip=ipaddress2  
 cmd=create 
 src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_1/part-m-00523
  dst=null  perm=user:group:rw-r-
 3. 2014-08-05 05:05:13,001 INFO FSNamesystem.audit: ugi=* ip=ipaddress1 
 cmd=rename 
 src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_0
 dst=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/task_1405021984947_5394024_m_000523
 perm=user:group:rwxr-x---
 4. 2014-08-05 05:05:13,004 INFO FSNamesystem.audit: ugi=* ip=ipaddress2  
 cmd=rename 
 src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_1
 dst=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/task_1405021984947_5394024_m_000523
 perm=user:group:rwxr-x---
 After consulting our Hadoop core team, we was pointed out some HCat code does 
 not participating in the two-phase commit protocol, for example in 
 FileRecordWriterContainer.close():
 {code}
 for (Map.EntryString, org.apache.hadoop.mapred.OutputCommitter 
 entry : baseDynamicCommitters.entrySet()) {
 org.apache.hadoop.mapred.TaskAttemptContext currContext = 
 dynamicContexts.get(entry.getKey());
 OutputCommitter baseOutputCommitter = entry.getValue();
 if (baseOutputCommitter.needsTaskCommit(currContext)) {
 baseOutputCommitter.commitTask(currContext);
 }
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8693) Separate out fair scheduler dependency from hadoop 0.23 shim


[ 
https://issues.apache.org/jira/browse/HIVE-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193500#comment-14193500
 ] 

Hive QA commented on HIVE-8693:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12678709/HIVE-8693.1.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 6637 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hive.hcatalog.streaming.TestStreaming.testInterleavedTransactionBatchCommits
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1599/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1599/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1599/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12678709 - PreCommit-HIVE-TRUNK-Build

 Separate out fair scheduler dependency from hadoop 0.23 shim
 

 Key: HIVE-8693
 URL: https://issues.apache.org/jira/browse/HIVE-8693
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Shims
Affects Versions: 0.14.0, 0.15.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-8693.1.patch, HIVE-8693.1.patch


 As part of HIVE-8424 HiveServer2 uses Fair scheduler APIs to determine 
 resource queue allocation for non-impersonation case. This adds a hard 
 dependency of Yarn server jars for Hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8689) handle overflows in statistics better

2014-11-01 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193503#comment-14193503
 ] 

Sergey Shelukhin commented on HIVE-8689:


[~hagleitn] 14? I will address the latest comment

 handle overflows in statistics better
 -

 Key: HIVE-8689
 URL: https://issues.apache.org/jira/browse/HIVE-8689
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.14.0

 Attachments: HIVE-8689.01.patch, HIVE-8689.patch


 Improve overflow checks in StatsAnnotation optimizer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8313) Optimize evaluation for ExprNodeConstantEvaluator and ExprNodeNullEvaluator


[ 
https://issues.apache.org/jira/browse/HIVE-8313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193513#comment-14193513
 ] 

Mithun Radhakrishnan commented on HIVE-8313:


FWIW, the test failure doesn't look related to this change.

 Optimize evaluation for ExprNodeConstantEvaluator and ExprNodeNullEvaluator
 ---

 Key: HIVE-8313
 URL: https://issues.apache.org/jira/browse/HIVE-8313
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0, 0.13.0, 0.14.0
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-8313.1.patch, HIVE-8313.2.patch


 Consider the following query:
 {code:sql}
 SELECT foo, bar, goo, id
 FROM myTable
 WHERE id IN ( 'A', 'B', 'C', 'D', ... , 'ZZ' );
 {code}
 One finds that when the IN clause has several thousand elements (and the 
 table has several million rows), the query above takes orders-of-magnitude 
 longer to run on Hive 0.12 than say Hive 0.10.
 I have a possibly incomplete fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8689) handle overflows in statistics better

2014-11-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-8689:
---
Attachment: HIVE-8689.02.patch

rebased, added ceil

 handle overflows in statistics better
 -

 Key: HIVE-8689
 URL: https://issues.apache.org/jira/browse/HIVE-8689
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.14.0

 Attachments: HIVE-8689.01.patch, HIVE-8689.02.patch, HIVE-8689.patch


 Improve overflow checks in StatsAnnotation optimizer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8689) handle overflows in statistics better


[ 
https://issues.apache.org/jira/browse/HIVE-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193521#comment-14193521
 ] 

Prasanth J commented on HIVE-8689:
--

+1

 handle overflows in statistics better
 -

 Key: HIVE-8689
 URL: https://issues.apache.org/jira/browse/HIVE-8689
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.14.0

 Attachments: HIVE-8689.01.patch, HIVE-8689.02.patch, HIVE-8689.patch


 Improve overflow checks in StatsAnnotation optimizer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8689) handle overflows in statistics better


[ 
https://issues.apache.org/jira/browse/HIVE-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193522#comment-14193522
 ] 

Prasanth J commented on HIVE-8689:
--

[~sershe] minor nit: Can you remove the getMaxIfOverflow() method? Since we are 
using safeAdd, safeMultiply methods we don't need that anymore.

 handle overflows in statistics better
 -

 Key: HIVE-8689
 URL: https://issues.apache.org/jira/browse/HIVE-8689
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.14.0

 Attachments: HIVE-8689.01.patch, HIVE-8689.02.patch, HIVE-8689.patch


 Improve overflow checks in StatsAnnotation optimizer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8675) Increase thrift server protocol test coverage


[ 
https://issues.apache.org/jira/browse/HIVE-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193531#comment-14193531
 ] 

Hive QA commented on HIVE-8675:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12678711/HIVE-8675.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 6669 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1600/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1600/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1600/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12678711 - PreCommit-HIVE-TRUNK-Build

 Increase thrift server protocol test coverage
 -

 Key: HIVE-8675
 URL: https://issues.apache.org/jira/browse/HIVE-8675
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Fix For: 0.14.0

 Attachments: HIVE-8675.patch, HIVE-8675.patch, HIVE-8675.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8671) Overflow in estimate row count and data size with fetch column stats


 [ 
https://issues.apache.org/jira/browse/HIVE-8671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8671:
-
Fix Version/s: (was: 0.15.0)

 Overflow in estimate row count and data size with fetch column stats
 

 Key: HIVE-8671
 URL: https://issues.apache.org/jira/browse/HIVE-8671
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Prasanth J
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8671.1.patch, HIVE-8671.2.patch, HIVE-8671.3.patch, 
 HIVE-8671.4.patch, HIVE-8671.5.patch


 Overflow in row counts and data size for several TPC-DS queries.
 Interestingly the operators which have overflow end up running with a small 
 parallelism.
 For instance Reducer 2 has an overflow but it only runs with parallelism of 2.
 {code}
Reducer 2 
 Reduce Operator Tree:
   Group By Operator
 aggregations: sum(VALUE._col0)
 keys: KEY._col0 (type: string), KEY._col1 (type: string), 
 KEY._col2 (type: string), KEY._col3 (type: string), KEY._col4 (type: float)
 mode: mergepartial
 outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
 Statistics: Num rows: 9223372036854775807 Data size: 
 9223372036854775341 Basic stats: COMPLETE Column stats: COMPLETE
 Reduce Output Operator
   key expressions: _col3 (type: string), _col3 (type: string)
   sort order: ++
   Map-reduce partition columns: _col3 (type: string)
   Statistics: Num rows: 9223372036854775807 Data size: 
 9223372036854775341 Basic stats: COMPLETE Column stats: COMPLETE
   value expressions: _col0 (type: string), _col1 (type: 
 string), _col2 (type: string), _col3 (type: string), _col4 (type: float), 
 _col5 (type: double)
 Execution mode: vectorized
 {code}
 {code}
 VERTEX   TOTAL_TASKSDURATION_SECONDS CPU_TIME_MILLIS 
 INPUT_RECORDS   OUTPUT_RECORDS 
 Map 1 62   26.41   1,779,510   
 211,978,502   60,628,390
 Map 5  14.28   6,950   
 138,098  138,098
 Map 6  12.44   3,910
 31   31
 Reducer 2  2   22.69  61,320
 60,628,390   69,182
 Reducer 3  12.63   3,910
 69,182  100
 Reducer 4  11.01   1,180   
 100  100
 {code}
 Query
 {code}
 explain  
 select  i_item_desc 
   ,i_category 
   ,i_class 
   ,i_current_price
   ,i_item_id
   ,sum(ws_ext_sales_price) as itemrevenue 
   ,sum(ws_ext_sales_price)*100/sum(sum(ws_ext_sales_price)) over
   (partition by i_class) as revenueratio
 from  
   web_sales
   ,item 
   ,date_dim
 where 
   web_sales.ws_item_sk = item.i_item_sk 
   and item.i_category in ('Jewelry', 'Sports', 'Books')
   and web_sales.ws_sold_date_sk = date_dim.d_date_sk
   and date_dim.d_date between '2001-01-12' and '2001-02-11'
 group by 
   i_item_id
 ,i_item_desc 
 ,i_category
 ,i_class
 ,i_current_price
 order by 
   i_category
 ,i_class
 ,i_item_id
 ,i_item_desc
 ,revenueratio
 limit 100
 {code}
 Explain 
 {code}
 STAGE PLANS:
   Stage: Stage-1
 Tez
   Edges:
 Map 1 - Map 5 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE)
 Reducer 2 - Map 1 (SIMPLE_EDGE)
 Reducer 3 - Reducer 2 (SIMPLE_EDGE)
 Reducer 4 - Reducer 3 (SIMPLE_EDGE)
   DagName: mmokhtar_20141019164343_854cb757-01bd-40cb-843e-9ada7c5e6f38:1
   Vertices:
 Map 1 
 Map Operator Tree:
 TableScan
   alias: web_sales
   filterExpr: ws_item_sk is not null (type: boolean)
   Statistics: Num rows: 21594638446 Data size: 2850189889652 
 Basic stats: COMPLETE Column stats: COMPLETE
   Filter Operator
 predicate: ws_item_sk is not null (type: boolean)
 Statistics: Num rows: 21594638446 Data size: 172746300152 
 Basic stats: COMPLETE Column stats: COMPLETE
 Select Operator
   expressions: ws_item_sk (type: int), ws_ext_sales_price 
 (type: float), ws_sold_date_sk (type: int)
   outputColumnNames: _col0, _col1, _col2
   Statistics: Num rows: 21594638446 Data size: 
 172746300152 Basic stats:

[jira] [Commented] (HIVE-7576) Add PartitionSpec support in HCatClient API


[ 
https://issues.apache.org/jira/browse/HIVE-7576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193534#comment-14193534
 ] 

Hive QA commented on HIVE-7576:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12678712/HIVE-7576.2.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1601/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1601/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1601/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-1601/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
Reverted 'itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestSSL.java'
Reverted 
'itests/hive-unit/src/main/java/org/apache/hive/jdbc/miniHS2/AbstractHiveService.java'
Reverted 'common/src/java/org/apache/hadoop/hive/conf/HiveConf.java'
Reverted 'service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java'
Reverted 
'service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpCLIService.java'
Reverted 
'service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java'
++ awk '{print $2}'
++ egrep -v '^X|^Performing status on external'
++ svn status --no-ignore
+ rm -rf target datanucleus.log ant/target shims/target shims/0.20/target 
shims/0.20S/target shims/0.23/target shims/aggregator/target 
shims/common/target shims/common-secure/target packaging/target 
hbase-handler/target testutils/target jdbc/target metastore/target 
itests/target itests/hcatalog-unit/target itests/test-serde/target 
itests/qtest/target itests/hive-unit-hadoop2/target itests/hive-minikdc/target 
itests/hive-unit/target itests/custom-serde/target itests/util/target 
hcatalog/target hcatalog/core/target hcatalog/streaming/target 
hcatalog/server-extensions/target hcatalog/webhcat/svr/target 
hcatalog/webhcat/java-client/target hcatalog/hcatalog-pig-adapter/target 
accumulo-handler/target hwi/target common/target common/src/gen service/target 
contrib/target serde/target beeline/target odbc/target cli/target 
ql/dependency-reduced-pom.xml ql/target
+ svn update

Fetching external item into 'hcatalog/src/test/e2e/harness'
External at revision 1636066.

At revision 1636066.
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12678712 - PreCommit-HIVE-TRUNK-Build

 Add PartitionSpec support in HCatClient API
 ---

 Key: HIVE-7576
 URL: https://issues.apache.org/jira/browse/HIVE-7576
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Metastore
Affects Versions: 0.13.1
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-7576.1.patch, HIVE-7576.2.patch


 HIVE-7223 adds support for PartitionSpecs in Hive Metastore. The HCatClient 
 API must add support to fetch partitions, add partitions, etc.

[jira] [Commented] (HIVE-8689) handle overflows in statistics better


[ 
https://issues.apache.org/jira/browse/HIVE-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193537#comment-14193537
 ] 

Gunther Hagleitner commented on HIVE-8689:
--

+1 for .14

 handle overflows in statistics better
 -

 Key: HIVE-8689
 URL: https://issues.apache.org/jira/browse/HIVE-8689
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.14.0

 Attachments: HIVE-8689.01.patch, HIVE-8689.02.patch, HIVE-8689.patch


 Improve overflow checks in StatsAnnotation optimizer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8313) Optimize evaluation for ExprNodeConstantEvaluator and ExprNodeNullEvaluator

2014-11-01 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193540#comment-14193540
 ] 

Gopal V commented on HIVE-8313:
---

[~mithun]: are you planning to include this for 0.14?

This would be a good addition.

 Optimize evaluation for ExprNodeConstantEvaluator and ExprNodeNullEvaluator
 ---

 Key: HIVE-8313
 URL: https://issues.apache.org/jira/browse/HIVE-8313
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0, 0.13.0, 0.14.0
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-8313.1.patch, HIVE-8313.2.patch


 Consider the following query:
 {code:sql}
 SELECT foo, bar, goo, id
 FROM myTable
 WHERE id IN ( 'A', 'B', 'C', 'D', ... , 'ZZ' );
 {code}
 One finds that when the IN clause has several thousand elements (and the 
 table has several million rows), the query above takes orders-of-magnitude 
 longer to run on Hive 0.12 than say Hive 0.10.
 I have a possibly incomplete fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8695) TestJdbcWithMiniKdc.testNegativeTokenAuth fails on non-expected error messages

Xiaobing Zhou created HIVE-8695:
---

 Summary: TestJdbcWithMiniKdc.testNegativeTokenAuth fails on 
non-expected error messages
 Key: HIVE-8695
 URL: https://issues.apache.org/jira/browse/HIVE-8695
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Xiaobing Zhou






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8695) TestJdbcWithMiniKdc.testNegativeTokenAuth fails on non-expected error messages


 [ 
https://issues.apache.org/jira/browse/HIVE-8695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HIVE-8695:

Description: 
repo steps:
{noformat}
run mvn test -Phadoop-2  -Dtest=TestJdbcWithMiniKdc#testNegativeTokenAuth
{noformat}
, it fails since '*Failed to validate proxy privilege*' is expected error 
message and cause message, however, '*Error retrieving delegation token for 
user*' and '*is not allowed to impersonate*' are the returned exception.

 TestJdbcWithMiniKdc.testNegativeTokenAuth fails on non-expected error messages
 --

 Key: HIVE-8695
 URL: https://issues.apache.org/jira/browse/HIVE-8695
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Xiaobing Zhou

 repo steps:
 {noformat}
 run mvn test -Phadoop-2  -Dtest=TestJdbcWithMiniKdc#testNegativeTokenAuth
 {noformat}
 , it fails since '*Failed to validate proxy privilege*' is expected error 
 message and cause message, however, '*Error retrieving delegation token for 
 user*' and '*is not allowed to impersonate*' are the returned exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8394) HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss.


 [ 
https://issues.apache.org/jira/browse/HIVE-8394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-8394:
---
Status: Open  (was: Patch Available)

Ok, HIVE-8394.2.patch assumes FileOutputCommitters. Must switch to using the  
{{baseDynamicCommitters}} list instead.

 HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss.
 -

 Key: HIVE-8394
 URL: https://issues.apache.org/jira/browse/HIVE-8394
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.1, 0.12.0, 0.14.0
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
Priority: Critical
 Attachments: HIVE-8394.1.patch, HIVE-8394.2.patch


 We've found situations in production where Pig queries using {{HCatStorer}}, 
 dynamic partitioning and {{opt.multiquery=true}} that produce partitions in 
 the output table, but the corresponding directories have no data files (in 
 spite of Pig reporting non-zero records written to HDFS). I don't yet have a 
 distilled test-case for this.
 Here's the code from FileOutputCommitterContainer after HIVE-7803:
 {code:java|title=FileOutputCommitterContainer.java|borderStyle=dashed|titleBGColor=#F7D6C1|bgColor=#CE}
   @Override
   public void commitTask(TaskAttemptContext context) throws IOException {
 String jobInfoStr = 
 context.getConfiguration().get(FileRecordWriterContainer.DYN_JOBINFO);
 if (!dynamicPartitioningUsed) {
  //See HCATALOG-499
   FileOutputFormatContainer.setWorkOutputPath(context);
   
 getBaseOutputCommitter().commitTask(HCatMapRedUtil.createTaskAttemptContext(context));
 } else if (jobInfoStr != null) {
   ArrayListString jobInfoList = 
 (ArrayListString)HCatUtil.deserialize(jobInfoStr);
   org.apache.hadoop.mapred.TaskAttemptContext currTaskContext = 
 HCatMapRedUtil.createTaskAttemptContext(context);
   for (String jobStr : jobInfoList) {
   OutputJobInfo localJobInfo = 
 (OutputJobInfo)HCatUtil.deserialize(jobStr);
   FileOutputCommitter committer = new FileOutputCommitter(new 
 Path(localJobInfo.getLocation()), currTaskContext);
   committer.commitTask(currTaskContext);
   }
 }
   }
 {code}
 The serialized jobInfoList can't be retrieved, and hence the commit never 
 completes. This is because Pig's MapReducePOStoreImpl deliberately clones 
 both the TaskAttemptContext and the contained Configuration instance, thus 
 separating the Configuration instances passed to 
 {{FileOutputCommitterContainer::commitTask()}} and 
 {{FileRecordWriterContainer::close()}}. Anything set by the RecordWriter is 
 unavailable to the Committer.
 One approach would have been to store state in the FileOutputFormatContainer. 
 But that won't work since this is constructed via reflection in 
 HCatOutputFormat (itself constructed via reflection by PigOutputFormat via 
 HCatStorer). There's no guarantee that the instance is preserved.
 My only recourse seems to be to use a Singleton to store shared state. I'm 
 loath to indulge in this brand of shenanigans. (Statics and container-reuse 
 in Tez might not play well together, for instance.) It might work if we're 
 careful about tearing down the singleton.
 Any other ideas? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8394) HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss.


 [ 
https://issues.apache.org/jira/browse/HIVE-8394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-8394:
---
Attachment: HIVE-8394.3.patch

Updated patch.

 HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss.
 -

 Key: HIVE-8394
 URL: https://issues.apache.org/jira/browse/HIVE-8394
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0, 0.14.0, 0.13.1
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
Priority: Critical
 Attachments: HIVE-8394.1.patch, HIVE-8394.2.patch, HIVE-8394.3.patch


 We've found situations in production where Pig queries using {{HCatStorer}}, 
 dynamic partitioning and {{opt.multiquery=true}} that produce partitions in 
 the output table, but the corresponding directories have no data files (in 
 spite of Pig reporting non-zero records written to HDFS). I don't yet have a 
 distilled test-case for this.
 Here's the code from FileOutputCommitterContainer after HIVE-7803:
 {code:java|title=FileOutputCommitterContainer.java|borderStyle=dashed|titleBGColor=#F7D6C1|bgColor=#CE}
   @Override
   public void commitTask(TaskAttemptContext context) throws IOException {
 String jobInfoStr = 
 context.getConfiguration().get(FileRecordWriterContainer.DYN_JOBINFO);
 if (!dynamicPartitioningUsed) {
  //See HCATALOG-499
   FileOutputFormatContainer.setWorkOutputPath(context);
   
 getBaseOutputCommitter().commitTask(HCatMapRedUtil.createTaskAttemptContext(context));
 } else if (jobInfoStr != null) {
   ArrayListString jobInfoList = 
 (ArrayListString)HCatUtil.deserialize(jobInfoStr);
   org.apache.hadoop.mapred.TaskAttemptContext currTaskContext = 
 HCatMapRedUtil.createTaskAttemptContext(context);
   for (String jobStr : jobInfoList) {
   OutputJobInfo localJobInfo = 
 (OutputJobInfo)HCatUtil.deserialize(jobStr);
   FileOutputCommitter committer = new FileOutputCommitter(new 
 Path(localJobInfo.getLocation()), currTaskContext);
   committer.commitTask(currTaskContext);
   }
 }
   }
 {code}
 The serialized jobInfoList can't be retrieved, and hence the commit never 
 completes. This is because Pig's MapReducePOStoreImpl deliberately clones 
 both the TaskAttemptContext and the contained Configuration instance, thus 
 separating the Configuration instances passed to 
 {{FileOutputCommitterContainer::commitTask()}} and 
 {{FileRecordWriterContainer::close()}}. Anything set by the RecordWriter is 
 unavailable to the Committer.
 One approach would have been to store state in the FileOutputFormatContainer. 
 But that won't work since this is constructed via reflection in 
 HCatOutputFormat (itself constructed via reflection by PigOutputFormat via 
 HCatStorer). There's no guarantee that the instance is preserved.
 My only recourse seems to be to use a Singleton to store shared state. I'm 
 loath to indulge in this brand of shenanigans. (Statics and container-reuse 
 in Tez might not play well together, for instance.) It might work if we're 
 careful about tearing down the singleton.
 Any other ideas? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8695) TestJdbcWithMiniKdc.testNegativeTokenAuth fails on non-expected error messages


 [ 
https://issues.apache.org/jira/browse/HIVE-8695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HIVE-8695:

Attachment: HIVE-8695.1.patch

After check, this is a result of HIVE-8557. Made a patch. Can anyone please 
review it? Thanks!

 TestJdbcWithMiniKdc.testNegativeTokenAuth fails on non-expected error messages
 --

 Key: HIVE-8695
 URL: https://issues.apache.org/jira/browse/HIVE-8695
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Xiaobing Zhou
 Attachments: HIVE-8695.1.patch


 repo steps:
 {noformat}
 run mvn test -Phadoop-2  -Dtest=TestJdbcWithMiniKdc#testNegativeTokenAuth
 {noformat}
 , it fails since '*Failed to validate proxy privilege*' is expected error 
 message and cause message, however, '*Error retrieving delegation token for 
 user*' and '*is not allowed to impersonate*' are the returned exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8695) TestJdbcWithMiniKdc.testNegativeTokenAuth fails on non-expected error messages


[ 
https://issues.apache.org/jira/browse/HIVE-8695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193572#comment-14193572
 ] 

Xiaobing Zhou commented on HIVE-8695:
-

[~thejas] is it safe to do this change in this patch, since you were working on 
HIVE-8557? Thanks!

 TestJdbcWithMiniKdc.testNegativeTokenAuth fails on non-expected error messages
 --

 Key: HIVE-8695
 URL: https://issues.apache.org/jira/browse/HIVE-8695
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Xiaobing Zhou
 Attachments: HIVE-8695.1.patch


 repo steps:
 {noformat}
 run mvn test -Phadoop-2  -Dtest=TestJdbcWithMiniKdc#testNegativeTokenAuth
 {noformat}
 , it fails since '*Failed to validate proxy privilege*' is expected error 
 message and cause message, however, '*Error retrieving delegation token for 
 user*' and '*is not allowed to impersonate*' are the returned exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7576) Add PartitionSpec support in HCatClient API


 [ 
https://issues.apache.org/jira/browse/HIVE-7576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-7576:
---
Attachment: HIVE-7576.3.patch

Looks like there were a couple more changes that went in to TestHCatClient 
yesterday that made a rebase necessary. Rebased and reuploaded .3.patch.

 Add PartitionSpec support in HCatClient API
 ---

 Key: HIVE-7576
 URL: https://issues.apache.org/jira/browse/HIVE-7576
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Metastore
Affects Versions: 0.13.1
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-7576.1.patch, HIVE-7576.2.patch, HIVE-7576.3.patch


 HIVE-7223 adds support for PartitionSpecs in Hive Metastore. The HCatClient 
 API must add support to fetch partitions, add partitions, etc. using 
 PartitionSpec semantics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8656) CBO: auto_join_filters fails

2014-11-01 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193577#comment-14193577
 ] 

Julian Hyde commented on HIVE-8656:
---

I have updated CALCITE-448's description, and have (finally) found a repro case 
in pure Calcite. I think it is a minor issue, now that TypeConverter has been 
fixed in Hive.

 CBO: auto_join_filters fails
 

 Key: HIVE-8656
 URL: https://issues.apache.org/jira/browse/HIVE-8656
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Sergey Shelukhin
Assignee: Julian Hyde
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8656.patch


 Haven't looked why yet



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-8584) Setting hive.exec.orc.default.compress to ZLIB will lead to orc file size delta byte(s) shorter on Windows than Linux


 [ 
https://issues.apache.org/jira/browse/HIVE-8584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou resolved HIVE-8584.
-
Resolution: Invalid

 Setting hive.exec.orc.default.compress to ZLIB will lead to orc file size 
 delta byte(s) shorter on Windows than Linux
 -

 Key: HIVE-8584
 URL: https://issues.apache.org/jira/browse/HIVE-8584
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
 Environment: Windows
Reporter: Xiaobing Zhou
Assignee: Xiaobing Zhou
Priority: Minor
 Attachments: HIVE-8584.1.patch, orc-win-none-1.dump, 
 orc-win-none-2.dump, orc-win-snappy-1.dump, orc-win-snappy-2.dump, 
 orc-win-zlib-1.dump, orc-win-zlib-2.dump, orc_analyze.q


 repo steps:
 1. run query orc_analyze.q
 2. hive --orcfiledump target_orc_file_generated
 run 1 and 2 on PST timezone on Linux, and one more time on other timezone 
 e.g. CST on Windows.
 Compare two target orc file dumping. Windows orc file is 1 byte shorter than 
 Linux one.
 That's the case even if running 1 and 2 on Windows for different timezones, 
 however, no problem on Linux.
 The issue only exists by using ZLIB mode, eventually OS native compression 
 lib is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8584) Setting hive.exec.orc.default.compress to ZLIB will lead to orc file size delta byte(s) shorter on Windows than Linux

[
https://issues.apache.org/jira/browse/HIVE-8584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193581#comment-14193581
]

Xiaobing Zhou commented on HIVE-8584:
-

Thanks all for comments. After deep investigation, ZLIB mode actually works
fine for both platform if qtest output is exact same on both. There are other
reasons led to output diff, which will be tracked by other JIRA. I'd mark this
as invalid.

Setting hive.exec.orc.default.compress to ZLIB will lead to orc file size
delta byte(s) shorter on Windows than Linux
-

Key: HIVE-8584
URL: https://issues.apache.org/jira/browse/HIVE-8584
Project: Hive
Issue Type: Bug
Affects Versions: 0.14.0
Environment: Windows
Reporter: Xiaobing Zhou
Assignee: Xiaobing Zhou
Priority: Minor
Attachments: HIVE-8584.1.patch, orc-win-none-1.dump,
orc-win-none-2.dump, orc-win-snappy-1.dump, orc-win-snappy-2.dump,
orc-win-zlib-1.dump, orc-win-zlib-2.dump, orc_analyze.q

repo steps:
1. run query orc_analyze.q
2. hive --orcfiledump target_orc_file_generated
run 1 and 2 on PST timezone on Linux, and one more time on other timezone
e.g. CST on Windows.
Compare two target orc file dumping. Windows orc file is 1 byte shorter than
Linux one.
That's the case even if running 1 and 2 on Windows for different timezones,
however, no problem on Linux.
The issue only exists by using ZLIB mode, eventually OS native compression
lib is used.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8680) Set Max Message for Binary Thrift endpoints


[ 
https://issues.apache.org/jira/browse/HIVE-8680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193582#comment-14193582
 ] 

Hive QA commented on HIVE-8680:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12678713/HIVE-8680.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6668 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1602/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1602/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1602/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12678713 - PreCommit-HIVE-TRUNK-Build

 Set Max Message for Binary Thrift endpoints
 ---

 Key: HIVE-8680
 URL: https://issues.apache.org/jira/browse/HIVE-8680
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-8680.patch, HIVE-8680.patch


 Thrift has a configuration open to restrict incoming message size. If we 
 configure this we'll stop OOM'ing when someone sends us an HTTP request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8394) HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss.


 [ 
https://issues.apache.org/jira/browse/HIVE-8394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-8394:
---
Attachment: (was: HIVE-8394.3.patch)

 HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss.
 -

 Key: HIVE-8394
 URL: https://issues.apache.org/jira/browse/HIVE-8394
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0, 0.14.0, 0.13.1
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
Priority: Critical
 Attachments: HIVE-8394.1.patch, HIVE-8394.2.patch


 We've found situations in production where Pig queries using {{HCatStorer}}, 
 dynamic partitioning and {{opt.multiquery=true}} that produce partitions in 
 the output table, but the corresponding directories have no data files (in 
 spite of Pig reporting non-zero records written to HDFS). I don't yet have a 
 distilled test-case for this.
 Here's the code from FileOutputCommitterContainer after HIVE-7803:
 {code:java|title=FileOutputCommitterContainer.java|borderStyle=dashed|titleBGColor=#F7D6C1|bgColor=#CE}
   @Override
   public void commitTask(TaskAttemptContext context) throws IOException {
 String jobInfoStr = 
 context.getConfiguration().get(FileRecordWriterContainer.DYN_JOBINFO);
 if (!dynamicPartitioningUsed) {
  //See HCATALOG-499
   FileOutputFormatContainer.setWorkOutputPath(context);
   
 getBaseOutputCommitter().commitTask(HCatMapRedUtil.createTaskAttemptContext(context));
 } else if (jobInfoStr != null) {
   ArrayListString jobInfoList = 
 (ArrayListString)HCatUtil.deserialize(jobInfoStr);
   org.apache.hadoop.mapred.TaskAttemptContext currTaskContext = 
 HCatMapRedUtil.createTaskAttemptContext(context);
   for (String jobStr : jobInfoList) {
   OutputJobInfo localJobInfo = 
 (OutputJobInfo)HCatUtil.deserialize(jobStr);
   FileOutputCommitter committer = new FileOutputCommitter(new 
 Path(localJobInfo.getLocation()), currTaskContext);
   committer.commitTask(currTaskContext);
   }
 }
   }
 {code}
 The serialized jobInfoList can't be retrieved, and hence the commit never 
 completes. This is because Pig's MapReducePOStoreImpl deliberately clones 
 both the TaskAttemptContext and the contained Configuration instance, thus 
 separating the Configuration instances passed to 
 {{FileOutputCommitterContainer::commitTask()}} and 
 {{FileRecordWriterContainer::close()}}. Anything set by the RecordWriter is 
 unavailable to the Committer.
 One approach would have been to store state in the FileOutputFormatContainer. 
 But that won't work since this is constructed via reflection in 
 HCatOutputFormat (itself constructed via reflection by PigOutputFormat via 
 HCatStorer). There's no guarantee that the instance is preserved.
 My only recourse seems to be to use a Singleton to store shared state. I'm 
 loath to indulge in this brand of shenanigans. (Statics and container-reuse 
 in Tez might not play well together, for instance.) It might work if we're 
 careful about tearing down the singleton.
 Any other ideas? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-7276) BaseSemanticAnalyzer.unescapeSQLString fails to parse Windows like path


 [ 
https://issues.apache.org/jira/browse/HIVE-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou resolved HIVE-7276.
-
Resolution: Cannot Reproduce

Resolved it since it's not reproducible any more.

 BaseSemanticAnalyzer.unescapeSQLString fails to parse Windows like path
 ---

 Key: HIVE-7276
 URL: https://issues.apache.org/jira/browse/HIVE-7276
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, Windows
Affects Versions: 0.13.0
 Environment: Windows Server 2008 R2
Reporter: Xiaobing Zhou
Assignee: Xiaobing Zhou
Priority: Critical

 BaseSemanticAnalyzer.unescapeSQLString fails to parse windows-like path, e.g. 
 C:\Users\xzhou\hworks. This will cause a large quantity of queries on windows 
 to fail. 
 For example, 
 'C:\Users\xzhou\hworks\workspace\hwx-hive-ws\hive\hcatalog\core\target\tmp\hive-junit-960740885870900'
  will be parsed as 'C:Usersxzhouhworksworkspacehwx-hive-wshivehcatalogcore 
 arget   mphive-junit-960740885870900', since \ is interpreted as start char 
 in unicode string, e.g. \002 for delimiter, and thus swallowed.
 \0, \b, \n, \r, \t, \Z, and so on within normal Windows like path will also 
 be swallowed.
   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8394) HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss.


 [ 
https://issues.apache.org/jira/browse/HIVE-8394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-8394:
---
Attachment: HIVE-8394.3.patch

Minor logging adjustment.

 HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss.
 -

 Key: HIVE-8394
 URL: https://issues.apache.org/jira/browse/HIVE-8394
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0, 0.14.0, 0.13.1
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
Priority: Critical
 Attachments: HIVE-8394.1.patch, HIVE-8394.2.patch, HIVE-8394.3.patch


 We've found situations in production where Pig queries using {{HCatStorer}}, 
 dynamic partitioning and {{opt.multiquery=true}} that produce partitions in 
 the output table, but the corresponding directories have no data files (in 
 spite of Pig reporting non-zero records written to HDFS). I don't yet have a 
 distilled test-case for this.
 Here's the code from FileOutputCommitterContainer after HIVE-7803:
 {code:java|title=FileOutputCommitterContainer.java|borderStyle=dashed|titleBGColor=#F7D6C1|bgColor=#CE}
   @Override
   public void commitTask(TaskAttemptContext context) throws IOException {
 String jobInfoStr = 
 context.getConfiguration().get(FileRecordWriterContainer.DYN_JOBINFO);
 if (!dynamicPartitioningUsed) {
  //See HCATALOG-499
   FileOutputFormatContainer.setWorkOutputPath(context);
   
 getBaseOutputCommitter().commitTask(HCatMapRedUtil.createTaskAttemptContext(context));
 } else if (jobInfoStr != null) {
   ArrayListString jobInfoList = 
 (ArrayListString)HCatUtil.deserialize(jobInfoStr);
   org.apache.hadoop.mapred.TaskAttemptContext currTaskContext = 
 HCatMapRedUtil.createTaskAttemptContext(context);
   for (String jobStr : jobInfoList) {
   OutputJobInfo localJobInfo = 
 (OutputJobInfo)HCatUtil.deserialize(jobStr);
   FileOutputCommitter committer = new FileOutputCommitter(new 
 Path(localJobInfo.getLocation()), currTaskContext);
   committer.commitTask(currTaskContext);
   }
 }
   }
 {code}
 The serialized jobInfoList can't be retrieved, and hence the commit never 
 completes. This is because Pig's MapReducePOStoreImpl deliberately clones 
 both the TaskAttemptContext and the contained Configuration instance, thus 
 separating the Configuration instances passed to 
 {{FileOutputCommitterContainer::commitTask()}} and 
 {{FileRecordWriterContainer::close()}}. Anything set by the RecordWriter is 
 unavailable to the Committer.
 One approach would have been to store state in the FileOutputFormatContainer. 
 But that won't work since this is constructed via reflection in 
 HCatOutputFormat (itself constructed via reflection by PigOutputFormat via 
 HCatStorer). There's no guarantee that the instance is preserved.
 My only recourse seems to be to use a Singleton to store shared state. I'm 
 loath to indulge in this brand of shenanigans. (Statics and container-reuse 
 in Tez might not play well together, for instance.) It might work if we're 
 careful about tearing down the singleton.
 Any other ideas? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8680) Set Max Message for Binary Thrift endpoints

2014-11-01 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193588#comment-14193588
 ] 

Szehon Ho commented on HIVE-8680:
-

+1 thanks Brock

 Set Max Message for Binary Thrift endpoints
 ---

 Key: HIVE-8680
 URL: https://issues.apache.org/jira/browse/HIVE-8680
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-8680.patch, HIVE-8680.patch


 Thrift has a configuration open to restrict incoming message size. If we 
 configure this we'll stop OOM'ing when someone sends us an HTTP request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7511) Hive: output is incorrect if there are UTF-8 characters in where clause of a hive select query.


[ 
https://issues.apache.org/jira/browse/HIVE-7511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193589#comment-14193589
 ] 

Xiaobing Zhou commented on HIVE-7511:
-

This can be resolved by applying java options, like -Dfile.encoding=UTF-8. 
Setting it as env variable(_JAVA_OPTIONS=-Dfile.encoding=UTF-8) or passing as 
java start argument both work fine.

 Hive: output is incorrect if there are UTF-8 characters in where clause of a 
 hive select query.
 ---

 Key: HIVE-7511
 URL: https://issues.apache.org/jira/browse/HIVE-7511
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
 Environment: Windows Server 2008 R2
Reporter: Xiaobing Zhou
Assignee: Xiaobing Zhou
Priority: Critical
 Attachments: HIVE-7511.1.patch


 When we put UTF-8 characters in where clause of a hive query the results are 
 empty for where content like '%丄%' and results contain all rows for where 
 content not like '%丄%'; even when few rows contain this character.
 Steps to reproduce:
 1. Save a file called data.txt in the root container. The contents of the 
 files are as follows.
 190   丄f齄啊c狛䶴h䶴c狝
 899   d狜狜㐁geg阿狚ea䶴eead狜e
 137   齄鼾h狝ge㐀狛g狚阿
 21﨩﨩e㐀c狛鼾d䶴﨨
 767   﨩c﨩g狜㐁狜狛齄阿﨩狚齄﨨䶵狝﨨
 281   﨨㐀啊aga啊c狝e鼾鼾
 573   㐁䶴hc﨨b狝㐁﨩䶴狜丄hc齄
 966   䶴丄狜﨨e狝eb狜㐁c㐀鼾﨩丄ga狚丄
 565   䶵㐀﨩㐀bb狛ehd丄ea丄㐀
 778   﨩㐁阿﨨狚bbea丄䶵丄狚鼾狚a䶵
 363   gd齄a鼾a䶴b㐁㐁fg鼾
 822   a阿狜䶵h䶵e狛h﨩gac狜阿㐀啊b
 338   b齄㐁ff阿e狜e㐀ba齄
 2. Execute the following queries to setup the table.
 a. CREATE TABLE hivetable(row INT, content STRING) ROW FORMAT DELIMITED 
 FIELDS TERMINATED BY '
 t' LOCATION '/hivetable';
 b. LOAD DATA INPATH 'wasb:///data.txt' OVERWRITE INTO TABLE hivetable;
 3. create a query file query.hql with following contents
 INSERT OVERWRITE DIRECTORY 'wasb:///hiveoutput'
 select * from hivetable where content like '%丄%';
 4. even though few rows contains this character the output is empty.
 5. change the contents of query.hql to 
 INSERT OVERWRITE DIRECTORY 'wasb:///hiveoutput'
 select * from hivetable where content not like '%丄%';
 6. The output contains all rows including those containing the given 
 character.
 7. Similar results are observed when using where content = '丄f齄啊c狛䶴h䶴c狝'; 
 8. We get expected results when using where content like '%a%'; 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8695) TestJdbcWithMiniKdc.testNegativeTokenAuth fails on non-expected error messages


 [ 
https://issues.apache.org/jira/browse/HIVE-8695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-8695:

Status: Patch Available  (was: Open)

 TestJdbcWithMiniKdc.testNegativeTokenAuth fails on non-expected error messages
 --

 Key: HIVE-8695
 URL: https://issues.apache.org/jira/browse/HIVE-8695
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Xiaobing Zhou
 Attachments: HIVE-8695.1.patch


 repo steps:
 {noformat}
 run mvn test -Phadoop-2  -Dtest=TestJdbcWithMiniKdc#testNegativeTokenAuth
 {noformat}
 , it fails since '*Failed to validate proxy privilege*' is expected error 
 message and cause message, however, '*Error retrieving delegation token for 
 user*' and '*is not allowed to impersonate*' are the returned exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8394) HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss.


 [ 
https://issues.apache.org/jira/browse/HIVE-8394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-8394:
---
Attachment: HIVE-8394.4.patch

Now with more logging, and ASF header.

 HIVE-7803 doesn't handle Pig MultiQuery, can cause data-loss.
 -

 Key: HIVE-8394
 URL: https://issues.apache.org/jira/browse/HIVE-8394
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0, 0.14.0, 0.13.1
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
Priority: Critical
 Attachments: HIVE-8394.1.patch, HIVE-8394.2.patch, HIVE-8394.3.patch, 
 HIVE-8394.4.patch


 We've found situations in production where Pig queries using {{HCatStorer}}, 
 dynamic partitioning and {{opt.multiquery=true}} that produce partitions in 
 the output table, but the corresponding directories have no data files (in 
 spite of Pig reporting non-zero records written to HDFS). I don't yet have a 
 distilled test-case for this.
 Here's the code from FileOutputCommitterContainer after HIVE-7803:
 {code:java|title=FileOutputCommitterContainer.java|borderStyle=dashed|titleBGColor=#F7D6C1|bgColor=#CE}
   @Override
   public void commitTask(TaskAttemptContext context) throws IOException {
 String jobInfoStr = 
 context.getConfiguration().get(FileRecordWriterContainer.DYN_JOBINFO);
 if (!dynamicPartitioningUsed) {
  //See HCATALOG-499
   FileOutputFormatContainer.setWorkOutputPath(context);
   
 getBaseOutputCommitter().commitTask(HCatMapRedUtil.createTaskAttemptContext(context));
 } else if (jobInfoStr != null) {
   ArrayListString jobInfoList = 
 (ArrayListString)HCatUtil.deserialize(jobInfoStr);
   org.apache.hadoop.mapred.TaskAttemptContext currTaskContext = 
 HCatMapRedUtil.createTaskAttemptContext(context);
   for (String jobStr : jobInfoList) {
   OutputJobInfo localJobInfo = 
 (OutputJobInfo)HCatUtil.deserialize(jobStr);
   FileOutputCommitter committer = new FileOutputCommitter(new 
 Path(localJobInfo.getLocation()), currTaskContext);
   committer.commitTask(currTaskContext);
   }
 }
   }
 {code}
 The serialized jobInfoList can't be retrieved, and hence the commit never 
 completes. This is because Pig's MapReducePOStoreImpl deliberately clones 
 both the TaskAttemptContext and the contained Configuration instance, thus 
 separating the Configuration instances passed to 
 {{FileOutputCommitterContainer::commitTask()}} and 
 {{FileRecordWriterContainer::close()}}. Anything set by the RecordWriter is 
 unavailable to the Committer.
 One approach would have been to store state in the FileOutputFormatContainer. 
 But that won't work since this is constructed via reflection in 
 HCatOutputFormat (itself constructed via reflection by PigOutputFormat via 
 HCatStorer). There's no guarantee that the instance is preserved.
 My only recourse seems to be to use a Singleton to store shared state. I'm 
 loath to indulge in this brand of shenanigans. (Statics and container-reuse 
 in Tez might not play well together, for instance.) It might work if we're 
 careful about tearing down the singleton.
 Any other ideas? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8696) HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.