[jira] [Commented] (HIVE-12536) Cannot handle dash (-) on the metastore database name

2015-11-27 Thread Marco Ceppi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030081#comment-15030081
 ] 

Marco Ceppi commented on HIVE-12536:


It seems that table and database names should be wrapped with tic marks (`). 
All queries should do this.

> Cannot handle dash (-) on the metastore database name
> -
>
> Key: HIVE-12536
> URL: https://issues.apache.org/jira/browse/HIVE-12536
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0, 1.2.1
>Reporter: Konstantinos Tsakalozos
>Priority: Minor
>
> If you setup a database for metastore with a dash in its name (eg, 
> apache-hive) hive client fails.
> Here is the db connection string. The database is apache-hive
> 
> javax.jdo.option.ConnectionURL
> jdbc:mysql://10.0.3.166/apache-hive
> JDBC connect string for a JDBC metastore
> 
> Here is the exception you get when staring hive:
> root@jackal-local-machine-4:/home/ubuntu/resources/hive-x86_64# su hive -c 
> hive
> Logging initialized using configuration in 
> jar:file:/usr/lib/hive/lib/hive-common-1.2.1.jar!/hive-log4j.properties
> Exception in thread "main" java.lang.RuntimeException: 
> java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Caused by: java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1523)
>   at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:86)
>   at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
>   at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
>   at 
> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005)
>   at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)
>   ... 7 more
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521)
>   ... 13 more
> Caused by: javax.jdo.JDOException: Couldnt obtain a new sequence (unique id) 
> : You have an error in your SQL syntax; check the manual that corresponds to 
> your MySQL server version for the right syntax to use near 
> '-hive.`SEQUENCE_TABLE` WHERE 
> `SEQUENCE_NAME`='org.apache.hadoop.hive.metastore.m' at line 1
> NestedThrowables:
> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error 
> in your SQL syntax; check the manual that corresponds to your MySQL server 
> version for the right syntax to use near '-hive.`SEQUENCE_TABLE` WHERE 
> `SEQUENCE_NAME`='org.apache.hadoop.hive.metastore.m' at line 1
>   at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:596)
>   at 
> org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:732)
>   at 
> org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:752)
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.createDatabase(ObjectStore.java:521)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114)
>   at 

[jira] [Commented] (HIVE-11890) Create ORC module

2015-11-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030190#comment-15030190
 ] 

Hive QA commented on HIVE-11890:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12774204/HIVE-11890.patch

{color:green}SUCCESS:{color} +1 due to 27 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 9866 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testFetchingPartitionsWithDifferentSchemas
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarDataNucleusUnCaching
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
org.apache.hive.spark.client.TestSparkClient.testCounters
org.apache.hive.spark.client.TestSparkClient.testErrorJob
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
org.apache.hive.spark.client.TestSparkClient.testRemoteClient
org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
org.apache.hive.spark.client.TestSparkClient.testSyncRpc
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6146/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6146/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6146/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12774204 - PreCommit-HIVE-TRUNK-Build

> Create ORC module
> -
>
> Key: HIVE-11890
> URL: https://issues.apache.org/jira/browse/HIVE-11890
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-11890.patch, HIVE-11890.patch, HIVE-11890.patch, 
> HIVE-11890.patch
>
>
> Start moving classes over to the ORC module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12512) Include driver logs in execution-level Operation logs

2015-11-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030034#comment-15030034
 ] 

Hive QA commented on HIVE-12512:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12774194/HIVE-12512.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9865 tests executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6144/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6144/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6144/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12774194 - PreCommit-HIVE-TRUNK-Build

> Include driver logs in execution-level Operation logs
> -
>
> Key: HIVE-12512
> URL: https://issues.apache.org/jira/browse/HIVE-12512
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
>Priority: Minor
> Attachments: HIVE-12512.patch
>
>
> When {{hive.server2.logging.operation.level}} is set to {{EXECUTION}} 
> (default),  operation logs do not include Driver logs, which contain useful 
> info like total number of jobs launched, stage getting executed, etc. that 
> help track high-level progress. It only adds a few more lines to the output.
> {code}
> 15/11/24 14:09:12 INFO ql.Driver: Semantic Analysis Completed
> 15/11/24 14:09:12 INFO ql.Driver: Starting 
> command(queryId=hive_20151124140909_e8cbb9bd-bac0-40b2-83d0-382de25b80d1): 
> select count(*) from sample_08
> 15/11/24 14:09:12 INFO ql.Driver: Query ID = 
> hive_20151124140909_e8cbb9bd-bac0-40b2-83d0-382de25b80d1
> 15/11/24 14:09:12 INFO ql.Driver: Total jobs = 1
> ...
> 15/11/24 14:09:40 INFO ql.Driver: MapReduce Jobs Launched:
> 15/11/24 14:09:40 INFO ql.Driver: Stage-Stage-1: Map: 1  Reduce: 1   
> Cumulative CPU: 3.58 sec   HDFS Read: 52956 HDFS Write: 4 SUCCESS
> 15/11/24 14:09:40 INFO ql.Driver: Total MapReduce CPU Time Spent: 3 seconds 
> 580 msec
> 15/11/24 14:09:40 INFO ql.Driver: OK
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12505) Insert overwrite in same encrypted zone silently fails to remove some existing files

2015-11-27 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-12505:
---
Attachment: HIVE-12505.2.patch

Fix the failed tests.

> Insert overwrite in same encrypted zone silently fails to remove some 
> existing files
> 
>
> Key: HIVE-12505
> URL: https://issues.apache.org/jira/browse/HIVE-12505
> Project: Hive
>  Issue Type: Bug
>  Components: Encryption
>Affects Versions: 1.2.1
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-12505.1.patch, HIVE-12505.2.patch, HIVE-12505.patch
>
>
> With HDFS Trash enabled but its encryption zone lower than Hive data 
> directory, insert overwrite command silently fails to trash the existing 
> files during overwrite, which could lead to unexpected incorrect results 
> (more rows returned than expected)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11878) ClassNotFoundException can possibly occur if multiple jars are registered one at a time in Hive

2015-11-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030036#comment-15030036
 ] 

Hive QA commented on HIVE-11878:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12774196/HIVE-11878.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6145/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6145/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6145/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-6145/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 7984738 HIVE-12465: Hive might produce wrong results when 
(outer) joins are merged (Jesus Camacho Rodriguez, reviewed by Ashutosh Chauhan)
+ git clean -f -d
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at 7984738 HIVE-12465: Hive might produce wrong results when 
(outer) joins are merged (Jesus Camacho Rodriguez, reviewed by Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12774196 - PreCommit-HIVE-TRUNK-Build

> ClassNotFoundException can possibly  occur if multiple jars are registered 
> one at a time in Hive
> 
>
> Key: HIVE-11878
> URL: https://issues.apache.org/jira/browse/HIVE-11878
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
>  Labels: URLClassLoader
> Attachments: HIVE-11878 ClassLoader Issues when Registering 
> Jars.pptx, HIVE-11878.2.patch, HIVE-11878.patch, HIVE-11878_approach3.patch, 
> HIVE-11878_approach3_per_session_clasloader.patch, 
> HIVE-11878_approach3_with_review_comments.patch, 
> HIVE-11878_approach3_with_review_comments1.patch, HIVE-11878_qtest.patch
>
>
> When we register a jar on the Hive console. Hive creates a fresh URL 
> classloader which includes the path of the current jar to be registered and 
> all the jar paths of the parent classloader. The parent classlaoder is the 
> current ThreadContextClassLoader. Once the URLClassloader is created Hive 
> sets that as the current ThreadContextClassloader.
> So if we register multiple jars in Hive, there will be multiple 
> URLClassLoaders created, each classloader including the jars from its parent 
> and the one extra jar to be registered. The last URLClassLoader created will 
> end up as the current ThreadContextClassLoader. (See details: 
> 

[jira] [Commented] (HIVE-12341) LLAP: add security to daemon protocol endpoint (excluding shuffle)

2015-11-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15029863#comment-15029863
 ] 

Hive QA commented on HIVE-12341:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12774455/HIVE-12341.05.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 9865 tests executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testFetchingPartitionsWithDifferentSchemas
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6143/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6143/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6143/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12774455 - PreCommit-HIVE-TRUNK-Build

> LLAP: add security to daemon protocol endpoint (excluding shuffle)
> --
>
> Key: HIVE-12341
> URL: https://issues.apache.org/jira/browse/HIVE-12341
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12341.01.patch, HIVE-12341.02.patch, 
> HIVE-12341.03.patch, HIVE-12341.03.patch, HIVE-12341.04.patch, 
> HIVE-12341.05.patch, HIVE-12341.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12537) RLEv2 doesn't seem to work

2015-11-27 Thread Bogdan Raducanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bogdan Raducanu updated HIVE-12537:
---
Attachment: orcdump.txt

> RLEv2 doesn't seem to work
> --
>
> Key: HIVE-12537
> URL: https://issues.apache.org/jira/browse/HIVE-12537
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 1.2.1
>Reporter: Bogdan Raducanu
>  Labels: orc, orcfile
> Attachments: Main.java, orcdump.txt
>
>
> Perhaps I'm doing something wrong or is actually working as expected.
> Putting 1 million constant int32 values produces an ORC file of 1MB. 
> Surprisingly, 1 million consecutive ints produces a much smaller file.
> Code and FileDump attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12537) RLEv2 doesn't seem to work

2015-11-27 Thread Bogdan Raducanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bogdan Raducanu updated HIVE-12537:
---
Description: 
Perhaps I'm doing something wrong or is actually working as expected.

Putting 1 million constant int32 values produces an ORC file of 1MB. 
Surprisingly, 1 million consecutive ints produces a much smaller file.
Code and FileDump attached.






  was:
Putting 1 million constant int32 values produces an ORC file of 1MB.
Perhaps I'm doing something wrong or is actually working as expected.
Will attach code.
Output from FileDump:



Rows: 100
Compression: NONE
Type: int

Stripe Statistics:
  Stripe 1:
Column 0: count: 100 hasNull: false min: 123 max: 123 sum: 12300

File Statistics:
  Column 0: count: 100 hasNull: false min: 123 max: 123 sum: 12300

Stripes:
  Stripe: offset: 3 data: 1003847 rows: 100 tail: 41 index: 2871
Stream: column 0 section ROW_INDEX start: 3 length 2871
Stream: column 0 section DATA start: 2874 length 1003847
Encoding column 0: DIRECT_V2

File length: 1006860 bytes
Padding length: 0 bytes
Padding ratio: 0%



> RLEv2 doesn't seem to work
> --
>
> Key: HIVE-12537
> URL: https://issues.apache.org/jira/browse/HIVE-12537
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 1.2.1
>Reporter: Bogdan Raducanu
>  Labels: orc, orcfile
> Attachments: Main.java, orcdump.txt
>
>
> Perhaps I'm doing something wrong or is actually working as expected.
> Putting 1 million constant int32 values produces an ORC file of 1MB. 
> Surprisingly, 1 million consecutive ints produces a much smaller file.
> Code and FileDump attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11531) Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise

2015-11-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030341#comment-15030341
 ] 

Hive QA commented on HIVE-11531:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12774231/HIVE-11531.02.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 23 failed/errored test(s), 9869 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_offset_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_int_type_promotion
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_coalesce
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_nvl
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_outer_join5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_string_concat
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_decimal_date
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_offset_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_context
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_date_funcs
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join5
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_binary_join_groupby
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_outer_join5
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_limit
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6148/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6148/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6148/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 23 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12774231 - PreCommit-HIVE-TRUNK-Build

> Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise
> -
>
> Key: HIVE-11531
> URL: https://issues.apache.org/jira/browse/HIVE-11531
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Hui Zheng
> Attachments: HIVE-11531.02.patch, HIVE-11531.WIP.1.patch, 
> HIVE-11531.WIP.2.patch, HIVE-11531.patch
>
>
> For any UIs that involve pagination, it is useful to issue queries in the 
> form SELECT ... LIMIT X,Y where X,Y are coordinates inside the result to be 
> paginated (which can be extremely large by itself). At present, ROW_NUMBER 
> can be used to achieve this effect, but optimizations for LIMIT such as TopN 
> in ReduceSink do not apply to ROW_NUMBER. We can add first class support for 
> "skip" to existing limit, or improve ROW_NUMBER for better performance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12413) Default mode for hive.mapred.mode should be strict

2015-11-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030384#comment-15030384
 ] 

Hive QA commented on HIVE-12413:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12774277/HIVE-12413.3.patch

{color:green}SUCCESS:{color} +1 due to 897 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 144 failed/errored test(s), 9865 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_multi
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join24
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_stats2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_smb_mapjoin_14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_15
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_spark4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_grouping_sets4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_having2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join19
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join41
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join42
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join43
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_filters_overlap
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_grp_diff_keys
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_hive_626
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_rc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_reorder
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_star
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_join_transpose
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_memcheck
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_subquery2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mergejoins_mixed
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multi_join_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_outer_join5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ptfgroupbyjoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_quotedid_skew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_runtime_skewjoin_mapjoin_spark
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_nonvec_fetchwork_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_nonvec_mapwork_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_vec_mapwork_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_nonvec_fetchwork_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_nonvec_mapwork_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_17
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_25
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_temp_table_join1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_folder_constants
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union27
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_mapjoin

[jira] [Commented] (HIVE-12184) DESCRIBE of fully qualified table fails when db and table name match and non-default database is in use

2015-11-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030343#comment-15030343
 ] 

Hive QA commented on HIVE-12184:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12774254/HIVE-12184.9.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6149/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6149/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6149/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-6149/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 7984738 HIVE-12465: Hive might produce wrong results when 
(outer) joins are merged (Jesus Camacho Rodriguez, reviewed by Ashutosh Chauhan)
+ git clean -f -d
Removing ql/src/test/queries/clientpositive/offset_limit.q
Removing ql/src/test/queries/clientpositive/offset_limit_global_optimizer.q
Removing ql/src/test/queries/clientpositive/offset_limit_ppd_optimizer.q
Removing ql/src/test/queries/clientpositive/vectorization_offset_limit.q
Removing ql/src/test/results/clientpositive/offset_limit.q.out
Removing ql/src/test/results/clientpositive/offset_limit_global_optimizer.q.out
Removing ql/src/test/results/clientpositive/offset_limit_ppd_optimizer.q.out
Removing ql/src/test/results/clientpositive/vectorization_offset_limit.q.out
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at 7984738 HIVE-12465: Hive might produce wrong results when 
(outer) joins are merged (Jesus Camacho Rodriguez, reviewed by Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12774254 - PreCommit-HIVE-TRUNK-Build

> DESCRIBE of fully qualified table fails when db and table name match and 
> non-default database is in use
> ---
>
> Key: HIVE-12184
> URL: https://issues.apache.org/jira/browse/HIVE-12184
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.2.1
>Reporter: Lenni Kuff
>Assignee: Naveen Gangam
> Attachments: HIVE-12184.2.patch, HIVE-12184.3.patch, 
> HIVE-12184.4.patch, HIVE-12184.5.patch, HIVE-12184.6.patch, 
> HIVE-12184.7.patch, HIVE-12184.8.patch, HIVE-12184.9.patch, HIVE-12184.patch
>
>
> DESCRIBE of fully qualified table fails when db and table name match and 
> non-default database is in use.
> Repro:
> {code}
> : jdbc:hive2://localhost:1/default> create database foo;
> No rows affected (0.116 seconds)
> 0: 

[jira] [Commented] (HIVE-12055) Create row-by-row shims for the write path

2015-11-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030344#comment-15030344
 ] 

Hive QA commented on HIVE-12055:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12774265/HIVE-12055.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6150/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6150/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6150/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-6150/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 7984738 HIVE-12465: Hive might produce wrong results when 
(outer) joins are merged (Jesus Camacho Rodriguez, reviewed by Ashutosh Chauhan)
+ git clean -f -d
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at 7984738 HIVE-12465: Hive might produce wrong results when 
(outer) joins are merged (Jesus Camacho Rodriguez, reviewed by Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12774265 - PreCommit-HIVE-TRUNK-Build

> Create row-by-row shims for the write path 
> ---
>
> Key: HIVE-12055
> URL: https://issues.apache.org/jira/browse/HIVE-12055
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch
>
>
> As part of removing the row-by-row writer, we'll need to shim out the higher 
> level API (OrcSerde and OrcOutputFormat) so that we maintain backwards 
> compatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11107) Support for Performance regression test suite with TPCDS

2015-11-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030291#comment-15030291
 ] 

Hive QA commented on HIVE-11107:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12774221/HIVE-11107.3.patch

{color:green}SUCCESS:{color} +1 due to 50 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 42 failed/errored test(s), 9916 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query12
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query15
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query17
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query18
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query19
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query20
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query22
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query25
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query26
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query27
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query29
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query3
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query40
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query42
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query43
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query45
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query46
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query50
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query52
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query54
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query55
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query68
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query7
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query70
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query72
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query75
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query76
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query79
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query80
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query82
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query84
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query85
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query90
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query93
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query94
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query96
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query97
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6147/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6147/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6147/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 42 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12774221 - PreCommit-HIVE-TRUNK-Build

> Support for Performance regression test suite with TPCDS
> 
>
> Key: HIVE-11107
> URL: https://issues.apache.org/jira/browse/HIVE-11107
> Project: Hive
>  Issue Type: Task
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-11107.1.patch, HIVE-11107.2.patch, 
> HIVE-11107.3.patch
>
>
> Support to add TPCDS queries to the performance regression test suite with 
> Hive CBO turned on.
> This benchmark is intended to make sure that subsequent changes to the 
> optimizer 

[jira] [Updated] (HIVE-12537) RLEv2 doesn't seem to work

2015-11-27 Thread Bogdan Raducanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bogdan Raducanu updated HIVE-12537:
---
Description: 
Putting 1 million constant int32 values produces an ORC file of 1MB.
Perhaps I'm doing something wrong or is actually working as expected.
Will attach code.
Output from FileDump:



Rows: 100
Compression: NONE
Type: int

Stripe Statistics:
  Stripe 1:
Column 0: count: 100 hasNull: false min: 123 max: 123 sum: 12300

File Statistics:
  Column 0: count: 100 hasNull: false min: 123 max: 123 sum: 12300

Stripes:
  Stripe: offset: 3 data: 1003847 rows: 100 tail: 41 index: 2871
Stream: column 0 section ROW_INDEX start: 3 length 2871
Stream: column 0 section DATA start: 2874 length 1003847
Encoding column 0: DIRECT_V2

File length: 1006860 bytes
Padding length: 0 bytes
Padding ratio: 0%


  was:
Putting 1 million constant int32 values produces an ORC file of 1MB.
Perhaps I'm doing something wrong or is actually working as expect.
Will attach code.
Output from FileDump:



Rows: 100
Compression: NONE
Type: int

Stripe Statistics:
  Stripe 1:
Column 0: count: 100 hasNull: false min: 123 max: 123 sum: 12300

File Statistics:
  Column 0: count: 100 hasNull: false min: 123 max: 123 sum: 12300

Stripes:
  Stripe: offset: 3 data: 1003847 rows: 100 tail: 41 index: 2871
Stream: column 0 section ROW_INDEX start: 3 length 2871
Stream: column 0 section DATA start: 2874 length 1003847
Encoding column 0: DIRECT_V2

File length: 1006860 bytes
Padding length: 0 bytes
Padding ratio: 0%



> RLEv2 doesn't seem to work
> --
>
> Key: HIVE-12537
> URL: https://issues.apache.org/jira/browse/HIVE-12537
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 1.2.1
>Reporter: Bogdan Raducanu
>
> Putting 1 million constant int32 values produces an ORC file of 1MB.
> Perhaps I'm doing something wrong or is actually working as expected.
> Will attach code.
> Output from FileDump:
> Rows: 100
> Compression: NONE
> Type: int
> Stripe Statistics:
>   Stripe 1:
> Column 0: count: 100 hasNull: false min: 123 max: 123 sum: 12300
> File Statistics:
>   Column 0: count: 100 hasNull: false min: 123 max: 123 sum: 12300
> Stripes:
>   Stripe: offset: 3 data: 1003847 rows: 100 tail: 41 index: 2871
> Stream: column 0 section ROW_INDEX start: 3 length 2871
> Stream: column 0 section DATA start: 2874 length 1003847
> Encoding column 0: DIRECT_V2
> File length: 1006860 bytes
> Padding length: 0 bytes
> Padding ratio: 0%



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12537) RLEv2 doesn't seem to work

2015-11-27 Thread Bogdan Raducanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bogdan Raducanu updated HIVE-12537:
---
Attachment: Main.java

> RLEv2 doesn't seem to work
> --
>
> Key: HIVE-12537
> URL: https://issues.apache.org/jira/browse/HIVE-12537
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 1.2.1
>Reporter: Bogdan Raducanu
> Attachments: Main.java
>
>
> Putting 1 million constant int32 values produces an ORC file of 1MB.
> Perhaps I'm doing something wrong or is actually working as expected.
> Will attach code.
> Output from FileDump:
> Rows: 100
> Compression: NONE
> Type: int
> Stripe Statistics:
>   Stripe 1:
> Column 0: count: 100 hasNull: false min: 123 max: 123 sum: 12300
> File Statistics:
>   Column 0: count: 100 hasNull: false min: 123 max: 123 sum: 12300
> Stripes:
>   Stripe: offset: 3 data: 1003847 rows: 100 tail: 41 index: 2871
> Stream: column 0 section ROW_INDEX start: 3 length 2871
> Stream: column 0 section DATA start: 2874 length 1003847
> Encoding column 0: DIRECT_V2
> File length: 1006860 bytes
> Padding length: 0 bytes
> Padding ratio: 0%



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12538) After set spark related config, SparkSession never get reused

2015-11-27 Thread Nemon Lou (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030405#comment-15030405
 ] 

Nemon Lou commented on HIVE-12538:
--

After debugging ,i find the problem is that ,the operation conf object 
SparkUtilities used to detect configuration change is different from session 
conf.
And the session conf object 's getSparkConfigUpdated method always return true 
after setting spark related config.
The code path where SQLOperation copy a new conf object from session conf:
https://github.com/apache/hive/blob/spark/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L467
{code}
/**
   * If there are query specific settings to overlay, then create a copy of 
config
   * There are two cases we need to clone the session config that's being 
passed to hive driver
   * 1. Async query -
   *If the client changes a config setting, that shouldn't reflect in the 
execution already underway
   * 2. confOverlay -
   *The query specific settings should only be applied to the query config 
and not session
   * @return new configuration
   * @throws HiveSQLException
   */
  private HiveConf getConfigForOperation() throws HiveSQLException {
HiveConf sqlOperationConf = getParentSession().getHiveConf();
if (!getConfOverlay().isEmpty() || shouldRunAsync()) {
  // clone the partent session config for this query
  sqlOperationConf = new HiveConf(sqlOperationConf);

  // apply overlay query specific settings, if any
  for (Map.Entry confEntry : getConfOverlay().entrySet()) {
try {
  sqlOperationConf.verifyAndSet(confEntry.getKey(), 
confEntry.getValue());
} catch (IllegalArgumentException e) {
  throw new HiveSQLException("Error applying statement specific 
settings", e);
}
  }
}
return sqlOperationConf;
  }
{code}
The code path where SparkUtilities detect the change and close the spark 
session :
https://github.com/apache/hive/blob/spark/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java#L122
{code}
public static SparkSession getSparkSession(HiveConf conf,
  SparkSessionManager sparkSessionManager) throws HiveException {
SparkSession sparkSession = SessionState.get().getSparkSession();

// Spark configurations are updated close the existing session
if (conf.getSparkConfigUpdated()) {
  sparkSessionManager.closeSession(sparkSession);
  sparkSession =  null;
  conf.setSparkConfigUpdated(false);
}
sparkSession = sparkSessionManager.getSession(sparkSession, conf, true);
SessionState.get().setSparkSession(sparkSession);
return sparkSession;
  }
{code}

It shoud be easy to reproduce, i will dig more.



> After set spark related config, SparkSession never get reused
> -
>
> Key: HIVE-12538
> URL: https://issues.apache.org/jira/browse/HIVE-12538
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.3.0
>Reporter: Nemon Lou
>
> Hive on Spark yarn-cluster mode.
> After setting "set spark.yarn.queue=QueueA;" ,
> run the query "select count(*) from test"  3 times and you will find  3 
> different yarn applications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12538) After set spark related config, SparkSession never get reused

2015-11-27 Thread Nemon Lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-12538:
-
Description: 
Hive on Spark yarn-cluster mode.
After setting "set spark.yarn.queue=QueueA;" ,
run the query "select count(*) from test"  3 times and you will find  3 
different yarn applications.
Two of the yarn applications in FINISHED & SUCCEEDED state,and one in RUNNING & 
UNDEFINED state waiting for next work.
And if you submit one more "select count(*) from test" ,the third one will be 
in FINISHED & SUCCEEDED state and a new yarn application will start up.


  was:
Hive on Spark yarn-cluster mode.
After setting "set spark.yarn.queue=QueueA;" ,
run the query "select count(*) from test"  3 times and you will find  3 
different yarn applications.



> After set spark related config, SparkSession never get reused
> -
>
> Key: HIVE-12538
> URL: https://issues.apache.org/jira/browse/HIVE-12538
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.3.0
>Reporter: Nemon Lou
>
> Hive on Spark yarn-cluster mode.
> After setting "set spark.yarn.queue=QueueA;" ,
> run the query "select count(*) from test"  3 times and you will find  3 
> different yarn applications.
> Two of the yarn applications in FINISHED & SUCCEEDED state,and one in RUNNING 
> & UNDEFINED state waiting for next work.
> And if you submit one more "select count(*) from test" ,the third one will be 
> in FINISHED & SUCCEEDED state and a new yarn application will start up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11107) Support for Performance regression test suite with TPCDS

2015-11-27 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-11107:
-
Attachment: HIVE-11107.4.patch

> Support for Performance regression test suite with TPCDS
> 
>
> Key: HIVE-11107
> URL: https://issues.apache.org/jira/browse/HIVE-11107
> Project: Hive
>  Issue Type: Task
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-11107.1.patch, HIVE-11107.2.patch, 
> HIVE-11107.3.patch, HIVE-11107.4.patch
>
>
> Support to add TPCDS queries to the performance regression test suite with 
> Hive CBO turned on.
> This benchmark is intended to make sure that subsequent changes to the 
> optimizer or any hive code do not yield any unexpected plan changes. i.e.  
> the intention is to not run the entire TPCDS query set, but just "explain 
> plan" for the TPCDS queries.
> As part of this jira, we will manually verify that expected hive 
> optimizations kick in for the queries (for given stats/dataset). If there is 
> a difference in plan within this test suite due to a future commit, it needs 
> to be analyzed and we need to make sure that it is not a regression.
> The test suite can be run in master branch from itests by 
> {code}
> mvn test -Dtest=TestPerfCliDriver -Phadoop-2
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12538) After set spark related config, SparkSession never get reused

2015-11-27 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030396#comment-15030396
 ] 

Xuefu Zhang commented on HIVE-12538:


I tried but wasn't able to reproduce.

> After set spark related config, SparkSession never get reused
> -
>
> Key: HIVE-12538
> URL: https://issues.apache.org/jira/browse/HIVE-12538
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.3.0
>Reporter: Nemon Lou
>
> Hive on Spark yarn-cluster mode.
> After setting "set spark.yarn.queue=QueueA;" ,
> run the query "select count(*) from test"  3 times and you will find  3 
> different yarn applications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12515) Clean the SparkCounters related code after remove counter based stats collection[Spark Branch]

2015-11-27 Thread Chengxiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15029680#comment-15029680
 ] 

Chengxiang Li commented on HIVE-12515:
--

{{SparkCounters}} is referred in lots of classes in HoS, not sure how many code 
changes since last merge with master, we may got many conflicts during merging 
if remove {{SparkCounters}} in master. I think we can just do this in spark 
branch, although 
{{org.apache.hadoop.hive.ql.stats.CounterStatsAggregatorSpark}} has been 
removed, it should be a quite simple conflict during merge.

> Clean the SparkCounters related code after remove counter based stats 
> collection[Spark Branch]
> --
>
> Key: HIVE-12515
> URL: https://issues.apache.org/jira/browse/HIVE-12515
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Xuefu Zhang
>
> As SparkCounters is only used to collection stats, after HIVE-12411, we does 
> not need it anymore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12515) Clean the SparkCounters related code after remove counter based stats collection[Spark Branch]

2015-11-27 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15029694#comment-15029694
 ] 

Rui Li commented on HIVE-12515:
---

[~chengxiang li] - If we want to do this in spark branch, how about first merge 
master into spark, so that we can have the patch based on HIVE-12411?

I quickly looked through the spark counter code. My understanding is we cannot 
completely remove it, because we still need it to collect operator's stats with 
SparkReporter. Is it correct?

> Clean the SparkCounters related code after remove counter based stats 
> collection[Spark Branch]
> --
>
> Key: HIVE-12515
> URL: https://issues.apache.org/jira/browse/HIVE-12515
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Xuefu Zhang
>
> As SparkCounters is only used to collection stats, after HIVE-12411, we does 
> not need it anymore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-3937) Hive Profiler

2015-11-27 Thread Vishesh Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15029635#comment-15029635
 ] 

Vishesh Garg commented on HIVE-3937:


I'm working in Hive 1.2.1 where I do not see the profiler code base mentioned 
in this issue. Is it still present in Hive?

> Hive Profiler
> -
>
> Key: HIVE-3937
> URL: https://issues.apache.org/jira/browse/HIVE-3937
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pamela Vagata
>Assignee: Pamela Vagata
>Priority: Minor
> Fix For: 0.11.0
>
> Attachments: HIVE-3937.1.patch.txt, HIVE-3937.patch.2.txt, 
> HIVE-3937.patch.3.txt, HIVE-3937.patch.4.txt, HIVE-3937.patch.5.txt
>
>
> Adding a Hive Profiler implementation which tracks inclusive wall times and 
> call counts of the operators



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12535) Dynamic Hash Join: Key references are cyclic

2015-11-27 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-12535:
---
Attachment: philz_26.txt

> Dynamic Hash Join: Key references are cyclic
> 
>
> Key: HIVE-12535
> URL: https://issues.apache.org/jira/browse/HIVE-12535
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.0.0
>Reporter: Gopal V
> Attachments: philz_26.txt
>
>
> MAPJOIN_4227 is inside "Reducer 2", but refers back to "Reducer 2" in its 
> keys.
> {code}
> ||<-Reducer 2 [SIMPLE_EDGE] vectorized, llap  
>   
>   
>   |
> |   Reduce Output Operator [RS_4189]  
>   
>   
>   |
> |  key expressions:_col0 (type: string), _col1 (type: 
> int)  
>   
>   |
> |  Map-reduce partition columns:_col0 (type: string), 
> _col1 (type: int) 
>   
>   |
> |  sort order:++  
>   
>   
>   |
> |  Statistics:Num rows: 83 Data size: 9213 Basic stats: 
> COMPLETE Column stats: COMPLETE   
>   
> |
> |  value expressions:_col2 (type: double) 
>   
>   
>   |
> |  Group By Operator [OP_4229]
>   
>   
>   |
> | aggregations:["sum(_col2)"] 
>   
>   
>   |
> | keys:_col0 (type: string), _col1 (type: int)
>   
>   
>   |
> | outputColumnNames:["_col0","_col1","_col2"] 
>   
>   
>   |
> | Statistics:Num rows: 83 Data size: 9213 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   
> |
> | Select Operator [OP_4228]   
>   
>   
>   |
> |outputColumnNames:["_col0","_col1","_col2"]  
>   
>   
>   |
> |Statistics:Num rows: 166 Data size: 26394 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   
>|
> |Map Join Operator [MAPJOIN_4227] 
>   
>  

[jira] [Updated] (HIVE-12535) Dynamic Hash Join: Key references are cyclic

2015-11-27 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-12535:
---
Description: 
MAPJOIN_4227 is inside "Reducer 2", but refers back to "Reducer 2" in its keys. 
It should say "Map 1" there.

{code}
||<-Reducer 2 [SIMPLE_EDGE] vectorized, llap


|
|   Reduce Output Operator [RS_4189]


|
|  key expressions:_col0 (type: string), _col1 (type: int)  


|
|  Map-reduce partition columns:_col0 (type: string), _col1 
(type: int) 

|
|  sort order:++


|
|  Statistics:Num rows: 83 Data size: 9213 Basic stats: 
COMPLETE Column stats: COMPLETE 

|
|  value expressions:_col2 (type: double)   


|
|  Group By Operator [OP_4229]  


|
| aggregations:["sum(_col2)"]   


|
| keys:_col0 (type: string), _col1 (type: int)  


|
| outputColumnNames:["_col0","_col1","_col2"]   


|
| Statistics:Num rows: 83 Data size: 9213 Basic stats: 
COMPLETE Column stats: COMPLETE 

 |
| Select Operator [OP_4228] 


|
|outputColumnNames:["_col0","_col1","_col2"]


|
|Statistics:Num rows: 166 Data size: 26394 Basic 
stats: COMPLETE Column stats: COMPLETE  

   |
|Map Join Operator [MAPJOIN_4227]   


|
||  condition map:[{"":"Inner Join 0 to 1"}]


|
||  keys:{"Reducer 2":"KEY.reducesinkkey0 (type: 
bigint), KEY.reducesinkkey1 (type: int), KEY.reducesinkkey2 (type: int)","Map 

[jira] [Updated] (HIVE-12535) Dynamic Hash Join: Key references are cyclic

2015-11-27 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-12535:
---
Assignee: Jason Dere

> Dynamic Hash Join: Key references are cyclic
> 
>
> Key: HIVE-12535
> URL: https://issues.apache.org/jira/browse/HIVE-12535
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Jason Dere
> Attachments: philz_26.txt
>
>
> MAPJOIN_4227 is inside "Reducer 2", but refers back to "Reducer 2" in its 
> keys. It should say "Map 1" there.
> {code}
> ||<-Reducer 2 [SIMPLE_EDGE] vectorized, llap  
>   
>   
>   |
> |   Reduce Output Operator [RS_4189]  
>   
>   
>   |
> |  key expressions:_col0 (type: string), _col1 (type: 
> int)  
>   
>   |
> |  Map-reduce partition columns:_col0 (type: string), 
> _col1 (type: int) 
>   
>   |
> |  sort order:++  
>   
>   
>   |
> |  Statistics:Num rows: 83 Data size: 9213 Basic stats: 
> COMPLETE Column stats: COMPLETE   
>   
> |
> |  value expressions:_col2 (type: double) 
>   
>   
>   |
> |  Group By Operator [OP_4229]
>   
>   
>   |
> | aggregations:["sum(_col2)"] 
>   
>   
>   |
> | keys:_col0 (type: string), _col1 (type: int)
>   
>   
>   |
> | outputColumnNames:["_col0","_col1","_col2"] 
>   
>   
>   |
> | Statistics:Num rows: 83 Data size: 9213 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   
> |
> | Select Operator [OP_4228]   
>   
>   
>   |
> |outputColumnNames:["_col0","_col1","_col2"]  
>   
>   
>   |
> |Statistics:Num rows: 166 Data size: 26394 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   
>|
> |Map Join Operator [MAPJOIN_4227] 
>

[jira] [Updated] (HIVE-12465) Hive might produce wrong results when (outer) joins are merged

2015-11-27 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12465:
---
Attachment: HIVE-12465-branch1.patch

> Hive might produce wrong results when (outer) joins are merged
> --
>
> Key: HIVE-12465
> URL: https://issues.apache.org/jira/browse/HIVE-12465
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Blocker
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12465-branch1.patch, HIVE-12465.01.patch, 
> HIVE-12465.02.patch, HIVE-12465.patch
>
>
> Consider the following query:
> {noformat}
> select * from
>   (select * from tab where tab.key = 0)a
> full outer join
>   (select * from tab_part where tab_part.key = 98)b
> join
>   tab_part c
> on a.key = b.key and b.key = c.key;
> {noformat}
> Hive should execute the full outer join operation (without ON clause) and 
> then the join operation (ON a.key = b.key and b.key = c.key). Instead, it 
> merges both joins, generating the following plan:
> {noformat}
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Map Reduce
>   Map Operator Tree:
>   TableScan
> alias: tab
> filterExpr: (key = 0) (type: boolean)
> Statistics: Num rows: 242 Data size: 22748 Basic stats: COMPLETE 
> Column stats: NONE
> Filter Operator
>   predicate: (key = 0) (type: boolean)
>   Statistics: Num rows: 121 Data size: 11374 Basic stats: 
> COMPLETE Column stats: NONE
>   Select Operator
> expressions: 0 (type: int), value (type: string), ds (type: 
> string)
> outputColumnNames: _col0, _col1, _col2
> Statistics: Num rows: 121 Data size: 11374 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   key expressions: _col0 (type: int)
>   sort order: +
>   Map-reduce partition columns: _col0 (type: int)
>   Statistics: Num rows: 121 Data size: 11374 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: _col1 (type: string), _col2 (type: 
> string)
>   TableScan
> alias: tab_part
> filterExpr: (key = 98) (type: boolean)
> Statistics: Num rows: 500 Data size: 47000 Basic stats: COMPLETE 
> Column stats: NONE
> Filter Operator
>   predicate: (key = 98) (type: boolean)
>   Statistics: Num rows: 250 Data size: 23500 Basic stats: 
> COMPLETE Column stats: NONE
>   Select Operator
> expressions: 98 (type: int), value (type: string), ds (type: 
> string)
> outputColumnNames: _col0, _col1, _col2
> Statistics: Num rows: 250 Data size: 23500 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   key expressions: _col0 (type: int)
>   sort order: +
>   Map-reduce partition columns: _col0 (type: int)
>   Statistics: Num rows: 250 Data size: 23500 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: _col1 (type: string), _col2 (type: 
> string)
>   TableScan
> alias: c
> Statistics: Num rows: 500 Data size: 47000 Basic stats: COMPLETE 
> Column stats: NONE
> Reduce Output Operator
>   key expressions: key (type: int)
>   sort order: +
>   Map-reduce partition columns: key (type: int)
>   Statistics: Num rows: 500 Data size: 47000 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: value (type: string), ds (type: string)
>   Reduce Operator Tree:
> Join Operator
>   condition map:
>Outer Join 0 to 1
>Inner Join 1 to 2
>   keys:
> 0 _col0 (type: int)
> 1 _col0 (type: int)
> 2 key (type: int)
>   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, 
> _col7, _col8
>   Statistics: Num rows: 1100 Data size: 103400 Basic stats: COMPLETE 
> Column stats: NONE
>   File Output Operator
> compressed: false
> Statistics: Num rows: 1100 Data size: 103400 Basic stats: 
> COMPLETE Column stats: NONE
> table:
> input format: org.apache.hadoop.mapred.TextInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> serde: 

[jira] [Commented] (HIVE-12020) Revert log4j2 xml configuration to properties based configuration

2015-11-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15029728#comment-15029728
 ] 

Hive QA commented on HIVE-12020:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12774206/HIVE-12020.3.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9865 tests executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testFetchingPartitionsWithDifferentSchemas
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hadoop.hive.ql.log.TestLog4j2Appenders.testHiveEventCounterAppender
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6142/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6142/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6142/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12774206 - PreCommit-HIVE-TRUNK-Build

> Revert log4j2 xml configuration to properties based configuration
> -
>
> Key: HIVE-12020
> URL: https://issues.apache.org/jira/browse/HIVE-12020
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logging
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-12020.1.patch, HIVE-12020.2.patch, 
> HIVE-12020.3.patch
>
>
> Log4j 2.4 release brought back properties based configuration. We should 
> revert XML based configuration and use properties based configuration instead 
> (less verbose and will be similar to old log4j properties). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)