date:20151013

[jira] [Commented] (HIVE-12156) expanding view doesn't quote reserved keyword

2015-10-13 Thread Jay Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956387#comment-14956387
 ] 

Jay Lee commented on HIVE-12156:


I don't get it.

I have been using 2 or more levels of "dot" in my daily queries, how come it 
has NOT supportted that yet? If this issue has nothing to do with reserved 
keywords, then why it will work if I change the reserved keyword `end` to 
normal literal?

hive> create table testreserved (data struct);
OK
Time taken: 0.556 seconds
hive> create view testreservedview as select data.end_key as data_end, data.id 
as data_id from testreserved;
OK
Time taken: 0.678 seconds
hive> select data.end_key from testreserved;
OK
Time taken: 0.743 seconds
hive> select data_id from testreservedview;
OK
Time taken: 0.477 seconds
hive> select data_end from testreservedview;
OK
Time taken: 0.404 seconds

And the query example you suggested actually works:

hive> create table src (default struct, id: string>);
OK
Time taken: 0.27 seconds
hive> select default.src.key from src;
OK
Time taken: 0.274 seconds

> expanding view doesn't quote reserved keyword
> -
>
> Key: HIVE-12156
> URL: https://issues.apache.org/jira/browse/HIVE-12156
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.2.1
> Environment: hadoop 2.7
> hive 1.2.1
>Reporter: Jay Lee
>
> hive> create table testreserved (data struct<`end`:string, id: string>);
> OK
> Time taken: 0.274 seconds
> hive> create view testreservedview as select data.`end` as data_end, data.id 
> as data_id from testreserved;
> OK
> Time taken: 0.769 seconds
> hive> select data.`end` from testreserved;
> OK
> Time taken: 1.852 seconds
> hive> select data_id from testreservedview;
> NoViableAltException(98@[])
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:10858)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceFieldExpression(HiveParser_IdentifiersParser.java:6438)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceUnaryPrefixExpression(HiveParser_IdentifiersParser.java:6768)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceUnarySuffixExpression(HiveParser_IdentifiersParser.java:6828)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceBitwiseXorExpression(HiveParser_IdentifiersParser.java:7012)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceStarExpression(HiveParser_IdentifiersParser.java:7172)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedencePlusExpression(HiveParser_IdentifiersParser.java:7332)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceAmpersandExpression(HiveParser_IdentifiersParser.java:7483)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceBitwiseOrExpression(HiveParser_IdentifiersParser.java:7634)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceEqualExpression(HiveParser_IdentifiersParser.java:8164)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceNotExpression(HiveParser_IdentifiersParser.java:9177)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceAndExpression(HiveParser_IdentifiersParser.java:9296)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceOrExpression(HiveParser_IdentifiersParser.java:9455)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.expression(HiveParser_IdentifiersParser.java:6105)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.expression(HiveParser.java:45840)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectItem(HiveParser_SelectClauseParser.java:2907)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectList(HiveParser_SelectClauseParser.java:1373)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectClause(HiveParser_SelectClauseParser.java:1128)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectClause(HiveParser.java:45827)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:41495)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:41402)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:40413)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:40283)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1590)
>   at 
> org.apac

[jira] [Commented] (HIVE-12086) ORC: Buffered float readers to remove vtable thunks

2015-10-13 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956379#comment-14956379
 ] 

Gopal V commented on HIVE-12086:


Share the micro-benchmark, I can modify it and try to run through this again.

> ORC: Buffered float readers to remove vtable thunks
> ---
>
> Key: HIVE-12086
> URL: https://issues.apache.org/jira/browse/HIVE-12086
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Minor
> Attachments: HIVE-12086.1.patch, HIVE-12086.2.patch, 
> perf-top-floatreader.jpg
>
>
> ORC float tree reader spends an inordinate amount of time, doing vtable 
> thunks through InputStream interface.
> The actual operation is not faster with this patch, but the interface lookup 
> goes down ~4x.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12086) ORC: Buffered float readers to remove vtable thunks

2015-10-13 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956374#comment-14956374
 ] 

Gopal V commented on HIVE-12086:


The perf top reader improvements disappear when the LLAP cache is not used. 
Looks like heap backed byte buffers are not any faster for .get(byte[]) over 
.get() loops.

> ORC: Buffered float readers to remove vtable thunks
> ---
>
> Key: HIVE-12086
> URL: https://issues.apache.org/jira/browse/HIVE-12086
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Minor
> Attachments: HIVE-12086.1.patch, HIVE-12086.2.patch, 
> perf-top-floatreader.jpg
>
>
> ORC float tree reader spends an inordinate amount of time, doing vtable 
> thunks through InputStream interface.
> The actual operation is not faster with this patch, but the interface lookup 
> goes down ~4x.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7693) Invalid column ref error in order by when using column alias in select clause and using having

2015-10-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956371#comment-14956371
 ] 

Hive QA commented on HIVE-7693:
---



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12766382/HIVE-7693.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 9684 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lineage3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_short_regress
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_short_regress
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5641/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5641/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5641/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12766382 - PreCommit-HIVE-TRUNK-Build

> Invalid column ref error in order by when using column alias in select clause 
> and using having
> --
>
> Key: HIVE-7693
> URL: https://issues.apache.org/jira/browse/HIVE-7693
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.13.0
>Reporter: Deepesh Khandelwal
>Assignee: Pengcheng Xiong
> Attachments: HIVE-7693.01.patch
>
>
> Hive CLI session:
> {noformat}
> hive> create table abc(foo int, bar string);
> OK
> Time taken: 0.633 seconds
> hive> select foo as c0, count(*) as c1 from abc group by foo, bar having bar 
> like '%abc%' order by foo;
> FAILED: SemanticException [Error 10004]: Line 1:93 Invalid table alias or 
> column reference 'foo': (possible column names are: c0, c1)
> {noformat}
> Without having clause, the query runs fine, example:
> {code}
> select foo as c0, count(*) as c1 from abc group by foo, bar order by foo;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10104) LLAP: Generate consistent splits and locations for the same split across jobs

2015-10-13 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956370#comment-14956370
 ] 

Lefty Leverenz commented on HIVE-10104:
---

HIVE-12078 fixes the typo ("consisten") in the parameter description and 
adjusts the line breaks.
Thanks, Sergey.

> LLAP: Generate consistent splits and locations for the same split across jobs
> -
>
> Key: HIVE-10104
> URL: https://issues.apache.org/jira/browse/HIVE-10104
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Fix For: llap
>
> Attachments: HIVE-10104.1.txt, HIVE-10104.2.txt
>
>
> Locations for splits are currently randomized. Also, the order of splits is 
> random - depending on how threads end up generating the splits.
> Add an option to sort the splits, and generate repeatable locations - 
> assuming all other factors are the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11927) Implement/Enable constant related optimization rules in Calcite: enable HiveReduceExpressionsRule to fold constants

2015-10-13 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11927:
---
Attachment: HIVE-11927.04.patch

> Implement/Enable constant related optimization rules in Calcite: enable 
> HiveReduceExpressionsRule to fold constants
> ---
>
> Key: HIVE-11927
> URL: https://issues.apache.org/jira/browse/HIVE-11927
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11927.01.patch, HIVE-11927.02.patch, 
> HIVE-11927.03.patch, HIVE-11927.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11981) ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized)

2015-10-13 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11981:

Attachment: HIVE-11981.03.patch

> ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized)
> --
>
> Key: HIVE-11981
> URL: https://issues.apache.org/jira/browse/HIVE-11981
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Transactions
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11981.01.patch, HIVE-11981.02.patch, 
> HIVE-11981.03.patch, ORC Schema Evolution Issues.docx
>
>
> High priority issues with schema evolution for the ORC file format.
> Schema evolution here is limited to adding new columns and a few cases of 
> column type-widening (e.g. int to bigint).
> Renaming columns, deleting column, moving columns and other schema evolution 
> were not pursued due to lack of importance and lack of time.  Also, it 
> appears a much more sophisticated metadata would be needed to support them.
> The biggest issues for users have been adding new columns for ACID table 
> (HIVE-11421 Support Schema evolution for ACID tables) and vectorization 
> (HIVE-10598 Vectorization borks when column is added to table).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12088) a simple insert hql throws out NoClassFoundException of MetaException

2015-10-13 Thread Feng Yuan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feng Yuan updated HIVE-12088:
-
Attachment: hive.log

> a simple insert hql throws out NoClassFoundException of MetaException
> -
>
> Key: HIVE-12088
> URL: https://issues.apache.org/jira/browse/HIVE-12088
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 1.2.0, 1.2.1
>Reporter: Feng Yuan
> Fix For: 1.2.2
>
> Attachments: hive.log
>
>
> example:
> from portrait.rec_feature_feedback a insert overwrite table portrait.test1 
> select iid, feedback_15day, feedback_7day, feedback_5day, feedback_3day, 
> feedback_1day where l_date = '2015-09-09' and bid in 
> ('949722CF_12F7_523A_EE21_E3D591B7E755');
> log shows:
> Query ID = hadoop_20151012153841_120bee59-56a7-4e53-9c45-76f97c0f50ad
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_1441881651073_95266, Tracking URL = 
> http://bjlg-44p12-rm01:8088/proxy/application_1441881651073_95266/
> Kill Command = /opt/hadoop/hadoop/bin/hadoop job  -kill 
> job_1441881651073_95266
> Hadoop job information for Stage-1: number of mappers: 21; number of 
> reducers: 0
> 2015-10-12 15:39:29,930 Stage-1 map = 0%,  reduce = 0%
> 2015-10-12 15:39:39,597 Stage-1 map = 5%,  reduce = 0%
> 2015-10-12 15:39:40,658 Stage-1 map = 0%,  reduce = 0%
> 2015-10-12 15:39:53,479 Stage-1 map = 5%,  reduce = 0%
> 2015-10-12 15:39:54,535 Stage-1 map = 0%,  reduce = 0%
> 2015-10-12 15:39:55,588 Stage-1 map = 10%,  reduce = 0%
> 2015-10-12 15:39:56,626 Stage-1 map = 5%,  reduce = 0%
> 2015-10-12 15:39:57,687 Stage-1 map = 0%,  reduce = 0%
> 2015-10-12 15:40:06,096 Stage-1 map = 100%,  reduce = 0%
> Ended Job = job_1441881651073_95266 with errors
> Error during job, obtaining debugging information...
> Examining task ID: task_1441881651073_95266_m_00 (and more) from job 
> job_1441881651073_95266
> Examining task ID: task_1441881651073_95266_m_16 (and more) from job 
> job_1441881651073_95266
> Examining task ID: task_1441881651073_95266_m_11 (and more) from job 
> job_1441881651073_95266
> Examining task ID: task_1441881651073_95266_m_18 (and more) from job 
> job_1441881651073_95266
> Examining task ID: task_1441881651073_95266_m_02 (and more) from job 
> job_1441881651073_95266
> Task with the most failures(4): 
> -
> Task ID:
>   task_1441881651073_95266_m_09
> URL:
>   
> http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1441881651073_95266&tipid=task_1441881651073_95266_m_09
> -
> Diagnostic Messages for this Task:
> Error: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.metastore.api.MetaException
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   at java.lang.Class.getDeclaredMethods0(Native Method)
>   at java.lang.Class.privateGetDeclaredMethods(Class.java:2570)
>   at java.lang.Class.privateGetPublicMethods(Class.java:2690)
>   at java.lang.Class.getMethods(Class.java:1467)
>   at com.sun.beans.finder.MethodFinder$1.create(MethodFinder.java:54)
>   at com.sun.beans.finder.MethodFinder$1.create(MethodFinder.java:49)
>   at com.sun.beans.util.Cache.get(Cache.java:127)
>   at com.sun.beans.finder.MethodFinder.findMethod(MethodFinder.java:81)
>   at java.beans.Statement.getMethod(Statement.java:357)
>   at java.beans.Statement.invokeInternal(Statement.java:261)
>   at java.beans.Statement.access$000(Statement.java:58)
>   at java.beans.Statement$2.run(Statement.java:185)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.beans.Statement.invoke(Statement.java:182)
>   at java.beans.Expression.getValue(Expression.java:153)
>   at 
> com.sun.beans.decoder.ObjectElementHandler.getValueObject(ObjectElementHandler.java:166)
>   at 
> com.sun.beans.decoder.NewElementHandler.getValueObject(NewElementHandler.java:123)
>   at 
> com.sun.beans.decoder.ElementHandler.getContextBean(ElementHandler.java:113)
>   at 
> com.sun.beans.decoder.NewElementHandler.getContextBean(NewElementHandler.java:109)
>   at 
> com.sun.beans.decoder.ObjectElementHandler.getValueObject(ObjectElementHandler.java:146)
>   at 
> com.sun.beans.decoder.NewElementHandler.getValueObject(NewElementHandler.java:123)
>   at 
> com.sun.b

[jira] [Commented] (HIVE-12088) a simple insert hql throws out NoClassFoundException of MetaException

2015-10-13 Thread Feng Yuan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956335#comment-14956335
 ] 

Feng Yuan commented on HIVE-12088:
--

@[~xuefuz] is there anything wrong about metastore?
i copy a hive0.14 metastore mysql db when i upgrade to 1.2.1.
and db`s characterencoding is latin1.when i do 'msck repair table abc;'
output:'OK
Partitions not in metastore:
rec_feature_feedback:l_date=2015-09-09/cid=Cyiyaowang/bid=F7734668_CC49_8C4F_24C5_EA8B6728E394
  
rec_feature_feedback:l_date=2015-09-09/cid=Czgc_pc/bid=949722CF_12F7_523A_EE21_E3D591B7E755
Time taken: 137.984 seconds, Fetched: 1 row(s)'

a segment of hive.log is:
"Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: 
Specified key was too long; max key length is 767 bytes
at sun.reflect.GeneratedConstructorAccessor40.newInstance(Unknown 
Source)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:406)
at com.mysql.jdbc.Util.getInstance(Util.java:381)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1030)"

i upload a hive.log in attachments.

> a simple insert hql throws out NoClassFoundException of MetaException
> -
>
> Key: HIVE-12088
> URL: https://issues.apache.org/jira/browse/HIVE-12088
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 1.2.0, 1.2.1
>Reporter: Feng Yuan
> Fix For: 1.2.2
>
>
> example:
> from portrait.rec_feature_feedback a insert overwrite table portrait.test1 
> select iid, feedback_15day, feedback_7day, feedback_5day, feedback_3day, 
> feedback_1day where l_date = '2015-09-09' and bid in 
> ('949722CF_12F7_523A_EE21_E3D591B7E755');
> log shows:
> Query ID = hadoop_20151012153841_120bee59-56a7-4e53-9c45-76f97c0f50ad
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_1441881651073_95266, Tracking URL = 
> http://bjlg-44p12-rm01:8088/proxy/application_1441881651073_95266/
> Kill Command = /opt/hadoop/hadoop/bin/hadoop job  -kill 
> job_1441881651073_95266
> Hadoop job information for Stage-1: number of mappers: 21; number of 
> reducers: 0
> 2015-10-12 15:39:29,930 Stage-1 map = 0%,  reduce = 0%
> 2015-10-12 15:39:39,597 Stage-1 map = 5%,  reduce = 0%
> 2015-10-12 15:39:40,658 Stage-1 map = 0%,  reduce = 0%
> 2015-10-12 15:39:53,479 Stage-1 map = 5%,  reduce = 0%
> 2015-10-12 15:39:54,535 Stage-1 map = 0%,  reduce = 0%
> 2015-10-12 15:39:55,588 Stage-1 map = 10%,  reduce = 0%
> 2015-10-12 15:39:56,626 Stage-1 map = 5%,  reduce = 0%
> 2015-10-12 15:39:57,687 Stage-1 map = 0%,  reduce = 0%
> 2015-10-12 15:40:06,096 Stage-1 map = 100%,  reduce = 0%
> Ended Job = job_1441881651073_95266 with errors
> Error during job, obtaining debugging information...
> Examining task ID: task_1441881651073_95266_m_00 (and more) from job 
> job_1441881651073_95266
> Examining task ID: task_1441881651073_95266_m_16 (and more) from job 
> job_1441881651073_95266
> Examining task ID: task_1441881651073_95266_m_11 (and more) from job 
> job_1441881651073_95266
> Examining task ID: task_1441881651073_95266_m_18 (and more) from job 
> job_1441881651073_95266
> Examining task ID: task_1441881651073_95266_m_02 (and more) from job 
> job_1441881651073_95266
> Task with the most failures(4): 
> -
> Task ID:
>   task_1441881651073_95266_m_09
> URL:
>   
> http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1441881651073_95266&tipid=task_1441881651073_95266_m_09
> -
> Diagnostic Messages for this Task:
> Error: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.metastore.api.MetaException
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   at java.lang.Class.getDeclaredMethods0(Native Method)
>   at java.lang.Class.privateGetDeclaredMethods(Class.java:2570)
>   at java.lang.Class.privateGetPublicMethods(Class.java:2690)
>   at java.lang.Class.getMethods(Class.java:1467)
>   at com.sun.beans.finder.MethodFinder$1.create(MethodFinder.java:54)
>   at com.sun.beans.finder.MethodFinder$1.create(MethodFinder.java:49)
>   at com.sun.beans.util.Cache.get(Cache.java:127)
>

[jira] [Commented] (HIVE-12088) a simple insert hql throws out NoClassFoundException of MetaException

2015-10-13 Thread Feng Yuan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956338#comment-14956338
 ] 

Feng Yuan commented on HIVE-12088:
--

@[~xuefuz] is there anything wrong about metastore?
i copy a hive0.14 metastore mysql db when i upgrade to 1.2.1.
and db`s characterencoding is latin1.when i do 'msck repair table abc;'
output:'OK
Partitions not in metastore:
rec_feature_feedback:l_date=2015-09-09/cid=Cyiyaowang/bid=F7734668_CC49_8C4F_24C5_EA8B6728E394
  
rec_feature_feedback:l_date=2015-09-09/cid=Czgc_pc/bid=949722CF_12F7_523A_EE21_E3D591B7E755
Time taken: 137.984 seconds, Fetched: 1 row(s)'

a segment of hive.log is:
"Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: 
Specified key was too long; max key length is 767 bytes
at sun.reflect.GeneratedConstructorAccessor40.newInstance(Unknown 
Source)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:406)
at com.mysql.jdbc.Util.getInstance(Util.java:381)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1030)"

i upload a hive.log in attachments.

> a simple insert hql throws out NoClassFoundException of MetaException
> -
>
> Key: HIVE-12088
> URL: https://issues.apache.org/jira/browse/HIVE-12088
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 1.2.0, 1.2.1
>Reporter: Feng Yuan
> Fix For: 1.2.2
>
>
> example:
> from portrait.rec_feature_feedback a insert overwrite table portrait.test1 
> select iid, feedback_15day, feedback_7day, feedback_5day, feedback_3day, 
> feedback_1day where l_date = '2015-09-09' and bid in 
> ('949722CF_12F7_523A_EE21_E3D591B7E755');
> log shows:
> Query ID = hadoop_20151012153841_120bee59-56a7-4e53-9c45-76f97c0f50ad
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_1441881651073_95266, Tracking URL = 
> http://bjlg-44p12-rm01:8088/proxy/application_1441881651073_95266/
> Kill Command = /opt/hadoop/hadoop/bin/hadoop job  -kill 
> job_1441881651073_95266
> Hadoop job information for Stage-1: number of mappers: 21; number of 
> reducers: 0
> 2015-10-12 15:39:29,930 Stage-1 map = 0%,  reduce = 0%
> 2015-10-12 15:39:39,597 Stage-1 map = 5%,  reduce = 0%
> 2015-10-12 15:39:40,658 Stage-1 map = 0%,  reduce = 0%
> 2015-10-12 15:39:53,479 Stage-1 map = 5%,  reduce = 0%
> 2015-10-12 15:39:54,535 Stage-1 map = 0%,  reduce = 0%
> 2015-10-12 15:39:55,588 Stage-1 map = 10%,  reduce = 0%
> 2015-10-12 15:39:56,626 Stage-1 map = 5%,  reduce = 0%
> 2015-10-12 15:39:57,687 Stage-1 map = 0%,  reduce = 0%
> 2015-10-12 15:40:06,096 Stage-1 map = 100%,  reduce = 0%
> Ended Job = job_1441881651073_95266 with errors
> Error during job, obtaining debugging information...
> Examining task ID: task_1441881651073_95266_m_00 (and more) from job 
> job_1441881651073_95266
> Examining task ID: task_1441881651073_95266_m_16 (and more) from job 
> job_1441881651073_95266
> Examining task ID: task_1441881651073_95266_m_11 (and more) from job 
> job_1441881651073_95266
> Examining task ID: task_1441881651073_95266_m_18 (and more) from job 
> job_1441881651073_95266
> Examining task ID: task_1441881651073_95266_m_02 (and more) from job 
> job_1441881651073_95266
> Task with the most failures(4): 
> -
> Task ID:
>   task_1441881651073_95266_m_09
> URL:
>   
> http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1441881651073_95266&tipid=task_1441881651073_95266_m_09
> -
> Diagnostic Messages for this Task:
> Error: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.metastore.api.MetaException
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   at java.lang.Class.getDeclaredMethods0(Native Method)
>   at java.lang.Class.privateGetDeclaredMethods(Class.java:2570)
>   at java.lang.Class.privateGetPublicMethods(Class.java:2690)
>   at java.lang.Class.getMethods(Class.java:1467)
>   at com.sun.beans.finder.MethodFinder$1.create(MethodFinder.java:54)
>   at com.sun.beans.finder.MethodFinder$1.create(MethodFinder.java:49)
>   at com.sun.beans.util.Cache.get(Cache.java:127)
>

[jira] [Commented] (HIVE-12088) a simple insert hql throws out NoClassFoundException of MetaException

2015-10-13 Thread Feng Yuan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956334#comment-14956334
 ] 

Feng Yuan commented on HIVE-12088:
--

@[~xuefuz] is there anything wrong about metastore?
i copy a hive0.14 metastore mysql db when i upgrade to 1.2.1.
and db`s characterencoding is latin1.when i do 'msck repair table abc;'
output:'OK
Partitions not in metastore:
rec_feature_feedback:l_date=2015-09-09/cid=Cyiyaowang/bid=F7734668_CC49_8C4F_24C5_EA8B6728E394
  
rec_feature_feedback:l_date=2015-09-09/cid=Czgc_pc/bid=949722CF_12F7_523A_EE21_E3D591B7E755
Time taken: 137.984 seconds, Fetched: 1 row(s)'

a segment of hive.log is:
"Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: 
Specified key was too long; max key length is 767 bytes
at sun.reflect.GeneratedConstructorAccessor40.newInstance(Unknown 
Source)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:406)
at com.mysql.jdbc.Util.getInstance(Util.java:381)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1030)"

i upload a hive.log in attachments.

> a simple insert hql throws out NoClassFoundException of MetaException
> -
>
> Key: HIVE-12088
> URL: https://issues.apache.org/jira/browse/HIVE-12088
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 1.2.0, 1.2.1
>Reporter: Feng Yuan
> Fix For: 1.2.2
>
>
> example:
> from portrait.rec_feature_feedback a insert overwrite table portrait.test1 
> select iid, feedback_15day, feedback_7day, feedback_5day, feedback_3day, 
> feedback_1day where l_date = '2015-09-09' and bid in 
> ('949722CF_12F7_523A_EE21_E3D591B7E755');
> log shows:
> Query ID = hadoop_20151012153841_120bee59-56a7-4e53-9c45-76f97c0f50ad
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_1441881651073_95266, Tracking URL = 
> http://bjlg-44p12-rm01:8088/proxy/application_1441881651073_95266/
> Kill Command = /opt/hadoop/hadoop/bin/hadoop job  -kill 
> job_1441881651073_95266
> Hadoop job information for Stage-1: number of mappers: 21; number of 
> reducers: 0
> 2015-10-12 15:39:29,930 Stage-1 map = 0%,  reduce = 0%
> 2015-10-12 15:39:39,597 Stage-1 map = 5%,  reduce = 0%
> 2015-10-12 15:39:40,658 Stage-1 map = 0%,  reduce = 0%
> 2015-10-12 15:39:53,479 Stage-1 map = 5%,  reduce = 0%
> 2015-10-12 15:39:54,535 Stage-1 map = 0%,  reduce = 0%
> 2015-10-12 15:39:55,588 Stage-1 map = 10%,  reduce = 0%
> 2015-10-12 15:39:56,626 Stage-1 map = 5%,  reduce = 0%
> 2015-10-12 15:39:57,687 Stage-1 map = 0%,  reduce = 0%
> 2015-10-12 15:40:06,096 Stage-1 map = 100%,  reduce = 0%
> Ended Job = job_1441881651073_95266 with errors
> Error during job, obtaining debugging information...
> Examining task ID: task_1441881651073_95266_m_00 (and more) from job 
> job_1441881651073_95266
> Examining task ID: task_1441881651073_95266_m_16 (and more) from job 
> job_1441881651073_95266
> Examining task ID: task_1441881651073_95266_m_11 (and more) from job 
> job_1441881651073_95266
> Examining task ID: task_1441881651073_95266_m_18 (and more) from job 
> job_1441881651073_95266
> Examining task ID: task_1441881651073_95266_m_02 (and more) from job 
> job_1441881651073_95266
> Task with the most failures(4): 
> -
> Task ID:
>   task_1441881651073_95266_m_09
> URL:
>   
> http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1441881651073_95266&tipid=task_1441881651073_95266_m_09
> -
> Diagnostic Messages for this Task:
> Error: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.metastore.api.MetaException
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   at java.lang.Class.getDeclaredMethods0(Native Method)
>   at java.lang.Class.privateGetDeclaredMethods(Class.java:2570)
>   at java.lang.Class.privateGetPublicMethods(Class.java:2690)
>   at java.lang.Class.getMethods(Class.java:1467)
>   at com.sun.beans.finder.MethodFinder$1.create(MethodFinder.java:54)
>   at com.sun.beans.finder.MethodFinder$1.create(MethodFinder.java:49)
>   at com.sun.beans.util.Cache.get(Cache.java:127)
>

[jira] [Commented] (HIVE-12053) Stats performance regression caused by HIVE-11786

2015-10-13 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956326#comment-14956326
 ] 

Siddharth Seth commented on HIVE-12053:
---

Should HIVE-11786 be reverted while this is fixed ? The performance degradation 
is very noticeable. 

> Stats performance regression caused by HIVE-11786
> -
>
> Key: HIVE-12053
> URL: https://issues.apache.org/jira/browse/HIVE-12053
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Chaoyu Tang
>
> HIVE-11786 tried to normalize table TAB_COL_STATS/PART_COL_STATS but caused 
> performance regression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12017) Do not disable CBO by default when number of joins in a query is equal or less than 1

2015-10-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956275#comment-14956275
 ] 

Hive QA commented on HIVE-12017:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12766361/HIVE-12017.03.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5640/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5640/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5640/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-5640/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 072665b HIVE-12168 : Addendum to HIVE-12038 (Szehon, reviewed by 
Sergey)
+ git clean -f -d
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at 072665b HIVE-12168 : Addendum to HIVE-12038 (Szehon, reviewed by 
Sergey)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12766361 - PreCommit-HIVE-TRUNK-Build

> Do not disable CBO by default when number of joins in a query is equal or 
> less than 1
> -
>
> Key: HIVE-12017
> URL: https://issues.apache.org/jira/browse/HIVE-12017
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12017.01.patch, HIVE-12017.02.patch, 
> HIVE-12017.03.patch
>
>
> Instead, we could disable some parts of CBO that are not relevant if the 
> query contains 1 or 0 joins. Implementation should be able to define easily 
> other query patterns for which we might disable some parts of CBO (in case we 
> want to do it in the future).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11954) Extend logic to choose side table in MapJoin Conversion algorithm

2015-10-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956268#comment-14956268
 ] 

Hive QA commented on HIVE-11954:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12766421/HIVE-11954.07.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 9683 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map_multi_distinct
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_bucket_map_join_tez1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_bucket_map_join_tez2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_explainuser_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_mrr
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_index_bitmap_auto
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter_partitioned
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5639/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5639/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5639/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12766421 - PreCommit-HIVE-TRUNK-Build

> Extend logic to choose side table in MapJoin Conversion algorithm
> -
>
> Key: HIVE-11954
> URL: https://issues.apache.org/jira/browse/HIVE-11954
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11954.01.patch, HIVE-11954.02.patch, 
> HIVE-11954.03.patch, HIVE-11954.04.patch, HIVE-11954.05.patch, 
> HIVE-11954.06.patch, HIVE-11954.07.patch, HIVE-11954.patch, HIVE-11954.patch
>
>
> Selection of side table (in memory/hash table) in MapJoin Conversion 
> algorithm needs to be more sophisticated.
> In an N way Map Join, Hive should pick an input stream as side table (in 
> memory table) that has least cost in producing relation (like TS(FIL|Proj)*).
> Cost based choice needs extended cost model; without return path its going to 
> be hard to do this.
> For the time being we could employ a modified cost based algorithm for side 
> table selection.
> New algorithm is described below:
> 1. Identify the candidate set of inputs for side table (in memory/hash table) 
> from the inputs (based on conditional task size)
> 2. For each of the input identify its cost, memory requirement. Cost is 1 for 
> each heavy weight relation op (Join, GB, PTF/Windowing, TF, etc.). Cost for 
> an input is the total no of heavy weight ops in its branch.
> 3. Order set from #1 on cost & memory req (ascending order)
> 4. Pick the first element from #3 as the side table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12056) Branch 1.1.1: root pom and itest pom are not linked

2015-10-13 Thread Chao Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956193#comment-14956193
 ] 

Chao Sun commented on HIVE-12056:
-

[~vgumashta], patch looks good. +1. Thanks for fixing this!
One minor thing: under the project root pom.xml, do we need to change 
{{hive.version.shortname}} to 1.1.1 too?

> Branch 1.1.1: root pom and itest pom are not linked
> ---
>
> Key: HIVE-12056
> URL: https://issues.apache.org/jira/browse/HIVE-12056
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Affects Versions: 1.1.1
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-12056.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12171) LLAP: BuddyAllocator failures when querying uncompressed data

2015-10-13 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-12171:
---
Description: 
{code}
hive> select sum(l_extendedprice * l_discount) as revenue from testing.lineitem 
where l_shipdate >= '1993-01-01' and l_shipdate < '1994-01-01' ;

Caused by: 
org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
Failed to allocate 492; at 0 out of 1
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateMultiple(BuddyAllocator.java:176)
at 
org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.preReadUncompressedStream(EncodedReaderImpl.java:882)
at 
org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:319)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:413)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:194)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:191)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:191)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:74)
at 
org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
... 4 more
{code}

{code}
Caused by: java.lang.ClassCastException: 
org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl$UncompressedCacheChunk
 cannot be cast to org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$BufferChunk
at 
org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.copyAndReplaceUncompressedChunks(EncodedReaderImpl.java:962)
at 
org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.preReadUncompressedStream(EncodedReaderImpl.java:890)
at 
org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:319)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:413)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:194)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:191)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:191)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:74)
at 
org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
... 4 more
{code}

  was:
{code}
hive> select sum(l_extendedprice * l_discount) as revenue from testing.lineitem 
where l_shipdate >= '1993-01-01' and l_shipdate < '1994-01-01' ;

Caused by: 
org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
Failed to allocate 492; at 0 out of 1
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateMultiple(BuddyAllocator.java:176)
at 
org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.preReadUncompressedStream(EncodedReaderImpl.java:882)
at 
org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:319)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:413)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:194)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:191)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:191)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:74)
at 
org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
... 4 more
{code}


> LLAP: BuddyAllocator failures when querying uncompressed data
> -
>
> Key: HIVE-12171
> URL: https://issues.apache

[jira] [Commented] (HIVE-11954) Extend logic to choose side table in MapJoin Conversion algorithm

2015-10-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956161#comment-14956161
 ] 

Hive QA commented on HIVE-11954:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12766421/HIVE-11954.07.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to no tests executed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5635/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5635/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5635/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests exited with: InterruptedException: null
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12766421 - PreCommit-HIVE-TRUNK-Build

> Extend logic to choose side table in MapJoin Conversion algorithm
> -
>
> Key: HIVE-11954
> URL: https://issues.apache.org/jira/browse/HIVE-11954
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11954.01.patch, HIVE-11954.02.patch, 
> HIVE-11954.03.patch, HIVE-11954.04.patch, HIVE-11954.05.patch, 
> HIVE-11954.06.patch, HIVE-11954.07.patch, HIVE-11954.patch, HIVE-11954.patch
>
>
> Selection of side table (in memory/hash table) in MapJoin Conversion 
> algorithm needs to be more sophisticated.
> In an N way Map Join, Hive should pick an input stream as side table (in 
> memory table) that has least cost in producing relation (like TS(FIL|Proj)*).
> Cost based choice needs extended cost model; without return path its going to 
> be hard to do this.
> For the time being we could employ a modified cost based algorithm for side 
> table selection.
> New algorithm is described below:
> 1. Identify the candidate set of inputs for side table (in memory/hash table) 
> from the inputs (based on conditional task size)
> 2. For each of the input identify its cost, memory requirement. Cost is 1 for 
> each heavy weight relation op (Join, GB, PTF/Windowing, TF, etc.). Cost for 
> an input is the total no of heavy weight ops in its branch.
> 3. Order set from #1 on cost & memory req (ascending order)
> 4. Pick the first element from #3 as the side table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12170) normalize HBase metastore connection configuration

2015-10-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12170:

Fix Version/s: 2.0.0

> normalize HBase metastore connection configuration
> --
>
> Key: HIVE-12170
> URL: https://issues.apache.org/jira/browse/HIVE-12170
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Blocker
> Fix For: 2.0.0
>
>
> Right now there are two ways to get HBaseReadWrite instance in metastore. 
> Both get a threadlocal instance (is there a good reason for that?).
> 1) One is w/o conf and only works if someone called the (2) before, from any 
> thread.
> 2) The other blindly sets a static conf and then gets an instance with that 
> conf, or if someone already happened to call (1) or (2) from this thread, it 
> returns the existing instance with whatever conf was set before (but still 
> resets the current conf to new conf).
> This doesn't make sense even in an already-threaded-safe case (like linear 
> CLI-based tests), and can easily lead to bugs as described; the config 
> propagation logic is not good (example - HIVE-12167); some calls just reset 
> config blindly, so there's no point in setting staticConf, other than for 
> those who don't have conf and would rely on the static (which is bad design).
> Having connections with different configs reliably in not possible, and 
> multi-threaded cases would also break - you could even set conf, have it 
> reset and get instance with somebody else's conf. 
> Static should definitely be removed, maybe threadlocal too (HConnection is 
> thread-safe).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12170) normalize HBase metastore connection configuration

2015-10-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12170:

Description: 
Right now there are two ways to get HBaseReadWrite instance in metastore. Both 
get a threadlocal instance (is there a good reason for that?).
1) One is w/o conf and only works if someone called the (2) before, from any 
thread.
2) The other blindly sets a static conf and then gets an instance with that 
conf, or if someone already happened to call (1) or (2) from this thread, it 
returns the existing instance with whatever conf was set before (but still 
resets the current conf to new conf).

This doesn't make sense even in an already-threaded-safe case (like linear 
CLI-based tests), and can easily lead to bugs as described; the config 
propagation logic is not good (example - HIVE-12167); some calls just reset 
config blindly, so there's no point in setting staticConf, other than for those 
who don't have conf and would rely on the static (which is bad design).
Having connections with different configs reliably in not possible, and 
multi-threaded cases would also break - you could even set conf, have it reset 
and get instance with somebody else's conf. 

Static should definitely be removed, maybe threadlocal too (HConnection is 
thread-safe).

  was:
Right now there are two ways to get HBaseReadWrite instance in metastore. Both 
get a threadlocal instance (is there a good reason for that?).
1) One is w/o conf and only works if someone called the (2) before, from any 
thread.
2) The other blindly sets a static conf and then gets an instance with that 
conf, or if someone already happened to call (1) or (2) from this thread, it 
returns the existing instance with whatever conf was set before (but still 
resets the current conf to new conf).

This doesn't make sense even in single threaded case, and can easily lead to 
bugs as described; the config propagation logic is not good (example - 
HIVE-12167); some calls just reset config blindly, so there's no point in 
setting staticConf, other than for those who don't have conf and would rely on 
the static (which is bad design).
Having connections with different configs reliably in not possible, and 
multi-threaded cases would also break - you could even set conf, have it reset 
and get instance with somebody else's conf. 

Static should definitely be removed, maybe threadlocal too (HConnection is 
thread-safe).


> normalize HBase metastore connection configuration
> --
>
> Key: HIVE-12170
> URL: https://issues.apache.org/jira/browse/HIVE-12170
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Blocker
> Fix For: 2.0.0
>
>
> Right now there are two ways to get HBaseReadWrite instance in metastore. 
> Both get a threadlocal instance (is there a good reason for that?).
> 1) One is w/o conf and only works if someone called the (2) before, from any 
> thread.
> 2) The other blindly sets a static conf and then gets an instance with that 
> conf, or if someone already happened to call (1) or (2) from this thread, it 
> returns the existing instance with whatever conf was set before (but still 
> resets the current conf to new conf).
> This doesn't make sense even in an already-threaded-safe case (like linear 
> CLI-based tests), and can easily lead to bugs as described; the config 
> propagation logic is not good (example - HIVE-12167); some calls just reset 
> config blindly, so there's no point in setting staticConf, other than for 
> those who don't have conf and would rely on the static (which is bad design).
> Having connections with different configs reliably in not possible, and 
> multi-threaded cases would also break - you could even set conf, have it 
> reset and get instance with somebody else's conf. 
> Static should definitely be removed, maybe threadlocal too (HConnection is 
> thread-safe).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12170) normalize HBase metastore connection configuration

2015-10-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12170:

Description: 
Right now there are two ways to get HBaseReadWrite instance in metastore. Both 
get a threadlocal instance (is there a good reason for that?).
1) One is w/o conf and only works if someone called the (2) before, from any 
thread.
2) The other blindly sets a static conf and then gets an instance with that 
conf, or if someone already happened to call (1) or (2) from this thread, it 
returns the existing instance with whatever conf was set before (but still 
resets the current conf to new conf).

This doesn't make sense even in single threaded case, and can easily lead to 
bugs as described; the config propagation logic is not good (example - 
HIVE-12167); some calls just reset config blindly, so there's no point in 
setting staticConf, other than for those who don't have conf and would rely on 
the static (which is bad design).
Having connections with different configs reliably in not possible, and 
multi-threaded cases would also break - you could even set conf, have it reset 
and get instance with somebody else's conf. 

Static should definitely be removed, maybe threadlocal too (HConnection is 
thread-safe).

  was:
Right now there are two ways to get HBaseReadWrite instance in metastore. Both 
get a threadlocal instance (is there a good reason for that?).
1) One is w/o conf and only works if someone called the (2) before, from any 
thread.
2) The other blindly sets a static conf and then gets an instance with that 
conf, or if someone already happened to call (1) or (2) from this thread, it 
returns the existing instance with whatever conf was set before (but still 
resets the current conf to new conf).

This doesn't make sense even in single threaded case, and can easily lead to 
bugs as described; the config propagation logic is not good (example - 
HIVE-12167), as some calls just reset config blindly, so there's no point in 
setting staticConf, other than for those who don't have conf and would rely on 
the static (which is bad design).
Having connections with different configs reliably in not possible, and 
multi-threaded cases would also break - you could even set conf, have it reset 
and get instance with somebody else's conf. 

Static should definitely be removed, maybe threadlocal too (HConnection is 
thread-safe).


> normalize HBase metastore connection configuration
> --
>
> Key: HIVE-12170
> URL: https://issues.apache.org/jira/browse/HIVE-12170
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Blocker
>
> Right now there are two ways to get HBaseReadWrite instance in metastore. 
> Both get a threadlocal instance (is there a good reason for that?).
> 1) One is w/o conf and only works if someone called the (2) before, from any 
> thread.
> 2) The other blindly sets a static conf and then gets an instance with that 
> conf, or if someone already happened to call (1) or (2) from this thread, it 
> returns the existing instance with whatever conf was set before (but still 
> resets the current conf to new conf).
> This doesn't make sense even in single threaded case, and can easily lead to 
> bugs as described; the config propagation logic is not good (example - 
> HIVE-12167); some calls just reset config blindly, so there's no point in 
> setting staticConf, other than for those who don't have conf and would rely 
> on the static (which is bad design).
> Having connections with different configs reliably in not possible, and 
> multi-threaded cases would also break - you could even set conf, have it 
> reset and get instance with somebody else's conf. 
> Static should definitely be removed, maybe threadlocal too (HConnection is 
> thread-safe).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12170) normalize HBase metastore connection configuration

2015-10-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956157#comment-14956157
 ] 

Sergey Shelukhin commented on HIVE-12170:
-

[~alangates] [~daijy] fyi

> normalize HBase metastore connection configuration
> --
>
> Key: HIVE-12170
> URL: https://issues.apache.org/jira/browse/HIVE-12170
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Blocker
>
> Right now there are two ways to get HBaseReadWrite instance in metastore. 
> Both get a threadlocal instance (is there a good reason for that?).
> 1) One is w/o conf and only works if someone called the (2) before, from any 
> thread.
> 2) The other blindly sets a static conf and then gets an instance with that 
> conf, or if someone already happened to call (1) or (2) from this thread, it 
> returns the existing instance with whatever conf was set before (but still 
> resets the current conf to new conf).
> This doesn't make sense even in single threaded case, and can easily lead to 
> bugs as described; the config propagation logic is not good (example - 
> HIVE-12167); some calls just reset config blindly, so there's no point in 
> setting staticConf, other than for those who don't have conf and would rely 
> on the static (which is bad design).
> Having connections with different configs reliably in not possible, and 
> multi-threaded cases would also break - you could even set conf, have it 
> reset and get instance with somebody else's conf. 
> Static should definitely be removed, maybe threadlocal too (HConnection is 
> thread-safe).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12091) Merge file doesn't work for ORC table when running on Spark. [Spark Branch]

2015-10-13 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-12091:
--
Affects Version/s: 1.2.1

> Merge file doesn't work for ORC table when running on Spark. [Spark Branch]
> ---
>
> Key: HIVE-12091
> URL: https://issues.apache.org/jira/browse/HIVE-12091
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 1.2.1
>Reporter: Xin Hao
>Assignee: Rui Li
> Fix For: spark-branch
>
> Attachments: HIVE-12091.1-spark.patch, HIVE-12091.2-spark.patch
>
>
> This issue occurs when hive.merge.sparkfiles is set to true. And can be 
> workaround by setting hive.merge.sparkfiles to false.
> BTW, we did a local experiment to run the case with MR engine (set 
> hive.merge.mapfiles=true; set hive.merge.mapredfiles=true;), it can pass.
> (1)Component Version：
> -- Hive Spark Branch 70eeadd2f019dcb2e301690290c8807731eab7a1  +  Hive-11473 
> patch (HIVE-11473.3-spark.patch)  ---> This is to support Spark 1.5 for Hive 
> on Spark
> -- Spark 1.5.1
> (2)Case used:
> -- Big-Bench  Data Load (load data from HDFS to Hive warehouse, scored as ORC 
> format). The related HiveQL:
> {noformat}
> DROP TABLE IF EXISTS customer_temporary;
> CREATE EXTERNAL TABLE customer_temporary
>   ( c_customer_sk bigint  --not null
>   , c_customer_id string  --not null
>   , c_current_cdemo_skbigint
>   , c_current_hdemo_skbigint
>   , c_current_addr_sk bigint
>   , c_first_shipto_date_skbigint
>   , c_first_sales_date_sk bigint
>   , c_salutation  string
>   , c_first_name  string
>   , c_last_name   string
>   , c_preferred_cust_flag string
>   , c_birth_day   int
>   , c_birth_month int
>   , c_birth_year  int
>   , c_birth_country   string
>   , c_login   string
>   , c_email_address   string
>   , c_last_review_datestring
>   )
>   ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
>   STORED AS TEXTFILE LOCATION 
> '/user/root/benchmarks/bigbench_n1t/data/customer'
> ;
> DROP TABLE IF EXISTS customer;
> CREATE TABLE customer
> STORED AS ORC
> AS
> SELECT * FROM customer_temporary
> ;
> {noformat}
> (3)Error/Exception Message:
> {noformat}
> 15/10/12 14:28:38 INFO exec.Utilities: PLAN PATH = 
> hdfs://bhx2:8020/tmp/hive/root/4e145415-d4ea-4751-9e16-ff31edb0c258/hive_2015-10-12_14-28-12_485_2093357701513622173-1/-mr-10005/d891fdec-eacc-4f66-8827-e2b650c24810/map.xml
> 15/10/12 14:28:38 INFO OrcFileMergeOperator: ORC merge file input path: 
> hdfs://bhx2:8020/user/hive/warehouse/bigbench_n100g.db/.hive-staging_hive_2015-10-12_14-28-12_485_2093357701513622173-1/-ext-10003/01_0
> 15/10/12 14:28:38 INFO OrcFileMergeOperator: Merged stripe from file 
> hdfs://bhx2:8020/user/hive/warehouse/bigbench_n100g.db/.hive-staging_hive_2015-10-12_14-28-12_485_2093357701513622173-1/-ext-10003/01_0
>  [ offset : 3 length: 10525754 row: 247500 ]
> 15/10/12 14:28:38 INFO spark.SparkMergeFileRecordHandler: Closing Merge 
> Operator OFM
> 15/10/12 14:28:38 ERROR executor.Executor: Exception in task 1.0 in stage 1.0 
> (TID 4)
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed to close AbstractFileMergeOperator
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMergeFileRecordHandler.close(SparkMergeFileRecordHandler.java:115)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:58)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:106)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:118)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:118)
>   at 
> org.apache.spark.SparkContext$$anonfun$37.apply(SparkContext.scala:1984)
>   at 
> org.apache.spark.SparkContext$$anonfun$37.apply(SparkContext.scala:1984)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>   at org.apache.spark.scheduler.Task.run(Task.scala:88)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(T

[jira] [Updated] (HIVE-12091) Merge file doesn't work for ORC table when running on Spark. [Spark Branch]

2015-10-13 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-12091:
--
Summary: Merge file doesn't work for ORC table when running on Spark. 
[Spark Branch]  (was: HiveException (Failed to close AbstractFileMergeOperator) 
occurs during loading data to ORC file, when hive.merge.sparkfiles is set to 
true. [Spark Branch])

> Merge file doesn't work for ORC table when running on Spark. [Spark Branch]
> ---
>
> Key: HIVE-12091
> URL: https://issues.apache.org/jira/browse/HIVE-12091
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xin Hao
>Assignee: Rui Li
> Attachments: HIVE-12091.1-spark.patch, HIVE-12091.2-spark.patch
>
>
> This issue occurs when hive.merge.sparkfiles is set to true. And can be 
> workaround by setting hive.merge.sparkfiles to false.
> BTW, we did a local experiment to run the case with MR engine (set 
> hive.merge.mapfiles=true; set hive.merge.mapredfiles=true;), it can pass.
> (1)Component Version：
> -- Hive Spark Branch 70eeadd2f019dcb2e301690290c8807731eab7a1  +  Hive-11473 
> patch (HIVE-11473.3-spark.patch)  ---> This is to support Spark 1.5 for Hive 
> on Spark
> -- Spark 1.5.1
> (2)Case used:
> -- Big-Bench  Data Load (load data from HDFS to Hive warehouse, scored as ORC 
> format). The related HiveQL:
> {noformat}
> DROP TABLE IF EXISTS customer_temporary;
> CREATE EXTERNAL TABLE customer_temporary
>   ( c_customer_sk bigint  --not null
>   , c_customer_id string  --not null
>   , c_current_cdemo_skbigint
>   , c_current_hdemo_skbigint
>   , c_current_addr_sk bigint
>   , c_first_shipto_date_skbigint
>   , c_first_sales_date_sk bigint
>   , c_salutation  string
>   , c_first_name  string
>   , c_last_name   string
>   , c_preferred_cust_flag string
>   , c_birth_day   int
>   , c_birth_month int
>   , c_birth_year  int
>   , c_birth_country   string
>   , c_login   string
>   , c_email_address   string
>   , c_last_review_datestring
>   )
>   ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
>   STORED AS TEXTFILE LOCATION 
> '/user/root/benchmarks/bigbench_n1t/data/customer'
> ;
> DROP TABLE IF EXISTS customer;
> CREATE TABLE customer
> STORED AS ORC
> AS
> SELECT * FROM customer_temporary
> ;
> {noformat}
> (3)Error/Exception Message:
> {noformat}
> 15/10/12 14:28:38 INFO exec.Utilities: PLAN PATH = 
> hdfs://bhx2:8020/tmp/hive/root/4e145415-d4ea-4751-9e16-ff31edb0c258/hive_2015-10-12_14-28-12_485_2093357701513622173-1/-mr-10005/d891fdec-eacc-4f66-8827-e2b650c24810/map.xml
> 15/10/12 14:28:38 INFO OrcFileMergeOperator: ORC merge file input path: 
> hdfs://bhx2:8020/user/hive/warehouse/bigbench_n100g.db/.hive-staging_hive_2015-10-12_14-28-12_485_2093357701513622173-1/-ext-10003/01_0
> 15/10/12 14:28:38 INFO OrcFileMergeOperator: Merged stripe from file 
> hdfs://bhx2:8020/user/hive/warehouse/bigbench_n100g.db/.hive-staging_hive_2015-10-12_14-28-12_485_2093357701513622173-1/-ext-10003/01_0
>  [ offset : 3 length: 10525754 row: 247500 ]
> 15/10/12 14:28:38 INFO spark.SparkMergeFileRecordHandler: Closing Merge 
> Operator OFM
> 15/10/12 14:28:38 ERROR executor.Executor: Exception in task 1.0 in stage 1.0 
> (TID 4)
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed to close AbstractFileMergeOperator
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMergeFileRecordHandler.close(SparkMergeFileRecordHandler.java:115)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:58)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:106)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:118)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:118)
>   at 
> org.apache.spark.SparkContext$$anonfun$37.apply(SparkContext.scala:1984)
>   at 
> org.apache.spark.SparkContext$$anonfun$37.apply(SparkContext.scala:1984)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>   at org.apache.spark.scheduler.Task.run(Task.scala:88)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)

[jira] [Commented] (HIVE-12167) HBase metastore causes massive number of ZK exceptions in MiniTez tests

2015-10-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956149#comment-14956149
 ] 

Sergey Shelukhin commented on HIVE-12167:
-

[~alangates] can you please review?

> HBase metastore causes massive number of ZK exceptions in MiniTez tests
> ---
>
> Key: HIVE-12167
> URL: https://issues.apache.org/jira/browse/HIVE-12167
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12167.patch
>
>
> I ran some random test (vectorization_10) with HBase metastore for unrelated 
> reason, and I see large number of exceptions in hive.log
> {noformat}
> $ grep -c "ConnectionLoss" hive.log
> 52
> $ grep -c "Connection refused" hive.log
> 1014
> {noformat}
> These log lines' count has increased by ~33% since merging llap branch, but 
> it is still high before that (39/~700) for the same test). These lines are 
> not present if I disable HBase metastore.
> The exceptions are:
> {noformat}
> 2015-10-13T17:51:06,232 WARN  [Thread-359-SendThread(localhost:2181)]: 
> zookeeper.ClientCnxn (ClientCnxn.java:run(1102)) - Session 0x0 for server 
> null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) 
> ~[?:1.8.0_45]
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) 
> ~[?:1.8.0_45]
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
>  ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) 
> [zookeeper-3.4.6.jar:3.4.6-1569965]
> {noformat}
> that is retried for some seconds and then
> {noformat}
> 2015-10-13T17:51:22,867 WARN  [Thread-359]: zookeeper.ZKUtil 
> (ZKUtil.java:checkExists(544)) - hconnection-0x1da6ef180x0, 
> quorum=localhost:2181, baseZNode=/hbase Unable to set watcher on znode 
> (/hbase/hbaseid)
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/hbaseid
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) 
> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) 
> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045) 
> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:222)
>  ~[hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:541) 
> [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:105)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:879)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.(ConnectionManager.java:635)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method) ~[?:1.8.0_45]
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  [?:1.8.0_45]
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  [?:1.8.0_45]
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422) 
> [?:1.8.0_45]
>   at 
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:420)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager.createConnectionInternal(ConnectionManager.java:329)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:144)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hive.metastore.hbase.VanillaHBaseConnection.connect(VanillaHBaseConnection.java:56)
>  [hive-metastore-2.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.(HBaseReadWrite.java:227)
>  [hive-metastore-2.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.(HBaseReadWrite.java:83)
>  [hive-metastore-2.0.0-SNAPSHOT.jar

[jira] [Updated] (HIVE-12167) HBase metastore causes massive number of ZK exceptions in MiniTez tests

2015-10-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12167:

Attachment: HIVE-12167.patch

Here's a hacky fix for tests for now. Root problem needs to be addressed in a 
separate JIRA.

> HBase metastore causes massive number of ZK exceptions in MiniTez tests
> ---
>
> Key: HIVE-12167
> URL: https://issues.apache.org/jira/browse/HIVE-12167
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12167.patch
>
>
> I ran some random test (vectorization_10) with HBase metastore for unrelated 
> reason, and I see large number of exceptions in hive.log
> {noformat}
> $ grep -c "ConnectionLoss" hive.log
> 52
> $ grep -c "Connection refused" hive.log
> 1014
> {noformat}
> These log lines' count has increased by ~33% since merging llap branch, but 
> it is still high before that (39/~700) for the same test). These lines are 
> not present if I disable HBase metastore.
> The exceptions are:
> {noformat}
> 2015-10-13T17:51:06,232 WARN  [Thread-359-SendThread(localhost:2181)]: 
> zookeeper.ClientCnxn (ClientCnxn.java:run(1102)) - Session 0x0 for server 
> null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) 
> ~[?:1.8.0_45]
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) 
> ~[?:1.8.0_45]
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
>  ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) 
> [zookeeper-3.4.6.jar:3.4.6-1569965]
> {noformat}
> that is retried for some seconds and then
> {noformat}
> 2015-10-13T17:51:22,867 WARN  [Thread-359]: zookeeper.ZKUtil 
> (ZKUtil.java:checkExists(544)) - hconnection-0x1da6ef180x0, 
> quorum=localhost:2181, baseZNode=/hbase Unable to set watcher on znode 
> (/hbase/hbaseid)
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/hbaseid
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) 
> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) 
> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045) 
> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:222)
>  ~[hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:541) 
> [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:105)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:879)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.(ConnectionManager.java:635)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method) ~[?:1.8.0_45]
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  [?:1.8.0_45]
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  [?:1.8.0_45]
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422) 
> [?:1.8.0_45]
>   at 
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:420)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager.createConnectionInternal(ConnectionManager.java:329)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:144)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hive.metastore.hbase.VanillaHBaseConnection.connect(VanillaHBaseConnection.java:56)
>  [hive-metastore-2.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.(HBaseReadWrite.java:227)
>  [hive-metastore-2.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.(HBaseReadWrite.java:83)
>

[jira] [Commented] (HIVE-12028) An empty array is of type Array and incompatible with other array types

2015-10-13 Thread Navis (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956146#comment-14956146
 ] 

Navis commented on HIVE-12028:
--

looks like "cast null as int"

> An empty array is of type Array and incompatible with other array 
> types
> ---
>
> Key: HIVE-12028
> URL: https://issues.apache.org/jira/browse/HIVE-12028
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0, 1.2.1
>Reporter: Furcy Pin
>
> How to reproduce:
> ```sql
> SELECT ARRAY(ARRAY(1),ARRAY()) ;
> FAILED: SemanticException [Error 10016]: Line 1:22 Argument type mismatch 
> 'ARRAY': Argument type "array" is different from preceding arguments. 
> Previous type was "array"
> SELECT COALESCE(ARRAY(1),ARRAY()) ;
> FAILED: SemanticException [Error 10016]: Line 1:25 Argument type mismatch 
> 'ARRAY': The expressions after COALESCE should all have the same type: 
> "array" is expected but "array" is found
> ```
> This is especially painful for COALESCE, as we cannot
> remove NULLS after doing a JOIN.
> The same problem holds with maps.
> The only workaround I could think of (except adding my own UDF)
> is quite ugly :
> ```sql
> SELECT ARRAY(ARRAY(1),empty.arr) FROM (SELECT collect_set(id) as arr FROM 
> (SELECT 1 as id) T WHERE id=0) empty ;
> ```



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-12162) MiniTez tests take forever to shut down

2015-10-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956138#comment-14956138
 ] 

Sergey Shelukhin edited comment on HIVE-12162 at 10/14/15 2:13 AM:
---

That might be caused by the way the thing shuts down. In fact I'm starting to 
think that session shutdown is hurting us more than session startup w/o session 
reuse.
If you look for {noformat}
2015-10-13T18:52:31,532 INFO  [main]: tez.TezSessionState 
(TezSessionState.java:close(412)) - Closing Tez Session
{noformat}
line and go from there, there are tons of exceptions from various components 
seemingly because they shut down in the wrong order. I think HBase metastore 
stuff makes it worse.
ZK appears to shut down one of the first, then there are tons of errors from 
HDFS, HBase, all kinds of mess that takes forever. I was actually looking at 
something else, just an observation.


was (Author: sershe):
That might be caused by the way the thing shuts down. In fact I'm starting to 
think that session shutdown is hurting us more than session startup w/o session 
reuse.
If you look for {noformat}
2015-10-13T18:52:31,532 INFO  [main]: tez.TezSessionState 
(TezSessionState.java:close(412)) - Closing Tez Session
{noformat}
line and go from there, there are tons of exceptions from various components 
seemingly because they shut down in the wrong order. I think HBase metastore 
stuff makes it worse.
ZK appears to shut down one of the first, then there are tons of errors from 
HDFS related to that (leases), HBase errors related to HDFS, all kinds of mess 
that takes forever. I was actually looking at something else, just an 
observation.

> MiniTez tests take forever to shut down
> ---
>
> Key: HIVE-12162
> URL: https://issues.apache.org/jira/browse/HIVE-12162
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Vikram Dixit K
>
> Even before LLAP branch merge 
> (http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5618/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/)
>  and with AM reuse 
> (http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5628/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/),
>  there's this:
> {noformat}
> estCliDriver_shutdown 1 min 8 sec Passed
> testCliDriver_shutdown1 min 7 sec Passed
> testCliDriver_shutdown1 min 7 sec Passed
> testCliDriver_shutdown1 min 6 sec Passed
> testCliDriver_shutdown1 min 6 sec Passed
> testCliDriver_shutdown1 min 5 sec Passed
> testCliDriver_shutdown1 min 5 sec Passed
> testCliDriver_shutdown1 min 5 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-12162) MiniTez tests take forever to shut down

2015-10-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956138#comment-14956138
 ] 

Sergey Shelukhin edited comment on HIVE-12162 at 10/14/15 2:07 AM:
---

That might be caused by the way the thing shuts down. In fact I'm starting to 
think that session shutdown is hurting us more than session startup w/o session 
reuse.
If you look for {noformat}
2015-10-13T18:52:31,532 INFO  [main]: tez.TezSessionState 
(TezSessionState.java:close(412)) - Closing Tez Session
{noformat}
line and go from there, there are tons of exceptions from various components 
seemingly because they shut down in the wrong order. I think HBase metastore 
stuff makes it worse.
ZK appears to shut down one of the first, then there are tons of errors from 
HDFS related to that (leases), HBase errors related to HDFS, all kinds of mess 
that takes forever. I was actually looking at something else, just an 
observation.


was (Author: sershe):
That might be caused by the way the thing shuts down. In fact I'm starting to 
think that session shutdown is hurting us more than session startup w/o session 
reuse.
If you look for {noformat}
2015-10-13T18:52:31,532 INFO  [main]: tez.TezSessionState 
(TezSessionState.java:close(412)) - Closing Tez Session
{noformat}
line and go from there, there are tons of exceptions from various components 
seemingly because they shut down in the wrong order. I think HBase metastore 
stuff makes it worse.
ZK appears to shut down one of the first, then there are tons of errors from 
everywhere related to that, HBase errors related to HDFS, all kinds of mess 
that takes forever. I was actually looking at something else, just an 
observation.

> MiniTez tests take forever to shut down
> ---
>
> Key: HIVE-12162
> URL: https://issues.apache.org/jira/browse/HIVE-12162
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Vikram Dixit K
>
> Even before LLAP branch merge 
> (http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5618/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/)
>  and with AM reuse 
> (http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5628/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/),
>  there's this:
> {noformat}
> estCliDriver_shutdown 1 min 8 sec Passed
> testCliDriver_shutdown1 min 7 sec Passed
> testCliDriver_shutdown1 min 7 sec Passed
> testCliDriver_shutdown1 min 6 sec Passed
> testCliDriver_shutdown1 min 6 sec Passed
> testCliDriver_shutdown1 min 5 sec Passed
> testCliDriver_shutdown1 min 5 sec Passed
> testCliDriver_shutdown1 min 5 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12162) MiniTez tests take forever to shut down

2015-10-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956138#comment-14956138
 ] 

Sergey Shelukhin commented on HIVE-12162:
-

That might be caused by the way the thing shuts down. In fact I'm starting to 
think that session shutdown is hurting us more than session startup w/o session 
reuse.
If you look for {noformat}
2015-10-13T18:52:31,532 INFO  [main]: tez.TezSessionState 
(TezSessionState.java:close(412)) - Closing Tez Session
{noformat}
line and go from there, there are tons of exceptions from various components 
seemingly because they shut down in the wrong order. I think HBase metastore 
stuff makes it worse.
ZK appears to shut down one of the first, then there are tons of errors from 
everywhere related to that, HBase errors related to HDFS, all kinds of mess 
that takes forever. I was actually looking at something else, just an 
observation.

> MiniTez tests take forever to shut down
> ---
>
> Key: HIVE-12162
> URL: https://issues.apache.org/jira/browse/HIVE-12162
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Vikram Dixit K
>
> Even before LLAP branch merge 
> (http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5618/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/)
>  and with AM reuse 
> (http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5628/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/),
>  there's this:
> {noformat}
> estCliDriver_shutdown 1 min 8 sec Passed
> testCliDriver_shutdown1 min 7 sec Passed
> testCliDriver_shutdown1 min 7 sec Passed
> testCliDriver_shutdown1 min 6 sec Passed
> testCliDriver_shutdown1 min 6 sec Passed
> testCliDriver_shutdown1 min 5 sec Passed
> testCliDriver_shutdown1 min 5 sec Passed
> testCliDriver_shutdown1 min 5 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12166) LLAP: Cache read error at 1000 Gb scale tests

2015-10-13 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956119#comment-14956119
 ] 

Gopal V commented on HIVE-12166:


[~sershe]: rolling it into the nightly build.

> LLAP: Cache read error at 1000 Gb scale tests
> -
>
> Key: HIVE-12166
> URL: https://issues.apache.org/jira/browse/HIVE-12166
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12166.patch
>
>
> {code}
> hive> select sum(l_extendedprice * l_discount) as revenue from 
> tpch_flat_orc_1000.lineitem  ;
> Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
> at java.util.ArrayList.elementData(ArrayList.java:400)
> at java.util.ArrayList.get(ArrayList.java:413)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.determineRgsToRead(OrcEncodedDataReader.java:687)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:264)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:194)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:191)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:191)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:74)
> at 
> org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
> ... 4 more
> {code}
> Disabling the cache allows this to run through without error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12167) HBase metastore causes massive number of ZK exceptions in MiniTez tests

2015-10-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956113#comment-14956113
 ] 

Sergey Shelukhin commented on HIVE-12167:
-

That's because config management for HBase metastore is terrible and involves a 
static and a threadlocal.
So first the test inits the static and one proper threadlocal.
Then some other random thread inits its own threadlocal with its own unrelated 
conf (for everyone) and sets its threadlocal to incorrect value.

> HBase metastore causes massive number of ZK exceptions in MiniTez tests
> ---
>
> Key: HIVE-12167
> URL: https://issues.apache.org/jira/browse/HIVE-12167
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> I ran some random test (vectorization_10) with HBase metastore for unrelated 
> reason, and I see large number of exceptions in hive.log
> {noformat}
> $ grep -c "ConnectionLoss" hive.log
> 52
> $ grep -c "Connection refused" hive.log
> 1014
> {noformat}
> These log lines' count has increased by ~33% since merging llap branch, but 
> it is still high before that (39/~700) for the same test). These lines are 
> not present if I disable HBase metastore.
> The exceptions are:
> {noformat}
> 2015-10-13T17:51:06,232 WARN  [Thread-359-SendThread(localhost:2181)]: 
> zookeeper.ClientCnxn (ClientCnxn.java:run(1102)) - Session 0x0 for server 
> null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) 
> ~[?:1.8.0_45]
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) 
> ~[?:1.8.0_45]
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
>  ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) 
> [zookeeper-3.4.6.jar:3.4.6-1569965]
> {noformat}
> that is retried for some seconds and then
> {noformat}
> 2015-10-13T17:51:22,867 WARN  [Thread-359]: zookeeper.ZKUtil 
> (ZKUtil.java:checkExists(544)) - hconnection-0x1da6ef180x0, 
> quorum=localhost:2181, baseZNode=/hbase Unable to set watcher on znode 
> (/hbase/hbaseid)
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/hbaseid
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) 
> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) 
> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045) 
> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:222)
>  ~[hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:541) 
> [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:105)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:879)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.(ConnectionManager.java:635)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method) ~[?:1.8.0_45]
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  [?:1.8.0_45]
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  [?:1.8.0_45]
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422) 
> [?:1.8.0_45]
>   at 
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:420)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager.createConnectionInternal(ConnectionManager.java:329)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:144)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hive.metastore.hbase.VanillaHBaseConnection.connect(VanillaHBaseConnection.java:56)
>  [hive-metastore-2.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.had

[jira] [Comment Edited] (HIVE-12167) HBase metastore causes massive number of ZK exceptions in MiniTez tests

2015-10-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956113#comment-14956113
 ] 

Sergey Shelukhin edited comment on HIVE-12167 at 10/14/15 1:47 AM:
---

That's because config management for HBase metastore is terrible and involves a 
static and a threadlocal.
So first the test inits the static and one proper threadlocal.
Then some other random thread inits its own threadlocal with its own unrelated 
conf (for everyone) and sets its threadlocal to incorrect value.


was (Author: sershe):
That's because config management for HBase metastore is terrible and involves a 
static and a threadlocal.
So first the test inits the static and one proper threadlocal.
Then some other random thread inits its own threadlocal with its own unrelated 
conf (for everyone) and sets its threadlocal to incorrect value.

> HBase metastore causes massive number of ZK exceptions in MiniTez tests
> ---
>
> Key: HIVE-12167
> URL: https://issues.apache.org/jira/browse/HIVE-12167
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> I ran some random test (vectorization_10) with HBase metastore for unrelated 
> reason, and I see large number of exceptions in hive.log
> {noformat}
> $ grep -c "ConnectionLoss" hive.log
> 52
> $ grep -c "Connection refused" hive.log
> 1014
> {noformat}
> These log lines' count has increased by ~33% since merging llap branch, but 
> it is still high before that (39/~700) for the same test). These lines are 
> not present if I disable HBase metastore.
> The exceptions are:
> {noformat}
> 2015-10-13T17:51:06,232 WARN  [Thread-359-SendThread(localhost:2181)]: 
> zookeeper.ClientCnxn (ClientCnxn.java:run(1102)) - Session 0x0 for server 
> null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) 
> ~[?:1.8.0_45]
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) 
> ~[?:1.8.0_45]
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
>  ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) 
> [zookeeper-3.4.6.jar:3.4.6-1569965]
> {noformat}
> that is retried for some seconds and then
> {noformat}
> 2015-10-13T17:51:22,867 WARN  [Thread-359]: zookeeper.ZKUtil 
> (ZKUtil.java:checkExists(544)) - hconnection-0x1da6ef180x0, 
> quorum=localhost:2181, baseZNode=/hbase Unable to set watcher on znode 
> (/hbase/hbaseid)
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/hbaseid
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) 
> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) 
> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045) 
> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:222)
>  ~[hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:541) 
> [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:105)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:879)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.(ConnectionManager.java:635)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method) ~[?:1.8.0_45]
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  [?:1.8.0_45]
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  [?:1.8.0_45]
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422) 
> [?:1.8.0_45]
>   at 
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:420)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager.createConnectionInternal(C

[jira] [Commented] (HIVE-12056) Branch 1.1.1: root pom and itest pom are not linked

2015-10-13 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956111#comment-14956111
 ] 

Vaibhav Gumashta commented on HIVE-12056:
-

[~csun] This is good for review.

> Branch 1.1.1: root pom and itest pom are not linked
> ---
>
> Key: HIVE-12056
> URL: https://issues.apache.org/jira/browse/HIVE-12056
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Affects Versions: 1.1.1
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-12056.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11499) Datanucleus leaks classloaders when used using embedded metastore with HiveServer2 with UDFs

2015-10-13 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956110#comment-14956110
 ] 

Vaibhav Gumashta commented on HIVE-11499:
-

TestJdbcWithMiniHS2.testAddJarDataNucleusUnCaching is failing because the test 
udf jar file is not yet in code base. It passes locally. This looks good to 
commit.

> Datanucleus leaks classloaders when used using embedded metastore with 
> HiveServer2 with UDFs
> 
>
> Key: HIVE-11499
> URL: https://issues.apache.org/jira/browse/HIVE-11499
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Metastore
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.1.1, 1.2.1
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-11499.1.patch, HIVE-11499.3.patch, 
> HIVE-11499.4.patch, HS2-NucleusCache-Leak.tiff
>
>
> When UDFs are used, we create a new classloader to add the UDF jar. Similar 
> to what hadoop's reflection utils does(HIVE-11408), datanucleus caches the 
> classloaders 
> (https://github.com/datanucleus/datanucleus-core/blob/3.2/src/java/org/datanucleus/NucleusContext.java#L161).
>  JDOPersistanceManager factory (1 per JVM) holds on to a NucleusContext 
> reference 
> (https://github.com/datanucleus/datanucleus-api-jdo/blob/3.2/src/java/org/datanucleus/api/jdo/JDOPersistenceManagerFactory.java#L115).
>  Until we call  NucleusContext#close, the classloader cache is not cleared. 
> In case of UDFs this can lead to permgen leak, as shown in the attached 
> screenshot, where NucleusContext holds on to several URLClassloader objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12062) enable HBase metastore file metadata cache for tez tests

2015-10-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956100#comment-14956100
 ] 

Sergey Shelukhin commented on HIVE-12062:
-

ZK port needs to be propagated to Tez AM it looks like

> enable HBase metastore file metadata cache for tez tests
> 
>
> Key: HIVE-12062
> URL: https://issues.apache.org/jira/browse/HIVE-12062
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12062.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11822) vectorize NVL UDF

2015-10-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956101#comment-14956101
 ] 

Hive QA commented on HIVE-11822:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12766291/HIVE-11822.2.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9683 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_coalesce
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5634/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5634/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5634/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12766291 - PreCommit-HIVE-TRUNK-Build

> vectorize NVL UDF
> -
>
> Key: HIVE-11822
> URL: https://issues.apache.org/jira/browse/HIVE-11822
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Takanobu Asanuma
> Attachments: HIVE-11822.1.patch, HIVE-11822.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12168) Addendum to HIVE-12038

2015-10-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956098#comment-14956098
 ] 

Sergey Shelukhin commented on HIVE-12168:
-

+1

> Addendum to HIVE-12038
> --
>
> Key: HIVE-12168
> URL: https://issues.apache.org/jira/browse/HIVE-12168
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-12168.patch
>
>
> In HIVE-12038, missed a case of Error.  Originally the assumption that if 
> error is true, then it is always a build error.  Apparently there is a 
> TestFailedException.
> Currently, it incorrectly report failed tests as build errors.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12091) HiveException (Failed to close AbstractFileMergeOperator) occurs during loading data to ORC file, when hive.merge.sparkfiles is set to true. [Spark Branch]

2015-10-13 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956095#comment-14956095
 ] 

Rui Li commented on HIVE-12091:
---

Latest failures are not related. I'll commit this shortly.

> HiveException (Failed to close AbstractFileMergeOperator) occurs during 
> loading data to ORC file, when hive.merge.sparkfiles is set to true. [Spark 
> Branch]
> ---
>
> Key: HIVE-12091
> URL: https://issues.apache.org/jira/browse/HIVE-12091
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xin Hao
>Assignee: Rui Li
> Attachments: HIVE-12091.1-spark.patch, HIVE-12091.2-spark.patch
>
>
> This issue occurs when hive.merge.sparkfiles is set to true. And can be 
> workaround by setting hive.merge.sparkfiles to false.
> BTW, we did a local experiment to run the case with MR engine (set 
> hive.merge.mapfiles=true; set hive.merge.mapredfiles=true;), it can pass.
> (1)Component Version：
> -- Hive Spark Branch 70eeadd2f019dcb2e301690290c8807731eab7a1  +  Hive-11473 
> patch (HIVE-11473.3-spark.patch)  ---> This is to support Spark 1.5 for Hive 
> on Spark
> -- Spark 1.5.1
> (2)Case used:
> -- Big-Bench  Data Load (load data from HDFS to Hive warehouse, scored as ORC 
> format). The related HiveQL:
> {noformat}
> DROP TABLE IF EXISTS customer_temporary;
> CREATE EXTERNAL TABLE customer_temporary
>   ( c_customer_sk bigint  --not null
>   , c_customer_id string  --not null
>   , c_current_cdemo_skbigint
>   , c_current_hdemo_skbigint
>   , c_current_addr_sk bigint
>   , c_first_shipto_date_skbigint
>   , c_first_sales_date_sk bigint
>   , c_salutation  string
>   , c_first_name  string
>   , c_last_name   string
>   , c_preferred_cust_flag string
>   , c_birth_day   int
>   , c_birth_month int
>   , c_birth_year  int
>   , c_birth_country   string
>   , c_login   string
>   , c_email_address   string
>   , c_last_review_datestring
>   )
>   ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
>   STORED AS TEXTFILE LOCATION 
> '/user/root/benchmarks/bigbench_n1t/data/customer'
> ;
> DROP TABLE IF EXISTS customer;
> CREATE TABLE customer
> STORED AS ORC
> AS
> SELECT * FROM customer_temporary
> ;
> {noformat}
> (3)Error/Exception Message:
> {noformat}
> 15/10/12 14:28:38 INFO exec.Utilities: PLAN PATH = 
> hdfs://bhx2:8020/tmp/hive/root/4e145415-d4ea-4751-9e16-ff31edb0c258/hive_2015-10-12_14-28-12_485_2093357701513622173-1/-mr-10005/d891fdec-eacc-4f66-8827-e2b650c24810/map.xml
> 15/10/12 14:28:38 INFO OrcFileMergeOperator: ORC merge file input path: 
> hdfs://bhx2:8020/user/hive/warehouse/bigbench_n100g.db/.hive-staging_hive_2015-10-12_14-28-12_485_2093357701513622173-1/-ext-10003/01_0
> 15/10/12 14:28:38 INFO OrcFileMergeOperator: Merged stripe from file 
> hdfs://bhx2:8020/user/hive/warehouse/bigbench_n100g.db/.hive-staging_hive_2015-10-12_14-28-12_485_2093357701513622173-1/-ext-10003/01_0
>  [ offset : 3 length: 10525754 row: 247500 ]
> 15/10/12 14:28:38 INFO spark.SparkMergeFileRecordHandler: Closing Merge 
> Operator OFM
> 15/10/12 14:28:38 ERROR executor.Executor: Exception in task 1.0 in stage 1.0 
> (TID 4)
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed to close AbstractFileMergeOperator
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMergeFileRecordHandler.close(SparkMergeFileRecordHandler.java:115)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:58)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:106)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:118)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:118)
>   at 
> org.apache.spark.SparkContext$$anonfun$37.apply(SparkContext.scala:1984)
>   at 
> org.apache.spark.SparkContext$$anonfun$37.apply(SparkContext.scala:1984)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>   at org.apache.spark.scheduler.Task.run(Task.scala:88)
>   at org.apache.spark.executor.Executor$TaskRunn

[jira] [Updated] (HIVE-12168) Addendum to HIVE-12038

2015-10-13 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-12168:
-
Attachment: HIVE-12168.patch

Making a fix.  I already applied locally to the build machine, otherwise all 
the reports are mistakenly flagging failed tests as build failures, and 
skipping the test failure report.

> Addendum to HIVE-12038
> --
>
> Key: HIVE-12168
> URL: https://issues.apache.org/jira/browse/HIVE-12168
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-12168.patch
>
>
> In HIVE-12038, missed a case of Error.  Originally the assumption that if 
> error is true, then it is always a build error.  Apparently there is a 
> TestFailedException.
> Currently, it incorrectly report failed tests as build errors.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-12167) HBase metastore causes massive number of ZK exceptions in MiniTez tests

2015-10-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-12167:
---

Assignee: Sergey Shelukhin  (was: Daniel Dai)

> HBase metastore causes massive number of ZK exceptions in MiniTez tests
> ---
>
> Key: HIVE-12167
> URL: https://issues.apache.org/jira/browse/HIVE-12167
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> I ran some random test (vectorization_10) with HBase metastore for unrelated 
> reason, and I see large number of exceptions in hive.log
> {noformat}
> $ grep -c "ConnectionLoss" hive.log
> 52
> $ grep -c "Connection refused" hive.log
> 1014
> {noformat}
> These log lines' count has increased by ~33% since merging llap branch, but 
> it is still high before that (39/~700) for the same test). These lines are 
> not present if I disable HBase metastore.
> The exceptions are:
> {noformat}
> 2015-10-13T17:51:06,232 WARN  [Thread-359-SendThread(localhost:2181)]: 
> zookeeper.ClientCnxn (ClientCnxn.java:run(1102)) - Session 0x0 for server 
> null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) 
> ~[?:1.8.0_45]
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) 
> ~[?:1.8.0_45]
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
>  ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) 
> [zookeeper-3.4.6.jar:3.4.6-1569965]
> {noformat}
> that is retried for some seconds and then
> {noformat}
> 2015-10-13T17:51:22,867 WARN  [Thread-359]: zookeeper.ZKUtil 
> (ZKUtil.java:checkExists(544)) - hconnection-0x1da6ef180x0, 
> quorum=localhost:2181, baseZNode=/hbase Unable to set watcher on znode 
> (/hbase/hbaseid)
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/hbaseid
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) 
> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) 
> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045) 
> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:222)
>  ~[hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:541) 
> [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:105)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:879)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.(ConnectionManager.java:635)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method) ~[?:1.8.0_45]
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  [?:1.8.0_45]
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  [?:1.8.0_45]
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422) 
> [?:1.8.0_45]
>   at 
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:420)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager.createConnectionInternal(ConnectionManager.java:329)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:144)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hive.metastore.hbase.VanillaHBaseConnection.connect(VanillaHBaseConnection.java:56)
>  [hive-metastore-2.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.(HBaseReadWrite.java:227)
>  [hive-metastore-2.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.(HBaseReadWrite.java:83)
>  [hive-metastore-2.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite$1.initi

[jira] [Commented] (HIVE-12167) HBase metastore causes massive number of ZK exceptions in MiniTez tests

2015-10-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956080#comment-14956080
 ] 

Sergey Shelukhin commented on HIVE-12167:
-

Looks like zk config is invalid somewhere... there are two different quorums 
logged. I wonder how it even works?

> HBase metastore causes massive number of ZK exceptions in MiniTez tests
> ---
>
> Key: HIVE-12167
> URL: https://issues.apache.org/jira/browse/HIVE-12167
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Daniel Dai
>
> I ran some random test (vectorization_10) with HBase metastore for unrelated 
> reason, and I see large number of exceptions in hive.log
> {noformat}
> $ grep -c "ConnectionLoss" hive.log
> 52
> $ grep -c "Connection refused" hive.log
> 1014
> {noformat}
> These log lines' count has increased by ~33% since merging llap branch, but 
> it is still high before that (39/~700) for the same test). These lines are 
> not present if I disable HBase metastore.
> The exceptions are:
> {noformat}
> 2015-10-13T17:51:06,232 WARN  [Thread-359-SendThread(localhost:2181)]: 
> zookeeper.ClientCnxn (ClientCnxn.java:run(1102)) - Session 0x0 for server 
> null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) 
> ~[?:1.8.0_45]
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) 
> ~[?:1.8.0_45]
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
>  ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) 
> [zookeeper-3.4.6.jar:3.4.6-1569965]
> {noformat}
> that is retried for some seconds and then
> {noformat}
> 2015-10-13T17:51:22,867 WARN  [Thread-359]: zookeeper.ZKUtil 
> (ZKUtil.java:checkExists(544)) - hconnection-0x1da6ef180x0, 
> quorum=localhost:2181, baseZNode=/hbase Unable to set watcher on znode 
> (/hbase/hbaseid)
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/hbaseid
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) 
> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) 
> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045) 
> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:222)
>  ~[hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:541) 
> [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:105)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:879)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.(ConnectionManager.java:635)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method) ~[?:1.8.0_45]
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  [?:1.8.0_45]
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  [?:1.8.0_45]
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422) 
> [?:1.8.0_45]
>   at 
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:420)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager.createConnectionInternal(ConnectionManager.java:329)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:144)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hive.metastore.hbase.VanillaHBaseConnection.connect(VanillaHBaseConnection.java:56)
>  [hive-metastore-2.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.(HBaseReadWrite.java:227)
>  [hive-metastore-2.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.(HBaseReadWrite.java:83)
>  [hiv

[jira] [Commented] (HIVE-11679) SemanticAnalysis of "a=1" can result in a new Configuration() object

2015-10-13 Thread Navis (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956079#comment-14956079
 ] 

Navis commented on HIVE-11679:
--

Strange.. I cannot reproduce fail of udaf_histogram_numeric

> SemanticAnalysis of "a=1" can result in a new Configuration() object
> 
>
> Key: HIVE-11679
> URL: https://issues.apache.org/jira/browse/HIVE-11679
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Navis
> Attachments: HIVE-11679.1.patch.txt, HIVE-11679.2.patch.txt, 
> HIVE-11679.3.patch.txt
>
>
> {code}
> public static ExprNodeGenericFuncDesc newInstance(GenericUDF genericUDF,
>   String funcText,
>   List children) throws UDFArgumentException {
> ...
>  if (genericUDF instanceof GenericUDFBaseCompare && children.size() == 2) {
>   TypeInfo oiTypeInfo0 = children.get(0).getTypeInfo();
>   TypeInfo oiTypeInfo1 = children.get(1).getTypeInfo();
>   SessionState ss = SessionState.get();
>   Configuration conf = (ss != null) ? ss.getConf() : new Configuration();
> {code}
> This is both a SessionState.get() which is a threadlocal lookup or worse, a  
> new Configuration()  which means XML parsing of multiple files for each 
> equality expression in the query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12167) HBase metastore causes massive number of ZK exceptions in MiniTez tests

2015-10-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956076#comment-14956076
 ] 

Sergey Shelukhin commented on HIVE-12167:
-

[~daijy] [~alangates] can you please take a look?


> HBase metastore causes massive number of ZK exceptions in MiniTez tests
> ---
>
> Key: HIVE-12167
> URL: https://issues.apache.org/jira/browse/HIVE-12167
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Daniel Dai
>
> I ran some random test (vectorization_10) with HBase metastore for unrelated 
> reason, and I see large number of exceptions in hive.log
> {noformat}
> $ grep -c "ConnectionLoss" hive.log
> 52
> $ grep -c "Connection refused" hive.log
> 1014
> {noformat}
> These log lines' count has increased by ~33% since merging llap branch, but 
> it is still high before that (39/~700) for the same test). These lines are 
> not present if I disable HBase metastore.
> The exceptions are:
> {noformat}
> 2015-10-13T17:51:06,232 WARN  [Thread-359-SendThread(localhost:2181)]: 
> zookeeper.ClientCnxn (ClientCnxn.java:run(1102)) - Session 0x0 for server 
> null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) 
> ~[?:1.8.0_45]
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) 
> ~[?:1.8.0_45]
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
>  ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) 
> [zookeeper-3.4.6.jar:3.4.6-1569965]
> {noformat}
> that is retried for some seconds and then
> {noformat}
> 2015-10-13T17:51:22,867 WARN  [Thread-359]: zookeeper.ZKUtil 
> (ZKUtil.java:checkExists(544)) - hconnection-0x1da6ef180x0, 
> quorum=localhost:2181, baseZNode=/hbase Unable to set watcher on znode 
> (/hbase/hbaseid)
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/hbaseid
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) 
> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) 
> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045) 
> ~[zookeeper-3.4.6.jar:3.4.6-1569965]
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:222)
>  ~[hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:541) 
> [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:105)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:879)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.(ConnectionManager.java:635)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method) ~[?:1.8.0_45]
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  [?:1.8.0_45]
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  [?:1.8.0_45]
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422) 
> [?:1.8.0_45]
>   at 
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:420)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager.createConnectionInternal(ConnectionManager.java:329)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:144)
>  [hbase-client-1.1.1.jar:1.1.1]
>   at 
> org.apache.hadoop.hive.metastore.hbase.VanillaHBaseConnection.connect(VanillaHBaseConnection.java:56)
>  [hive-metastore-2.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.(HBaseReadWrite.java:227)
>  [hive-metastore-2.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.(HBaseReadWrite.java:83)
>  [hive-metastore-2.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hado

[jira] [Updated] (HIVE-12167) HBase metastore causes massive number of ZK exceptions in MiniTez tests

2015-10-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12167:

Description: 
I ran some random test (vectorization_10) with HBase metastore for unrelated 
reason, and I see large number of exceptions in hive.log
{noformat}
$ grep -c "ConnectionLoss" hive.log
52
$ grep -c "Connection refused" hive.log
1014
{noformat}
These log lines' count has increased by ~33% since merging llap branch, but it 
is still high before that (39/~700) for the same test). These lines are not 
present if I disable HBase metastore.
The exceptions are:
{noformat}
2015-10-13T17:51:06,232 WARN  [Thread-359-SendThread(localhost:2181)]: 
zookeeper.ClientCnxn (ClientCnxn.java:run(1102)) - Session 0x0 for server null, 
unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) 
~[?:1.8.0_45]
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) 
~[?:1.8.0_45]
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
 ~[zookeeper-3.4.6.jar:3.4.6-1569965]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) 
[zookeeper-3.4.6.jar:3.4.6-1569965]
{noformat}
that is retried for some seconds and then
{noformat}
2015-10-13T17:51:22,867 WARN  [Thread-359]: zookeeper.ZKUtil 
(ZKUtil.java:checkExists(544)) - hconnection-0x1da6ef180x0, 
quorum=localhost:2181, baseZNode=/hbase Unable to set watcher on znode 
(/hbase/hbaseid)
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /hbase/hbaseid
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) 
~[zookeeper-3.4.6.jar:3.4.6-1569965]
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) 
~[zookeeper-3.4.6.jar:3.4.6-1569965]
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045) 
~[zookeeper-3.4.6.jar:3.4.6-1569965]
at 
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:222)
 ~[hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:541) 
[hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
 [hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:105)
 [hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:879)
 [hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.(ConnectionManager.java:635)
 [hbase-client-1.1.1.jar:1.1.1]
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
Method) ~[?:1.8.0_45]
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
 [?:1.8.0_45]
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 [?:1.8.0_45]
at java.lang.reflect.Constructor.newInstance(Constructor.java:422) 
[?:1.8.0_45]
at 
org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
 [hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:420)
 [hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hbase.client.ConnectionManager.createConnectionInternal(ConnectionManager.java:329)
 [hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:144)
 [hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hive.metastore.hbase.VanillaHBaseConnection.connect(VanillaHBaseConnection.java:56)
 [hive-metastore-2.0.0-SNAPSHOT.jar:?]
at 
org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.(HBaseReadWrite.java:227)
 [hive-metastore-2.0.0-SNAPSHOT.jar:?]
at 
org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.(HBaseReadWrite.java:83)
 [hive-metastore-2.0.0-SNAPSHOT.jar:?]
at 
org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite$1.initialValue(HBaseReadWrite.java:157)
 [hive-metastore-2.0.0-SNAPSHOT.jar:?]
at 
org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite$1.initialValue(HBaseReadWrite.java:151)
 [hive-metastore-2.0.0-SNAPSHOT.jar:?]
at java.lang.ThreadLocal.setInitialValue(ThreadLocal.java:180) 
[?:1.8.0_45]
at java.lang.ThreadLocal.get(ThreadLocal.java:170) [?:1.8.0_45]
at 
org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.getInstance(HBaseReadWrite.java:205)
 [hive-metastore-2.0.0-SNAPSHOT.jar:?]
at 
org.apache.hadoop.hive.metastore.hbase.St

[jira] [Updated] (HIVE-12082) Null comparison for greatest and least operator

2015-10-13 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-12082:
-
Attachment: HIVE-12082.2.patch

Address review comments.

> Null comparison for greatest and least operator
> ---
>
> Key: HIVE-12082
> URL: https://issues.apache.org/jira/browse/HIVE-12082
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-12082.2.patch, HIVE-12082.patch
>
>
> In mysql comparisons if any of the entries are null, then the result is null.
> [https://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html|https://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html]
>  and 
> [https://dev.mysql.com/doc/refman/5.0/en/type-conversion.html|https://dev.mysql.com/doc/refman/5.0/en/type-conversion.html].
> This can be demonstrated by the following mysql query:
> {noformat}
> mysql> select greatest(1, null) from test;
> +---+
> | greatest(1, null) |
> +---+
> |  NULL |
> +---+
> 1 row in set (0.00 sec)
> mysql> select greatest(-1, null) from test;
> ++
> | greatest(-1, null) |
> ++
> |   NULL |
> ++
> 1 row in set (0.00 sec)
> {noformat}
> This is in contrast to Hive, where null are ignored in the comparisons.
> {noformat}
> hive> select greatest(null, 1) from test;
> OK
> 1
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12157) select-clause doesn't support unicode alias

2015-10-13 Thread richard du (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956070#comment-14956070
 ] 

richard du commented on HIVE-12157:
---

OK.I'd like to submit the patch.

> select-clause doesn't support unicode alias
> ---
>
> Key: HIVE-12157
> URL: https://issues.apache.org/jira/browse/HIVE-12157
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Affects Versions: 1.2.1
>Reporter: richard du
>Priority: Minor
>
> Parser will throw exception when I use alias:
> hive> desc test;
> OK
> a   int 
> b   string  
> Time taken: 0.135 seconds, Fetched: 2 row(s)
> hive> select a as 行1 from test limit 10;
> NoViableAltException(302@[134:7: ( ( ( KW_AS )? identifier ) | ( KW_AS LPAREN 
> identifier ( COMMA identifier )* RPAREN ) )?])
> at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
> at org.antlr.runtime.DFA.predict(DFA.java:116)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectItem(HiveParser_SelectClauseParser.java:2915)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectList(HiveParser_SelectClauseParser.java:1373)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectClause(HiveParser_SelectClauseParser.java:1128)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectClause(HiveParser.java:45827)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:41495)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:41402)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:40413)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:40283)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1590)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1109)
> at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:202)
> at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:396)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> FAILED: ParseException line 1:13 cannot recognize input near 'as' '1' 'from' 
> in selection target



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12167) HBase metastore causes massive number of ZK exceptions in MiniTez tests

2015-10-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12167:

Description: 
I ran some random test (vectorization_10) with HBase metastore for unrelated 
reason, and I see large number of exceptions in hive.log
{noformat}
$ grep -c "ConnectionLoss" hive.log
52
$ grep -c "Connection refused" hive.log
1014
{noformat}
These log lines' count has increased by ~33% since merging llap branch, but it 
is still high before that (39/~700) for the same test). These lines are not 
present if I disable HBase metastore.
The exceptions are:
{noformat}
2015-10-13T17:51:06,232 WARN  [Thread-359-SendThread(localhost:2181)]: 
zookeeper.ClientCnxn (ClientCnxn.java:run(1102)) - Session 0x0 for server null, 
unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) 
~[?:1.8.0_45]
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) 
~[?:1.8.0_45]
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
 ~[zookeeper-3.4.6.jar:3.4.6-1569965]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) 
[zookeeper-3.4.6.jar:3.4.6-1569965]
{noformat}
that is retried for some seconds and then
{noformat}
2015-10-13T17:51:22,867 WARN  [Thread-359]: zookeeper.ZKUtil 
(ZKUtil.java:checkExists(544)) - hconnection-0x1da6ef180x0, 
quorum=localhost:2181, baseZNode=/hbase Unable to set watcher on znode 
(/hbase/hbaseid)
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /hbase/hbaseid
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) 
~[zookeeper-3.4.6.jar:3.4.6-1569965]
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) 
~[zookeeper-3.4.6.jar:3.4.6-1569965]
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045) 
~[zookeeper-3.4.6.jar:3.4.6-1569965]
at 
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:222)
 ~[hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:541) 
[hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
 [hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:105)
 [hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:879)
 [hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.(ConnectionManager.java:635)
 [hbase-client-1.1.1.jar:1.1.1]
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
Method) ~[?:1.8.0_45]
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
 [?:1.8.0_45]
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 [?:1.8.0_45]
at java.lang.reflect.Constructor.newInstance(Constructor.java:422) 
[?:1.8.0_45]
at 
org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
 [hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:420)
 [hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hbase.client.ConnectionManager.createConnectionInternal(ConnectionManager.java:329)
 [hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:144)
 [hbase-client-1.1.1.jar:1.1.1]
at 
org.apache.hadoop.hive.metastore.hbase.VanillaHBaseConnection.connect(VanillaHBaseConnection.java:56)
 [hive-metastore-2.0.0-SNAPSHOT.jar:?]
at 
org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.(HBaseReadWrite.java:227)
 [hive-metastore-2.0.0-SNAPSHOT.jar:?]
at 
org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.(HBaseReadWrite.java:83)
 [hive-metastore-2.0.0-SNAPSHOT.jar:?]
at 
org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite$1.initialValue(HBaseReadWrite.java:157)
 [hive-metastore-2.0.0-SNAPSHOT.jar:?]
at 
org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite$1.initialValue(HBaseReadWrite.java:151)
 [hive-metastore-2.0.0-SNAPSHOT.jar:?]
at java.lang.ThreadLocal.setInitialValue(ThreadLocal.java:180) 
[?:1.8.0_45]
at java.lang.ThreadLocal.get(ThreadLocal.java:170) [?:1.8.0_45]
at 
org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.getInstance(HBaseReadWrite.java:205)
 [hive-metastore-2.0.0-SNAPSHOT.jar:?]
at 
org.apache.hadoop.hive.metastore.hbase.St

[jira] [Commented] (HIVE-11768) java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances

2015-10-13 Thread Navis (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956063#comment-14956063
 ] 

Navis commented on HIVE-11768:
--

[~thejas] I've left "FileSystem.deleteOnExit" problem for another issue because 
"FileSystem.close" will not be called ever. Could it be removed from code 
safely? I'm not sure on that.

> java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances
> 
>
> Key: HIVE-11768
> URL: https://issues.apache.org/jira/browse/HIVE-11768
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
>Assignee: Navis
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HIVE-11768.1.patch.txt, HIVE-11768.2.patch.txt, 
> HIVE-11768.3.patch.txt, HIVE-11768.4.patch.txt, HIVE-11768.5.patch.txt
>
>
>   More than 490,000 paths was added to java.io.DeleteOnExitHook on one of our 
> long running HiveServer2 instances,taken up more than 100MB on heap.
>   Most of the paths contains a suffix of ".pipeout".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11768) java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances

2015-10-13 Thread Navis (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-11768:
-
Attachment: HIVE-11768.5.patch.txt

Addressed comments & fixed test fail

> java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances
> 
>
> Key: HIVE-11768
> URL: https://issues.apache.org/jira/browse/HIVE-11768
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
>Assignee: Navis
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HIVE-11768.1.patch.txt, HIVE-11768.2.patch.txt, 
> HIVE-11768.3.patch.txt, HIVE-11768.4.patch.txt, HIVE-11768.5.patch.txt
>
>
>   More than 490,000 paths was added to java.io.DeleteOnExitHook on one of our 
> long running HiveServer2 instances,taken up more than 100MB on heap.
>   Most of the paths contains a suffix of ".pipeout".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11892) UDTF run in local fetch task does not return rows forwarded during GenericUDTF.close()

2015-10-13 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955995#comment-14955995
 ] 

Ashutosh Chauhan commented on HIVE-11892:
-

+1

> UDTF run in local fetch task does not return rows forwarded during 
> GenericUDTF.close()
> --
>
> Key: HIVE-11892
> URL: https://issues.apache.org/jira/browse/HIVE-11892
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-11892.1.patch, HIVE-11892.2.patch
>
>
> Using the example UDTF GenericUDTFCount2, which is part of hive-contrib:
> {noformat}
> create temporary function udtfCount2 as 
> 'org.apache.hadoop.hive.contrib.udtf.example.GenericUDTFCount2';
> set hive.fetch.task.conversion=minimal;
> -- Task created, correct output (2 rows)
> select udtfCount2() from src;
> set hive.fetch.task.conversion=more;
> -- Runs in local task, incorrect output (0 rows)
> select udtfCount2() from src;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11892) UDTF run in local fetch task does not return rows forwarded during GenericUDTF.close()

2015-10-13 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955983#comment-14955983
 ] 

Jason Dere commented on HIVE-11892:
---

[~ashutoshc], can you review?

> UDTF run in local fetch task does not return rows forwarded during 
> GenericUDTF.close()
> --
>
> Key: HIVE-11892
> URL: https://issues.apache.org/jira/browse/HIVE-11892
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-11892.1.patch, HIVE-11892.2.patch
>
>
> Using the example UDTF GenericUDTFCount2, which is part of hive-contrib:
> {noformat}
> create temporary function udtfCount2 as 
> 'org.apache.hadoop.hive.contrib.udtf.example.GenericUDTFCount2';
> set hive.fetch.task.conversion=minimal;
> -- Task created, correct output (2 rows)
> select udtfCount2() from src;
> set hive.fetch.task.conversion=more;
> -- Runs in local task, incorrect output (0 rows)
> select udtfCount2() from src;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11616) DelegationTokenSecretManager reuse the same objectstore ,which has cocurrent issue

2015-10-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955967#comment-14955967
 ] 

Hive QA commented on HIVE-11616:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12762638/HIVE-11616.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9675 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.thrift.TestDBTokenStore.testDBTokenStore
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.minikdc.TestHiveAuthFactory.testStartTokenManagerForDBTokenStore
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5633/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5633/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5633/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12762638 - PreCommit-HIVE-TRUNK-Build

> DelegationTokenSecretManager reuse the same objectstore ,which has cocurrent 
> issue
> --
>
> Key: HIVE-11616
> URL: https://issues.apache.org/jira/browse/HIVE-11616
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1
>Reporter: wangwenli
>Assignee: Cody Fu
> Fix For: 0.12.1
>
> Attachments: HIVE-11616.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> sometime in metastore log, will get below exception,  after analysis, we 
> found that :
> when hivemetastore start, the DelegationTokenSecretManager will maintain the 
> same objectstore, see here
> saslServer.startDelegationTokenSecretManager(conf, *baseHandler.getMS()*, 
> ServerMode.METASTORE);
> this lead to the cocurrent issue.
> 2015-08-18 20:59:10,520 | ERROR | pool-6-thread-200 | Error occurred during 
> processing of message. | 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:296)
> org.apache.hadoop.hive.thrift.DelegationTokenStore$TokenStoreException: 
> org.datanucleus.transaction.NucleusTransactionException: Invalid state. 
> Transaction has already started
>   at 
> org.apache.hadoop.hive.thrift.DBTokenStore.invokeOnRawStore(DBTokenStore.java:154)
>   at 
> org.apache.hadoop.hive.thrift.DBTokenStore.getToken(DBTokenStore.java:88)
>   at 
> org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java:112)
>   at 
> org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java:56)
>   at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$SaslDigestCallbackHandler.getPassword(HadoopThriftAuthBridge.java:565)
>   at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$SaslDigestCallbackHandler.handle(HadoopThriftAuthBridge.java:596)
>   at 
> com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java:589)
>   at 
> com.sun.security.sasl.digest.DigestMD5Server.evaluateResponse(DigestMD5Server.java:244)
>   at 
> org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java:539)
>   at 
> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:283)
>   at 
> org.apache.thrift.transport.HiveTSaslServerTransport.open(HiveTSaslServerTransport.java:133)
>   at 
> org.apache.thrift.transport.HiveTSaslServerTransport$Factory.getTransport(HiveTSaslServerTransport.java:261)
>   at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:739)
>   at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:736)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:360)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1652)
>   at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthB

[jira] [Updated] (HIVE-12166) LLAP: Cache read error at 1000 Gb scale tests

2015-10-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12166:

Attachment: (was: HIVE-12166.patch)

> LLAP: Cache read error at 1000 Gb scale tests
> -
>
> Key: HIVE-12166
> URL: https://issues.apache.org/jira/browse/HIVE-12166
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12166.patch
>
>
> {code}
> hive> select sum(l_extendedprice * l_discount) as revenue from 
> tpch_flat_orc_1000.lineitem  ;
> Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
> at java.util.ArrayList.elementData(ArrayList.java:400)
> at java.util.ArrayList.get(ArrayList.java:413)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.determineRgsToRead(OrcEncodedDataReader.java:687)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:264)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:194)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:191)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:191)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:74)
> at 
> org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
> ... 4 more
> {code}
> Disabling the cache allows this to run through without error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12166) LLAP: Cache read error at 1000 Gb scale tests

2015-10-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12166:

Attachment: HIVE-12166.patch

> LLAP: Cache read error at 1000 Gb scale tests
> -
>
> Key: HIVE-12166
> URL: https://issues.apache.org/jira/browse/HIVE-12166
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12166.patch
>
>
> {code}
> hive> select sum(l_extendedprice * l_discount) as revenue from 
> tpch_flat_orc_1000.lineitem  ;
> Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
> at java.util.ArrayList.elementData(ArrayList.java:400)
> at java.util.ArrayList.get(ArrayList.java:413)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.determineRgsToRead(OrcEncodedDataReader.java:687)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:264)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:194)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:191)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:191)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:74)
> at 
> org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
> ... 4 more
> {code}
> Disabling the cache allows this to run through without error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12166) LLAP: Cache read error at 1000 Gb scale tests

2015-10-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12166:

Attachment: HIVE-12166.patch

Can you try this patch? Looks like there's a spurious empty split

> LLAP: Cache read error at 1000 Gb scale tests
> -
>
> Key: HIVE-12166
> URL: https://issues.apache.org/jira/browse/HIVE-12166
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12166.patch
>
>
> {code}
> hive> select sum(l_extendedprice * l_discount) as revenue from 
> tpch_flat_orc_1000.lineitem  ;
> Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
> at java.util.ArrayList.elementData(ArrayList.java:400)
> at java.util.ArrayList.get(ArrayList.java:413)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.determineRgsToRead(OrcEncodedDataReader.java:687)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:264)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:194)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:191)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:191)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:74)
> at 
> org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
> ... 4 more
> {code}
> Disabling the cache allows this to run through without error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11565) LLAP: Some counters are incorrect

2015-10-13 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955907#comment-14955907
 ] 

Siddharth Seth commented on HIVE-11565:
---

Stops emitting system counters like FileSystem and resource utilization.

> LLAP: Some counters are incorrect
> -
>
> Key: HIVE-11565
> URL: https://issues.apache.org/jira/browse/HIVE-11565
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Siddharth Seth
> Attachments: HIVE-11565.1.txt
>
>
> 1) Tez counters for LLAP are incorrect.
> 2) Some counters, such as cache hit ratio for a fragment, are not propagated.
> We need to make sure that Tez counters for LLAP are usable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-12092) SARGS: UDFLike prefix cases needs to be translated into >= sargs

2015-10-13 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V reassigned HIVE-12092:
--

Assignee: Gopal V

> SARGS: UDFLike prefix cases needs to be translated into >= sargs
> 
>
> Key: HIVE-12092
> URL: https://issues.apache.org/jira/browse/HIVE-12092
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>
> A query which follows the following format
> {{select * from table where access_url like "https:%" ;}}
> needs to rewrite SARGs as 
> {{access_url >= 'https:'}}
> to get a significant hit-rate on a simple expression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11856) allow split strategies to run on threadpool

2015-10-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955889#comment-14955889
 ] 

Sergey Shelukhin commented on HIVE-11856:
-

MiniTez slowdown will be resolved by HIVE-11923. I'll see if 
groupby3_map_multi_distinct is spurious, looks like it is

> allow split strategies to run on threadpool
> ---
>
> Key: HIVE-11856
> URL: https://issues.apache.org/jira/browse/HIVE-11856
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11856.01.patch, HIVE-11856.02.patch, 
> HIVE-11856.patch
>
>
> If a split strategy makes metastore cache calls, it should probably run on 
> the threadpool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11954) Extend logic to choose side table in MapJoin Conversion algorithm

2015-10-13 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11954:
---
Attachment: HIVE-11954.07.patch

> Extend logic to choose side table in MapJoin Conversion algorithm
> -
>
> Key: HIVE-11954
> URL: https://issues.apache.org/jira/browse/HIVE-11954
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11954.01.patch, HIVE-11954.02.patch, 
> HIVE-11954.03.patch, HIVE-11954.04.patch, HIVE-11954.05.patch, 
> HIVE-11954.06.patch, HIVE-11954.07.patch, HIVE-11954.patch, HIVE-11954.patch
>
>
> Selection of side table (in memory/hash table) in MapJoin Conversion 
> algorithm needs to be more sophisticated.
> In an N way Map Join, Hive should pick an input stream as side table (in 
> memory table) that has least cost in producing relation (like TS(FIL|Proj)*).
> Cost based choice needs extended cost model; without return path its going to 
> be hard to do this.
> For the time being we could employ a modified cost based algorithm for side 
> table selection.
> New algorithm is described below:
> 1. Identify the candidate set of inputs for side table (in memory/hash table) 
> from the inputs (based on conditional task size)
> 2. For each of the input identify its cost, memory requirement. Cost is 1 for 
> each heavy weight relation op (Join, GB, PTF/Windowing, TF, etc.). Cost for 
> an input is the total no of heavy weight ops in its branch.
> 3. Order set from #1 on cost & memory req (ascending order)
> 4. Pick the first element from #3 as the side table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12166) LLAP: Cache read error at 1000 Gb scale tests

2015-10-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955871#comment-14955871
 ] 

Sergey Shelukhin commented on HIVE-12166:
-

Is it LRFU only, or any cache?

> LLAP: Cache read error at 1000 Gb scale tests
> -
>
> Key: HIVE-12166
> URL: https://issues.apache.org/jira/browse/HIVE-12166
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>
> {code}
> hive> select sum(l_extendedprice * l_discount) as revenue from 
> tpch_flat_orc_1000.lineitem  ;
> Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
> at java.util.ArrayList.elementData(ArrayList.java:400)
> at java.util.ArrayList.get(ArrayList.java:413)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.determineRgsToRead(OrcEncodedDataReader.java:687)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:264)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:194)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:191)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:191)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:74)
> at 
> org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
> ... 4 more
> {code}
> Disabling the cache allows this to run through without error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12166) LLAP: Cache read error at 1000 Gb scale tests

2015-10-13 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955872#comment-14955872
 ] 

Gopal V commented on HIVE-12166:


[~sershe]: any cache actually - this particular case is triggered even for the 
1st query.

> LLAP: Cache read error at 1000 Gb scale tests
> -
>
> Key: HIVE-12166
> URL: https://issues.apache.org/jira/browse/HIVE-12166
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>
> {code}
> hive> select sum(l_extendedprice * l_discount) as revenue from 
> tpch_flat_orc_1000.lineitem  ;
> Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
> at java.util.ArrayList.elementData(ArrayList.java:400)
> at java.util.ArrayList.get(ArrayList.java:413)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.determineRgsToRead(OrcEncodedDataReader.java:687)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:264)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:194)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:191)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:191)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:74)
> at 
> org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
> ... 4 more
> {code}
> Disabling the cache allows this to run through without error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-10-13 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955856#comment-14955856
 ] 

Xuefu Zhang commented on HIVE-10438:


Some additional comments on RB.

> Architecture for  ResultSet Compression via external plugin
> ---
>
> Key: HIVE-10438
> URL: https://issues.apache.org/jira/browse/HIVE-10438
> Project: Hive
>  Issue Type: New Feature
>  Components: Hive, Thrift API
>Affects Versions: 1.2.0
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
>  Labels: patch
> Attachments: HIVE-10438-1.patch, HIVE-10438.patch, 
> Proposal-rscompressor.pdf, README.txt, 
> Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2ResultSetCompressor.zip, 
> hs2driver-master.zip
>
>
> This JIRA proposes an architecture for enabling ResultSet compression which 
> uses an external plugin. 
> The patch has three aspects to it: 
> 0. An architecture for enabling ResultSet compression with external plugins
> 1. An example plugin to demonstrate end-to-end functionality 
> 2. A container to allow everyone to write and test ResultSet compressors with 
> a query submitter (https://github.com/xiaom/hs2driver) 
> Also attaching a design document explaining the changes, experimental results 
> document, and a pdf explaining how to setup the docker container to observe 
> end-to-end functionality of ResultSet compression. 
> https://reviews.apache.org/r/35792/ Review board link. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12166) LLAP: Cache read error at 1000 Gb scale tests

2015-10-13 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-12166:
---
Component/s: Query Processor

> LLAP: Cache read error at 1000 Gb scale tests
> -
>
> Key: HIVE-12166
> URL: https://issues.apache.org/jira/browse/HIVE-12166
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>
> {code}
> hive> select sum(l_extendedprice * l_discount) as revenue from 
> tpch_flat_orc_1000.lineitem  ;
> Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
> at java.util.ArrayList.elementData(ArrayList.java:400)
> at java.util.ArrayList.get(ArrayList.java:413)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.determineRgsToRead(OrcEncodedDataReader.java:687)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:264)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:194)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:191)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:191)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:74)
> at 
> org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
> ... 4 more
> {code}
> Disabling the cache allows this to run through without error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12166) LLAP: Cache read error at 1000 Gb scale tests

2015-10-13 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-12166:
---
Affects Version/s: 2.0.0

> LLAP: Cache read error at 1000 Gb scale tests
> -
>
> Key: HIVE-12166
> URL: https://issues.apache.org/jira/browse/HIVE-12166
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>
> {code}
> hive> select sum(l_extendedprice * l_discount) as revenue from 
> tpch_flat_orc_1000.lineitem  ;
> Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
> at java.util.ArrayList.elementData(ArrayList.java:400)
> at java.util.ArrayList.get(ArrayList.java:413)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.determineRgsToRead(OrcEncodedDataReader.java:687)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:264)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:194)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:191)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:191)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:74)
> at 
> org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
> ... 4 more
> {code}
> Disabling the cache allows this to run through without error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12166) LLAP: Cache read error at 1000 Gb scale tests

2015-10-13 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-12166:
---
Assignee: Sergey Shelukhin

> LLAP: Cache read error at 1000 Gb scale tests
> -
>
> Key: HIVE-12166
> URL: https://issues.apache.org/jira/browse/HIVE-12166
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>
> {code}
> hive> select sum(l_extendedprice * l_discount) as revenue from 
> tpch_flat_orc_1000.lineitem  ;
> Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
> at java.util.ArrayList.elementData(ArrayList.java:400)
> at java.util.ArrayList.get(ArrayList.java:413)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.determineRgsToRead(OrcEncodedDataReader.java:687)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:264)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:194)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:191)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:191)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:74)
> at 
> org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
> ... 4 more
> {code}
> Disabling the cache allows this to run through without error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11565) LLAP: Some counters are incorrect

2015-10-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955846#comment-14955846
 ] 

Sergey Shelukhin commented on HIVE-11565:
-

what does that false do, stop emitting counters? +1


> LLAP: Some counters are incorrect
> -
>
> Key: HIVE-11565
> URL: https://issues.apache.org/jira/browse/HIVE-11565
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Siddharth Seth
> Attachments: HIVE-11565.1.txt
>
>
> 1) Tez counters for LLAP are incorrect.
> 2) Some counters, such as cache hit ratio for a fragment, are not propagated.
> We need to make sure that Tez counters for LLAP are usable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12162) MiniTez tests take forever to shut down

2015-10-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12162:

Assignee: Vikram Dixit K

> MiniTez tests take forever to shut down
> ---
>
> Key: HIVE-12162
> URL: https://issues.apache.org/jira/browse/HIVE-12162
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Vikram Dixit K
>
> Even before LLAP branch merge 
> (http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5618/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/)
>  and with AM reuse 
> (http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5628/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/),
>  there's this:
> {noformat}
> estCliDriver_shutdown 1 min 8 sec Passed
> testCliDriver_shutdown1 min 7 sec Passed
> testCliDriver_shutdown1 min 7 sec Passed
> testCliDriver_shutdown1 min 6 sec Passed
> testCliDriver_shutdown1 min 6 sec Passed
> testCliDriver_shutdown1 min 5 sec Passed
> testCliDriver_shutdown1 min 5 sec Passed
> testCliDriver_shutdown1 min 5 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12162) MiniTez tests take forever to shut down

2015-10-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955844#comment-14955844
 ] 

Sergey Shelukhin commented on HIVE-12162:
-

Assigning as suggested by Sid :)


> MiniTez tests take forever to shut down
> ---
>
> Key: HIVE-12162
> URL: https://issues.apache.org/jira/browse/HIVE-12162
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Vikram Dixit K
>
> Even before LLAP branch merge 
> (http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5618/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/)
>  and with AM reuse 
> (http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5628/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/),
>  there's this:
> {noformat}
> estCliDriver_shutdown 1 min 8 sec Passed
> testCliDriver_shutdown1 min 7 sec Passed
> testCliDriver_shutdown1 min 7 sec Passed
> testCliDriver_shutdown1 min 6 sec Passed
> testCliDriver_shutdown1 min 6 sec Passed
> testCliDriver_shutdown1 min 5 sec Passed
> testCliDriver_shutdown1 min 5 sec Passed
> testCliDriver_shutdown1 min 5 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12161) MiniTez test is very slow since LLAP branch merge

2015-10-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955837#comment-14955837
 ] 

Sergey Shelukhin commented on HIVE-12161:
-

Actually, nm, it looks like time difference locally is actually in the test...

> MiniTez test is very slow since LLAP branch merge
> -
>
> Key: HIVE-12161
> URL: https://issues.apache.org/jira/browse/HIVE-12161
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> Before merge, the test took 4~hrs (total time parallelized, not wall clock 
> time), after the merge it's taking 12-15hrs. First such build:
> http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5622/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/
> -Session reuse patch which used to make them super fast now makes them run in 
> 2hrs-
> -http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5628/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/
>  which is still a lot.- This is an invalid statement



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12161) MiniTez test is very slow since LLAP branch merge

2015-10-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12161:

Assignee: (was: Sergey Shelukhin)

> MiniTez test is very slow since LLAP branch merge
> -
>
> Key: HIVE-12161
> URL: https://issues.apache.org/jira/browse/HIVE-12161
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> Before merge, the test took 4~hrs (total time parallelized, not wall clock 
> time), after the merge it's taking 12-15hrs. First such build:
> http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5622/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/
> -Session reuse patch which used to make them super fast now makes them run in 
> 2hrs-
> -http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5628/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/
>  which is still a lot.- This is an invalid statement



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12161) MiniTez test is very slow since LLAP branch merge

2015-10-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955831#comment-14955831
 ] 

Sergey Shelukhin commented on HIVE-12161:
-

I am looking at local repro...

> MiniTez test is very slow since LLAP branch merge
> -
>
> Key: HIVE-12161
> URL: https://issues.apache.org/jira/browse/HIVE-12161
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> Before merge, the test took 4~hrs (total time parallelized, not wall clock 
> time), after the merge it's taking 12-15hrs. First such build:
> http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5622/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/
> -Session reuse patch which used to make them super fast now makes them run in 
> 2hrs-
> -http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5628/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/
>  which is still a lot.- This is an invalid statement



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-12161) MiniTez test is very slow since LLAP branch merge

2015-10-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-12161:
---

Assignee: Sergey Shelukhin

> MiniTez test is very slow since LLAP branch merge
> -
>
> Key: HIVE-12161
> URL: https://issues.apache.org/jira/browse/HIVE-12161
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Before merge, the test took 4~hrs (total time parallelized, not wall clock 
> time), after the merge it's taking 12-15hrs. First such build:
> http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5622/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/
> -Session reuse patch which used to make them super fast now makes them run in 
> 2hrs-
> -http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5628/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/
>  which is still a lot.- This is an invalid statement



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11882) Fetch optimizer should stop source files traversal once it exceeds the hive.fetch.task.conversion.threshold

2015-10-13 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11882:
---
Affects Version/s: 2.0.0
   1.3.0
   1.2.1

> Fetch optimizer should stop source files traversal once it exceeds the 
> hive.fetch.task.conversion.threshold
> ---
>
> Key: HIVE-11882
> URL: https://issues.apache.org/jira/browse/HIVE-11882
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 1.0.0, 1.3.0, 1.2.1, 2.0.0
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11882.1.patch
>
>
> Hive 1.0's fetch optimizer tries to optimize queries of the form "select  
> from  where  limit " to a fetch task (see the 
> hive.fetch.task.conversion property). This optimization gets the lengths of 
> all the files in the specified partition and does some comparison against a 
> threshold value to determine whether it should use a fetch task or not (see 
> the hive.fetch.task.conversion.threshold property). This process of getting 
> the length of all files. One of the main problems in this optimization is the 
> fetch optimizer doesn't seem to stop once it exceeds the 
> hive.fetch.task.conversion.threshold. It works fine on HDFS, but could cause 
> a significant performance degradation on other supported file systems. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11882) Fetch optimizer should stop source files traversal once it exceeds the hive.fetch.task.conversion.threshold

2015-10-13 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11882:
---
Fix Version/s: 2.0.0
   1.3.0

> Fetch optimizer should stop source files traversal once it exceeds the 
> hive.fetch.task.conversion.threshold
> ---
>
> Key: HIVE-11882
> URL: https://issues.apache.org/jira/browse/HIVE-11882
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 1.0.0, 1.3.0, 1.2.1, 2.0.0
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11882.1.patch
>
>
> Hive 1.0's fetch optimizer tries to optimize queries of the form "select  
> from  where  limit " to a fetch task (see the 
> hive.fetch.task.conversion property). This optimization gets the lengths of 
> all the files in the specified partition and does some comparison against a 
> threshold value to determine whether it should use a fetch task or not (see 
> the hive.fetch.task.conversion.threshold property). This process of getting 
> the length of all files. One of the main problems in this optimization is the 
> fetch optimizer doesn't seem to stop once it exceeds the 
> hive.fetch.task.conversion.threshold. It works fine on HDFS, but could cause 
> a significant performance degradation on other supported file systems. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11149) Fix issue with sometimes HashMap in PerfLogger.java hangs

2015-10-13 Thread Sushanth Sowmyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955809#comment-14955809
 ] 

Sushanth Sowmyan commented on HIVE-11149:
-

As an update, I do not think that we should be backporting HIVE-11891, since it 
refactors PerfLogger from hive-exec to hive-common, which is a cross-jar change 
that I don't think we should make on backport maint lines. However, this patch 
is simple enough that we could create a 1.2 version of this patch as well which 
will affect PerfLogger in hive-exec as it used to be in 1.2.

> Fix issue with sometimes HashMap in PerfLogger.java hangs 
> --
>
> Key: HIVE-11149
> URL: https://issues.apache.org/jira/browse/HIVE-11149
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 1.2.1
>Reporter: WangMeng
>Assignee: WangMeng
> Fix For: 2.0.0
>
> Attachments: HIVE-11149.01.patch, HIVE-11149.02.patch, 
> HIVE-11149.03.patch, HIVE-11149.04.patch
>
>
> In  Multi-thread environment,  sometimes the  HashMap in PerfLogger.java  
> will  casue massive Java Processes hang  and cost  large amounts of 
> unnecessary CPU and Memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10890) Provide implementable engine selector

2015-10-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955756#comment-14955756
 ] 

Hive QA commented on HIVE-10890:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12766247/HIVE-10890.3.patch.txt

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to no tests executed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5632/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5632/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5632/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-5632/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 49ff918 HIVE-12010 : Tests should use FileSystem based stats 
collection mechanism (Ashutosh Chauhan via Pengcheng Xiong)
+ git clean -f -d
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at 49ff918 HIVE-12010 : Tests should use FileSystem based stats 
collection mechanism (Ashutosh Chauhan via Pengcheng Xiong)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12766247 - PreCommit-HIVE-TRUNK-Build

> Provide implementable engine selector
> -
>
> Key: HIVE-10890
> URL: https://issues.apache.org/jira/browse/HIVE-10890
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-10890.1.patch.txt, HIVE-10890.2.patch.txt, 
> HIVE-10890.3.patch.txt
>
>
> Now hive supports three kind of engines. It would be good to have an 
> automatic engine selector without setting explicitly engine for execution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12083) HIVE-10965 introduces thrift error if partNames or colNames are empty

2015-10-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955751#comment-14955751
 ] 

Hive QA commented on HIVE-12083:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12766245/HIVE-12083.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9682 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5631/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5631/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5631/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12766245 - PreCommit-HIVE-TRUNK-Build

> HIVE-10965 introduces thrift error if partNames or colNames are empty
> -
>
> Key: HIVE-12083
> URL: https://issues.apache.org/jira/browse/HIVE-12083
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1, 1.0.2
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-12083.2.patch, HIVE-12083.patch
>
>
> In the fix for HIVE-10965, there is a short-circuit path that causes an empty 
> AggrStats object to be returned if partNames is empty or colNames is empty:
> {code}
> diff --git 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> index 0a56bac..ed810d2 100644
> --- 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> +++ 
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> @@ -1100,6 +1100,7 @@ public ColumnStatistics getTableStats(
>public AggrStats aggrColStatsForPartitions(String dbName, String tableName,
>List partNames, List colNames, boolean 
> useDensityFunctionForNDVEstimation)
>throws MetaException {
> +if (colNames.isEmpty() || partNames.isEmpty()) return new AggrStats(); 
> // Nothing to aggregate.
>  long partsFound = partsFoundForPartitions(dbName, tableName, partNames, 
> colNames);
>  List colStatsList;
>  // Try to read from the cache first
> {code}
> This runs afoul of thrift requirements that AggrStats have required fields:
> {code}
> struct AggrStats {
> 1: required list colStats,
> 2: required i64 partsFound // number of partitions for which stats were found
> }
> {code}
> Thus, we get errors as follows:
> {noformat}
> 2015-10-08 00:00:25,413 ERROR server.TThreadPoolServer 
> (TThreadPoolServer.java:run(213)) - Thrift error occurred during processing 
> of message.
> org.apache.thrift.protocol.TProtocolException: Required field 'colStats' is 
> unset! Struct:AggrStats(colStats:null, partsFound:0)
> at 
> org.apache.hadoop.hive.metastore.api.AggrStats.validate(AggrStats.java:389)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.validate(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result$get_aggr_stats_for_resultStandardScheme.write(ThriftHiveMetastore.java)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_aggr_stats_for_result.write(ThriftHiveMetastore.java)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:536)
>

[jira] [Comment Edited] (HIVE-12161) MiniTez test is very slow since LLAP branch merge

2015-10-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955749#comment-14955749
 ] 

Sergey Shelukhin edited comment on HIVE-12161 at 10/13/15 9:43 PM:
---

[~vikram.dixit] [~sseth] fyi.
I tried to do some analysis on builds 5618 (before), 5629 (after) and 5626 (AM 
reuse patch). 
Then I realized we have 5596 build results which is AM reuse before merge and 
it also took 2 hours and correlation is 0.98 in runtimes. So I think this is 
entirely attributable to session setup.
The test time differences between 5618 and 5629 don't have any discernible 
pattern (e.g. a few tests even got faster testCliDriver_create_merge_compressed 
 41  36  17 (before-after-after+AM reuse)  , some tests got many 
minutes slower, testCliDriver_auto_sortmerge_join_1223  299 5.3, or 
testCliDriver_load_dyn_part127  321 3.5).
Average benefit from session reuse "before" is 24sec, 8sec. deviation, average 
benefit "after" is 155sec., with 118sec. deviation.
Should we investigate why session setup is so slow and random? Could it be due 
to some Tez changes in new version? Will it affect real clusters? 



was (Author: sershe):
[~vikram.dixit] [~sseth] fyi.
I tried to do some analysis on builds 5618 (before), 5629 (after) and 5626 (AM 
reuse patch). 
Then I realized we have 5596 build results which is AM reuse before merge and 
it also took 2 hours and correlation is 0.98 in runtimes. So I think this is 
entirely attributable to session setup.
The test time differences between 5618 and 5629 don't have any discernible 
pattern (e.g. a few tests even got faster testCliDriver_create_merge_compressed 
 41  36  17 (before-after-after+AM reuse)  , some tests got many 
minutes slower, testCliDriver_auto_sortmerge_join_1223  299 5.3, or 
testCliDriver_load_dyn_part127  321 3.5).
Average benefit from session reuse "before" is 24sec, 8sec. deviation, average 
benefit "after" is 155sec., with 118sec. deviation.
Should we investigate why session setup is so slow and random? Will it affect 
real clusters? 


> MiniTez test is very slow since LLAP branch merge
> -
>
> Key: HIVE-12161
> URL: https://issues.apache.org/jira/browse/HIVE-12161
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> Before merge, the test took 4~hrs (total time parallelized, not wall clock 
> time), after the merge it's taking 12-15hrs. First such build:
> http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5622/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/
> -Session reuse patch which used to make them super fast now makes them run in 
> 2hrs-
> -http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5628/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/
>  which is still a lot.- This is an invalid statement



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-12161) MiniTez test is very slow since LLAP branch merge

2015-10-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955749#comment-14955749
 ] 

Sergey Shelukhin edited comment on HIVE-12161 at 10/13/15 9:43 PM:
---

[~vikram.dixit] [~sseth] fyi.
I tried to do some analysis on builds 5618 (before), 5629 (after) and 5626 (AM 
reuse patch). 
Then I realized we have 5596 build results which is AM reuse before merge and 
it also took 2 hours and correlation is 0.98 in runtimes. So I think this is 
entirely attributable to session setup.
The test time differences between 5618 and 5629 don't have any discernible 
pattern (e.g. a few tests even got faster testCliDriver_create_merge_compressed 
 41  36  17 (before-after-after+AM reuse)  , some tests got many 
minutes slower, testCliDriver_auto_sortmerge_join_1223  299 5.3, or 
testCliDriver_load_dyn_part127  321 3.5).
Average benefit from session reuse "before" is 24sec, 8sec. deviation, average 
benefit "after" is 155sec., with 118sec. deviation.
Should we investigate why session setup is so slow and random? Will it affect 
real clusters? 



was (Author: sershe):
[~vikram.dixit] [~sseth] fyi.
I tried to do some analysis on builds 5618 (before), 5629 (after) and 5626 (AM 
reuse patch). 
Then I realized we have 5596 build results which is AM reuse before merge and 
it also took 2 hours and correlation is 0.98 in runtimes. So I think this is 
entirely attributable to session setup.
The test time differences between 5618 and 5629 are don't have any discernible 
pattern (e.g. (before-after-after+AM reuse) a few tests even got faster 
testCliDriver_create_merge_compressed41  36  17 , some tests got 
many minutes slower, testCliDriver_auto_sortmerge_join_12   23  299 
5.3, or testCliDriver_load_dyn_part127  321 3.5 294).
Average benefit from session reuse "before" is 24sec, 8sec. deviation, average 
benefit "after" is 155sec., with 118sec. stdev.
Should we investigate why session setup is so slow and random? Will it affect 
real clusters? 


> MiniTez test is very slow since LLAP branch merge
> -
>
> Key: HIVE-12161
> URL: https://issues.apache.org/jira/browse/HIVE-12161
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> Before merge, the test took 4~hrs (total time parallelized, not wall clock 
> time), after the merge it's taking 12-15hrs. First such build:
> http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5622/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/
> -Session reuse patch which used to make them super fast now makes them run in 
> 2hrs-
> -http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5628/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/
>  which is still a lot.- This is an invalid statement



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12161) MiniTez test is very slow since LLAP branch merge

2015-10-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955749#comment-14955749
 ] 

Sergey Shelukhin commented on HIVE-12161:
-

[~vikram.dixit] [~sseth] fyi.
I tried to do some analysis on builds 5618 (before), 5629 (after) and 5626 (AM 
reuse patch). 
Then I realized we have 5596 build results which is AM reuse before merge and 
it also took 2 hours and correlation is 0.98 in runtimes. So I think this is 
entirely attributable to session setup.
The test time differences between 5618 and 5629 are don't have any discernible 
pattern (e.g. (before-after-after+AM reuse) a few tests even got faster 
testCliDriver_create_merge_compressed41  36  17 , some tests got 
many minutes slower, testCliDriver_auto_sortmerge_join_12   23  299 
5.3, or testCliDriver_load_dyn_part127  321 3.5 294).
Average benefit from session reuse "before" is 24sec, 8sec. deviation, average 
benefit "after" is 155sec., with 118sec. stdev.
Should we investigate why session setup is so slow and random? Will it affect 
real clusters? 


> MiniTez test is very slow since LLAP branch merge
> -
>
> Key: HIVE-12161
> URL: https://issues.apache.org/jira/browse/HIVE-12161
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> Before merge, the test took 4~hrs (total time parallelized, not wall clock 
> time), after the merge it's taking 12-15hrs. First such build:
> http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5622/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/
> -Session reuse patch which used to make them super fast now makes them run in 
> 2hrs-
> -http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5628/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/
>  which is still a lot.- This is an invalid statement



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11768) java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances

2015-10-13 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955748#comment-14955748
 ] 

Thejas M Nair commented on HIVE-11768:
--

I agree HIVE-6091 will not fix the memory leak. Also, looks like that patch 
missed the deletion of tmpErrOutputFile.

Thanks for the patch [~navis]! I have added some minor comments in RB .

I also see that there is a call in Context.java and SessionState to 
FileSystem.deleteOnExit  and we don't call cancelDeleteOnExit on those. Looks 
like those Path objects can also consume excess memory in a long running HS2. 
It does not have to be addressed in this jira, we can create a follow up jira 
for that.


> java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances
> 
>
> Key: HIVE-11768
> URL: https://issues.apache.org/jira/browse/HIVE-11768
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
>Assignee: Navis
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HIVE-11768.1.patch.txt, HIVE-11768.2.patch.txt, 
> HIVE-11768.3.patch.txt, HIVE-11768.4.patch.txt
>
>
>   More than 490,000 paths was added to java.io.DeleteOnExitHook on one of our 
> long running HiveServer2 instances,taken up more than 100MB on heap.
>   Most of the paths contains a suffix of ".pipeout".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12165) wrong result when hive.optimize.sampling.orderby=true with some aggregate functions

2015-10-13 Thread ErwanMAS (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ErwanMAS updated HIVE-12165:

Description: 
This simple query give wrong result , when , i use the parallel order .

{noformat}
select count(*) , count(distinct dummyint ) , min(dummyint),max(dummyint) from 
foobar_1M ;
{noformat}

Current wrong result :

{noformat}
c0  c1  c2  c3
32740   32740   0   163695
113172  113172  163700  729555
54088   54088   729560  95
{noformat}

Right result :
{noformat}
c0  c1  c2  c3
100 100 0   99
{noformat}

The sql script for my test 
{noformat}
drop table foobar_1 ;
create table foobar_1 ( dummyint int  , dummystr string ) ;
insert into table foobar_1 select count(*),'dummy 0'  from foobar_1 ;

drop table foobar_1M ;
create table foobar_1M ( dummyint bigint  , dummystr string ) ;

insert overwrite table foobar_1M
   select val_int  , concat('dummy ',val_int) from
 ( select ((d_1*10)+d_2)*10+d_3)*10+d_4)*10+d_5)*10+d_6) as 
val_int from foobar_1
 lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
tbl_1 as d_1
 lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
tbl_2 as d_2
 lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
tbl_3 as d_3
 lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
tbl_4 as d_4
 lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
tbl_5 as d_5
 lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
tbl_6 as d_6  ) as f ;


set hive.optimize.sampling.orderby.number=1;
set hive.optimize.sampling.orderby.percent=0.1f;
set mapreduce.job.reduces=3 ;

set hive.optimize.sampling.orderby=false;

select count(*) , count(distinct dummyint ) , min(dummyint),max(dummyint) from 
foobar_1M ;

set hive.optimize.sampling.orderby=true;

select count(*) , count(distinct dummyint ) , min(dummyint),max(dummyint) from 
foobar_1M ;
{noformat}

  was:
This simple query give wrong result , when , i use the parallel order .

{noformat}
select count(*) , count(distinct dummyint ) , min(dummyint),max(dummyint) from 
foobar_1M ;
{noformat}

Current wrong result :

{noformat}
c0  c1  c2  c3
32740   32740   0   163695
113172  113172  163700  729555
54088   54088   729560  95
{noformat}

Right result :
{noformat}
c0  c1  c2  c3
100 100 0   99
{noformat}

The sql script for my test 
{noformat}
drop table foobar_1M ;
create table foobar_1M ( dummyint bigint  , dummystr string ) ;

insert overwrite table foobar_1M
   select val_int  , concat('dummy ',val_int) from
 ( select ((d_1*10)+d_2)*10+d_3)*10+d_4)*10+d_5)*10+d_6) as 
val_int from foobar_1
 lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
tbl_1 as d_1
 lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
tbl_2 as d_2
 lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
tbl_3 as d_3
 lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
tbl_4 as d_4
 lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
tbl_5 as d_5
 lateral view outer explode(split("0,1,2,3,4,5,6,7,8,9",",")) 
tbl_6 as d_6  ) as f ;


set hive.optimize.sampling.orderby.number=1;
set hive.optimize.sampling.orderby.percent=0.1f;
set mapreduce.job.reduces=3 ;

set hive.optimize.sampling.orderby=false;

select count(*) , count(distinct dummyint ) , min(dummyint),max(dummyint) from 
foobar_1M ;

set hive.optimize.sampling.orderby=true;

select count(*) , count(distinct dummyint ) , min(dummyint),max(dummyint) from 
foobar_1M ;
{noformat}


> wrong result when hive.optimize.sampling.orderby=true with some aggregate 
> functions
> ---
>
> Key: HIVE-12165
> URL: https://issues.apache.org/jira/browse/HIVE-12165
> Project: Hive
>  Issue Type: Bug
> Environment: hortonworks  2.3
>Reporter: ErwanMAS
>Priority: Critical
>
> This simple query give wrong result , when , i use the parallel order .
> {noformat}
> select count(*) , count(distinct dummyint ) , min(dummyint),max(dummyint) 
> from foobar_1M ;
> {noformat}
> Current wrong result :
> {noformat}
> c0c1  c2  c3
> 32740 32740   0   163695
> 113172113172  163700  729555
> 54088 54088   729560  95
> {noformat}
> Right result :
> {noformat}
> c0c1  c2  c3
> 100   100 0   99
> {noformat}
> The sql script for my test 
> {noformat}
> drop table foobar_1 ;
> create table foobar_1 ( dummyint int  , dummystr string ) ;
> insert into table foobar_1 select count(*),'dummy 0'  from foob

[jira] [Updated] (HIVE-12161) MiniTez test is very slow since LLAP branch merge

2015-10-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12161:

Description: 
Before merge, the test took 4~hrs (total time parallelized, not wall clock 
time), after the merge it's taking 12-15hrs. First such build:
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5622/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/

-Session reuse patch which used to make them super fast now makes them run in 
2hrs-
-http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5628/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/
 which is still a lot.- This is an invalid statement


  was:
Before merge, the test took 4~hrs (total time parallelized, not wall clock 
time), after the merge it's taking 12-15hrs. First such build:
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5622/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/

-Session reuse patch which used to make them super fast now makes them run in 
2hrs 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5628/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/
 which is still a lot.- This is invalid

Need to investigate why tests are slow regardless of AM reuse.


> MiniTez test is very slow since LLAP branch merge
> -
>
> Key: HIVE-12161
> URL: https://issues.apache.org/jira/browse/HIVE-12161
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> Before merge, the test took 4~hrs (total time parallelized, not wall clock 
> time), after the merge it's taking 12-15hrs. First such build:
> http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5622/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/
> -Session reuse patch which used to make them super fast now makes them run in 
> 2hrs-
> -http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5628/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/
>  which is still a lot.- This is an invalid statement



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12161) MiniTez test is very slow since LLAP branch merge

2015-10-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12161:

Description: 
Before merge, the test took 4~hrs (total time parallelized, not wall clock 
time), after the merge it's taking 12-15hrs. First such build:
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5622/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/

-Session reuse patch which used to make them super fast now makes them run in 
2hrs 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5628/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/
 which is still a lot.- This is invalid

Need to investigate why tests are slow regardless of AM reuse.

  was:
Before merge, the test took 4~hrs (total time parallelized, not wall clock 
time), after the merge it's taking 12-15hrs. First such build:
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5622/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/

Session reuse patch which used to make them super fast now makes them run in 
2hrs 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5628/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/
 which is still a lot.

Need to investigate why tests are slow regardless of AM reuse.


> MiniTez test is very slow since LLAP branch merge
> -
>
> Key: HIVE-12161
> URL: https://issues.apache.org/jira/browse/HIVE-12161
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> Before merge, the test took 4~hrs (total time parallelized, not wall clock 
> time), after the merge it's taking 12-15hrs. First such build:
> http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5622/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/
> -Session reuse patch which used to make them super fast now makes them run in 
> 2hrs 
> http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5628/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/
>  which is still a lot.- This is invalid
> Need to investigate why tests are slow regardless of AM reuse.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12156) expanding view doesn't quote reserved keyword

2015-10-13 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955635#comment-14955635
 ] 

Pengcheng Xiong commented on HIVE-12156:


[~busyjay], this is current limitation of Hive. Hive parser does NOT support 2 
or more levels of "dot". For example, 
{code}
select default.src.key from src;
{code}
will fail. This has nothing to do with reserved keywords. If you are willing to 
work on the new feature to support 2 or more levels of dot, I would be happy to 
review it. Thanks.

> expanding view doesn't quote reserved keyword
> -
>
> Key: HIVE-12156
> URL: https://issues.apache.org/jira/browse/HIVE-12156
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.2.1
> Environment: hadoop 2.7
> hive 1.2.1
>Reporter: Jay Lee
>
> hive> create table testreserved (data struct<`end`:string, id: string>);
> OK
> Time taken: 0.274 seconds
> hive> create view testreservedview as select data.`end` as data_end, data.id 
> as data_id from testreserved;
> OK
> Time taken: 0.769 seconds
> hive> select data.`end` from testreserved;
> OK
> Time taken: 1.852 seconds
> hive> select data_id from testreservedview;
> NoViableAltException(98@[])
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:10858)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceFieldExpression(HiveParser_IdentifiersParser.java:6438)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceUnaryPrefixExpression(HiveParser_IdentifiersParser.java:6768)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceUnarySuffixExpression(HiveParser_IdentifiersParser.java:6828)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceBitwiseXorExpression(HiveParser_IdentifiersParser.java:7012)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceStarExpression(HiveParser_IdentifiersParser.java:7172)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedencePlusExpression(HiveParser_IdentifiersParser.java:7332)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceAmpersandExpression(HiveParser_IdentifiersParser.java:7483)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceBitwiseOrExpression(HiveParser_IdentifiersParser.java:7634)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceEqualExpression(HiveParser_IdentifiersParser.java:8164)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceNotExpression(HiveParser_IdentifiersParser.java:9177)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceAndExpression(HiveParser_IdentifiersParser.java:9296)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceOrExpression(HiveParser_IdentifiersParser.java:9455)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.expression(HiveParser_IdentifiersParser.java:6105)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.expression(HiveParser.java:45840)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectItem(HiveParser_SelectClauseParser.java:2907)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectList(HiveParser_SelectClauseParser.java:1373)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectClause(HiveParser_SelectClauseParser.java:1128)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectClause(HiveParser.java:45827)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:41495)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:41402)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:40413)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:40283)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1590)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1109)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:202)
> ...
> FAILED: SemanticException line 1:29 cannot recognize input near 'end' 'as' 
> 'data_end' in expression specification in definition of VIEW testreservedview 
> [
> select `testreserved`.`data`.end as `data_end`, `testreserved`.`data`.id as 
> `data_id` from `test`.`testreserved`
> ] used as testreservedview at Line 1:20
> When view is expanded, field should be quote with backqu

[jira] [Commented] (HIVE-12157) select-clause doesn't support unicode alias

2015-10-13 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955630#comment-14955630
 ] 

Pengcheng Xiong commented on HIVE-12157:


[~richarddu], this is current limitation of Hive. Hive does NOT support 
unicode. If you are willing to work on this new feature, I would be happy to 
review it. Thanks.

> select-clause doesn't support unicode alias
> ---
>
> Key: HIVE-12157
> URL: https://issues.apache.org/jira/browse/HIVE-12157
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Affects Versions: 1.2.1
>Reporter: richard du
>Priority: Minor
>
> Parser will throw exception when I use alias:
> hive> desc test;
> OK
> a   int 
> b   string  
> Time taken: 0.135 seconds, Fetched: 2 row(s)
> hive> select a as 行1 from test limit 10;
> NoViableAltException(302@[134:7: ( ( ( KW_AS )? identifier ) | ( KW_AS LPAREN 
> identifier ( COMMA identifier )* RPAREN ) )?])
> at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
> at org.antlr.runtime.DFA.predict(DFA.java:116)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectItem(HiveParser_SelectClauseParser.java:2915)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectList(HiveParser_SelectClauseParser.java:1373)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectClause(HiveParser_SelectClauseParser.java:1128)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectClause(HiveParser.java:45827)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:41495)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:41402)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:40413)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:40283)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1590)
> at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1109)
> at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:202)
> at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:396)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> FAILED: ParseException line 1:13 cannot recognize input near 'as' '1' 'from' 
> in selection target



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11565) LLAP: Some counters are incorrect

2015-10-13 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-11565:
--
Summary: LLAP: Some counters are incorrect  (was: LLAP: Tez counters for 
LLAP)

> LLAP: Some counters are incorrect
> -
>
> Key: HIVE-11565
> URL: https://issues.apache.org/jira/browse/HIVE-11565
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
> Attachments: HIVE-11565.1.txt
>
>
> 1) Tez counters for LLAP are incorrect.
> 2) Some counters, such as cache hit ratio for a fragment, are not propagated.
> We need to make sure that Tez counters for LLAP are usable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11565) LLAP: Tez counters for LLAP

2015-10-13 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955629#comment-14955629
 ] 

Siddharth Seth commented on HIVE-11565:
---

Re-purposing this jira to disable incorrect tez counters. Created HIVE-12163 
for the rest of the fixes

> LLAP: Tez counters for LLAP
> ---
>
> Key: HIVE-11565
> URL: https://issues.apache.org/jira/browse/HIVE-11565
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
> Attachments: HIVE-11565.1.txt
>
>
> 1) Tez counters for LLAP are incorrect.
> 2) Some counters, such as cache hit ratio for a fragment, are not propagated.
> We need to make sure that Tez counters for LLAP are usable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12163) LLAP: Tez counters for LLAP 2

2015-10-13 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-12163:
--
Description: 

Some counters, such as cache hit ratio for a fragment, are not propagated.

  was:
1) Tez counters for LLAP are incorrect.
2) Some counters, such as cache hit ratio for a fragment, are not propagated.

We need to make sure that Tez counters for LLAP are usable. 


> LLAP: Tez counters for LLAP 2
> -
>
> Key: HIVE-12163
> URL: https://issues.apache.org/jira/browse/HIVE-12163
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>
> Some counters, such as cache hit ratio for a fragment, are not propagated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11565) LLAP: Tez counters for LLAP

2015-10-13 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-11565:
--
Attachment: HIVE-11565.1.txt

Patch to remove counters which are incorrect, also upgrades to tez 0.8.1-alpha.

> LLAP: Tez counters for LLAP
> ---
>
> Key: HIVE-11565
> URL: https://issues.apache.org/jira/browse/HIVE-11565
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
> Attachments: HIVE-11565.1.txt
>
>
> 1) Tez counters for LLAP are incorrect.
> 2) Some counters, such as cache hit ratio for a fragment, are not propagated.
> We need to make sure that Tez counters for LLAP are usable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12162) MiniTez tests take forever to shut down

2015-10-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955566#comment-14955566
 ] 

Sergey Shelukhin commented on HIVE-12162:
-

[~sseth] [~vikram.dixit] fyi

> MiniTez tests take forever to shut down
> ---
>
> Key: HIVE-12162
> URL: https://issues.apache.org/jira/browse/HIVE-12162
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> Even before LLAP branch merge 
> (http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5618/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/)
>  and with AM reuse 
> (http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5628/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/),
>  there's this:
> {noformat}
> estCliDriver_shutdown 1 min 8 sec Passed
> testCliDriver_shutdown1 min 7 sec Passed
> testCliDriver_shutdown1 min 7 sec Passed
> testCliDriver_shutdown1 min 6 sec Passed
> testCliDriver_shutdown1 min 6 sec Passed
> testCliDriver_shutdown1 min 5 sec Passed
> testCliDriver_shutdown1 min 5 sec Passed
> testCliDriver_shutdown1 min 5 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12162) MiniTez tests take forever to shut down

2015-10-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12162:

Description: 
Even before LLAP branch merge 
(http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5618/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/)
 and with AM reuse 
(http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5628/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/),
 there's this:
{noformat}
estCliDriver_shutdown   1 min 8 sec Passed
testCliDriver_shutdown  1 min 7 sec Passed
testCliDriver_shutdown  1 min 7 sec Passed
testCliDriver_shutdown  1 min 6 sec Passed
testCliDriver_shutdown  1 min 6 sec Passed
testCliDriver_shutdown  1 min 5 sec Passed
testCliDriver_shutdown  1 min 5 sec Passed
testCliDriver_shutdown  1 min 5 sec Passed
testCliDriver_shutdown  1 min 4 sec Passed
testCliDriver_shutdown  1 min 4 sec Passed
testCliDriver_shutdown  1 min 4 sec Passed
testCliDriver_shutdown  1 min 4 sec Passed
testCliDriver_shutdown  1 min 4 sec Passed
testCliDriver_shutdown  1 min 4 sec Passed
testCliDriver_shutdown  1 min 3 sec Passed
testCliDriver_shutdown  1 min 3 sec Passed
testCliDriver_shutdown  1 min 3 sec Passed
testCliDriver_shutdown  1 min 3 sec Passed
testCliDriver_shutdown  1 min 3 sec Passed
{noformat}

  was:
Even before LLAP branch merge and with AM reuse, there's this:
{noformat}
estCliDriver_shutdown   1 min 8 sec Passed
testCliDriver_shutdown  1 min 7 sec Passed
testCliDriver_shutdown  1 min 7 sec Passed
testCliDriver_shutdown  1 min 6 sec Passed
testCliDriver_shutdown  1 min 6 sec Passed
testCliDriver_shutdown  1 min 5 sec Passed
testCliDriver_shutdown  1 min 5 sec Passed
testCliDriver_shutdown  1 min 5 sec Passed
testCliDriver_shutdown  1 min 4 sec Passed
testCliDriver_shutdown  1 min 4 sec Passed
testCliDriver_shutdown  1 min 4 sec Passed
testCliDriver_shutdown  1 min 4 sec Passed
testCliDriver_shutdown  1 min 4 sec Passed
testCliDriver_shutdown  1 min 4 sec Passed
testCliDriver_shutdown  1 min 3 sec Passed
testCliDriver_shutdown  1 min 3 sec Passed
testCliDriver_shutdown  1 min 3 sec Passed
testCliDriver_shutdown  1 min 3 sec Passed
testCliDriver_shutdown  1 min 3 sec Passed
{noformat}


> MiniTez tests take forever to shut down
> ---
>
> Key: HIVE-12162
> URL: https://issues.apache.org/jira/browse/HIVE-12162
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> Even before LLAP branch merge 
> (http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5618/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/)
>  and with AM reuse 
> (http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5628/testReport/org.apache.hadoop.hive.cli/TestMiniTezCliDriver/),
>  there's this:
> {noformat}
> estCliDriver_shutdown 1 min 8 sec Passed
> testCliDriver_shutdown1 min 7 sec Passed
> testCliDriver_shutdown1 min 7 sec Passed
> testCliDriver_shutdown1 min 6 sec Passed
> testCliDriver_shutdown1 min 6 sec Passed
> testCliDriver_shutdown1 min 5 sec Passed
> testCliDriver_shutdown1 min 5 sec Passed
> testCliDriver_shutdown1 min 5 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 4 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> testCliDriver_shutdown1 min 3 sec Passed
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11785) Support escaping carriage return and new line for LazySimpleSerDe

2015-10-13 Thread Chao Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955540#comment-14955540
 ] 

Chao Sun commented on HIVE-11785:
-

+1

> Support escaping carriage return and new line for LazySimpleSerDe
> -
>
> Key: HIVE-11785
> URL: https://issues.apache.org/jira/browse/HIVE-11785
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 2.0.0
>
> Attachments: HIVE-11785.2.patch, HIVE-11785.3.patch, 
> HIVE-11785.patch, test.parquet
>
>
> Create the table and perform the queries as follows. You will see different 
> results when the setting changes. 
> The expected result should be:
> {noformat}
> 1 newline
> here
> 2 carriage return
> 3 both
> here
> {noformat}
> {noformat}
> hive> create table repo (lvalue int, charstring string) stored as parquet;
> OK
> Time taken: 0.34 seconds
> hive> load data inpath '/tmp/repo/test.parquet' overwrite into table repo;
> Loading data to table default.repo
> chgrp: changing ownership of 
> 'hdfs://nameservice1/user/hive/warehouse/repo/test.parquet': User does not 
> belong to hive
> Table default.repo stats: [numFiles=1, numRows=0, totalSize=610, 
> rawDataSize=0]
> OK
> Time taken: 0.732 seconds
> hive> set hive.fetch.task.conversion=more;
> hive> select * from repo;
> OK
> 1 newline
> here
> here  carriage return
> 3 both
> here
> Time taken: 0.253 seconds, Fetched: 3 row(s)
> hive> set hive.fetch.task.conversion=none;
> hive> select * from repo;
> Query ID = root_20150909113535_e081db8b-ccd9-4c44-aad9-d990ffb8edf3
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_1441752031022_0006, Tracking URL = 
> http://host-10-17-81-63.coe.cloudera.com:8088/proxy/application_1441752031022_0006/
> Kill Command = 
> /opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hadoop/bin/hadoop job  
> -kill job_1441752031022_0006
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: > 0
> 2015-09-09 11:35:54,127 Stage-1 map = 0%,  reduce = 0%
> 2015-09-09 11:36:04,664 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.98 
> sec
> MapReduce Total cumulative CPU time: 2 seconds 980 msec
> Ended Job = job_1441752031022_0006
> MapReduce Jobs Launched:
> Stage-Stage-1: Map: 1   Cumulative CPU: 2.98 sec   HDFS Read: 4251 HDFS 
> Write: 51 SUCCESS
> Total MapReduce CPU Time Spent: 2 seconds 980 msec
> OK
> 1 newline
> NULL  NULL
> 2 carriage return
> NULL  NULL
> 3 both
> NULL  NULL
> Time taken: 25.131 seconds, Fetched: 6 row(s)
> hive>
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12084) Hive queries with ORDER BY and large LIMIT fails with OutOfMemoryError Java heap space

2015-10-13 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12084:
-
Attachment: (was: HIVE-12084.2.patch)

> Hive queries with ORDER BY and large LIMIT fails with OutOfMemoryError Java 
> heap space
> --
>
> Key: HIVE-12084
> URL: https://issues.apache.org/jira/browse/HIVE-12084
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12084.1.patch, HIVE-12084.2.patch
>
>
> STEPS TO REPRODUCE:
> {code}
> CREATE TABLE `sample_07` ( `code` string , `description` string , `total_emp` 
> int , `salary` int ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS 
> TextFile;
> load data local inpath 'sample_07.csv'  into table sample_07;
> set hive.limit.pushdown.memory.usage=0.;
> select * from sample_07 order by salary LIMIT 9;
> {code}
> This will result in 
> {code}
> Caused by: java.lang.OutOfMemoryError: Java heap space
>   at org.apache.hadoop.hive.ql.exec.TopNHash.initialize(TopNHash.java:113)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.initializeOp(ReduceSinkOperator.java:234)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.initializeOp(VectorReduceSinkOperator.java:68)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:469)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:425)
> {code}
> The basic issue lies with top n optimization. We need a limit for the top n 
> optimization. Ideally we would detect that the allocated bytes will be bigger 
> than the "limit.pushdown.memory.usage" without trying to alloc it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12084) Hive queries with ORDER BY and large LIMIT fails with OutOfMemoryError Java heap space

2015-10-13 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12084:
-
Attachment: HIVE-12084.2.patch

> Hive queries with ORDER BY and large LIMIT fails with OutOfMemoryError Java 
> heap space
> --
>
> Key: HIVE-12084
> URL: https://issues.apache.org/jira/browse/HIVE-12084
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12084.1.patch, HIVE-12084.2.patch
>
>
> STEPS TO REPRODUCE:
> {code}
> CREATE TABLE `sample_07` ( `code` string , `description` string , `total_emp` 
> int , `salary` int ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS 
> TextFile;
> load data local inpath 'sample_07.csv'  into table sample_07;
> set hive.limit.pushdown.memory.usage=0.;
> select * from sample_07 order by salary LIMIT 9;
> {code}
> This will result in 
> {code}
> Caused by: java.lang.OutOfMemoryError: Java heap space
>   at org.apache.hadoop.hive.ql.exec.TopNHash.initialize(TopNHash.java:113)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.initializeOp(ReduceSinkOperator.java:234)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.initializeOp(VectorReduceSinkOperator.java:68)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:469)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:425)
> {code}
> The basic issue lies with top n optimization. We need a limit for the top n 
> optimization. Ideally we would detect that the allocated bytes will be bigger 
> than the "limit.pushdown.memory.usage" without trying to alloc it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12084) Hive queries with ORDER BY and large LIMIT fails with OutOfMemoryError Java heap space

2015-10-13 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12084:
-
Attachment: HIVE-12084.2.patch

> Hive queries with ORDER BY and large LIMIT fails with OutOfMemoryError Java 
> heap space
> --
>
> Key: HIVE-12084
> URL: https://issues.apache.org/jira/browse/HIVE-12084
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12084.1.patch, HIVE-12084.2.patch
>
>
> STEPS TO REPRODUCE:
> {code}
> CREATE TABLE `sample_07` ( `code` string , `description` string , `total_emp` 
> int , `salary` int ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS 
> TextFile;
> load data local inpath 'sample_07.csv'  into table sample_07;
> set hive.limit.pushdown.memory.usage=0.;
> select * from sample_07 order by salary LIMIT 9;
> {code}
> This will result in 
> {code}
> Caused by: java.lang.OutOfMemoryError: Java heap space
>   at org.apache.hadoop.hive.ql.exec.TopNHash.initialize(TopNHash.java:113)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.initializeOp(ReduceSinkOperator.java:234)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.initializeOp(VectorReduceSinkOperator.java:68)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:469)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:425)
> {code}
> The basic issue lies with top n optimization. We need a limit for the top n 
> optimization. Ideally we would detect that the allocated bytes will be bigger 
> than the "limit.pushdown.memory.usage" without trying to alloc it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12076) WebHCat listing jobs after the given JobId even when templeton.jobs.listorder is set to lexicographicaldesc

2015-10-13 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955511#comment-14955511
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-12076:
--

Thanks for the comment. +1 for the change.

> WebHCat listing jobs after the given JobId even when templeton.jobs.listorder 
> is set to lexicographicaldesc
> ---
>
> Key: HIVE-12076
> URL: https://issues.apache.org/jira/browse/HIVE-12076
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Kiran Kumar Kolli
>Assignee: Kiran Kumar Kolli
> Fix For: 0.14.0
>
> Attachments: HIVE-12076.1.patch, HIVE-12076.2.patch, 
> HIVE-12076.3.patch
>
>
> HIVE-11724 introduced new setting to change the order of jobs listed. 
> In-cases where "templeton.jobs.listorder" is set to lexicographicaldesc, 
> filtering based on jobid still returning values greater then given job if, 
> where as less than are expected. Its a code bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11882) Fetch optimizer should stop source files traversal once it exceeds the hive.fetch.task.conversion.threshold

2015-10-13 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955508#comment-14955508
 ] 

Gopal V commented on HIVE-11882:


[~yalovyyi]: Yes, I think the patch is ready to go in.

> Fetch optimizer should stop source files traversal once it exceeds the 
> hive.fetch.task.conversion.threshold
> ---
>
> Key: HIVE-11882
> URL: https://issues.apache.org/jira/browse/HIVE-11882
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 1.0.0
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
> Attachments: HIVE-11882.1.patch
>
>
> Hive 1.0's fetch optimizer tries to optimize queries of the form "select  
> from  where  limit " to a fetch task (see the 
> hive.fetch.task.conversion property). This optimization gets the lengths of 
> all the files in the specified partition and does some comparison against a 
> threshold value to determine whether it should use a fetch task or not (see 
> the hive.fetch.task.conversion.threshold property). This process of getting 
> the length of all files. One of the main problems in this optimization is the 
> fetch optimizer doesn't seem to stop once it exceeds the 
> hive.fetch.task.conversion.threshold. It works fine on HDFS, but could cause 
> a significant performance degradation on other supported file systems. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11923) allow qtests to run via a single client session for tez and llap

2015-10-13 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955494#comment-14955494
 ] 

Prasanth Jayachandran commented on HIVE-11923:
--

+1 for the workaround patch. 

> allow qtests to run via a single client session for tez and llap
> 
>
> Key: HIVE-11923
> URL: https://issues.apache.org/jira/browse/HIVE-11923
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-11923.03.patch, HIVE-11923.04.patch, 
> HIVE-11923.1.txt, HIVE-11923.2.branchllap.txt, HIVE-11923.2.patch, 
> HIVE-11923.2.txt, HIVE-11923.2.txt, HIVE-11923.branch-1.txt
>
>
> Launching a new session - AM and containers for each test adds unnecessary 
> overheads. Running via a single session should reduce the run time 
> significantly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11518) Provide interface to adjust required resource for tez tasks

2015-10-13 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955492#comment-14955492
 ] 

Hive QA commented on HIVE-11518:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12766243/HIVE-11518.2.patch.txt

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9645 tests executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-tez_bmj_schema_evolution.q-orc_merge5.q-vectorization_limit.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vectorization_10.q-vector_partitioned_date_time.q-vector_non_string_partition.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5630/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5630/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5630/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12766243 - PreCommit-HIVE-TRUNK-Build

> Provide interface to adjust required resource for tez tasks
> ---
>
> Key: HIVE-11518
> URL: https://issues.apache.org/jira/browse/HIVE-11518
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-11518.1.patch.txt, HIVE-11518.2.patch.txt
>
>
> Resource requirements for each tasks are varied but currently it's fixed to 
> one value(via hive.tez.container.size). It would be good to customize 
> resource requirements appropriate to expected work.
> Suggested interface is quite simple.
> {code}
> public interface ResourceCalculator {
>   Resource adjust(Resource resource, MapWork mapWork);
>   Resource adjust(Resource resource, ReduceWork reduceWork);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 132 matches

Mail list logo