[jira] [Commented] (HIVE-16812) VectorizedOrcAcidRowBatchReader doesn't filter delete events

2018-09-25 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628270#comment-16628270
 ] 

Hive QA commented on HIVE-16812:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941300/HIVE-16812.05.patch

{color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 14999 tests 
executed
*Failed tests:*
{noformat}
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=194)

[druidmini_dynamic_partition.q,druidmini_test_ts.q,druidmini_expressions.q,druidmini_test_alter.q,druidmini_test_insert.q]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_acid4]
 (batchId=159)
org.apache.hive.streaming.TestStreaming.testAutoRollTransactionBatch 
(batchId=323)
org.apache.hive.streaming.TestStreaming.testNoBuckets (batchId=323)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14052/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14052/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14052/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941300 - PreCommit-HIVE-Build

> VectorizedOrcAcidRowBatchReader doesn't filter delete events
> 
>
> Key: HIVE-16812
> URL: https://issues.apache.org/jira/browse/HIVE-16812
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16812.02.patch, HIVE-16812.04.patch, 
> HIVE-16812.05.patch
>
>
> the c'tor of VectorizedOrcAcidRowBatchReader has
> {noformat}
> // Clone readerOptions for deleteEvents.
> Reader.Options deleteEventReaderOptions = readerOptions.clone();
> // Set the range on the deleteEventReaderOptions to 0 to INTEGER_MAX 
> because
> // we always want to read all the delete delta files.
> deleteEventReaderOptions.range(0, Long.MAX_VALUE);
> {noformat}
> This is suboptimal since base and deltas are sorted by ROW__ID.  So for each 
> split if base we can find min/max ROW_ID and only load events from delta that 
> are in [min,max] range.  This will reduce the number of delete events we load 
> in memory (to no more than there in the split).
> When we support sorting on PK, the same should apply but we'd need to make 
> sure to store PKs in ORC index
> See {{OrcRawRecordMerger.discoverKeyBounds()}}
> {{hive.acid.key.index}} in Orc footer has an index of ROW__IDs so we should 
> know min/max easily for any file written by {{OrcRecordUpdater}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20632) Query with get_splits UDF fails if materialized view is created on queried table.

2018-09-25 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20632:

Attachment: HIVE-20632.01.patch

> Query with get_splits UDF fails if materialized view is created on queried 
> table. 
> --
>
> Key: HIVE-20632
> URL: https://issues.apache.org/jira/browse/HIVE-20632
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, UDF
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: UDF, materializedviews, pull-request-available
> Attachments: HIVE-20632.01.patch
>
>
> Scenario:
>  # Create ACID table t1 and insert few rows.
>  # Create materialized view mv as select a from t1 where a > 5;
>  # Run get_split query "select get_splits( select a from t1 where a > 5); – 
> This fails with AssertionError.
> {code:java}
> java.lang.AssertionError
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:673)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(Hive.java:1495)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAllValidMaterializedViews(Hive.java:1478)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2160)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1777)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1669)
> at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
> at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
> at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
> at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1428)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:475)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12309)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:355)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:669)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1872)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1819)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:266)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:204)
> at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2764)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.runStatementOnDriver(TestAcidOnTez.java:927)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocksWithMaterializedView(TestAcidOnTez.java:803)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at 

[jira] [Updated] (HIVE-20632) Query with get_splits UDF fails if materialized view is created on queried table.

2018-09-25 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20632:

Status: Patch Available  (was: Open)

> Query with get_splits UDF fails if materialized view is created on queried 
> table. 
> --
>
> Key: HIVE-20632
> URL: https://issues.apache.org/jira/browse/HIVE-20632
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, UDF
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: UDF, materializedviews, pull-request-available
> Attachments: HIVE-20632.01.patch
>
>
> Scenario:
>  # Create ACID table t1 and insert few rows.
>  # Create materialized view mv as select a from t1 where a > 5;
>  # Run get_split query "select get_splits( select a from t1 where a > 5); – 
> This fails with AssertionError.
> {code:java}
> java.lang.AssertionError
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:673)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(Hive.java:1495)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAllValidMaterializedViews(Hive.java:1478)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2160)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1777)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1669)
> at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
> at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
> at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
> at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1428)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:475)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12309)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:355)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:669)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1872)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1819)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:266)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:204)
> at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2764)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.runStatementOnDriver(TestAcidOnTez.java:927)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocksWithMaterializedView(TestAcidOnTez.java:803)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at 

[jira] [Updated] (HIVE-20632) Query with get_splits UDF fails if materialized view is created on queried table.

2018-09-25 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20632:

Attachment: (was: HIVE-20632.01.patch)

> Query with get_splits UDF fails if materialized view is created on queried 
> table. 
> --
>
> Key: HIVE-20632
> URL: https://issues.apache.org/jira/browse/HIVE-20632
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, UDF
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: UDF, materializedviews, pull-request-available
>
> Scenario:
>  # Create ACID table t1 and insert few rows.
>  # Create materialized view mv as select a from t1 where a > 5;
>  # Run get_split query "select get_splits( select a from t1 where a > 5); – 
> This fails with AssertionError.
> {code:java}
> java.lang.AssertionError
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:673)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(Hive.java:1495)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAllValidMaterializedViews(Hive.java:1478)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2160)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1777)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1669)
> at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
> at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
> at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
> at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1428)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:475)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12309)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:355)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:669)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1872)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1819)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:266)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:204)
> at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2764)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.runStatementOnDriver(TestAcidOnTez.java:927)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocksWithMaterializedView(TestAcidOnTez.java:803)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at 

[jira] [Updated] (HIVE-20627) Concurrent async queries intermittently fails with LockException and cause memory leak.

2018-09-25 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20627:

Status: Patch Available  (was: Open)

> Concurrent async queries intermittently fails with LockException and cause 
> memory leak.
> ---
>
> Key: HIVE-20627
> URL: https://issues.apache.org/jira/browse/HIVE-20627
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Transactions
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20627.01.patch
>
>
> When multiple async queries are executed from same session, it leads to 
> multiple async query execution DAGs share the same Hive object which is set 
> by caller for all threads. In case of loading dynamic partitions, it creates 
> MoveTask which re-creates the Hive object and closes the shared Hive object 
> which causes metastore connection issues for other async execution thread who 
> still access it. This is also seen if ReplDumpTask and ReplLoadTask are part 
> of the DAG.
> *Call Stack:*
> {code:java}
> 2018-09-16T04:38:04,280 ERROR [load-dynamic-partitions-7]: metadata.Hive 
> (Hive.java:call(2436)) - Exception when loading partition with parameters 
> partPath=hdfs://mycluster/warehouse/tablespace/managed/hive/tbl_3bcvvdubni/.hive-staging_hive_2018-09-16_04-35-50_708_7776079613819042057-1147/-ext-1/age=55,
>  table=tbl_3bcvvdubni, partSpec={age=55}, loadFileType=KEEP_EXISTING, 
> listBucketingLevel=0, isAcid=true, hasFollowingStatsTask=true
> org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the 
> metastore
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:714)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableValidWriteIdListWithTxnList(AcidUtils.java:1791)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1756) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1714) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1976) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2415) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2406) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_171]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
> Caused by: org.apache.thrift.protocol.TProtocolException: Required field 
> 'validTxnList' is unset! 
> Struct:GetValidWriteIdsRequest(fullTableNames:[default.tbl_3bcvvdubni], 
> validTxnList:null)
> at 
> org.apache.hadoop.hive.metastore.api.GetValidWriteIdsRequest.validate(GetValidWriteIdsRequest.java:396)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.validate(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:71) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.send_get_valid_write_ids(ThriftHiveMetastore.java:5443)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_valid_write_ids(ThriftHiveMetastore.java:5435)
>  

[jira] [Updated] (HIVE-20632) Query with get_splits UDF fails if materialized view is created on queried table.

2018-09-25 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20632:

Status: Open  (was: Patch Available)

> Query with get_splits UDF fails if materialized view is created on queried 
> table. 
> --
>
> Key: HIVE-20632
> URL: https://issues.apache.org/jira/browse/HIVE-20632
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, UDF
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: UDF, materializedviews, pull-request-available
> Attachments: HIVE-20632.01.patch
>
>
> Scenario:
>  # Create ACID table t1 and insert few rows.
>  # Create materialized view mv as select a from t1 where a > 5;
>  # Run get_split query "select get_splits( select a from t1 where a > 5); – 
> This fails with AssertionError.
> {code:java}
> java.lang.AssertionError
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:673)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(Hive.java:1495)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAllValidMaterializedViews(Hive.java:1478)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2160)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1777)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1669)
> at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
> at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
> at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
> at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1428)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:475)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12309)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:355)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:669)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1872)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1819)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:266)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:204)
> at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2764)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.runStatementOnDriver(TestAcidOnTez.java:927)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocksWithMaterializedView(TestAcidOnTez.java:803)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at 

[jira] [Updated] (HIVE-20627) Concurrent async queries intermittently fails with LockException and cause memory leak.

2018-09-25 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20627:

Attachment: (was: HIVE-20627.01.patch)

> Concurrent async queries intermittently fails with LockException and cause 
> memory leak.
> ---
>
> Key: HIVE-20627
> URL: https://issues.apache.org/jira/browse/HIVE-20627
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Transactions
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20627.01.patch
>
>
> When multiple async queries are executed from same session, it leads to 
> multiple async query execution DAGs share the same Hive object which is set 
> by caller for all threads. In case of loading dynamic partitions, it creates 
> MoveTask which re-creates the Hive object and closes the shared Hive object 
> which causes metastore connection issues for other async execution thread who 
> still access it. This is also seen if ReplDumpTask and ReplLoadTask are part 
> of the DAG.
> *Call Stack:*
> {code:java}
> 2018-09-16T04:38:04,280 ERROR [load-dynamic-partitions-7]: metadata.Hive 
> (Hive.java:call(2436)) - Exception when loading partition with parameters 
> partPath=hdfs://mycluster/warehouse/tablespace/managed/hive/tbl_3bcvvdubni/.hive-staging_hive_2018-09-16_04-35-50_708_7776079613819042057-1147/-ext-1/age=55,
>  table=tbl_3bcvvdubni, partSpec={age=55}, loadFileType=KEEP_EXISTING, 
> listBucketingLevel=0, isAcid=true, hasFollowingStatsTask=true
> org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the 
> metastore
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:714)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableValidWriteIdListWithTxnList(AcidUtils.java:1791)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1756) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1714) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1976) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2415) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2406) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_171]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
> Caused by: org.apache.thrift.protocol.TProtocolException: Required field 
> 'validTxnList' is unset! 
> Struct:GetValidWriteIdsRequest(fullTableNames:[default.tbl_3bcvvdubni], 
> validTxnList:null)
> at 
> org.apache.hadoop.hive.metastore.api.GetValidWriteIdsRequest.validate(GetValidWriteIdsRequest.java:396)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.validate(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:71) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.send_get_valid_write_ids(ThriftHiveMetastore.java:5443)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_valid_write_ids(ThriftHiveMetastore.java:5435)
>  

[jira] [Updated] (HIVE-20627) Concurrent async queries intermittently fails with LockException and cause memory leak.

2018-09-25 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20627:

Attachment: HIVE-20627.01.patch

> Concurrent async queries intermittently fails with LockException and cause 
> memory leak.
> ---
>
> Key: HIVE-20627
> URL: https://issues.apache.org/jira/browse/HIVE-20627
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Transactions
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20627.01.patch
>
>
> When multiple async queries are executed from same session, it leads to 
> multiple async query execution DAGs share the same Hive object which is set 
> by caller for all threads. In case of loading dynamic partitions, it creates 
> MoveTask which re-creates the Hive object and closes the shared Hive object 
> which causes metastore connection issues for other async execution thread who 
> still access it. This is also seen if ReplDumpTask and ReplLoadTask are part 
> of the DAG.
> *Call Stack:*
> {code:java}
> 2018-09-16T04:38:04,280 ERROR [load-dynamic-partitions-7]: metadata.Hive 
> (Hive.java:call(2436)) - Exception when loading partition with parameters 
> partPath=hdfs://mycluster/warehouse/tablespace/managed/hive/tbl_3bcvvdubni/.hive-staging_hive_2018-09-16_04-35-50_708_7776079613819042057-1147/-ext-1/age=55,
>  table=tbl_3bcvvdubni, partSpec={age=55}, loadFileType=KEEP_EXISTING, 
> listBucketingLevel=0, isAcid=true, hasFollowingStatsTask=true
> org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the 
> metastore
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:714)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableValidWriteIdListWithTxnList(AcidUtils.java:1791)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1756) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1714) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1976) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2415) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2406) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_171]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
> Caused by: org.apache.thrift.protocol.TProtocolException: Required field 
> 'validTxnList' is unset! 
> Struct:GetValidWriteIdsRequest(fullTableNames:[default.tbl_3bcvvdubni], 
> validTxnList:null)
> at 
> org.apache.hadoop.hive.metastore.api.GetValidWriteIdsRequest.validate(GetValidWriteIdsRequest.java:396)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.validate(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:71) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.send_get_valid_write_ids(ThriftHiveMetastore.java:5443)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_valid_write_ids(ThriftHiveMetastore.java:5435)
>  

[jira] [Updated] (HIVE-20627) Concurrent async queries intermittently fails with LockException and cause memory leak.

2018-09-25 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20627:

Status: Open  (was: Patch Available)

> Concurrent async queries intermittently fails with LockException and cause 
> memory leak.
> ---
>
> Key: HIVE-20627
> URL: https://issues.apache.org/jira/browse/HIVE-20627
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Transactions
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20627.01.patch
>
>
> When multiple async queries are executed from same session, it leads to 
> multiple async query execution DAGs share the same Hive object which is set 
> by caller for all threads. In case of loading dynamic partitions, it creates 
> MoveTask which re-creates the Hive object and closes the shared Hive object 
> which causes metastore connection issues for other async execution thread who 
> still access it. This is also seen if ReplDumpTask and ReplLoadTask are part 
> of the DAG.
> *Call Stack:*
> {code:java}
> 2018-09-16T04:38:04,280 ERROR [load-dynamic-partitions-7]: metadata.Hive 
> (Hive.java:call(2436)) - Exception when loading partition with parameters 
> partPath=hdfs://mycluster/warehouse/tablespace/managed/hive/tbl_3bcvvdubni/.hive-staging_hive_2018-09-16_04-35-50_708_7776079613819042057-1147/-ext-1/age=55,
>  table=tbl_3bcvvdubni, partSpec={age=55}, loadFileType=KEEP_EXISTING, 
> listBucketingLevel=0, isAcid=true, hasFollowingStatsTask=true
> org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the 
> metastore
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:714)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableValidWriteIdListWithTxnList(AcidUtils.java:1791)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1756) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1714) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1976) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2415) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2406) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_171]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
> Caused by: org.apache.thrift.protocol.TProtocolException: Required field 
> 'validTxnList' is unset! 
> Struct:GetValidWriteIdsRequest(fullTableNames:[default.tbl_3bcvvdubni], 
> validTxnList:null)
> at 
> org.apache.hadoop.hive.metastore.api.GetValidWriteIdsRequest.validate(GetValidWriteIdsRequest.java:396)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.validate(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:71) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.send_get_valid_write_ids(ThriftHiveMetastore.java:5443)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_valid_write_ids(ThriftHiveMetastore.java:5435)
>  

[jira] [Commented] (HIVE-16812) VectorizedOrcAcidRowBatchReader doesn't filter delete events

2018-09-25 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628258#comment-16628258
 ] 

Hive QA commented on HIVE-16812:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
44s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
58s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
31s{color} | {color:blue} common in master has 65 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
45s{color} | {color:blue} ql in master has 2326 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
43s{color} | {color:red} ql: The patch generated 26 new + 920 unchanged - 10 
fixed = 946 total (was 930) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  3m 
57s{color} | {color:red} ql generated 1 new + 2324 unchanged - 2 fixed = 2325 
total (was 2326) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
13s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m  7s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Redundant nullcheck of keyIndex, which is known to be non-null in 
org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader.findMinMaxKeys(OrcSplit,
 Configuration, Reader$Options)  Redundant null check at 
VectorizedOrcAcidRowBatchReader.java:is known to be non-null in 
org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader.findMinMaxKeys(OrcSplit,
 Configuration, Reader$Options)  Redundant null check at 
VectorizedOrcAcidRowBatchReader.java:[line 394] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14052/dev-support/hive-personality.sh
 |
| git revision | master / a036e52 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14052/yetus/diff-checkstyle-ql.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14052/yetus/new-findbugs-ql.html
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14052/yetus/patch-asflicense-problems.txt
 |
| modules | C: common ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14052/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> VectorizedOrcAcidRowBatchReader doesn't filter delete events
> 
>
> Key: HIVE-16812
> URL: https://issues.apache.org/jira/browse/HIVE-16812
> Project: Hive
>  Issue Type: Improvement
>  Components: 

[jira] [Commented] (HIVE-20540) Vectorization : Support loading bucketed tables using sorted dynamic partition optimizer - II

2018-09-25 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628248#comment-16628248
 ] 

Hive QA commented on HIVE-20540:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941298/HIVE-20540.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14998 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14051/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14051/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14051/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941298 - PreCommit-HIVE-Build

> Vectorization : Support loading bucketed tables using sorted dynamic 
> partition optimizer - II
> -
>
> Key: HIVE-20540
> URL: https://issues.apache.org/jira/browse/HIVE-20540
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20540.1.patch, HIVE-20540.2.patch, 
> HIVE-20540.3.patch
>
>
> Followup to HIVE-20510 with remaining issues,
>  
> 1. Avoid using Reflection.
> 2. In VectorizationContext, use correct place to setup the VectorExpression. 
> It may be missed in certain cases.
> 3. In BucketNumExpression, make sure that a value is not overwritten before 
> it is processed. Use a flag to achieve this.
> cc [~gopalv]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20629) Hive incremental replication fails with events missing error if database is kept idle for more than an hour

2018-09-25 Thread mahesh kumar behera (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-20629:
---
Status: Patch Available  (was: Open)

> Hive incremental replication fails with events missing error if database is 
> kept idle for more than an hour 
> 
>
> Key: HIVE-20629
> URL: https://issues.apache.org/jira/browse/HIVE-20629
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20629.01.patch, HIVE-20629.02.patch
>
>
> Start a source cluster with 2 database. Replicate the databases to target 
> after doing some operations. Keep taking incremental dump for both database 
> and keep replicating them to target cluster. Keep one the database idle for 
> more than 24 hrs. After 24 hrs, the incremental dump of idle database fails 
> with event missing error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20629) Hive incremental replication fails with events missing error if database is kept idle for more than an hour

2018-09-25 Thread mahesh kumar behera (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-20629:
---
Attachment: HIVE-20629.02.patch

> Hive incremental replication fails with events missing error if database is 
> kept idle for more than an hour 
> 
>
> Key: HIVE-20629
> URL: https://issues.apache.org/jira/browse/HIVE-20629
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20629.01.patch, HIVE-20629.02.patch
>
>
> Start a source cluster with 2 database. Replicate the databases to target 
> after doing some operations. Keep taking incremental dump for both database 
> and keep replicating them to target cluster. Keep one the database idle for 
> more than 24 hrs. After 24 hrs, the incremental dump of idle database fails 
> with event missing error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20629) Hive incremental replication fails with events missing error if database is kept idle for more than an hour

2018-09-25 Thread mahesh kumar behera (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-20629:
---
Status: Open  (was: Patch Available)

> Hive incremental replication fails with events missing error if database is 
> kept idle for more than an hour 
> 
>
> Key: HIVE-20629
> URL: https://issues.apache.org/jira/browse/HIVE-20629
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20629.01.patch, HIVE-20629.02.patch
>
>
> Start a source cluster with 2 database. Replicate the databases to target 
> after doing some operations. Keep taking incremental dump for both database 
> and keep replicating them to target cluster. Keep one the database idle for 
> more than 24 hrs. After 24 hrs, the incremental dump of idle database fails 
> with event missing error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20540) Vectorization : Support loading bucketed tables using sorted dynamic partition optimizer - II

2018-09-25 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628217#comment-16628217
 ] 

Hive QA commented on HIVE-20540:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
49s{color} | {color:blue} ql in master has 2326 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
39s{color} | {color:red} ql: The patch generated 1 new + 378 unchanged - 2 
fixed = 379 total (was 380) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
52s{color} | {color:green} ql generated 0 new + 2325 unchanged - 1 fixed = 2325 
total (was 2326) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m  5s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14051/dev-support/hive-personality.sh
 |
| git revision | master / a036e52 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14051/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14051/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vectorization : Support loading bucketed tables using sorted dynamic 
> partition optimizer - II
> -
>
> Key: HIVE-20540
> URL: https://issues.apache.org/jira/browse/HIVE-20540
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20540.1.patch, HIVE-20540.2.patch, 
> HIVE-20540.3.patch
>
>
> Followup to HIVE-20510 with remaining issues,
>  
> 1. Avoid using Reflection.
> 2. In VectorizationContext, use correct place to setup the VectorExpression. 
> It may be missed in certain cases.
> 3. In BucketNumExpression, make sure that a value is not overwritten before 
> it is processed. Use a flag to achieve this.
> cc [~gopalv]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20631) Hive returns 20011 error code for re-triable error

2018-09-25 Thread mahesh kumar behera (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-20631:
---
Attachment: HIVE-20631.02.patch

> Hive returns 20011 error code for re-triable error
> --
>
> Key: HIVE-20631
> URL: https://issues.apache.org/jira/browse/HIVE-20631
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20631.01.patch, HIVE-20631.02.patch
>
>
> In case of network issue .repl load is returning non retry-able error code. 
> The scenario is 
> 1. While copying the file, repl load found that source is not reachable and 
> went for copy retry.
> 2. While retying, getting file checksum failed due to network issue and thus 
> its assumed that the source file is not present. So in the next retry copy is 
> tried from cm path.
> 3. In the next retry, network is recovered and it in cm path no file was 
> found. This will cause return of non retry-able error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20631) Hive returns 20011 error code for re-triable error

2018-09-25 Thread mahesh kumar behera (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-20631:
---
Status: Patch Available  (was: Open)

> Hive returns 20011 error code for re-triable error
> --
>
> Key: HIVE-20631
> URL: https://issues.apache.org/jira/browse/HIVE-20631
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20631.01.patch, HIVE-20631.02.patch
>
>
> In case of network issue .repl load is returning non retry-able error code. 
> The scenario is 
> 1. While copying the file, repl load found that source is not reachable and 
> went for copy retry.
> 2. While retying, getting file checksum failed due to network issue and thus 
> its assumed that the source file is not present. So in the next retry copy is 
> tried from cm path.
> 3. In the next retry, network is recovered and it in cm path no file was 
> found. This will cause return of non retry-able error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20631) Hive returns 20011 error code for re-triable error

2018-09-25 Thread mahesh kumar behera (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-20631:
---
Status: Open  (was: Patch Available)

> Hive returns 20011 error code for re-triable error
> --
>
> Key: HIVE-20631
> URL: https://issues.apache.org/jira/browse/HIVE-20631
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20631.01.patch, HIVE-20631.02.patch
>
>
> In case of network issue .repl load is returning non retry-able error code. 
> The scenario is 
> 1. While copying the file, repl load found that source is not reachable and 
> went for copy retry.
> 2. While retying, getting file checksum failed due to network issue and thus 
> its assumed that the source file is not present. So in the next retry copy is 
> tried from cm path.
> 3. In the next retry, network is recovered and it in cm path no file was 
> found. This will cause return of non retry-able error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20535) Add new configuration to set the size of the global compile lock

2018-09-25 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628202#comment-16628202
 ] 

Hive QA commented on HIVE-20535:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941279/HIVE-20535.16.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14998 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14050/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14050/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14050/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941279 - PreCommit-HIVE-Build

> Add new configuration to set the size of the global compile lock
> 
>
> Key: HIVE-20535
> URL: https://issues.apache.org/jira/browse/HIVE-20535
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Reporter: denys kuzmenko
>Assignee: denys kuzmenko
>Priority: Major
> Attachments: HIVE-20535.1.patch, HIVE-20535.10.patch, 
> HIVE-20535.11.patch, HIVE-20535.12.patch, HIVE-20535.13.patch, 
> HIVE-20535.14.patch, HIVE-20535.15.patch, HIVE-20535.16.patch, 
> HIVE-20535.2.patch, HIVE-20535.3.patch, HIVE-20535.4.patch, 
> HIVE-20535.5.patch, HIVE-20535.6.patch, HIVE-20535.8.patch, HIVE-20535.9.patch
>
>
> When removing the compile lock, it is quite risky to remove it entirely.
> It would be good to provide a pool size for the concurrent compilation, so 
> the administrator can limit the load



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20536) Add Surrogate Keys function to Hive

2018-09-25 Thread Andrew Sears (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628196#comment-16628196
 ] 

Andrew Sears commented on HIVE-20536:
-

Is this bringing some of the functionality of IDENTITY / Sequence / 
Auto_increment to Hive? Sounds useful!

Surrogate key functionality could be documented in wiki here.
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF

> Add Surrogate Keys function to Hive
> ---
>
> Key: HIVE-20536
> URL: https://issues.apache.org/jira/browse/HIVE-20536
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20536.01.patch, HIVE-20536.02.patch, 
> HIVE-20536.03.patch, HIVE-20536.04.patch, HIVE-20536.05.patch, 
> HIVE-20536.06.patch, HIVE-20536.07.patch
>
>
> Surrogate keys is an ability to generate and use unique integers for each row 
> in a table. If we have that ability then in conjunction with default clause 
> we can get surrogate keys functionality. Consider following ddl:
> create table t1 (a string, b bigint default unique_long());
> We already have default clause wherein you can specify a function to provide 
> values. So, what we need is udf which can generate unique longs for each row 
> across queries for a table. 
> Idea is to use write_id . This is a column in metastore table TXN_COMPONENTS 
> whose value is determined at compile time to be used during query execution. 
> Each query execution generates a new write_id. So, we can seed udf with this 
> value during compilation.
> Then we statically allocate ranges for each task from which it can draw next 
> long. So, lets say 64-bit write_id we divy up such that last 24 bits belong 
> to original usage of it that is txns. Next 16 bits are used for task_attempts 
> and last 24 bits to generate new long for each row. This implies we can allow 
> 17M txns, 65K tasks and 17M rows in a task. If you hit any of those limits we 
> can fail the query.
> Implementation wise: serialize write_id in initialize() of udf. Then during 
> execute() we find out what task_attempt current task is and use it along with 
> write_id() to get starting long and give a new value on each invocation of 
> execute().
> Here we are assuming write_id can be determined at compile time, which should 
> be the case but we need to figure out how to get handle to it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-11394) Enhance EXPLAIN display for vectorization

2018-09-25 Thread Andrew Sears (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628185#comment-16628185
 ] 

Andrew Sears commented on HIVE-11394:
-

Updated the Language Manual with basic syntax from JIRA notes.

> Enhance EXPLAIN display for vectorization
> -
>
> Key: HIVE-11394
> URL: https://issues.apache.org/jira/browse/HIVE-11394
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>  Labels: TODOC2.2
> Fix For: 2.3.0
>
> Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, 
> HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, 
> HIVE-11394.09.patch, HIVE-11394.091.patch, HIVE-11394.092.patch, 
> HIVE-11394.093.patch, HIVE-11394.094.patch, HIVE-11394.095.patch, 
> HIVE-11394.096.patch, HIVE-11394.097.patch, HIVE-11394.098.patch, 
> HIVE-11394.099.patch, HIVE-11394.0991.patch, HIVE-11394.0992.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not 
> vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] 
> \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization 
> enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter 
> Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. 
> predicateExpression.  It includes all information of SUMMARY and OPERATOR, 
> too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> ---
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization 
> sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION 
> SUMMARY.
> Under Reducer 3’s "Reduce Vectorization:" you’ll see
> notVectorizedReason: Aggregation Function UDF avg parameter expression for 
> GROUPBY operator: Data type struct of 
> Column\[VALUE._col2\] not supported
> For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": 
> "false" which says a node has a GROUP BY with an AVG or some other aggregator 
> that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators 
> are row-mode.  I.e. not vector output.
> If "usesVectorUDFAdaptor:": "false" were true, it would say there was at 
> least one vectorized expression is using VectorUDFAdaptor.
> And, "allNative:": "false" will be true when all operators are native.  
> Today, GROUP BY and FILE SINK are not native.  MAP JOIN and REDUCE SINK are 
> conditionally native.  FILTER and SELECT are native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
> ...
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> ...
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: alltypesorc
>   Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Select Operator
> expressions: cint (type: int)
> outputColumnNames: cint
> Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Group By Operator
>   keys: cint (type: int)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> Map Vectorization:
> enabled: true
> enabledConditionsMet: 
> hive.vectorized.use.vectorized.input.format IS true
> groupByVectorOutput: 

[jira] [Commented] (HIVE-20535) Add new configuration to set the size of the global compile lock

2018-09-25 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628180#comment-16628180
 ] 

Hive QA commented on HIVE-20535:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
38s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
31s{color} | {color:blue} common in master has 65 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
45s{color} | {color:blue} ql in master has 2326 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
40s{color} | {color:red} ql: The patch generated 3 new + 142 unchanged - 6 
fixed = 145 total (was 148) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14050/dev-support/hive-personality.sh
 |
| git revision | master / a036e52 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14050/yetus/diff-checkstyle-ql.txt
 |
| modules | C: common ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14050/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Add new configuration to set the size of the global compile lock
> 
>
> Key: HIVE-20535
> URL: https://issues.apache.org/jira/browse/HIVE-20535
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Reporter: denys kuzmenko
>Assignee: denys kuzmenko
>Priority: Major
> Attachments: HIVE-20535.1.patch, HIVE-20535.10.patch, 
> HIVE-20535.11.patch, HIVE-20535.12.patch, HIVE-20535.13.patch, 
> HIVE-20535.14.patch, HIVE-20535.15.patch, HIVE-20535.16.patch, 
> HIVE-20535.2.patch, HIVE-20535.3.patch, HIVE-20535.4.patch, 
> HIVE-20535.5.patch, HIVE-20535.6.patch, HIVE-20535.8.patch, HIVE-20535.9.patch
>
>
> When removing the compile lock, it is quite risky to remove it entirely.
> It would be good to provide a pool size for the concurrent compilation, so 
> the administrator can limit the load



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-12075) add analyze command to explictly cache file metadata in HBase metastore

2018-09-25 Thread Andrew Sears (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628174#comment-16628174
 ] 

Andrew Sears commented on HIVE-12075:
-

[~vitalii] updated the wiki for clarification on the feature removal in Hive 
3.0.

> add analyze command to explictly cache file metadata in HBase metastore
> ---
>
> Key: HIVE-12075
> URL: https://issues.apache.org/jira/browse/HIVE-12075
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: TODOC2.1
> Fix For: 2.1.0
>
> Attachments: HIVE-12075.01.nogen.patch, HIVE-12075.01.patch, 
> HIVE-12075.02.patch, HIVE-12075.03.patch, HIVE-12075.04.patch, 
> HIVE-12075.nogen.patch, HIVE-12075.patch
>
>
> ANALYZE TABLE (spec as usual) CACHE METADATA



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20523) Improve table statistics for Parquet format

2018-09-25 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628149#comment-16628149
 ] 

Hive QA commented on HIVE-20523:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941276/HIVE-20523.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 14999 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[array_table_stats] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[nested_column_pruning] 
(batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_analyze] 
(batchId=24)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_complex_types_vectorization]
 (batchId=77)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_join] 
(batchId=21)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_map_type_vectorization]
 (batchId=90)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_no_row_serde] 
(batchId=74)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_struct_type_vectorization]
 (batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_types_non_dictionary_encoding_vectorization]
 (batchId=91)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_types_vectorization]
 (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_decimal_date]
 (batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_part_project]
 (batchId=37)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_numeric_overflows]
 (batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_parquet_projection]
 (batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_parquet_types]
 (batchId=71)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_partitioned_date_time]
 (batchId=179)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
 (batchId=188)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[parquet_join] 
(batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[parquet_vectorization_decimal_date]
 (batchId=124)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[parquet_vectorization_part_project]
 (batchId=126)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_input_format_excludes]
 (batchId=131)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_parquet_projection]
 (batchId=130)
org.apache.hadoop.hive.ql.io.parquet.TestParquetSerDe.testParquetHiveSerDe 
(batchId=287)
org.apache.hive.jdbc.miniHS2.TestHs2ConnectionMetricsBinary.testOpenConnectionMetrics
 (batchId=256)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14049/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14049/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14049/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 24 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941276 - PreCommit-HIVE-Build

> Improve table statistics for Parquet format
> ---
>
> Key: HIVE-20523
> URL: https://issues.apache.org/jira/browse/HIVE-20523
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: George Pachitariu
>Assignee: George Pachitariu
>Priority: Minor
> Attachments: HIVE-20523.1.patch, HIVE-20523.2.patch, 
> HIVE-20523.3.patch, HIVE-20523.4.patch, HIVE-20523.patch
>
>
> Right now, in the table basic statistics, the *raw data size* for a row with 
> any data type in the Parquet format is 1. This is an underestimated value 
> when columns are complex data structures, like arrays.
> Having tables with underestimated raw data size makes Hive assign less 
> containers (mappers/reducers) to it, making the overall query slower. 
> Heavy underestimation also makes Hive choose MapJoin instead of the 
> ShuffleJoin that can fail with OOM errors.
> In this patch, I compute the columns data size better, taking into account 
> complex structures. I followed the Writer implementation for the ORC format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20604) Minor compaction disables ORC column stats

2018-09-25 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628125#comment-16628125
 ] 

Prasanth Jayachandran commented on HIVE-20604:
--

+1

> Minor compaction disables ORC column stats
> --
>
> Key: HIVE-20604
> URL: https://issues.apache.org/jira/browse/HIVE-20604
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20604.01.patch
>
>
> {noformat}
>   @Override
>   public org.apache.hadoop.hive.ql.exec.FileSinkOperator.RecordWriter
> getRawRecordWriter(Path path, Options options) throws IOException {
> final Path filename = AcidUtils.createFilename(path, options);
> final OrcFile.WriterOptions opts =
> OrcFile.writerOptions(options.getTableProperties(), 
> options.getConfiguration());
> if (!options.isWritingBase()) {
>   opts.bufferSize(OrcRecordUpdater.DELTA_BUFFER_SIZE)
>   .stripeSize(OrcRecordUpdater.DELTA_STRIPE_SIZE)
>   .blockPadding(false)
>   .compress(CompressionKind.NONE)
>   .rowIndexStride(0)
>   ;
> }
> {noformat}
> {{rowIndexStride(0)}} makes {{StripeStatistics.getColumnStatistics()}} return 
> objects but with meaningless values, like min/max for 
> {{IntegerColumnStatistics}} set to MIN_LONG/MAX_LONG.
> This interferes with ability to infer min ROW_ID for a split but also creates 
> inefficient files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-09-25 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17684:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks [~mi...@cloudera.com] for the contribution!

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, 
> HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, 
> HIVE-17684.09.patch, HIVE-17684.10.patch, HIVE-17684.11.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20523) Improve table statistics for Parquet format

2018-09-25 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628108#comment-16628108
 ] 

Hive QA commented on HIVE-20523:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
52s{color} | {color:blue} ql in master has 2326 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
40s{color} | {color:red} ql: The patch generated 1 new + 5 unchanged - 1 fixed 
= 6 total (was 6) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m 17s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14049/dev-support/hive-personality.sh
 |
| git revision | master / 4137c21 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14049/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14049/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Improve table statistics for Parquet format
> ---
>
> Key: HIVE-20523
> URL: https://issues.apache.org/jira/browse/HIVE-20523
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: George Pachitariu
>Assignee: George Pachitariu
>Priority: Minor
> Attachments: HIVE-20523.1.patch, HIVE-20523.2.patch, 
> HIVE-20523.3.patch, HIVE-20523.4.patch, HIVE-20523.patch
>
>
> Right now, in the table basic statistics, the *raw data size* for a row with 
> any data type in the Parquet format is 1. This is an underestimated value 
> when columns are complex data structures, like arrays.
> Having tables with underestimated raw data size makes Hive assign less 
> containers (mappers/reducers) to it, making the overall query slower. 
> Heavy underestimation also makes Hive choose MapJoin instead of the 
> ShuffleJoin that can fail with OOM errors.
> In this patch, I compute the columns data size better, taking into account 
> complex structures. I followed the Writer implementation for the ORC format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20636) Improve number of null values estimation after outer join

2018-09-25 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-20636:
---
Attachment: HIVE-20636.patch

> Improve number of null values estimation after outer join
> -
>
> Key: HIVE-20636
> URL: https://issues.apache.org/jira/browse/HIVE-20636
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20636.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (HIVE-20636) Improve number of null values estimation after outer join

2018-09-25 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-20636 started by Jesus Camacho Rodriguez.
--
> Improve number of null values estimation after outer join
> -
>
> Key: HIVE-20636
> URL: https://issues.apache.org/jira/browse/HIVE-20636
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20636) Improve number of null values estimation after outer join

2018-09-25 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-20636:
--


> Improve number of null values estimation after outer join
> -
>
> Key: HIVE-20636
> URL: https://issues.apache.org/jira/browse/HIVE-20636
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20636) Improve number of null values estimation after outer join

2018-09-25 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-20636:
---
Status: Patch Available  (was: In Progress)

> Improve number of null values estimation after outer join
> -
>
> Key: HIVE-20636
> URL: https://issues.apache.org/jira/browse/HIVE-20636
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20615) CachedStore: Background refresh thread bug fixes

2018-09-25 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628087#comment-16628087
 ] 

Hive QA commented on HIVE-20615:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941270/HIVE-20615.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 14974 tests 
executed
*Failed tests:*
{noformat}
TestCachedStore - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestCatalogCaching - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestDeadline - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestHiveMetaStoreGetMetaConf - did not produce a TEST-*.xml file (likely timed 
out) (batchId=228)
TestMarkPartition - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestMetaStoreEventListenerOnlyOnCommit - did not produce a TEST-*.xml file 
(likely timed out) (batchId=228)
TestMetaStoreInitListener - did not produce a TEST-*.xml file (likely timed 
out) (batchId=228)
TestMetaStoreListenersError - did not produce a TEST-*.xml file (likely timed 
out) (batchId=228)
TestMetaStoreSchemaInfo - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14048/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14048/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14048/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941270 - PreCommit-HIVE-Build

> CachedStore: Background refresh thread bug fixes
> 
>
> Key: HIVE-20615
> URL: https://issues.apache.org/jira/browse/HIVE-20615
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Affects Versions: 3.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
> Attachments: HIVE-20615.1.patch, HIVE-20615.1.patch, 
> HIVE-20615.1.patch
>
>
> Regression introduced in HIVE-18264. Fixes background thread starting and 
> refreshing of the table cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17231) ColumnizedDeleteEventRegistry.DeleteReaderValue optimization

2018-09-25 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17231:
--
Description: 
 For unbucketed tables DeleteReaderValue will currently return all delete 
events.  Once we trust that
 the N in bucketN for "base" spit is reliable, all delete events not 
matching N can be skipped.

This is useful to protect against extreme cases where someone runs an 
update/delete on a partition that matches 10 billion rows thus generates very 
many delete events.

Since HIVE-19890 all acid data files must have bucketid/writerid in the file 
name match bucketid/writerid in ROW__ID in the data.

  was:
 For unbucketed tables DeleteReaderValue will currently return all delete 
events.  Once we trust that
 the N in bucketN for "base" spit is reliable, all delete events not 
matching N can be skipped.

This is useful to protect against extreme cases where someone runs an 
update/delete on a partition that matches 10 billion rows thus generates very 
many delete events.



> ColumnizedDeleteEventRegistry.DeleteReaderValue optimization
> 
>
> Key: HIVE-17231
> URL: https://issues.apache.org/jira/browse/HIVE-17231
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Priority: Major
>
>  For unbucketed tables DeleteReaderValue will currently return all delete 
> events.  Once we trust that
>  the N in bucketN for "base" spit is reliable, all delete events not 
> matching N can be skipped.
> This is useful to protect against extreme cases where someone runs an 
> update/delete on a partition that matches 10 billion rows thus generates very 
> many delete events.
> Since HIVE-19890 all acid data files must have bucketid/writerid in the file 
> name match bucketid/writerid in ROW__ID in the data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20604) Minor compaction disables ORC column stats

2018-09-25 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20604:
--
Target Version/s: 4.0.0
  Status: Patch Available  (was: Open)

[~prasanth_j] could you review please

> Minor compaction disables ORC column stats
> --
>
> Key: HIVE-20604
> URL: https://issues.apache.org/jira/browse/HIVE-20604
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20604.01.patch
>
>
> {noformat}
>   @Override
>   public org.apache.hadoop.hive.ql.exec.FileSinkOperator.RecordWriter
> getRawRecordWriter(Path path, Options options) throws IOException {
> final Path filename = AcidUtils.createFilename(path, options);
> final OrcFile.WriterOptions opts =
> OrcFile.writerOptions(options.getTableProperties(), 
> options.getConfiguration());
> if (!options.isWritingBase()) {
>   opts.bufferSize(OrcRecordUpdater.DELTA_BUFFER_SIZE)
>   .stripeSize(OrcRecordUpdater.DELTA_STRIPE_SIZE)
>   .blockPadding(false)
>   .compress(CompressionKind.NONE)
>   .rowIndexStride(0)
>   ;
> }
> {noformat}
> {{rowIndexStride(0)}} makes {{StripeStatistics.getColumnStatistics()}} return 
> objects but with meaningless values, like min/max for 
> {{IntegerColumnStatistics}} set to MIN_LONG/MAX_LONG.
> This interferes with ability to infer min ROW_ID for a split but also creates 
> inefficient files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20604) Minor compaction disables ORC column stats

2018-09-25 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20604:
--
Attachment: HIVE-20604.01.patch

> Minor compaction disables ORC column stats
> --
>
> Key: HIVE-20604
> URL: https://issues.apache.org/jira/browse/HIVE-20604
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20604.01.patch
>
>
> {noformat}
>   @Override
>   public org.apache.hadoop.hive.ql.exec.FileSinkOperator.RecordWriter
> getRawRecordWriter(Path path, Options options) throws IOException {
> final Path filename = AcidUtils.createFilename(path, options);
> final OrcFile.WriterOptions opts =
> OrcFile.writerOptions(options.getTableProperties(), 
> options.getConfiguration());
> if (!options.isWritingBase()) {
>   opts.bufferSize(OrcRecordUpdater.DELTA_BUFFER_SIZE)
>   .stripeSize(OrcRecordUpdater.DELTA_STRIPE_SIZE)
>   .blockPadding(false)
>   .compress(CompressionKind.NONE)
>   .rowIndexStride(0)
>   ;
> }
> {noformat}
> {{rowIndexStride(0)}} makes {{StripeStatistics.getColumnStatistics()}} return 
> objects but with meaningless values, like min/max for 
> {{IntegerColumnStatistics}} set to MIN_LONG/MAX_LONG.
> This interferes with ability to infer min ROW_ID for a split but also creates 
> inefficient files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20556) Expose an API to retrieve the TBL_ID from TBLS in the metastore tables

2018-09-25 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628071#comment-16628071
 ] 

Eugene Koifman commented on HIVE-20556:
---

[~jmarhuen] could you create RB for this?
{{optional i64 id}} would it be better to make it required and give it a 
default value, like -1?
It seems odd that this is optional.

I see {{tbl.unsetId()}} in a number of places. Could you explain what it's for?

if this a read-only field, should MTable.setId(long id) be public?



> Expose an API to retrieve the TBL_ID from TBLS in the metastore tables
> --
>
> Key: HIVE-20556
> URL: https://issues.apache.org/jira/browse/HIVE-20556
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore, Standalone Metastore
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
> Attachments: HIVE-20556.1.patch, HIVE-20556.10.patch, 
> HIVE-20556.11.patch, HIVE-20556.12.patch, HIVE-20556.13.patch, 
> HIVE-20556.14.patch, HIVE-20556.15.patch, HIVE-20556.2.patch, 
> HIVE-20556.3.patch, HIVE-20556.4.patch, HIVE-20556.5.patch, 
> HIVE-20556.6.patch, HIVE-20556.7.patch, HIVE-20556.8.patch, HIVE-20556.9.patch
>
>
> We have two options to do this
> 1) Use the current MTable and add a field for this value
> 2) Add an independent API call to the metastore that would return the TBL_ID.
> Option 1 is preferable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19302) Logging Too Verbose For TableNotFound

2018-09-25 Thread Alice Fan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alice Fan updated HIVE-19302:
-
Attachment: HIVE-19302.6.patch
Status: Patch Available  (was: Open)

> Logging Too Verbose For TableNotFound
> -
>
> Key: HIVE-19302
> URL: https://issues.apache.org/jira/browse/HIVE-19302
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 3.0.0, 2.2.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Minor
> Attachments: HIVE-19302.4.patch, HIVE-19302.5.patch, 
> HIVE-19302.6.patch, table_not_found_cdh6.txt
>
>
> There is way too much logging when a user submits a query against a table 
> which does not exist.  In an ad-hoc setting, it is quite normal that a user 
> fat-fingers a table name.  Yet, from the perspective of the Hive 
> administrator, there was perhaps a major issue based on the volume and 
> severity of logging.  Please change the logging to INFO level, and do not 
> present a stack trace, for such a trivial error.
>  
> See the attached file for a sample of what logging a single "table not found" 
> query generates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19302) Logging Too Verbose For TableNotFound

2018-09-25 Thread Alice Fan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alice Fan updated HIVE-19302:
-
Status: Open  (was: Patch Available)

> Logging Too Verbose For TableNotFound
> -
>
> Key: HIVE-19302
> URL: https://issues.apache.org/jira/browse/HIVE-19302
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 3.0.0, 2.2.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Minor
> Attachments: HIVE-19302.4.patch, HIVE-19302.5.patch, 
> table_not_found_cdh6.txt
>
>
> There is way too much logging when a user submits a query against a table 
> which does not exist.  In an ad-hoc setting, it is quite normal that a user 
> fat-fingers a table name.  Yet, from the perspective of the Hive 
> administrator, there was perhaps a major issue based on the volume and 
> severity of logging.  Please change the logging to INFO level, and do not 
> present a stack trace, for such a trivial error.
>  
> See the attached file for a sample of what logging a single "table not found" 
> query generates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20615) CachedStore: Background refresh thread bug fixes

2018-09-25 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628058#comment-16628058
 ] 

Hive QA commented on HIVE-20615:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 7s{color} | {color:green} master passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
13s{color} | {color:red} metastore-server in master failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 7s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
12s{color} | {color:red} metastore-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 11m  0s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14048/dev-support/hive-personality.sh
 |
| git revision | master / 4137c21 |
| Default Java | 1.8.0_111 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14048/yetus/branch-findbugs-standalone-metastore_metastore-server.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14048/yetus/whitespace-eol.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14048/yetus/patch-findbugs-standalone-metastore_metastore-server.txt
 |
| modules | C: standalone-metastore/metastore-server U: 
standalone-metastore/metastore-server |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14048/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> CachedStore: Background refresh thread bug fixes
> 
>
> Key: HIVE-20615
> URL: https://issues.apache.org/jira/browse/HIVE-20615
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Affects Versions: 3.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
> Attachments: HIVE-20615.1.patch, HIVE-20615.1.patch, 
> HIVE-20615.1.patch
>
>
> Regression introduced in HIVE-18264. Fixes background thread starting and 
> refreshing of the table cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19302) Logging Too Verbose For TableNotFound

2018-09-25 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628044#comment-16628044
 ] 

Hive QA commented on HIVE-19302:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941261/HIVE-19302.5.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 14998 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.exec.spark.TestSparkSessionTimeout.testMultiSessionSparkSessionTimeout
 (batchId=246)
org.apache.hadoop.hive.ql.exec.spark.TestSparkSessionTimeout.testMultiSparkSessionTimeout
 (batchId=246)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14047/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14047/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14047/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941261 - PreCommit-HIVE-Build

> Logging Too Verbose For TableNotFound
> -
>
> Key: HIVE-19302
> URL: https://issues.apache.org/jira/browse/HIVE-19302
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.2.0, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Minor
> Attachments: HIVE-19302.4.patch, HIVE-19302.5.patch, 
> table_not_found_cdh6.txt
>
>
> There is way too much logging when a user submits a query against a table 
> which does not exist.  In an ad-hoc setting, it is quite normal that a user 
> fat-fingers a table name.  Yet, from the perspective of the Hive 
> administrator, there was perhaps a major issue based on the volume and 
> severity of logging.  Please change the logging to INFO level, and do not 
> present a stack trace, for such a trivial error.
>  
> See the attached file for a sample of what logging a single "table not found" 
> query generates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20619) Include MultiDelimitSerDe in HIveServer2 By Default

2018-09-25 Thread Alice Fan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alice Fan updated HIVE-20619:
-
Attachment: HIVE-20619.1.patch
Status: Patch Available  (was: Open)

> Include MultiDelimitSerDe in HIveServer2 By Default
> ---
>
> Key: HIVE-20619
> URL: https://issues.apache.org/jira/browse/HIVE-20619
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Serializers/Deserializers
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Major
> Attachments: HIVE-20619.1.patch
>
>
> In [HIVE-20020], the hive-contrib JAR file was removed from the HiveServer2 
> classpath.  With this change, the {{MultiDelimitSerDe}} is no longer 
> included.  This is fine, because {{MultiDelimitSerDe}} was a pain in that 
> environment anyway.  It was available to HiveServer2, and therefore would 
> work with a limited set of queries (select * from table limit 1) but any 
> other query on that table which launched a MapReduce project would fail 
> because the hive-contrib JAR file was not sent out with the rest of the Hive 
> JARs for MapReduce jobs.
> Please bring {{MultiDelimitSerDe}} back into the fold so that it's available 
> to users out of the box without having to install the hive-contrib JAR into 
> the HiveServer2 auxiliary directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16953) OrcRawRecordMerger.discoverOriginalKeyBounds issue if both split start and end are in the same stripe

2018-09-25 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628034#comment-16628034
 ] 

Eugene Koifman commented on HIVE-16953:
---

better long term fix is to make VectorizedOrcAcidRowBatchReader is used for 
both vectorized and non-vectorized reads and add a thin wrapper on top of it 
for non-vectorized reads that turns VRBs into rows

> OrcRawRecordMerger.discoverOriginalKeyBounds issue if both split start and 
> end are in the same stripe
> -
>
> Key: HIVE-16953
> URL: https://issues.apache.org/jira/browse/HIVE-16953
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Priority: Major
>
> if getOffset() and getMaxOffset() are inside
> * the sames tripe - in this case we have minKey & isTail=false but 
> rowLength is never set.
> don't know if we can ever have a split like that



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18778) Needs to capture input/output entities in explain

2018-09-25 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-18778:
--
Attachment: HIVE-18778.10.branch-3.patch

> Needs to capture input/output entities in explain
> -
>
> Key: HIVE-18778
> URL: https://issues.apache.org/jira/browse/HIVE-18778
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>Priority: Major
> Attachments: HIVE-18778-SparkPositive.patch, HIVE-18778.1.patch, 
> HIVE-18778.10.branch-3.patch, HIVE-18778.2.patch, HIVE-18778.3.patch, 
> HIVE-18778.4.patch, HIVE-18778.5.patch, HIVE-18778.6.patch, 
> HIVE-18778.7.patch, HIVE-18778.8.patch, HIVE-18778.9.branch-3.patch, 
> HIVE-18778.9.patch, HIVE-18778_TestCliDriver.patch, 
> HIVE-18788_SparkNegative.patch, HIVE-18788_SparkPerf.patch
>
>
> With Sentry enabled, commands like explain drop table foo fail with {{explain 
> drop table foo;}}
> {code}
> Error: Error while compiling statement: FAILED: SemanticException No valid 
> privileges
>  Required privilege( Table) not available in input privileges
>  The required privileges: (state=42000,code=4)
> {code}
> Sentry fails to authorize because the ExplainSemanticAnalyzer uses an 
> instance of DDLSemanticAnalyzer to analyze the explain query.
> {code}
> BaseSemanticAnalyzer sem = SemanticAnalyzerFactory.get(conf, input);
> sem.analyze(input, ctx);
> sem.validate()
> {code}
> The inputs/outputs entities for this query are set in the above code. 
> However, these are never set on the instance of ExplainSemanticAnalyzer 
> itself and thus is not propagated into the HookContext in the calling Driver 
> code.
> {code}
> sem.analyze(tree, ctx); --> this results in calling the above code that uses 
> DDLSA
> hookCtx.update(sem); --> sem is an instance of ExplainSemanticAnalyzer, this 
> code attempts to update the HookContext with the input/output info from ESA 
> which is never set.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18778) Needs to capture input/output entities in explain

2018-09-25 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-18778:
--
Target Version/s: 4.0.0, 3.2.0  (was: 3.0.0)

> Needs to capture input/output entities in explain
> -
>
> Key: HIVE-18778
> URL: https://issues.apache.org/jira/browse/HIVE-18778
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>Priority: Major
> Attachments: HIVE-18778-SparkPositive.patch, HIVE-18778.1.patch, 
> HIVE-18778.10.branch-3.patch, HIVE-18778.2.patch, HIVE-18778.3.patch, 
> HIVE-18778.4.patch, HIVE-18778.5.patch, HIVE-18778.6.patch, 
> HIVE-18778.7.patch, HIVE-18778.8.patch, HIVE-18778.9.branch-3.patch, 
> HIVE-18778.9.patch, HIVE-18778_TestCliDriver.patch, 
> HIVE-18788_SparkNegative.patch, HIVE-18788_SparkPerf.patch
>
>
> With Sentry enabled, commands like explain drop table foo fail with {{explain 
> drop table foo;}}
> {code}
> Error: Error while compiling statement: FAILED: SemanticException No valid 
> privileges
>  Required privilege( Table) not available in input privileges
>  The required privileges: (state=42000,code=4)
> {code}
> Sentry fails to authorize because the ExplainSemanticAnalyzer uses an 
> instance of DDLSemanticAnalyzer to analyze the explain query.
> {code}
> BaseSemanticAnalyzer sem = SemanticAnalyzerFactory.get(conf, input);
> sem.analyze(input, ctx);
> sem.validate()
> {code}
> The inputs/outputs entities for this query are set in the above code. 
> However, these are never set on the instance of ExplainSemanticAnalyzer 
> itself and thus is not propagated into the HookContext in the calling Driver 
> code.
> {code}
> sem.analyze(tree, ctx); --> this results in calling the above code that uses 
> DDLSA
> hookCtx.update(sem); --> sem is an instance of ExplainSemanticAnalyzer, this 
> code attempts to update the HookContext with the input/output info from ESA 
> which is never set.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (HIVE-17284) remove OrcRecordUpdater.deleteEventIndexBuilder

2018-09-25 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17284:
--
Comment: was deleted

(was: this may not be the right thing to do.  ORC flattens structs 
({{ROW__ID}}) and will maintain min/max for individual columns.  To filter 
events we really need min/max {{ROW__ID}})

> remove OrcRecordUpdater.deleteEventIndexBuilder
> ---
>
> Key: HIVE-17284
> URL: https://issues.apache.org/jira/browse/HIVE-17284
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Minor
>
> There is no point in it. We know how many rows a delete_delta file has from 
> ORC and they are all the same type - so no need for AcidStats.
>  hive.acid.key.index has no value since delete_delta files are never split 
> and are not likely to have more than 1 stripe since they are very small.
> Also can remove KeyIndexBuilder.acidStats - we only have 1 type of event per 
> file
>  
> if doing this, make sure to fix {{OrcInputFormat.isOriginal(Reader)}} and 
> {{OrcInputFormat.isOriginal(Footer)}} etc



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17284) remove OrcRecordUpdater.deleteEventIndexBuilder

2018-09-25 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17284:
--
Description: 
There is no point in it. We know how many rows a delete_delta file has from ORC 
and they are all the same type - so no need for AcidStats.
 hive.acid.key.index has no value since delete_delta files are never split and 
are not likely to have more than 1 stripe since they are very small.

Also can remove KeyIndexBuilder.acidStats - we only have 1 type of event per 
file

 

if doing this, make sure to fix {{OrcInputFormat.isOriginal(Reader)}} and 
{{OrcInputFormat.isOriginal(Footer)}} etc

there is new KeyIndexBuilder("delete") and new KeyIndexBuilder("insert").  The 
later is needed in HIVE-16812, the former can be removed.

  was:
There is no point in it. We know how many rows a delete_delta file has from ORC 
and they are all the same type - so no need for AcidStats.
 hive.acid.key.index has no value since delete_delta files are never split and 
are not likely to have more than 1 stripe since they are very small.

Also can remove KeyIndexBuilder.acidStats - we only have 1 type of event per 
file

 

if doing this, make sure to fix {{OrcInputFormat.isOriginal(Reader)}} and 
{{OrcInputFormat.isOriginal(Footer)}} etc


> remove OrcRecordUpdater.deleteEventIndexBuilder
> ---
>
> Key: HIVE-17284
> URL: https://issues.apache.org/jira/browse/HIVE-17284
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Minor
>
> There is no point in it. We know how many rows a delete_delta file has from 
> ORC and they are all the same type - so no need for AcidStats.
>  hive.acid.key.index has no value since delete_delta files are never split 
> and are not likely to have more than 1 stripe since they are very small.
> Also can remove KeyIndexBuilder.acidStats - we only have 1 type of event per 
> file
>  
> if doing this, make sure to fix {{OrcInputFormat.isOriginal(Reader)}} and 
> {{OrcInputFormat.isOriginal(Footer)}} etc
> there is new KeyIndexBuilder("delete") and new KeyIndexBuilder("insert").  
> The later is needed in HIVE-16812, the former can be removed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-16812) VectorizedOrcAcidRowBatchReader doesn't filter delete events

2018-09-25 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628015#comment-16628015
 ] 

Eugene Koifman edited comment on HIVE-16812 at 9/25/18 10:47 PM:
-

patch 5 should be ready for review
{{VectorizedOrcAcidRowBatchReader}} examines the split worth of insert events 
and based on that generates 2 sets of bounds to use to filter delete events 
before loading them into in-memory structure.
The 1st set is min/max ROW__ID.
The 2nd is a SARG to push down to delete_delta files

This is used by {{ColumnizedDeleteEventRegistry}} but not 
{{SortMergedDeleteEventRegistry}}.

A limitation is that it currently doesn't handle {{OrcSplit.isOriginal()}} 
files.  This should be done in a followup after HIVE-17917.

[~gopalv] could you review please https://reviews.apache.org/r/68846/


was (Author: ekoifman):
patch 5 should be ready for review
VectorizedOrcAcidRowBatchReader examines the split worth of insert events and 
based on that generates 2 sets of bounds to use to filter delete events before 
loading them into in-memory structure.
The 1st set is min/max ROW__ID.
The 2nd is a SARG to push down to delete_delta files

This is used by ColumnizedDeleteEventRegistry but not 
SortMergedDeleteEventRegistry.

A limitation is that it currently doesn't handle {{OrcSplit.isOriginal()}} 
files.  This should be done in a followup after HIVE-17917.

[~gopalv] could you review please

> VectorizedOrcAcidRowBatchReader doesn't filter delete events
> 
>
> Key: HIVE-16812
> URL: https://issues.apache.org/jira/browse/HIVE-16812
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16812.02.patch, HIVE-16812.04.patch, 
> HIVE-16812.05.patch
>
>
> the c'tor of VectorizedOrcAcidRowBatchReader has
> {noformat}
> // Clone readerOptions for deleteEvents.
> Reader.Options deleteEventReaderOptions = readerOptions.clone();
> // Set the range on the deleteEventReaderOptions to 0 to INTEGER_MAX 
> because
> // we always want to read all the delete delta files.
> deleteEventReaderOptions.range(0, Long.MAX_VALUE);
> {noformat}
> This is suboptimal since base and deltas are sorted by ROW__ID.  So for each 
> split if base we can find min/max ROW_ID and only load events from delta that 
> are in [min,max] range.  This will reduce the number of delete events we load 
> in memory (to no more than there in the split).
> When we support sorting on PK, the same should apply but we'd need to make 
> sure to store PKs in ORC index
> See {{OrcRawRecordMerger.discoverKeyBounds()}}
> {{hive.acid.key.index}} in Orc footer has an index of ROW__IDs so we should 
> know min/max easily for any file written by {{OrcRecordUpdater}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19302) Logging Too Verbose For TableNotFound

2018-09-25 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628018#comment-16628018
 ] 

Hive QA commented on HIVE-19302:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
51s{color} | {color:blue} ql in master has 2326 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m 26s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14047/dev-support/hive-personality.sh
 |
| git revision | master / 4137c21 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14047/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Logging Too Verbose For TableNotFound
> -
>
> Key: HIVE-19302
> URL: https://issues.apache.org/jira/browse/HIVE-19302
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.2.0, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Minor
> Attachments: HIVE-19302.4.patch, HIVE-19302.5.patch, 
> table_not_found_cdh6.txt
>
>
> There is way too much logging when a user submits a query against a table 
> which does not exist.  In an ad-hoc setting, it is quite normal that a user 
> fat-fingers a table name.  Yet, from the perspective of the Hive 
> administrator, there was perhaps a major issue based on the volume and 
> severity of logging.  Please change the logging to INFO level, and do not 
> present a stack trace, for such a trivial error.
>  
> See the attached file for a sample of what logging a single "table not found" 
> query generates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16812) VectorizedOrcAcidRowBatchReader doesn't filter delete events

2018-09-25 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628015#comment-16628015
 ] 

Eugene Koifman commented on HIVE-16812:
---

patch 5 should be ready for review
VectorizedOrcAcidRowBatchReader examines the split worth of insert events and 
based on that generates 2 sets of bounds to use to filter delete events before 
loading them into in-memory structure.
The 1st set is min/max ROW__ID.
The 2nd is a SARG to push down to delete_delta files

This is used by ColumnizedDeleteEventRegistry but not 
SortMergedDeleteEventRegistry.

A limitation is that it currently doesn't handle {{OrcSplit.isOriginal()}} 
files.  This should be done in a followup after HIVE-17917.

[~gopalv] could you review please

> VectorizedOrcAcidRowBatchReader doesn't filter delete events
> 
>
> Key: HIVE-16812
> URL: https://issues.apache.org/jira/browse/HIVE-16812
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16812.02.patch, HIVE-16812.04.patch, 
> HIVE-16812.05.patch
>
>
> the c'tor of VectorizedOrcAcidRowBatchReader has
> {noformat}
> // Clone readerOptions for deleteEvents.
> Reader.Options deleteEventReaderOptions = readerOptions.clone();
> // Set the range on the deleteEventReaderOptions to 0 to INTEGER_MAX 
> because
> // we always want to read all the delete delta files.
> deleteEventReaderOptions.range(0, Long.MAX_VALUE);
> {noformat}
> This is suboptimal since base and deltas are sorted by ROW__ID.  So for each 
> split if base we can find min/max ROW_ID and only load events from delta that 
> are in [min,max] range.  This will reduce the number of delete events we load 
> in memory (to no more than there in the split).
> When we support sorting on PK, the same should apply but we'd need to make 
> sure to store PKs in ORC index
> See {{OrcRawRecordMerger.discoverKeyBounds()}}
> {{hive.acid.key.index}} in Orc footer has an index of ROW__IDs so we should 
> know min/max easily for any file written by {{OrcRecordUpdater}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-16812) VectorizedOrcAcidRowBatchReader doesn't filter delete events

2018-09-25 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-16812:
--
Attachment: HIVE-16812.05.patch

> VectorizedOrcAcidRowBatchReader doesn't filter delete events
> 
>
> Key: HIVE-16812
> URL: https://issues.apache.org/jira/browse/HIVE-16812
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16812.02.patch, HIVE-16812.04.patch, 
> HIVE-16812.05.patch
>
>
> the c'tor of VectorizedOrcAcidRowBatchReader has
> {noformat}
> // Clone readerOptions for deleteEvents.
> Reader.Options deleteEventReaderOptions = readerOptions.clone();
> // Set the range on the deleteEventReaderOptions to 0 to INTEGER_MAX 
> because
> // we always want to read all the delete delta files.
> deleteEventReaderOptions.range(0, Long.MAX_VALUE);
> {noformat}
> This is suboptimal since base and deltas are sorted by ROW__ID.  So for each 
> split if base we can find min/max ROW_ID and only load events from delta that 
> are in [min,max] range.  This will reduce the number of delete events we load 
> in memory (to no more than there in the split).
> When we support sorting on PK, the same should apply but we'd need to make 
> sure to store PKs in ORC index
> See {{OrcRawRecordMerger.discoverKeyBounds()}}
> {{hive.acid.key.index}} in Orc footer has an index of ROW__IDs so we should 
> know min/max easily for any file written by {{OrcRecordUpdater}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20540) Vectorization : Support loading bucketed tables using sorted dynamic partition optimizer - II

2018-09-25 Thread Deepak Jaiswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-20540:
--
Attachment: HIVE-20540.3.patch

> Vectorization : Support loading bucketed tables using sorted dynamic 
> partition optimizer - II
> -
>
> Key: HIVE-20540
> URL: https://issues.apache.org/jira/browse/HIVE-20540
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20540.1.patch, HIVE-20540.2.patch, 
> HIVE-20540.3.patch
>
>
> Followup to HIVE-20510 with remaining issues,
>  
> 1. Avoid using Reflection.
> 2. In VectorizationContext, use correct place to setup the VectorExpression. 
> It may be missed in certain cases.
> 3. In BucketNumExpression, make sure that a value is not overwritten before 
> it is processed. Use a flag to achieve this.
> cc [~gopalv]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20545) Exclude large-sized parameters from serialization of Table and Partition thrift objects in HMS notifications

2018-09-25 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627991#comment-16627991
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-20545:
-

Test failure unrelated.

> Exclude large-sized parameters from serialization of Table and Partition 
> thrift objects in HMS notifications
> 
>
> Key: HIVE-20545
> URL: https://issues.apache.org/jira/browse/HIVE-20545
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20545.1.patch, HIVE-20545.2.patch, 
> HIVE-20545.3.branch-3.patch, HIVE-20545.3.patch, HIVE-20545.4.patch
>
>
> Clients can add large-sized parameters in Table/Partition objects. So we need 
> to enable adding regex patterns through HiveConf to match parameters to be 
> filtered from table and partition objects before serialization in HMS 
> notifications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20545) Exclude large-sized parameters from serialization of Table and Partition thrift objects in HMS notifications

2018-09-25 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627989#comment-16627989
 ] 

Hive QA commented on HIVE-20545:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941252/HIVE-20545.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 14997 tests 
executed
*Failed tests:*
{noformat}
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=195)

[druidmini_masking.q,druidmini_test1.q,druidkafkamini_basic.q,druidmini_joins.q,druid_timestamptz.q]
org.apache.hive.service.auth.TestCustomAuthentication.org.apache.hive.service.auth.TestCustomAuthentication
 (batchId=248)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14046/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14046/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14046/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941252 - PreCommit-HIVE-Build

> Exclude large-sized parameters from serialization of Table and Partition 
> thrift objects in HMS notifications
> 
>
> Key: HIVE-20545
> URL: https://issues.apache.org/jira/browse/HIVE-20545
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20545.1.patch, HIVE-20545.2.patch, 
> HIVE-20545.3.branch-3.patch, HIVE-20545.3.patch, HIVE-20545.4.patch
>
>
> Clients can add large-sized parameters in Table/Partition objects. So we need 
> to enable adding regex patterns through HiveConf to match parameters to be 
> filtered from table and partition objects before serialization in HMS 
> notifications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20634) DirectSQL does not retry in ORM mode while getting partitions by filter

2018-09-25 Thread Karthik Manamcheri (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Manamcheri reassigned HIVE-20634:
-


> DirectSQL does not retry in ORM mode while getting partitions by filter
> ---
>
> Key: HIVE-20634
> URL: https://issues.apache.org/jira/browse/HIVE-20634
> Project: Hive
>  Issue Type: Bug
>Reporter: Karthik Manamcheri
>Assignee: Karthik Manamcheri
>Priority: Major
>
> The code path for getting partitions by filter is as follows,
> {code:java}
>   protected List getPartitionsByFilterInternal(..) {
>...
>   @Override
>   protected boolean canUseDirectSql(GetHelper> ctx) 
> throws MetaException 
>  {
> return directSql.generateSqlFilterForPushdown(ctx.getTable(), tree, 
> filter);
>   }
>...
>   }
> {code}
> If directSql.generateSqlFilterForPushdown throws an exception, we should be 
> returning false from canUseDirectSql instead of propagating the exception. 
> The propagation of exception causes the whole query to fail, instead of 
> retrying with JDO.
> We should have code such as
> {code:java}
>   @Override
>   protected boolean canUseDirectSql(GetHelper ctx) throws 
> MetaException {
> try {
>   return directSql.generateSqlFilterForPushdown(ctx.getTable(), 
> exprTree, filter);
> } catch (final MetaException me) {
>   return false;
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20545) Exclude large-sized parameters from serialization of Table and Partition thrift objects in HMS notifications

2018-09-25 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627956#comment-16627956
 ] 

Hive QA commented on HIVE-20545:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
29s{color} | {color:blue} standalone-metastore/metastore-common in master has 
28 extant Findbugs warnings. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
13s{color} | {color:red} metastore-server in master failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
9s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
7s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
14s{color} | {color:red} metastore-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 20m 19s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14046/dev-support/hive-personality.sh
 |
| git revision | master / 4137c21 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14046/yetus/branch-findbugs-standalone-metastore_metastore-server.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14046/yetus/patch-findbugs-standalone-metastore_metastore-server.txt
 |
| modules | C: standalone-metastore/metastore-common 
standalone-metastore/metastore-server U: standalone-metastore |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14046/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Exclude large-sized parameters from serialization of Table and Partition 
> thrift objects in HMS notifications
> 
>
> Key: HIVE-20545
> URL: https://issues.apache.org/jira/browse/HIVE-20545
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20545.1.patch, HIVE-20545.2.patch, 
> HIVE-20545.3.branch-3.patch, HIVE-20545.3.patch, HIVE-20545.4.patch
>
>
> Clients can add large-sized parameters in Table/Partition objects. So we need 
> to enable adding regex patterns through HiveConf to match parameters to be 
> filtered 

[jira] [Commented] (HIVE-20632) Query with get_splits UDF fails if materialized view is created on queried table.

2018-09-25 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627911#comment-16627911
 ] 

Hive QA commented on HIVE-20632:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941241/HIVE-20632.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 14997 tests 
executed
*Failed tests:*
{noformat}
org.apache.hive.spark.client.rpc.TestRpc.testClientTimeout (batchId=318)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14045/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14045/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14045/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941241 - PreCommit-HIVE-Build

> Query with get_splits UDF fails if materialized view is created on queried 
> table. 
> --
>
> Key: HIVE-20632
> URL: https://issues.apache.org/jira/browse/HIVE-20632
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, UDF
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: UDF, materializedviews, pull-request-available
> Attachments: HIVE-20632.01.patch
>
>
> Scenario:
>  # Create ACID table t1 and insert few rows.
>  # Create materialized view mv as select a from t1 where a > 5;
>  # Run get_split query "select get_splits( select a from t1 where a > 5); – 
> This fails with AssertionError.
> {code:java}
> java.lang.AssertionError
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:673)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(Hive.java:1495)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAllValidMaterializedViews(Hive.java:1478)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2160)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1777)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1669)
> at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
> at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
> at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
> at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1428)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:475)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12309)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:355)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:669)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1872)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1819)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:266)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:204)
> at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2764)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.runStatementOnDriver(TestAcidOnTez.java:927)
> at 
> 

[jira] [Commented] (HIVE-20600) Metastore connection leak

2018-09-25 Thread Damon Cortesi (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627907#comment-16627907
 ] 

Damon Cortesi commented on HIVE-20600:
--

This may be the same issue as HIVE-20511.

> Metastore connection leak
> -
>
> Key: HIVE-20600
> URL: https://issues.apache.org/jira/browse/HIVE-20600
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.3.3
>Reporter: Damon Cortesi
>Priority: Major
> Attachments: HIVE-20600.patch, consume_threads.py
>
>
> Within the execute method of HiveServer2, there appears to be a connection 
> leak. With fairly straightforward series of INSERT statements, the connection 
> count in the logs continues to increase over time. Under certain loads, this 
> can also consume all underlying threads of the Hive metastore and result in 
> HS2 becoming unresponsive to new connections.
> The log below is the result of some python code executing a single insert 
> statement, and then looping through a series of 10 more insert statements. We 
> can see there's one dangling connection left open after each execution 
> leaving us with 12 open connections (11 from the execute statements + 1 from 
> HS2 startup).
> {code}
> 2018-09-19T17:14:32,108 INFO [main([])]: hive.metastore 
> (HiveMetaStoreClient.java:open(481)) - Opened a connection to metastore, 
> current connections: 1
>  2018-09-19T17:14:48,175 INFO [29049f74-73c4-4f48-9cf7-b4bfe524a85b 
> HiveServer2-Handler-Pool: Thread-31([])]: hive.metastore 
> (HiveMetaStoreClient.java:open(481)) - Opened a connection to metastore, 
> current connections: 2
>  2018-09-19T17:15:05,543 INFO [HiveServer2-Background-Pool: Thread-36([])]: 
> hive.metastore (HiveMetaStoreClient.java:close(564)) - Closed a connection to 
> metastore, current connections: 1
>  2018-09-19T17:15:05,548 INFO [HiveServer2-Background-Pool: Thread-36([])]: 
> hive.metastore (HiveMetaStoreClient.java:open(481)) - Opened a connection to 
> metastore, current connections: 2
>  2018-09-19T17:15:05,932 INFO [HiveServer2-Background-Pool: Thread-36([])]: 
> hive.metastore (HiveMetaStoreClient.java:close(564)) - Closed a connection to 
> metastore, current connections: 1
>  2018-09-19T17:15:05,935 INFO [HiveServer2-Background-Pool: Thread-36([])]: 
> hive.metastore (HiveMetaStoreClient.java:open(481)) - Opened a connection to 
> metastore, current connections: 2
>  2018-09-19T17:15:06,123 INFO [HiveServer2-Background-Pool: Thread-36([])]: 
> hive.metastore (HiveMetaStoreClient.java:close(564)) - Closed a connection to 
> metastore, current connections: 1
>  2018-09-19T17:15:06,126 INFO [HiveServer2-Background-Pool: Thread-36([])]: 
> hive.metastore (HiveMetaStoreClient.java:open(481)) - Opened a connection to 
> metastore, current connections: 2
> ...
>  2018-09-19T17:15:20,626 INFO [29049f74-73c4-4f48-9cf7-b4bfe524a85b 
> HiveServer2-Handler-Pool: Thread-31([])]: hive.metastore 
> (HiveMetaStoreClient.java:open(481)) - Opened a connection to metastore, 
> current connections: 12
>  2018-09-19T17:15:21,153 INFO [HiveServer2-Background-Pool: Thread-162([])]: 
> hive.metastore (HiveMetaStoreClient.java:close(564)) - Closed a connection to 
> metastore, current connections: 11
>  2018-09-19T17:15:21,155 INFO [HiveServer2-Background-Pool: Thread-162([])]: 
> hive.metastore (HiveMetaStoreClient.java:open(481)) - Opened a connection to 
> metastore, current connections: 12
>  2018-09-19T17:15:21,306 INFO [HiveServer2-Background-Pool: Thread-162([])]: 
> hive.metastore (HiveMetaStoreClient.java:close(564)) - Closed a connection to 
> metastore, current connections: 11
>  2018-09-19T17:15:21,308 INFO [HiveServer2-Background-Pool: Thread-162([])]: 
> hive.metastore (HiveMetaStoreClient.java:open(481)) - Opened a connection to 
> metastore, current connections: 12
>  2018-09-19T17:15:21,385 INFO [HiveServer2-Background-Pool: Thread-162([])]: 
> hive.metastore (HiveMetaStoreClient.java:close(564)) - Closed a connection to 
> metastore, current connections: 11
>  2018-09-19T17:15:21,387 INFO [HiveServer2-Background-Pool: Thread-162([])]: 
> hive.metastore (HiveMetaStoreClient.java:open(481)) - Opened a connection to 
> metastore, current connections: 12
>  2018-09-19T17:15:21,541 INFO [HiveServer2-Handler-Pool: Thread-31([])]: 
> hive.metastore (HiveMetaStoreClient.java:open(481)) - Opened a connection to 
> metastore, current connections: 13
>  2018-09-19T17:15:21,542 INFO [HiveServer2-Handler-Pool: Thread-31([])]: 
> hive.metastore (HiveMetaStoreClient.java:close(564)) - Closed a connection to 
> metastore, current connections: 12
> {code}
> Attached is a simple [impyla|https://github.com/cloudera/impyla] script that 
> triggers the condition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20632) Query with get_splits UDF fails if materialized view is created on queried table.

2018-09-25 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627877#comment-16627877
 ] 

Hive QA commented on HIVE-20632:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
39s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 9s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
40s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
0s{color} | {color:blue} ql in master has 2326 extant Findbugs warnings. 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
14s{color} | {color:red} metastore-server in master failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
42s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
8s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
19s{color} | {color:red} itests/hive-unit: The patch generated 1 new + 188 
unchanged - 0 fixed = 189 total (was 188) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
43s{color} | {color:red} ql: The patch generated 1 new + 590 unchanged - 2 
fixed = 591 total (was 592) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
11s{color} | {color:red} ql generated 1 new + 2325 unchanged - 1 fixed = 2326 
total (was 2326) {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
14s{color} | {color:red} metastore-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 32m 17s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Exception is caught when Exception is not thrown in 
org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(String, List, 
List, boolean, HiveTxnManager)  At Hive.java:is not thrown in 
org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(String, List, 
List, boolean, HiveTxnManager)  At Hive.java:[line 1576] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14045/dev-support/hive-personality.sh
 |
| git revision | master / a81f53a |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14045/yetus/branch-findbugs-standalone-metastore_metastore-server.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14045/yetus/diff-checkstyle-itests_hive-unit.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14045/yetus/diff-checkstyle-ql.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14045/yetus/new-findbugs-ql.html
 |
| findbugs | 

[jira] [Updated] (HIVE-20603) "Wrong FS" error when inserting to partition after changing table location filesystem

2018-09-25 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-20603:
--
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Committed to master

> "Wrong FS" error when inserting to partition after changing table location 
> filesystem
> -
>
> Key: HIVE-20603
> URL: https://issues.apache.org/jira/browse/HIVE-20603
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20603.1.patch, HIVE-20603.2.patch, 
> HIVE-20603.3.patch
>
>
> Inserting into an existing partition, after changing a table's location to 
> point to a different HDFS filesystem:
> {noformat}
>query += "CREATE TABLE test_managed_tbl (id int, name string, dept string) 
> PARTITIONED BY (year int);\n"
> query += "INSERT INTO test_managed_tbl PARTITION (year=2016) VALUES 
> (8,'Henry','CSE');\n"
> query += "ALTER TABLE test_managed_tbl ADD PARTITION (year=2017);\n"
> query += "ALTER TABLE test_managed_tbl SET LOCATION 
>   
> 'hdfs://ns2/warehouse/tablespace/managed/hive/test_managed_tbl'"
> query += "INSERT INTO test_managed_tbl PARTITION (year=2017) VALUES 
> (9,'Harris','CSE');\n"
> {noformat}
> Results in the following error:
> {noformat}
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://ns1/warehouse/tablespace/managed/hive/test_managed_tbl/year=2017, 
> expected: hdfs://ns2
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:781)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:240)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1583)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1580)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1595)
> at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1734)
> at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:4141)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1966)
> at 
> org.apache.hadoop.hive.ql.exec.MoveTask.handleStaticParts(MoveTask.java:477)
> at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:397)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:210)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2701)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2372)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2048)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1746)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1740)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20224) ReplChangeManager.java Remove Logging Guards

2018-09-25 Thread Morio Ramdenbourg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Morio Ramdenbourg reassigned HIVE-20224:


Assignee: Morio Ramdenbourg

> ReplChangeManager.java Remove Logging Guards
> 
>
> Key: HIVE-20224
> URL: https://issues.apache.org/jira/browse/HIVE-20224
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Standalone Metastore
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Morio Ramdenbourg
>Priority: Trivial
>  Labels: newb, newbie, noob
>
> {code:java|title=metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ReplChangeManager.java}
> if (LOG.isDebugEnabled()) {
>   LOG.debug("A file with the same content of {} already exists, ignore", 
> path.toString());
> }
> // >
> LOG.debug("A file with the same content of {} already exists, ignore", path);
> if (LOG.isDebugEnabled()) {
>   LOG.debug("Encoded URI: " + encodedUri);
> }
> // >
> LOG.debug("Encoded URI: {}", encodedUri);
> if (LOG.isDebugEnabled()) {
>   LOG.debug("Move " + file.toString() + " to trash");
> }
> // >
>  LOG.debug("Move {} to trash", file);
> ... others
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20535) Add new configuration to set the size of the global compile lock

2018-09-25 Thread denys kuzmenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

denys kuzmenko updated HIVE-20535:
--
Attachment: HIVE-20535.16.patch

> Add new configuration to set the size of the global compile lock
> 
>
> Key: HIVE-20535
> URL: https://issues.apache.org/jira/browse/HIVE-20535
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Reporter: denys kuzmenko
>Assignee: denys kuzmenko
>Priority: Major
> Attachments: HIVE-20535.1.patch, HIVE-20535.10.patch, 
> HIVE-20535.11.patch, HIVE-20535.12.patch, HIVE-20535.13.patch, 
> HIVE-20535.14.patch, HIVE-20535.15.patch, HIVE-20535.16.patch, 
> HIVE-20535.2.patch, HIVE-20535.3.patch, HIVE-20535.4.patch, 
> HIVE-20535.5.patch, HIVE-20535.6.patch, HIVE-20535.8.patch, HIVE-20535.9.patch
>
>
> When removing the compile lock, it is quite risky to remove it entirely.
> It would be good to provide a pool size for the concurrent compilation, so 
> the administrator can limit the load



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-09-25 Thread Sahil Takiar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627851#comment-16627851
 ] 

Sahil Takiar commented on HIVE-17684:
-

+1 latest patch LGTM

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, 
> HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, 
> HIVE-17684.09.patch, HIVE-17684.10.patch, HIVE-17684.11.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20627) Concurrent async queries intermittently fails with LockException and cause memory leak.

2018-09-25 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627843#comment-16627843
 ] 

Hive QA commented on HIVE-20627:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941234/HIVE-20627.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 14996 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.common.metrics.metrics2.TestCodahaleMetrics.testFileReporting
 (batchId=274)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14044/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14044/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14044/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941234 - PreCommit-HIVE-Build

> Concurrent async queries intermittently fails with LockException and cause 
> memory leak.
> ---
>
> Key: HIVE-20627
> URL: https://issues.apache.org/jira/browse/HIVE-20627
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Transactions
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20627.01.patch
>
>
> When multiple async queries are executed from same session, it leads to 
> multiple async query execution DAGs share the same Hive object which is set 
> by caller for all threads. In case of loading dynamic partitions, it creates 
> MoveTask which re-creates the Hive object and closes the shared Hive object 
> which causes metastore connection issues for other async execution thread who 
> still access it. This is also seen if ReplDumpTask and ReplLoadTask are part 
> of the DAG.
> *Call Stack:*
> {code:java}
> 2018-09-16T04:38:04,280 ERROR [load-dynamic-partitions-7]: metadata.Hive 
> (Hive.java:call(2436)) - Exception when loading partition with parameters 
> partPath=hdfs://mycluster/warehouse/tablespace/managed/hive/tbl_3bcvvdubni/.hive-staging_hive_2018-09-16_04-35-50_708_7776079613819042057-1147/-ext-1/age=55,
>  table=tbl_3bcvvdubni, partSpec={age=55}, loadFileType=KEEP_EXISTING, 
> listBucketingLevel=0, isAcid=true, hasFollowingStatsTask=true
> org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the 
> metastore
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:714)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableValidWriteIdListWithTxnList(AcidUtils.java:1791)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1756) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1714) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1976) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2415) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2406) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_171]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
> Caused by: org.apache.thrift.protocol.TProtocolException: Required field 
> 'validTxnList' is unset! 
> Struct:GetValidWriteIdsRequest(fullTableNames:[default.tbl_3bcvvdubni], 
> validTxnList:null)
> at 
> org.apache.hadoop.hive.metastore.api.GetValidWriteIdsRequest.validate(GetValidWriteIdsRequest.java:396)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.validate(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> 

[jira] [Updated] (HIVE-20523) Improve table statistics for Parquet format

2018-09-25 Thread George Pachitariu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

George Pachitariu updated HIVE-20523:
-
Attachment: HIVE-20523.4.patch
Status: Patch Available  (was: Open)

> Improve table statistics for Parquet format
> ---
>
> Key: HIVE-20523
> URL: https://issues.apache.org/jira/browse/HIVE-20523
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: George Pachitariu
>Assignee: George Pachitariu
>Priority: Minor
> Attachments: HIVE-20523.1.patch, HIVE-20523.2.patch, 
> HIVE-20523.3.patch, HIVE-20523.4.patch, HIVE-20523.patch
>
>
> Right now, in the table basic statistics, the *raw data size* for a row with 
> any data type in the Parquet format is 1. This is an underestimated value 
> when columns are complex data structures, like arrays.
> Having tables with underestimated raw data size makes Hive assign less 
> containers (mappers/reducers) to it, making the overall query slower. 
> Heavy underestimation also makes Hive choose MapJoin instead of the 
> ShuffleJoin that can fail with OOM errors.
> In this patch, I compute the columns data size better, taking into account 
> complex structures. I followed the Writer implementation for the ORC format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-09-25 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627833#comment-16627833
 ] 

Misha Dmitriev commented on HIVE-17684:
---

Hi [~KaiXu] yes, your issue is the same that we try to improve here.

Note, however, that if/when this change will be integrated, 
{{MapJoinMemoryExhaustionException}} is not guaranteed to go away in your use 
case. That is, maybe your Spark executor really doesn't have enough memory to 
process the given table, and thus the exception is thrown correctly. But this 
change should reduce the chance of this exception being thrown in wrong 
circumstances, when there is actually enough memory.

In the mean time, a workaround is to increase the JVM heap for your Spark 
executors.

[~stakiar] it looks like the last patch finally passed tests, so it can be 
integrated?

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, 
> HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, 
> HIVE-17684.09.patch, HIVE-17684.10.patch, HIVE-17684.11.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20523) Improve table statistics for Parquet format

2018-09-25 Thread George Pachitariu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

George Pachitariu updated HIVE-20523:
-
Status: Open  (was: Patch Available)

> Improve table statistics for Parquet format
> ---
>
> Key: HIVE-20523
> URL: https://issues.apache.org/jira/browse/HIVE-20523
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: George Pachitariu
>Assignee: George Pachitariu
>Priority: Minor
> Attachments: HIVE-20523.1.patch, HIVE-20523.2.patch, 
> HIVE-20523.3.patch, HIVE-20523.patch
>
>
> Right now, in the table basic statistics, the *raw data size* for a row with 
> any data type in the Parquet format is 1. This is an underestimated value 
> when columns are complex data structures, like arrays.
> Having tables with underestimated raw data size makes Hive assign less 
> containers (mappers/reducers) to it, making the overall query slower. 
> Heavy underestimation also makes Hive choose MapJoin instead of the 
> ShuffleJoin that can fail with OOM errors.
> In this patch, I compute the columns data size better, taking into account 
> complex structures. I followed the Writer implementation for the ORC format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20615) CachedStore: Background refresh thread bug fixes

2018-09-25 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-20615:

Attachment: HIVE-20615.1.patch

> CachedStore: Background refresh thread bug fixes
> 
>
> Key: HIVE-20615
> URL: https://issues.apache.org/jira/browse/HIVE-20615
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Affects Versions: 3.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
> Attachments: HIVE-20615.1.patch, HIVE-20615.1.patch, 
> HIVE-20615.1.patch
>
>
> Regression introduced in HIVE-18264. Fixes background thread starting and 
> refreshing of the table cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20430) CachedStore: bug fixes for TestEmbeddedHiveMetaStore, TestRemoteHiveMetaStore, TestMiniLlapCliDriver, TestMiniTezCliDriver, TestMinimrCliDriver

2018-09-25 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-20430:

  Resolution: Fixed
Hadoop Flags: Reviewed
Target Version/s: 4.0.0
  Status: Resolved  (was: Patch Available)

Pushed to master. Thanks [~daijy]

> CachedStore: bug fixes for TestEmbeddedHiveMetaStore, 
> TestRemoteHiveMetaStore, TestMiniLlapCliDriver, TestMiniTezCliDriver, 
> TestMinimrCliDriver
> ---
>
> Key: HIVE-20430
> URL: https://issues.apache.org/jira/browse/HIVE-20430
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
> Attachments: HIVE-20430.1.patch, HIVE-20430.2.patch, 
> HIVE-20430.2.patch
>
>
> 1. getTable call needs to set TableType before returning
> 2. getTableObjectsByName should throw UnknownDBException when needed and 
> should not return null table objects
> 3. listTableNamesByFilter should fall back to ObjectStore till we have the 
> correct impl
> 4. listPartitionNamesPs and listPartitionsPsWithAuth are buggy
> 5. SharedCache.removePartition bug fix
> 6. removeTableColStats needs to remove all col stats when column name is null



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20627) Concurrent async queries intermittently fails with LockException and cause memory leak.

2018-09-25 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627812#comment-16627812
 ] 

Hive QA commented on HIVE-20627:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
38s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
24s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
50s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
46s{color} | {color:blue} ql in master has 2326 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
36s{color} | {color:blue} service in master has 48 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
15s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m 55s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14044/dev-support/hive-personality.sh
 |
| git revision | master / 6137ee5 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql service U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14044/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Concurrent async queries intermittently fails with LockException and cause 
> memory leak.
> ---
>
> Key: HIVE-20627
> URL: https://issues.apache.org/jira/browse/HIVE-20627
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Transactions
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20627.01.patch
>
>
> When multiple async queries are executed from same session, it leads to 
> multiple async query execution DAGs share the same Hive object which is set 
> by caller for all threads. In case of loading dynamic partitions, it creates 
> MoveTask which re-creates the Hive object and closes the shared Hive object 
> which causes metastore connection issues for other async execution thread who 
> still access it. This is also seen if ReplDumpTask and ReplLoadTask are part 
> of the DAG.
> *Call Stack:*
> {code:java}
> 2018-09-16T04:38:04,280 ERROR [load-dynamic-partitions-7]: metadata.Hive 
> (Hive.java:call(2436)) - Exception 

[jira] [Commented] (HIVE-20595) Add findbugs-exclude.xml to metastore-server

2018-09-25 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627795#comment-16627795
 ] 

Peter Vary commented on HIVE-20595:
---

[~lpinter]: Could you rerun the tests to get green results?
Thanks,
Peter

> Add findbugs-exclude.xml to metastore-server
> 
>
> Key: HIVE-20595
> URL: https://issues.apache.org/jira/browse/HIVE-20595
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Laszlo Pinter
>Assignee: Laszlo Pinter
>Priority: Blocker
> Attachments: HIVE-20595.01.patch, HIVE-20595.02.patch
>
>
> The findbugs-exclude.xml is missing from 
> standalone-metastore/metastore-server/findbugs. This should be added, 
> otherwise the findbugs check will fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20595) Add findbugs-exclude.xml to metastore-server

2018-09-25 Thread Peter Vary (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary reassigned HIVE-20595:
-

Assignee: Laszlo Pinter

> Add findbugs-exclude.xml to metastore-server
> 
>
> Key: HIVE-20595
> URL: https://issues.apache.org/jira/browse/HIVE-20595
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Laszlo Pinter
>Assignee: Laszlo Pinter
>Priority: Blocker
> Attachments: HIVE-20595.01.patch, HIVE-20595.02.patch
>
>
> The findbugs-exclude.xml is missing from 
> standalone-metastore/metastore-server/findbugs. This should be added, 
> otherwise the findbugs check will fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20607) TxnHandler should use PreparedStatement to execute direct SQL queries.

2018-09-25 Thread Sankar Hariappan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627775#comment-16627775
 ] 

Sankar Hariappan commented on HIVE-20607:
-

01.patch committed to master.

Thanks [~daijy]!

> TxnHandler should use PreparedStatement to execute direct SQL queries.
> --
>
> Key: HIVE-20607
> URL: https://issues.apache.org/jira/browse/HIVE-20607
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore, Transactions
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20607.01.patch
>
>
> TxnHandler uses direct SQL queries to operate on Txn related databases/tables 
> in Hive metastore RDBMS.
> Most of the methods are direct calls from Metastore api which should be 
> directly append input string arguments to the SQL string.
> Need to use parameterised PreparedStatement object to set these arguments.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20535) Add new configuration to set the size of the global compile lock

2018-09-25 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627776#comment-16627776
 ] 

Hive QA commented on HIVE-20535:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941207/HIVE-20535.15.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14043/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14043/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14043/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12941207/HIVE-20535.15.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941207 - PreCommit-HIVE-Build

> Add new configuration to set the size of the global compile lock
> 
>
> Key: HIVE-20535
> URL: https://issues.apache.org/jira/browse/HIVE-20535
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Reporter: denys kuzmenko
>Assignee: denys kuzmenko
>Priority: Major
> Attachments: HIVE-20535.1.patch, HIVE-20535.10.patch, 
> HIVE-20535.11.patch, HIVE-20535.12.patch, HIVE-20535.13.patch, 
> HIVE-20535.14.patch, HIVE-20535.15.patch, HIVE-20535.2.patch, 
> HIVE-20535.3.patch, HIVE-20535.4.patch, HIVE-20535.5.patch, 
> HIVE-20535.6.patch, HIVE-20535.8.patch, HIVE-20535.9.patch
>
>
> When removing the compile lock, it is quite risky to remove it entirely.
> It would be good to provide a pool size for the concurrent compilation, so 
> the administrator can limit the load



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20607) TxnHandler should use PreparedStatement to execute direct SQL queries.

2018-09-25 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20607:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> TxnHandler should use PreparedStatement to execute direct SQL queries.
> --
>
> Key: HIVE-20607
> URL: https://issues.apache.org/jira/browse/HIVE-20607
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore, Transactions
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20607.01.patch
>
>
> TxnHandler uses direct SQL queries to operate on Txn related databases/tables 
> in Hive metastore RDBMS.
> Most of the methods are direct calls from Metastore api which should be 
> directly append input string arguments to the SQL string.
> Need to use parameterised PreparedStatement object to set these arguments.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17917) VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization

2018-09-25 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627773#comment-16627773
 ] 

Hive QA commented on HIVE-17917:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941190/HIVE-17917.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 14996 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.TestTxnNoBuckets.testCtasPartitioned (batchId=299)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14042/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14042/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14042/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941190 - PreCommit-HIVE-Build

> VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization
> ---
>
> Key: HIVE-17917
> URL: https://issues.apache.org/jira/browse/HIVE-17917
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Saurabh Seth
>Priority: Minor
> Attachments: HIVE-17917.patch
>
>
> VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization() 
> computation is currently (after HIVE-17458) is done once per split.  It could 
> instead be done once per file (since the result is the same for each split of 
> the same file) and passed along in OrcSplit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17300) WebUI query plan graphs

2018-09-25 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627772#comment-16627772
 ] 

Szehon Ho commented on HIVE-17300:
--

Hi Karen, sorry for the StringUtils, I missed this on my end that it's already 
imported.

For the OperationLog I saw its accessible from a ThreadLocal, I wonder if it 
will work.

> WebUI query plan graphs
> ---
>
> Key: HIVE-17300
> URL: https://issues.apache.org/jira/browse/HIVE-17300
> Project: Hive
>  Issue Type: Sub-task
>  Components: Web UI
>Affects Versions: 4.0.0
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: beginner, features, patch
> Attachments: HIVE-17300.3.patch, HIVE-17300.4.patch, 
> HIVE-17300.5.patch, HIVE-17300.6.patch, HIVE-17300.7.patch, 
> HIVE-17300.7.patch, HIVE-17300.8.patch, HIVE-17300.8.patch, 
> HIVE-17300.8.patch, HIVE-17300.8.patch, HIVE-17300.9.patch, HIVE-17300.patch, 
> complete_success.png, full_mapred_stats.png, graph_with_mapred_stats.png, 
> last_stage_error.png, last_stage_running.png, non_mapred_task_selected.png
>
>
> Hi all,
> I’m working on a feature of the Hive WebUI Query Plan tab that would provide 
> the option to display the query plan as a nice graph (scroll down for 
> screenshots). If you click on one of the graph’s stages, the plan for that 
> stage appears as text below. 
> Stages are color-coded if they have a status (Success, Error, Running), and 
> the rest are grayed out. Coloring is based on status already available in the 
> WebUI, under the Stages tab.
> There is an additional option to display stats for MapReduce tasks. This 
> includes the job’s ID, tracking URL (where the logs are found), and mapper 
> and reducer numbers/progress, among other info. 
> The library I’m using for the graph is called vis.js (http://visjs.org/). It 
> has an Apache license, and the only necessary file to be included from this 
> library is about 700 KB.
> I tried to keep server-side changes minimal, and graph generation is taken 
> care of by the client. Plans with more than a given number of stages 
> (default: 25) won't be displayed in order to preserve resources.
> I’d love to hear any and all input from the community about this feature: do 
> you think it’s useful, and is there anything important I’m missing?
> Thanks,
> Karen Coppage
> Review request: https://reviews.apache.org/r/61663/
> Any input is welcome!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20544) TOpenSessionReq logs password and username

2018-09-25 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627770#comment-16627770
 ] 

Peter Vary commented on HIVE-20544:
---

[~klcopp]: The current commit rules require us to have a green run to push a 
change. Even if we know the failed test is unrelated. If we see a test which is 
flaky (usually when it fails more than a few times in the previous runs) then 
we disable the test and file a jira for fixing it. Taking a look a the test 
history it seems that the test should be fixed, so I would just reupload the 
patch with a different filename and hope for a clean run.
Thanks,
Peter

> TOpenSessionReq logs password and username
> --
>
> Key: HIVE-20544
> URL: https://issues.apache.org/jira/browse/HIVE-20544
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: beginner, patch, security
> Attachments: HIVE-20544.1.patch, HIVE-20544.2.patch, 
> HIVE-20544.3.patch, HIVE-20544.3.patch, HIVE-20544.patch, non-solution.patch, 
> working-solution.patch
>
>
> In 
> service-rpc/src/gen/thrift/gen-javabean/org/apache/hive/service/rpc/thrift/TOpenSessionReq,
>  if client protocol is unset, validate() and toString() prints both username 
> and password to logs.
> Logging a password is a security risk. We should hide the ***.
> =Edit= (no longer relevant, see comments)
> This issue is tricky since it is caused in a fully generated class. I've been 
> playing around and have found one working solution, butI'd truly appreciate 
> ideas for a more elegant solution or input.
> The problem:
>  TCLIService.thrift is the template for generating all classes in 
> service-rpc. Struct TOpenSessionReq is OpenSession()'s one parameter and is 
> defined thus:
> {noformat}
> struct TOpenSessionReq {
>   1: required TProtocolVersion client_protocol = 
> TProtocolVersion.HIVE_CLI_SERVICE_PROTOCOL_V10
>   2: optional string username
>   3: optional string password
>   4: optional map configuration
> }
> {noformat}
> In the generated class TOpenSessionReq.java, client_protocol is checked by a 
> validate() method, which is called quite a few times; if client_protocol is 
> not set, it throws a TProtocolException, passing along a toString(). This 
> toString() gets the names and values of all fields, including username and 
> password.
> Working solution:
>  * Create a separate struct containing only the username and password, and 
> pass it to OpenSession() as a second parameter. Since all fields in the new 
> struct are "optional", the generated validate() is empty – toString() is 
> never used. This involves changing core classes and breaks the "Each function 
> should take exactly one parameter" coding convention (detailed at 
> service-rpc/if/TCLIService.thrift:27).
>  See working-solution.patch.
> What doesn't work:
>  * Making client_protocol optional instead of required. Apparently this will 
> break everything.
>  * Overwriting toString() – TOpenSessionReq is a struct.
>  * Creating two Thrift structs, one struct for required (TRequiredReq) and 
> one for optional (TOptionalReq) fields, and nesting them in struct 
> TOpenSessionReq. This doesn't work because validate() in TOpenSessionReq can 
> call TOptionalReq.toString(), which prints the password to logs. This will 
> happen if TRequiredReq.client_protocol isn't set.
>  See non-solution.patch
>  * Asking Thrift devs to change their code. I wrote them an email but have no 
> expectations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19302) Logging Too Verbose For TableNotFound

2018-09-25 Thread Alice Fan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alice Fan updated HIVE-19302:
-
Status: Open  (was: Patch Available)

> Logging Too Verbose For TableNotFound
> -
>
> Key: HIVE-19302
> URL: https://issues.apache.org/jira/browse/HIVE-19302
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 3.0.0, 2.2.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Minor
> Attachments: HIVE-19302.4.patch, HIVE-19302.5.patch, 
> table_not_found_cdh6.txt
>
>
> There is way too much logging when a user submits a query against a table 
> which does not exist.  In an ad-hoc setting, it is quite normal that a user 
> fat-fingers a table name.  Yet, from the perspective of the Hive 
> administrator, there was perhaps a major issue based on the volume and 
> severity of logging.  Please change the logging to INFO level, and do not 
> present a stack trace, for such a trivial error.
>  
> See the attached file for a sample of what logging a single "table not found" 
> query generates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19302) Logging Too Verbose For TableNotFound

2018-09-25 Thread Alice Fan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alice Fan updated HIVE-19302:
-
Attachment: HIVE-19302.5.patch
Status: Patch Available  (was: Open)

> Logging Too Verbose For TableNotFound
> -
>
> Key: HIVE-19302
> URL: https://issues.apache.org/jira/browse/HIVE-19302
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 3.0.0, 2.2.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Minor
> Attachments: HIVE-19302.4.patch, HIVE-19302.5.patch, 
> table_not_found_cdh6.txt
>
>
> There is way too much logging when a user submits a query against a table 
> which does not exist.  In an ad-hoc setting, it is quite normal that a user 
> fat-fingers a table name.  Yet, from the perspective of the Hive 
> administrator, there was perhaps a major issue based on the volume and 
> severity of logging.  Please change the logging to INFO level, and do not 
> present a stack trace, for such a trivial error.
>  
> See the attached file for a sample of what logging a single "table not found" 
> query generates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20593) Load Data for partitioned ACID tables fails with bucketId out of range: -1

2018-09-25 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627763#comment-16627763
 ] 

Eugene Koifman commented on HIVE-20593:
---

+1

> Load Data for partitioned ACID tables fails with bucketId out of range: -1
> --
>
> Key: HIVE-20593
> URL: https://issues.apache.org/jira/browse/HIVE-20593
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.1.0
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20593.1.patch, HIVE-20593.2.patch, 
> HIVE-20593.3.patch
>
>
> Load data for ACID tables is failing to load ORC files when it is converted 
> to IAS job.
>  
> The tempTblObj is inherited from target table. However, the only table 
> property which needs to be inherited is bucketing version. Properties like 
> transactional etc should be ignored.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17917) VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization

2018-09-25 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627748#comment-16627748
 ] 

Hive QA commented on HIVE-17917:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
41s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
14s{color} | {color:green} master passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  3m 
22s{color} | {color:red} root in master failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 1s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
53s{color} | {color:blue} ql in master has 2326 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
21s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
48s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  3m 
28s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  3m 28s{color} 
| {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
40s{color} | {color:red} ql: The patch generated 1 new + 430 unchanged - 2 
fixed = 431 total (was 432) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  3m 
52s{color} | {color:red} ql generated 1 new + 2325 unchanged - 1 fixed = 2326 
total (was 2326) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 50m  8s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Redundant nullcheck of 
org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader.syntheticProps,
 which is known to be non-null in 
org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader.handleOriginalFile(BitSet,
 ColumnVector[])  Redundant null check at 
VectorizedOrcAcidRowBatchReader.java:is known to be non-null in 
org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader.handleOriginalFile(BitSet,
 ColumnVector[])  Redundant null check at 
VectorizedOrcAcidRowBatchReader.java:[line 495] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14042/dev-support/hive-personality.sh
 |
| git revision | master / 307bbca |
| Default Java | 1.8.0_111 |
| compile | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14042/yetus/branch-compile-root.txt
 |
| findbugs | v3.0.0 |
| compile | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14042/yetus/patch-compile-root.txt
 |
| javac | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14042/yetus/patch-compile-root.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14042/yetus/diff-checkstyle-ql.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14042/yetus/new-findbugs-ql.html
 |
| modules | C: . ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14042/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization
> ---
>
> Key: HIVE-17917
> URL: 

[jira] [Commented] (HIVE-12812) Enable mapred.input.dir.recursive by default to support union with aggregate function

2018-09-25 Thread Alice Fan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627729#comment-16627729
 ] 

Alice Fan commented on HIVE-12812:
--

Hi [~ctang.ma] [~ychena] [~ngangam], Could you please help to review 
HIVE-12812.2.patch? This will be helpful to resolve 
[HIVE-20319|https://issues.apache.org/jira/browse/HIVE-20319] too. Thanks!


> Enable mapred.input.dir.recursive by default to support union with aggregate 
> function
> -
>
> Key: HIVE-12812
> URL: https://issues.apache.org/jira/browse/HIVE-12812
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Chaoyu Tang
>Assignee: Alice Fan
>Priority: Major
> Attachments: HIVE-12812.1.patch, HIVE-12812.2.patch, 
> HIVE-12812.patch, HIVE-12812.patch, HIVE-12812.patch
>
>
> When union remove optimization is enabled, union query with aggregate 
> function writes its subquery intermediate results to subdirs which needs 
> mapred.input.dir.recursive to be enabled in order to be fetched. This 
> property is not defined by default in Hive and often ignored by user, which 
> causes the query failure and is hard to be debugged.
> So we need set mapred.input.dir.recursive to true whenever union remove 
> optimization is enabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20601) EnvironmentContext null in ALTER_PARTITION event in DbNotificationListener

2018-09-25 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20601:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

> EnvironmentContext null in ALTER_PARTITION event in DbNotificationListener
> --
>
> Key: HIVE-20601
> URL: https://issues.apache.org/jira/browse/HIVE-20601
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 4.0.0
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20601.1.patch, HIVE-20601.2.patch
>
>
> Cause : EnvironmentContext not passed here:
> [https://github.com/apache/hive/blob/36c33ca066c99dfdb21223a711c0c3f33c85b943/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java#L726]
>  
> It will be useful to have the environmentContext passed to 
> DbNotificationListener in this case, to know if the alter happened due to a 
> stat change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20545) Exclude large-sized parameters from serialization of Table and Partition thrift objects in HMS notifications

2018-09-25 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20545:

Attachment: HIVE-20545.4.patch

> Exclude large-sized parameters from serialization of Table and Partition 
> thrift objects in HMS notifications
> 
>
> Key: HIVE-20545
> URL: https://issues.apache.org/jira/browse/HIVE-20545
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20545.1.patch, HIVE-20545.2.patch, 
> HIVE-20545.3.branch-3.patch, HIVE-20545.3.patch, HIVE-20545.4.patch
>
>
> Clients can add large-sized parameters in Table/Partition objects. So we need 
> to enable adding regex patterns through HiveConf to match parameters to be 
> filtered from table and partition objects before serialization in HMS 
> notifications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20545) Exclude large-sized parameters from serialization of Table and Partition thrift objects in HMS notifications

2018-09-25 Thread Bharathkrishna Guruvayoor Murali (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627722#comment-16627722
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-20545:
-

Attaching HIVE-20545.4.patch to run the tests again.

> Exclude large-sized parameters from serialization of Table and Partition 
> thrift objects in HMS notifications
> 
>
> Key: HIVE-20545
> URL: https://issues.apache.org/jira/browse/HIVE-20545
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20545.1.patch, HIVE-20545.2.patch, 
> HIVE-20545.3.branch-3.patch, HIVE-20545.3.patch, HIVE-20545.4.patch
>
>
> Clients can add large-sized parameters in Table/Partition objects. So we need 
> to enable adding regex patterns through HiveConf to match parameters to be 
> filtered from table and partition objects before serialization in HMS 
> notifications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20595) Add findbugs-exclude.xml to metastore-server

2018-09-25 Thread Alice Fan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627720#comment-16627720
 ] 

Alice Fan commented on HIVE-20595:
--

[~lpinter], oops. sorry.. that was an accident. 

> Add findbugs-exclude.xml to metastore-server
> 
>
> Key: HIVE-20595
> URL: https://issues.apache.org/jira/browse/HIVE-20595
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Laszlo Pinter
>Priority: Blocker
> Attachments: HIVE-20595.01.patch, HIVE-20595.02.patch
>
>
> The findbugs-exclude.xml is missing from 
> standalone-metastore/metastore-server/findbugs. This should be added, 
> otherwise the findbugs check will fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20595) Add findbugs-exclude.xml to metastore-server

2018-09-25 Thread Alice Fan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alice Fan reassigned HIVE-20595:


Assignee: (was: Alice Fan)

> Add findbugs-exclude.xml to metastore-server
> 
>
> Key: HIVE-20595
> URL: https://issues.apache.org/jira/browse/HIVE-20595
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Laszlo Pinter
>Priority: Blocker
> Attachments: HIVE-20595.01.patch, HIVE-20595.02.patch
>
>
> The findbugs-exclude.xml is missing from 
> standalone-metastore/metastore-server/findbugs. This should be added, 
> otherwise the findbugs check will fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17917) VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization

2018-09-25 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627692#comment-16627692
 ] 

Eugene Koifman commented on HIVE-17917:
---

I don't think non-vector mode is a priority

> VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization
> ---
>
> Key: HIVE-17917
> URL: https://issues.apache.org/jira/browse/HIVE-17917
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Saurabh Seth
>Priority: Minor
> Attachments: HIVE-17917.patch
>
>
> VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization() 
> computation is currently (after HIVE-17458) is done once per split.  It could 
> instead be done once per file (since the result is the same for each split of 
> the same file) and passed along in OrcSplit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20535) Add new configuration to set the size of the global compile lock

2018-09-25 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627666#comment-16627666
 ] 

Hive QA commented on HIVE-20535:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941207/HIVE-20535.15.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 14991 tests 
executed
*Failed tests:*
{noformat}
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=193)

[druidmini_dynamic_partition.q,druidmini_test_ts.q,druidmini_expressions.q,druidmini_test_alter.q,druidmini_test_insert.q]
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14041/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14041/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14041/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941207 - PreCommit-HIVE-Build

> Add new configuration to set the size of the global compile lock
> 
>
> Key: HIVE-20535
> URL: https://issues.apache.org/jira/browse/HIVE-20535
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Reporter: denys kuzmenko
>Assignee: denys kuzmenko
>Priority: Major
> Attachments: HIVE-20535.1.patch, HIVE-20535.10.patch, 
> HIVE-20535.11.patch, HIVE-20535.12.patch, HIVE-20535.13.patch, 
> HIVE-20535.14.patch, HIVE-20535.15.patch, HIVE-20535.2.patch, 
> HIVE-20535.3.patch, HIVE-20535.4.patch, HIVE-20535.5.patch, 
> HIVE-20535.6.patch, HIVE-20535.8.patch, HIVE-20535.9.patch
>
>
> When removing the compile lock, it is quite risky to remove it entirely.
> It would be good to provide a pool size for the concurrent compilation, so 
> the administrator can limit the load



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20599) CAST(INTERVAL_DAY_TIME AS STRING) is throwing SemanticException

2018-09-25 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20599:
-
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Thanks [~nareshpr] for the contribution! Committed patch to master and branch-3.

> CAST(INTERVAL_DAY_TIME AS STRING) is throwing SemanticException
> ---
>
> Key: HIVE-20599
> URL: https://issues.apache.org/jira/browse/HIVE-20599
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 3.1.0
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
> Fix For: 4.0.0, 3.1.0
>
> Attachments: HIVE-20599-branch-3.patch, 
> HIVE-20599.1-branch-3.1.patch, HIVE-20599.1-branch-3.patch, 
> HIVE-20599.1.patch, HIVE-20599.2.patch, HIVE-20599.3.patch, HIVE-20599.4.patch
>
>
> SELECT CAST(from_utc_timestamp(timestamp '2018-05-02 15:30:30', 'PST') - 
> from_utc_timestamp(timestamp '1970-01-30 16:00:00', 'PST') AS STRING);
> throws below Exception
> {code:java}
> Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 
> Wrong arguments ''PST'': No matching method for class 
> org.apache.hadoop.hive.ql.udf.UDFToString with (interval_day_time). Possible 
> choices: _FUNC_(bigint)  _FUNC_(binary)  _FUNC_(boolean)  _FUNC_(date)  
> _FUNC_(decimal(38,18))  _FUNC_(double)  _FUNC_(float)  _FUNC_(int)  
> _FUNC_(smallint)  _FUNC_(string)  _FUNC_(timestamp)  _FUNC_(tinyint)  
> _FUNC_(void) (state=42000,code=4){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18871) hive on tez execution error due to set hive.aux.jars.path to hdfs://

2018-09-25 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-18871:
-
   Resolution: Fixed
Fix Version/s: 3.2.0
   4.0.0
   Status: Resolved  (was: Patch Available)

Thanks [~qunyan]! Patch committed to master and branch-3

> hive on tez execution error due to set hive.aux.jars.path to hdfs://
> 
>
> Key: HIVE-18871
> URL: https://issues.apache.org/jira/browse/HIVE-18871
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.2.1, 4.0.0, 3.2.0
> Environment: hadoop 2.6.5
> hive 2.2.1
> tez 0.8.4
>Reporter: zhuwei
>Assignee: zhuwei
>Priority: Major
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-18871.1.patch, HIVE-18871.2.patch, 
> HIVE-18871.3.patch, HIVE-18871.4.patch, HIVE-18871.5.patch, 
> HIVE-18871.6.patch, HIVE-18871.7.patch, HIVE-18871.8.patch
>
>
> When set the properties 
> hive.aux.jars.path=hdfs://mycluster/apps/hive/lib/guava.jar
> and hive.execution.engine=tez; execute any query will fail with below error 
> log:
> exec.Task: Failed to execute tez graph.
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://mycluster/apps/hive/lib/guava.jar, expected: file:///
>  at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645) 
> ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:80)
>  ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:529)
>  ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:747)
>  ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:524)
>  ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409)
>  ~[hadoop-common-2.6.0.jar:?]
>  at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:337) 
> ~[hadoop-common-2.6.0.jar:?]
>  at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1905) 
> ~[hadoop-common-2.6.0.jar:?]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeResource(DagUtils.java:1007)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.addTempResources(DagUtils.java:902)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeTempFilesFromConf(DagUtils.java:845)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.refreshLocalResourcesFromConf(TezSessionState.java:466)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(TezSessionState.java:252)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager$TezSessionPoolSession.openInternal(TezSessionPoolManager.java:622)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:206)
>  ~[hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezTask.updateSession(TezTask.java:283) 
> ~[hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:155) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2073) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1744) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1453) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1171) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161) 
> [hive-exec-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:429) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:445) 
> [hive-cli-2.1.1.jar:2.1.1]
>  at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:151) 
> 

[jira] [Commented] (HIVE-20607) TxnHandler should use PreparedStatement to execute direct SQL queries.

2018-09-25 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627660#comment-16627660
 ] 

Daniel Dai commented on HIVE-20607:
---

+1, LGTM.

> TxnHandler should use PreparedStatement to execute direct SQL queries.
> --
>
> Key: HIVE-20607
> URL: https://issues.apache.org/jira/browse/HIVE-20607
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore, Transactions
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20607.01.patch
>
>
> TxnHandler uses direct SQL queries to operate on Txn related databases/tables 
> in Hive metastore RDBMS.
> Most of the methods are direct calls from Metastore api which should be 
> directly append input string arguments to the SQL string.
> Need to use parameterised PreparedStatement object to set these arguments.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20601) EnvironmentContext null in ALTER_PARTITION event in DbNotificationListener

2018-09-25 Thread Bharathkrishna Guruvayoor Murali (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20601:

Attachment: (was: HIVE-20601.1.branch-3.patch)

> EnvironmentContext null in ALTER_PARTITION event in DbNotificationListener
> --
>
> Key: HIVE-20601
> URL: https://issues.apache.org/jira/browse/HIVE-20601
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 4.0.0
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20601.1.patch, HIVE-20601.2.patch
>
>
> Cause : EnvironmentContext not passed here:
> [https://github.com/apache/hive/blob/36c33ca066c99dfdb21223a711c0c3f33c85b943/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java#L726]
>  
> It will be useful to have the environmentContext passed to 
> DbNotificationListener in this case, to know if the alter happened due to a 
> stat change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20601) EnvironmentContext null in ALTER_PARTITION event in DbNotificationListener

2018-09-25 Thread Andrew Sherman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627653#comment-16627653
 ] 

Andrew Sherman commented on HIVE-20601:
---

Pushed to master, thanks [~bharos92]

> EnvironmentContext null in ALTER_PARTITION event in DbNotificationListener
> --
>
> Key: HIVE-20601
> URL: https://issues.apache.org/jira/browse/HIVE-20601
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 4.0.0
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20601.1.branch-3.patch, HIVE-20601.1.patch, 
> HIVE-20601.2.patch
>
>
> Cause : EnvironmentContext not passed here:
> [https://github.com/apache/hive/blob/36c33ca066c99dfdb21223a711c0c3f33c85b943/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java#L726]
>  
> It will be useful to have the environmentContext passed to 
> DbNotificationListener in this case, to know if the alter happened due to a 
> stat change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20632) Query with get_splits UDF fails if materialized view is created on queried table.

2018-09-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627635#comment-16627635
 ] 

ASF GitHub Bot commented on HIVE-20632:
---

GitHub user sankarh opened a pull request:

https://github.com/apache/hive/pull/438

HIVE-20632: Query with get_splits UDF fails if materialized view is created 
on queried table.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sankarh/hive HIVE-20632

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/438.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #438


commit 8356b2ea7d6c699e3a5057b34e5752b2c871aafc
Author: Sankar Hariappan 
Date:   2018-09-25T16:31:41Z

HIVE-20632: Query with get_splits UDF fails if materialized view is created 
on queried table.




> Query with get_splits UDF fails if materialized view is created on queried 
> table. 
> --
>
> Key: HIVE-20632
> URL: https://issues.apache.org/jira/browse/HIVE-20632
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, UDF
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: UDF, materializedviews, pull-request-available
> Attachments: HIVE-20632.01.patch
>
>
> Scenario:
>  # Create ACID table t1 and insert few rows.
>  # Create materialized view mv as select a from t1 where a > 5;
>  # Run get_split query "select get_splits( select a from t1 where a > 5); – 
> This fails with AssertionError.
> {code:java}
> java.lang.AssertionError
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:673)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(Hive.java:1495)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAllValidMaterializedViews(Hive.java:1478)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2160)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1777)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1669)
> at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
> at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
> at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
> at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1428)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:475)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12309)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:355)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:669)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1872)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1819)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:266)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:204)
> at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2764)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.runStatementOnDriver(TestAcidOnTez.java:927)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocksWithMaterializedView(TestAcidOnTez.java:803)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> 

[jira] [Updated] (HIVE-20632) Query with get_splits UDF fails if materialized view is created on queried table.

2018-09-25 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-20632:
--
Labels: UDF materializedviews pull-request-available  (was: UDF 
materializedviews)

> Query with get_splits UDF fails if materialized view is created on queried 
> table. 
> --
>
> Key: HIVE-20632
> URL: https://issues.apache.org/jira/browse/HIVE-20632
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, UDF
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: UDF, materializedviews, pull-request-available
> Attachments: HIVE-20632.01.patch
>
>
> Scenario:
>  # Create ACID table t1 and insert few rows.
>  # Create materialized view mv as select a from t1 where a > 5;
>  # Run get_split query "select get_splits( select a from t1 where a > 5); – 
> This fails with AssertionError.
> {code:java}
> java.lang.AssertionError
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:673)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(Hive.java:1495)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAllValidMaterializedViews(Hive.java:1478)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2160)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1777)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1669)
> at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
> at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
> at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
> at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1428)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:475)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12309)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:355)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:669)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1872)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1819)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:266)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:204)
> at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2764)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.runStatementOnDriver(TestAcidOnTez.java:927)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocksWithMaterializedView(TestAcidOnTez.java:803)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> at 

[jira] [Updated] (HIVE-20632) Query with get_splits UDF fails if materialized view is created on queried table.

2018-09-25 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20632:

Component/s: (was: Standalone Metastore)

> Query with get_splits UDF fails if materialized view is created on queried 
> table. 
> --
>
> Key: HIVE-20632
> URL: https://issues.apache.org/jira/browse/HIVE-20632
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, UDF
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: UDF, materializedviews
> Attachments: HIVE-20632.01.patch
>
>
> Scenario:
>  # Create ACID table t1 and insert few rows.
>  # Create materialized view mv as select a from t1 where a > 5;
>  # Run get_split query "select get_splits( select a from t1 where a > 5); – 
> This fails with AssertionError.
> {code:java}
> java.lang.AssertionError
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:673)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(Hive.java:1495)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAllValidMaterializedViews(Hive.java:1478)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2160)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1777)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1669)
> at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
> at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
> at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
> at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1428)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:475)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12309)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:355)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:669)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1872)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1819)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:266)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:204)
> at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2764)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.runStatementOnDriver(TestAcidOnTez.java:927)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocksWithMaterializedView(TestAcidOnTez.java:803)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at 

[jira] [Updated] (HIVE-20632) Query with get_splits UDF fails if materialized view is created on queried table.

2018-09-25 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20632:

Component/s: (was: HiveServer2)

> Query with get_splits UDF fails if materialized view is created on queried 
> table. 
> --
>
> Key: HIVE-20632
> URL: https://issues.apache.org/jira/browse/HIVE-20632
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, UDF
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: UDF, materializedviews
> Attachments: HIVE-20632.01.patch
>
>
> Scenario:
>  # Create ACID table t1 and insert few rows.
>  # Create materialized view mv as select a from t1 where a > 5;
>  # Run get_split query "select get_splits( select a from t1 where a > 5); – 
> This fails with AssertionError.
> {code:java}
> java.lang.AssertionError
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:673)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(Hive.java:1495)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAllValidMaterializedViews(Hive.java:1478)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2160)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1777)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1669)
> at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
> at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
> at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
> at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1428)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:475)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12309)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:355)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:669)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1872)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1819)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:266)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:204)
> at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2764)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.runStatementOnDriver(TestAcidOnTez.java:927)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocksWithMaterializedView(TestAcidOnTez.java:803)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at 

[jira] [Commented] (HIVE-20535) Add new configuration to set the size of the global compile lock

2018-09-25 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627612#comment-16627612
 ] 

Hive QA commented on HIVE-20535:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
32s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
17s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
30s{color} | {color:blue} common in master has 65 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
47s{color} | {color:blue} ql in master has 2326 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
39s{color} | {color:red} ql: The patch generated 3 new + 142 unchanged - 6 
fixed = 145 total (was 148) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 58s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14041/dev-support/hive-personality.sh
 |
| git revision | master / e161b01 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14041/yetus/diff-checkstyle-ql.txt
 |
| modules | C: common ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14041/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Add new configuration to set the size of the global compile lock
> 
>
> Key: HIVE-20535
> URL: https://issues.apache.org/jira/browse/HIVE-20535
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Reporter: denys kuzmenko
>Assignee: denys kuzmenko
>Priority: Major
> Attachments: HIVE-20535.1.patch, HIVE-20535.10.patch, 
> HIVE-20535.11.patch, HIVE-20535.12.patch, HIVE-20535.13.patch, 
> HIVE-20535.14.patch, HIVE-20535.15.patch, HIVE-20535.2.patch, 
> HIVE-20535.3.patch, HIVE-20535.4.patch, HIVE-20535.5.patch, 
> HIVE-20535.6.patch, HIVE-20535.8.patch, HIVE-20535.9.patch
>
>
> When removing the compile lock, it is quite risky to remove it entirely.
> It would be good to provide a pool size for the concurrent compilation, so 
> the administrator can limit the load



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20632) Query with get_splits UDF fails if materialized view is created on queried table.

2018-09-25 Thread Sankar Hariappan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627604#comment-16627604
 ] 

Sankar Hariappan commented on HIVE-20632:
-

[~jcamachorodriguez], [~maheshk114],

Can you please take a look at the patch?

> Query with get_splits UDF fails if materialized view is created on queried 
> table. 
> --
>
> Key: HIVE-20632
> URL: https://issues.apache.org/jira/browse/HIVE-20632
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Materialized views, Standalone Metastore, 
> UDF
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: UDF, materializedviews
> Attachments: HIVE-20632.01.patch
>
>
> Scenario:
>  # Create ACID table t1 and insert few rows.
>  # Create materialized view mv as select a from t1 where a > 5;
>  # Run get_split query "select get_splits( select a from t1 where a > 5); – 
> This fails with AssertionError.
> {code:java}
> java.lang.AssertionError
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:673)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(Hive.java:1495)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAllValidMaterializedViews(Hive.java:1478)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2160)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1777)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1669)
> at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
> at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
> at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
> at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1428)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:475)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12309)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:355)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:669)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1872)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1819)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:266)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:204)
> at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2764)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.runStatementOnDriver(TestAcidOnTez.java:927)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocksWithMaterializedView(TestAcidOnTez.java:803)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at 

  1   2   >