[jira] [Created] (HIVE-26831) Random temp ID for lock enqueue and commitTxn is not guaranteed negative

2022-12-09 Thread Matthew Sharp (Jira)
Matthew Sharp created HIVE-26831:


 Summary: Random temp ID for lock enqueue and commitTxn is not 
guaranteed negative
 Key: HIVE-26831
 URL: https://issues.apache.org/jira/browse/HIVE-26831
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0-alpha-2, 4.0.0-alpha-1
Reporter: Matthew Sharp


HIVE-23283 mentions the goal of generating a negative number to avoid any 
potential conflicts, but the current random number generation is not guaranteed 
to produce only negative values.

>From the TxnHandler class:
{code:java}
private long generateTemporaryId() {
  return -1 * ThreadLocalRandom.current().nextLong();  
} {code}
If the nextLong returns a negative value we will see a positive temporary ID.  
The odds of this conflicting may be low and with retries it may be hidden from 
users, but this should be fixed to ensure only negative values are returned.

Something like this may be best:
{code:java}
private long generateTemporaryId() { 
  return ThreadLocalRandom.current().nextLong(Long.MIN_VALUE, 0); 
}{code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26830) Update TPCDS30TB metastore dump with histograms

2022-12-09 Thread Alessandro Solimando (Jira)
Alessandro Solimando created HIVE-26830:
---

 Summary: Update TPCDS30TB metastore dump with histograms
 Key: HIVE-26830
 URL: https://issues.apache.org/jira/browse/HIVE-26830
 Project: Hive
  Issue Type: Improvement
  Components: Test
Affects Versions: 4.0.0-alpha-2
Reporter: Alessandro Solimando


Once histogram statistics are added, we should re-create 30TB TPCDS setup, 
compute statistics and update the metastore dump.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26829) Upgrade avro to 1.11.0

2022-12-09 Thread Raghav Aggarwal (Jira)
Raghav Aggarwal created HIVE-26829:
--

 Summary: Upgrade avro to 1.11.0
 Key: HIVE-26829
 URL: https://issues.apache.org/jira/browse/HIVE-26829
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0-alpha-2
Reporter: Raghav Aggarwal
Assignee: Raghav Aggarwal


To address this CVE-2021-43045, avro needs to be upgraded to 1.11.0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26828) Fix OOM for hybridgrace_hashjoin_2.q

2022-12-09 Thread Alessandro Solimando (Jira)
Alessandro Solimando created HIVE-26828:
---

 Summary: Fix OOM for hybridgrace_hashjoin_2.q
 Key: HIVE-26828
 URL: https://issues.apache.org/jira/browse/HIVE-26828
 Project: Hive
  Issue Type: Bug
  Components: Test, Tez
Affects Versions: 4.0.0-alpha-2
Reporter: Alessandro Solimando


_hybridgrace_hashjoin_2.q_ test was disabled because it was failing with OOM 
transiently (from [flaky_test 
output|http://ci.hive.apache.org/blue/organizations/jenkins/hive-flaky-check/detail/hive-flaky-check/597/tests/],
 in case it disappears):
{noformat}
property: qfile used as override with val: hybridgrace_hashjoin_2.qproperty: 
run_disabled used as override with val: falseSetting hive-site: 
file:/home/jenkins/agent/workspace/hive-flaky-check/data/conf/tez//hive-site.xmlInitializing
 the schema to: 4.0.0Metastore connection URL:  
jdbc:derby:memory:junit_metastore_db;create=trueMetastore connection Driver :   
org.apache.derby.jdbc.EmbeddedDriverMetastore connection User:  APPMetastore 
connection Password:   mineStarting metastore schema initialization to 
4.0.0Initialization script hive-schema-4.0.0.derby.sqlInitialization script 
completedRunning: diff -a 
/home/jenkins/agent/workspace/hive-flaky-check/itests/qtest/target/qfile-results/clientpositive/hybridgrace_hashjoin_2.q.out
 
/home/jenkins/agent/workspace/hive-flaky-check/ql/src/test/results/clientpositive/tez/hybridgrace_hashjoin_2.q.out1954,1999d1953<
 Status: Failed< Vertex failed, vertexName=Map 2, vertexId=vertex_#ID#, 
diagnostics=[Vertex vertex_#ID# [Map 2] killed/failed due 
to:ROOT_INPUT_INIT_FAILURE, Vertex Input: z1 initializer failed, 
vertex=vertex_#ID# [Map 2], java.lang.RuntimeException: Failed to load plan: 
hdfs://localhost:45033/home/jenkins/agent/workspace/hive-flaky-check/itests/qtest/target/tmp/scratchdir/jenkins/88f705a8-2d67-4d0a-92fd-d9617faf4e46/hive_2022-12-08_02-25-15_569_4666093830564098399-1/jenkins/_tez_scratch_dir/5b786380-b362-45e0-ac10-0f835ef1d8d7/map.xml<
  A masked pattern was here < Caused by: 
org.apache.hive.com.esotericsoftware.kryo.KryoException: 
java.lang.OutOfMemoryError: GC overhead limit exceeded< Serialization trace:< 
childOperators (org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator)< 
childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)< aliasToWork 
(org.apache.hadoop.hive.ql.plan.MapWork)<  A masked pattern was here < 
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded<  A 
masked pattern was here < ]< [Masked Vertex killed due to 
OTHER_VERTEX_FAILURE]< [Masked Vertex killed due to OTHER_VERTEX_FAILURE]< 
[Masked Vertex killed due to OTHER_VERTEX_FAILURE]< [Masked Vertex killed due 
to OTHER_VERTEX_FAILURE]< [Masked Vertex killed due to OTHER_VERTEX_FAILURE]< 
DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:5< 
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 2, 
vertexId=vertex_#ID#, diagnostics=[Vertex vertex_#ID# [Map 2] killed/failed due 
to:ROOT_INPUT_INIT_FAILURE, Vertex Input: z1 initializer failed, 
vertex=vertex_#ID# [Map 2], java.lang.RuntimeException: Failed to load plan: 
hdfs://localhost:45033/home/jenkins/agent/workspace/hive-flaky-check/itests/qtest/target/tmp/scratchdir/jenkins/88f705a8-2d67-4d0a-92fd-d9617faf4e46/hive_2022-12-08_02-25-15_569_4666093830564098399-1/jenkins/_tez_scratch_dir/5b786380-b362-45e0-ac10-0f835ef1d8d7/map.xml<
  A masked pattern was here < Caused by: 
org.apache.hive.com.esotericsoftware.kryo.KryoException: 
java.lang.OutOfMemoryError: GC overhead limit exceeded< Serialization trace:< 
childOperators (org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator)< 
childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)< aliasToWork 
(org.apache.hadoop.hive.ql.plan.MapWork)<  A masked pattern was here < 
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded<  A 
masked pattern was here < ][Masked Vertex killed due to 
OTHER_VERTEX_FAILURE][Masked Vertex killed due to OTHER_VERTEX_FAILURE][Masked 
Vertex killed due to OTHER_VERTEX_FAILURE][Masked Vertex killed due to 
OTHER_VERTEX_FAILURE][Masked Vertex killed due to OTHER_VERTEX_FAILURE]DAG did 
not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:5< PREHOOK: 
query: SELECT COUNT(*)< FROM src1 x< JOIN srcpart z1 ON (x.key = z1.key)< JOIN 
src y1 ON (x.key = y1.key)< JOIN srcpart z2 ON (x.value = z2.value)< JOIN 
src y2 ON (x.value = y2.value)< WHERE z1.key < '' AND z2.key < 
'zz'<  AND y1.value < '' AND y2.value < 'zz'< PREHOOK: 
type: QUERY< PREHOOK: Input: default@src< PREHOOK: Input: default@src1< 
PREHOOK: Input: default@srcpart< PREHOOK: Input: 
default@srcpart@ds=2008-04-08/hr=11< PREHOOK: Input: 
default@srcpart@ds=2008-04-08/hr=12< 

[jira] [Created] (HIVE-26827) Add configs to workaround predicate issue with Parquet on TIMESTAMP data type

2022-12-09 Thread Taraka Rama Rao Lethavadla (Jira)
Taraka Rama Rao Lethavadla created HIVE-26827:
-

 Summary: Add configs to workaround predicate issue with Parquet on 
TIMESTAMP data type
 Key: HIVE-26827
 URL: https://issues.apache.org/jira/browse/HIVE-26827
 Project: Hive
  Issue Type: Improvement
Reporter: Taraka Rama Rao Lethavadla
Assignee: Taraka Rama Rao Lethavadla


The below query fails with error
{noformat}
ect * from db.parquet_table_with_timestamp where created_date_utc between
'2022-11-05 00:01:01' and '2022-11-08 23:59:59'{noformat}
 

We can workaround the issue below

 
{noformat}
2022-11-10 06:43:36,751 [ERROR] [TezChild] 
|read.ParquetFilterPredicateConverter|: fail to build predicate filter leaf 
with errors org.apache.hadoop.hive.ql.metadata.HiveException: Conversion to 
Parquet FilterPredicate not supported for TIMESTAMP{noformat}
by setting configs at session level
 # set hive.optimize.index.filter=false;
 # set hive.optimize.ppd=false;

As part of this Jira proposing to add these config info to the above message so 
that who ever encounter this problem can try the workaround

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26826) Large table drop causes issues when interrupted

2022-12-09 Thread KIRTI RUGE (Jira)
KIRTI RUGE created HIVE-26826:
-

 Summary: Large table drop causes issues when interrupted
 Key: HIVE-26826
 URL: https://issues.apache.org/jira/browse/HIVE-26826
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Reporter: KIRTI RUGE


*Issue:*

1. Create a table with large number of partitions (e.g 10K partitions)

2. Drop this table (This will take lot longer time in s3 & other calls)

3. Interrupt #2. In beeline try "ctl+c" or cancel query in hue.

4. Since interrupt handling has issues in this codepath, it doesn't kill the 
query.

Other statements will start waiting from this point onwards, as the lockManager 
doesn't release the locks completely.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)