[jira] [Commented] (HIVE-13424) Refactoring the code to pass a QueryState object rather than HiveConf object
[ https://issues.apache.org/jira/browse/HIVE-13424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231678#comment-15231678 ] Hive QA commented on HIVE-13424: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12797330/HIVE-13424.3.patch {color:green}SUCCESS:{color} +1 due to 14 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 389 failed/errored test(s), 9967 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_acid_globallimit org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_alter_merge_2_orc org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_alter_merge_orc org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_alter_merge_stats_orc org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join0 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join21 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join29 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join30 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join_filters org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join_nulls org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_10 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_11 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_12 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_13 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_14 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_15 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_16 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_4 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_5 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_6 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_7 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_8 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_9 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucket2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucket3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucket4 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucket_map_join_tez1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucket_map_join_tez2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucketpruning1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_gby org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_gby_empty org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_join org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_limit org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_semijoin org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_simple_select org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_stats org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_subq_exists org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_subq_in org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_subq_not_in org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_udf_udaf org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_union org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_views org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_windowing org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_column_names_with_leading_and_trailing_spaces org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_constprog_dpp org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_constprog_semijoin org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_correlationoptimizer1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_count org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_create_merge_compressed org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cross_join
[jira] [Comment Edited] (HIVE-13457) Create HS2 REST API endpoints for monitoring information
[ https://issues.apache.org/jira/browse/HIVE-13457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231656#comment-15231656 ] Thejas M Nair edited comment on HIVE-13457 at 4/8/16 5:21 AM: -- Thats a great idea. I have also been thinking that creating and exposing codahale/dropwizard [health check|https://dropwizard.github.io/metrics/3.1.0/manual/healthchecks/] metrics would be very useful for monitoring. That way monitoring tools such as ambari can give an alert that includes the problem being faced. For example for HS2 - Health check (OK/ Error message values) * metastorepresistence * filesystem * thread capacity * memory usage was (Author: thejas): Thats a great idea. I have also been thinking that creating and exposing codahale/dropwizard health check metrics would be very useful for monitoring. That way monitoring tools such as ambari can give an alert that includes the problem being faced. For example for HS2 - Health check (OK/ Error message values) * metastorepresistence * filesystem * thread capacity * memory usage > Create HS2 REST API endpoints for monitoring information > > > Key: HIVE-13457 > URL: https://issues.apache.org/jira/browse/HIVE-13457 > Project: Hive > Issue Type: Improvement >Reporter: Szehon Ho > > Similar to what is exposed in HS2 webui in HIVE-12338, it would be nice if > other UI's like admin tools or Hue can access and display this information as > well. Hence, we will create some REST endpoints to expose this information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13457) Create HS2 REST API endpoints for monitoring information
[ https://issues.apache.org/jira/browse/HIVE-13457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231656#comment-15231656 ] Thejas M Nair commented on HIVE-13457: -- Thats a great idea. I have also been thinking that creating and exposing codahale/dropwizard health check metrics would be very useful for monitoring. That way monitoring tools such as ambari can give an alert that includes the problem being faced. For example for HS2 - Health check (OK/ Error message values) * metastorepresistence * filesystem * thread capacity * memory usage > Create HS2 REST API endpoints for monitoring information > > > Key: HIVE-13457 > URL: https://issues.apache.org/jira/browse/HIVE-13457 > Project: Hive > Issue Type: Improvement >Reporter: Szehon Ho > > Similar to what is exposed in HS2 webui in HIVE-12338, it would be nice if > other UI's like admin tools or Hue can access and display this information as > well. Hence, we will create some REST endpoints to expose this information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13434) BaseSemanticAnalyzer.unescapeSQLString doesn't unescape \u0000 style character literals.
[ https://issues.apache.org/jira/browse/HIVE-13434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231584#comment-15231584 ] Hive QA commented on HIVE-13434: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12797313/HIVE-13434.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 22 failed/errored test(s), 9982 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket4 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket5 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket6 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_disable_merge_for_bucketing org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_map_operators org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_num_buckets org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_list_bucket_dml_10 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge1 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge2 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge9 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge_diff_fs org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_reduce_deduplicate org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join1 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join2 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join3 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join4 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join5 org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testForcedLocalityPreemption org.apache.hive.hcatalog.mapreduce.TestHCatMultiOutputFormat.org.apache.hive.hcatalog.mapreduce.TestHCatMultiOutputFormat org.apache.hive.service.TestHS2ImpersonationWithRemoteMS.org.apache.hive.service.TestHS2ImpersonationWithRemoteMS {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7506/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7506/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7506/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 22 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12797313 - PreCommit-HIVE-TRUNK-Build > BaseSemanticAnalyzer.unescapeSQLString doesn't unescape \u style > character literals. > > > Key: HIVE-13434 > URL: https://issues.apache.org/jira/browse/HIVE-13434 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.1.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta > Attachments: HIVE-13434.1.patch > > > BaseSemanticAnalyzer.unescapeSQLString method may have a fault. When "\u0061" > style character literals are passed to the method, it's not unescaped > successfully. > In Spark SQL project, we referenced the unescaping logic and noticed this > issue (SPARK-14426) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13459) Cassandra Hive throws "Unable to find partitioner class 'org.apache.cassandra.dht.Murmur3Partitioner'"
[ https://issues.apache.org/jira/browse/HIVE-13459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yb updated HIVE-13459: -- Description: Using Hive trying to execute select statement on cassandra, but it throws error: hive> select * from genericquantity; OK Failed with exception java.io.IOException:java.lang.RuntimeException: org.apache.cassandra.exceptions.ConfigurationException: Unable to find partitioner class 'org.apache.cassandra.dht.Murmur3Partitioner' Time taken: 0.518 seconds middle:hive-1.2.0-hadoop-2.6.0-cassandra-2.1.6.jar > Cassandra Hive throws "Unable to find partitioner class > 'org.apache.cassandra.dht.Murmur3Partitioner'" > -- > > Key: HIVE-13459 > URL: https://issues.apache.org/jira/browse/HIVE-13459 > Project: Hive > Issue Type: Bug >Reporter: yb > > Using Hive trying to execute select statement on cassandra, but it throws > error: > hive> select * from genericquantity; > OK > Failed with exception java.io.IOException:java.lang.RuntimeException: > org.apache.cassandra.exceptions.ConfigurationException: Unable to find > partitioner class 'org.apache.cassandra.dht.Murmur3Partitioner' > Time taken: 0.518 seconds > middle:hive-1.2.0-hadoop-2.6.0-cassandra-2.1.6.jar -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6535) JDBC: async wait should happen during fetch for results
[ https://issues.apache.org/jira/browse/HIVE-6535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231519#comment-15231519 ] Vaibhav Gumashta commented on HIVE-6535: Discuss with [~thejas] and it makes sense to implement this as a non-standard api call. Looks like there is an expectation that Statement#execute be a blocking call. > JDBC: async wait should happen during fetch for results > --- > > Key: HIVE-6535 > URL: https://issues.apache.org/jira/browse/HIVE-6535 > Project: Hive > Issue Type: Improvement > Components: HiveServer2, JDBC >Affects Versions: 0.14.0, 1.2.1, 2.0.0 >Reporter: Thejas M Nair >Assignee: Vaibhav Gumashta > Attachments: HIVE-6535.1.patch, HIVE-6535.2.patch > > > The hive jdbc client waits query completion during execute() call. It would > be better to block in the jdbc for completion when the results are being > fetched. > This way the application using hive jdbc driver can do other tasks while > asynchronous query execution is happening, until it needs to fetch the result > set. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10176) skip.header.line.count causes values to be skipped when performing insert values
[ https://issues.apache.org/jira/browse/HIVE-10176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231496#comment-15231496 ] Hive QA commented on HIVE-10176: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12797309/HIVE-10176.10.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9978 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7505/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7505/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7505/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12797309 - PreCommit-HIVE-TRUNK-Build > skip.header.line.count causes values to be skipped when performing insert > values > > > Key: HIVE-10176 > URL: https://issues.apache.org/jira/browse/HIVE-10176 > Project: Hive > Issue Type: Bug >Affects Versions: 1.0.0 >Reporter: Wenbo Wang >Assignee: Vladyslav Pavlenko > Attachments: HIVE-10176.1.patch, HIVE-10176.10.patch, > HIVE-10176.2.patch, HIVE-10176.3.patch, HIVE-10176.4.patch, > HIVE-10176.5.patch, HIVE-10176.6.patch, HIVE-10176.7.patch, > HIVE-10176.8.patch, HIVE-10176.9.patch, data > > > When inserting values in to tables with TBLPROPERTIES > ("skip.header.line.count"="1") the first value listed is also skipped. > create table test (row int, name string) TBLPROPERTIES > ("skip.header.line.count"="1"); > load data local inpath '/root/data' into table test; > insert into table test values (1, 'a'), (2, 'b'), (3, 'c'); > (1, 'a') isn't inserted into the table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13395) Lost Update problem in ACID
[ https://issues.apache.org/jira/browse/HIVE-13395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-13395: -- Priority: Blocker (was: Critical) > Lost Update problem in ACID > --- > > Key: HIVE-13395 > URL: https://issues.apache.org/jira/browse/HIVE-13395 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.2.0, 2.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Blocker > > ACID users can run into Lost Update problem. > In Hive 1.2, Driver.recordValidTxns() (which records the snapshot to use for > the query) is called in Driver.compile(). > Now suppose to concurrent "update T set x = x + 1" are executed. (for > simplicity assume there is exactly 1 row in T) > What can happen is that both compile at the same time (more precisely before > acquireLocksAndOpenTxn() in runInternal() is called) and thus will lock in > the same snapshot, say the value of x = 7 in this snapshot. > Now 1 will get the lock on the row, the second will block. > Now 1, makes x = 8 and commits. > Now 2 proceeds and makes x = 8 again since in it's snapshot x is still 7. > This specific issue is solved in Hive 1.3/2.0 (HIVE-11077 which is a large > patch that deals with multi-statement txns) by moving recordValidTxns() after > locks are acquired which reduces the likelihood of this but doesn't eliminate > the problem. > > Even in 1.3 version of the code, you could have the same issue. Assume the > same 2 queries: > Both start a txn, say txnid 9 and 10. Say 10 gets the lock first, 9 blocks. > 10 updates the row (so x = 8) and thus ReaderKey.currentTransactionId=10. > 10 commits. > Now 9 can proceed and it will get a snapshot that includes 10, i.e. it will > see x = 8 and it will write x = 9, but it will set > ReaderKey.currentTransactionId = 9. Thus when merge logic runs, it will see > x = 8 is the later version of this row, i.e. lost update. > The problem is that locks alone are insufficient for MVCC architecture. > > At lower level Row ID has (originalTransactionId, rowid, bucket id, > currentTransactionId) and since on update/delete we do a table scan, we could > check that we are about to write a row with currentTransactionId < > (currentTransactionId of row we've read) and fail the query. Currently, > currentTransactionId is not surfaced at higher level where this check can be > made. > This would not work (efficiently) longer term where we want to support fast > update on user defined PK vis streaming ingest. > Also, this would not work with multi statement txns since in that case we'd > lock in the snapshot at the start of the txn, but then 2nd, 3rd etc queries > would use the same snapshot and the locks for these queries would be acquired > after the snapshot is locked in so this would be the same situation as pre > HIVE-11077. > > > A more robust solution (commonly used with MVCC) is to keep track of start > and commit time (logical counter) or each transaction to detect if two txns > overlap. The 2nd part is to keep track of write-set, i.e. which data (rows, > partitions, whatever appropriate level of granularity is) were modified by > any txn and if 2 txns overlap in time and wrote the same element, abort later > one. This is called first-committer-wins rule. This requires a MS DB schema > change > It would be most convenient to use the same sequence for txnId, start and > commit time (in which case txnid=start time). In this case we'd need to add > 1 filed to TXNS table. The complication here is that we'll be using elements > of the sequence faster and they are used as part of file name of delta and > base dir and currently limited to 7 digits which can be exceeded. So this > would require some thought to handling upgrade/migration. > Also, write-set tracking requires either additional metastore table or > keeping info in HIVE_LOCKS around longer with new state. > > In the short term, on SQL side of things we could (in auto commit mode only) > acquire the locks first and then open the txn AND update these locks with txn > id. > This implies another Thrift change to pass in lockId to openTxn. > The same would not work for Streaming API since it opens several txns at once > and then acquires locks for each. > (Not sure if that's is an issue or not since Streaming only does Insert). > Either way this feels hacky. > > Here is one simple example why we need Write-Set tracking for multi-statement > txns > Consider transactions T ~1~ and T ~2~: > T ~1~: r ~1~\[x] -> w ~1~\[y] -> c ~1~ > T ~2~: w ~2~\[x] -> w ~2~\[y] -> c ~2~ > Suppose the order of operations is r ~1~\[x] w ~2~\[x] then a > conventional R/W lock manager w/o MVCSS will block the write from T ~2~ > With MVCC we don't want
[jira] [Updated] (HIVE-13187) hiveserver2 can suppress OOM errors in some cases
[ https://issues.apache.org/jira/browse/HIVE-13187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-13187: -- Target Version/s: 2.1.0 (was: 2.0.1) > hiveserver2 can suppress OOM errors in some cases > - > > Key: HIVE-13187 > URL: https://issues.apache.org/jira/browse/HIVE-13187 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Siddharth Seth >Priority: Critical > > Affects at least branch-2. > See trace in https://issues.apache.org/jira/browse/HIVE-13176 > This looks to be in src/java/org/apache/hadoop/hive/ql/exec/TaskRunner.java. > That catches Throwable in the thread and sends it further up. There's no > checks to see if this is an Error or general Exception - Errors end up > getting suppressed, instead of killing HiveServer2. This is on the processing > threads. > It looks like the Handler threads have some kind of OOM checker on them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13176) OutOfMemoryError : GC overhead limit exceeded
[ https://issues.apache.org/jira/browse/HIVE-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-13176: -- Target Version/s: 2.1.0 (was: 2.0.1) > OutOfMemoryError : GC overhead limit exceeded > -- > > Key: HIVE-13176 > URL: https://issues.apache.org/jira/browse/HIVE-13176 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Kavan Suresh >Assignee: Siddharth Seth > Attachments: dataNucleus.png, fs.png, shutdownhook.png > > > Detected leaks while testing hiveserver2 concurrency setup with LLAP. > 2016-02-26T12:50:58,131 ERROR [HiveServer2-Background-Pool: Thread-311030]: > operation.Operation (SQLOperation.java:run(230)) - Error running hive query: > org.apache.hive.service.cli.HiveSQLException: Error while processing > statement: FAILED: Execution Error, return code -101 from > org.apache.hadoop.hive.ql.exec.StatsTask. GC overhead limit exceeded > at > org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:333) > ~[hive-jdbc-2.0.0.2.3.5.1-36-standalone.jar:2.0.0.2.3.5.1-36] > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:177) > ~[hive-jdbc-2.0.0.2.3.5.1-36-standalone.jar:2.0.0.2.3.5.1-36] > at > org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:73) > ~[hive-jdbc-2.0.0.2.3.5.1-36-standalone.jar:2.0.0.2.3.5.1-36] > at > org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:227) > [hive-jdbc-2.0.0.2.3.5.1-36-standalone.jar:2.0.0.2.3.5.1-36] > at java.security.AccessController.doPrivileged(Native Method) > ~[?:1.8.0_45] > at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_45] > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > [hadoop-common-2.7.1.2.3.5.1-36.jar:?] > at > org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:239) > [hive-jdbc-2.0.0.2.3.5.1-36-standalone.jar:2.0.0.2.3.5.1-36] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [?:1.8.0_45] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > [?:1.8.0_45] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [?:1.8.0_45] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [?:1.8.0_45] > at java.lang.Thread.run(Thread.java:745) [?:1.8.0_45] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13360) Refactoring Hive Authorization
[ https://issues.apache.org/jira/browse/HIVE-13360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231417#comment-15231417 ] Thejas M Nair commented on HIVE-13360: -- Regarding the change to move ip address from the query context object (HiveAuthzContext/QueryContext) to HiveAuthenticationProvider. I don't think that is the right place for it. In HS2 HTTP mode, when proxies and knox servers are between end user and HS2 , every request for single session does not have to come via a single IP address. Current assumption in hive code base is that the IP address is valid for the entire session. But that is more of a bug. Also, HIVE-12777 provides the ability to serialize the sessionhandle (equivalent to a jdbc connection identifier) and restore the session from that. The restoration could in theory happen from another machine with different IP address. Considering this, the correct longer term place for passing the IP address to authorization plugins is using HiveAuthzContext/QueryContext. Also, QueryContext is not the best name for the class as it passed for metastore api calls as well (HiveAuthorizer.filterListCmdObjects), IMO, something like "ActionContext" would be more appropriate. However, I don't think its worth changing the name at the cost of changing the API. > Refactoring Hive Authorization > -- > > Key: HIVE-13360 > URL: https://issues.apache.org/jira/browse/HIVE-13360 > Project: Hive > Issue Type: Sub-task > Components: Security >Affects Versions: 2.0.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 2.1.0 > > Attachments: HIVE-13360.01.patch, HIVE-13360.02.patch, > HIVE-13360.03.patch, HIVE-13360.04.patch, HIVE-13360.final.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-4841) Add partition level hook to HiveMetaHook
[ https://issues.apache.org/jira/browse/HIVE-4841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231413#comment-15231413 ] Hive QA commented on HIVE-4841: --- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12690295/HIVE-4841.4.patch.txt {color:red}ERROR:{color} -1 due to build exiting with an error Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7504/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7504/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7504/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-7504/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin >From https://github.com/apache/hive 7e0b08c..fee6669 master -> origin/master + git reset --hard HEAD HEAD is now at 7e0b08c HIVE-13360: Refactoring Hive Authorization (Pengcheng Xiong, reviewed by Ashutosh Chauhan) + git clean -f -d Removing ql/src/java/org/apache/hadoop/hive/ql/Driver.java.orig + git checkout master Already on 'master' Your branch is behind 'origin/master' by 2 commits, and can be fast-forwarded. + git reset --hard origin/master HEAD is now at fee6669 HIVE-1: StatsOptimizer throws ClassCastException (Pengcheng Xiong, reviewed by Ashutosh Chauhan) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12690295 - PreCommit-HIVE-TRUNK-Build > Add partition level hook to HiveMetaHook > > > Key: HIVE-4841 > URL: https://issues.apache.org/jira/browse/HIVE-4841 > Project: Hive > Issue Type: Improvement > Components: StorageHandler >Reporter: Navis >Assignee: Navis >Priority: Minor > Attachments: HIVE-4841.4.patch.txt, HIVE-4841.D11673.1.patch, > HIVE-4841.D11673.2.patch, HIVE-4841.D11673.3.patch > > > Current HiveMetaHook provides hooks for tables only. With partition level > hook, external storages also could be revised to exploit PPR. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13421) Propagate job progress in operation status
[ https://issues.apache.org/jira/browse/HIVE-13421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231409#comment-15231409 ] Hive QA commented on HIVE-13421: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12797272/HIVE-13421.02.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9980 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cross_product_check_1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge_incompat1 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7503/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7503/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7503/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12797272 - PreCommit-HIVE-TRUNK-Build > Propagate job progress in operation status > -- > > Key: HIVE-13421 > URL: https://issues.apache.org/jira/browse/HIVE-13421 > Project: Hive > Issue Type: Improvement >Reporter: Rajat Khandelwal >Assignee: Rajat Khandelwal > Attachments: HIVE-13421.01.patch, HIVE-13421.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13333) StatsOptimizer throws ClassCastException
[ https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-1: --- Fix Version/s: 2.1.0 > StatsOptimizer throws ClassCastException > > > Key: HIVE-1 > URL: https://issues.apache.org/jira/browse/HIVE-1 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer >Affects Versions: 2.0.0 >Reporter: Ashutosh Chauhan >Assignee: Pengcheng Xiong > Fix For: 2.1.0 > > Attachments: HIVE-1.01.patch, HIVE-1.02.patch, > HIVE-1.03.patch > > > mvn test -Dtest=TestCliDriver -Dtest.output.overwrite=true > -Dqfile=cbo_rp_udf_udaf.q -Dhive.compute.query.using.stats=true repros the > issue. > In StatsOptimizer with return path on, we may have aggr($f0), aggr($f1) in GBY > and then select aggr($f1), aggr($f0) in SEL. > Thus we need to use colExp to find out which position is > corresponding to which position. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13333) StatsOptimizer throws ClassCastException
[ https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231385#comment-15231385 ] Pengcheng Xiong commented on HIVE-1: manually rerun all the failed tests, can not repo. Pushed to master. Thanks [~ashutoshc] for the review. > StatsOptimizer throws ClassCastException > > > Key: HIVE-1 > URL: https://issues.apache.org/jira/browse/HIVE-1 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer >Affects Versions: 2.0.0 >Reporter: Ashutosh Chauhan >Assignee: Pengcheng Xiong > Attachments: HIVE-1.01.patch, HIVE-1.02.patch, > HIVE-1.03.patch > > > mvn test -Dtest=TestCliDriver -Dtest.output.overwrite=true > -Dqfile=cbo_rp_udf_udaf.q -Dhive.compute.query.using.stats=true repros the > issue. > In StatsOptimizer with return path on, we may have aggr($f0), aggr($f1) in GBY > and then select aggr($f1), aggr($f0) in SEL. > Thus we need to use colExp to find out which position is > corresponding to which position. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13333) StatsOptimizer throws ClassCastException
[ https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-1: --- Resolution: Fixed Status: Resolved (was: Patch Available) > StatsOptimizer throws ClassCastException > > > Key: HIVE-1 > URL: https://issues.apache.org/jira/browse/HIVE-1 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer >Affects Versions: 2.0.0 >Reporter: Ashutosh Chauhan >Assignee: Pengcheng Xiong > Attachments: HIVE-1.01.patch, HIVE-1.02.patch, > HIVE-1.03.patch > > > mvn test -Dtest=TestCliDriver -Dtest.output.overwrite=true > -Dqfile=cbo_rp_udf_udaf.q -Dhive.compute.query.using.stats=true repros the > issue. > In StatsOptimizer with return path on, we may have aggr($f0), aggr($f1) in GBY > and then select aggr($f1), aggr($f0) in SEL. > Thus we need to use colExp to find out which position is > corresponding to which position. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13439) JDBC: provide a way to retrieve GUID to query Yarn ATS
[ https://issues.apache.org/jira/browse/HIVE-13439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231381#comment-15231381 ] Thejas M Nair commented on HIVE-13439: -- +1 > JDBC: provide a way to retrieve GUID to query Yarn ATS > -- > > Key: HIVE-13439 > URL: https://issues.apache.org/jira/browse/HIVE-13439 > Project: Hive > Issue Type: Bug > Components: JDBC >Affects Versions: 1.2.1, 2.0.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > Attachments: HIVE-13439.1.patch, HIVE-13439.2.patch > > > HIVE-9673 added support for passing base64 encoded operation handles to ATS. > We should a method on client side to retrieve that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9660) store end offset of compressed data for RG in RowIndex in ORC
[ https://issues.apache.org/jira/browse/HIVE-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-9660: --- Attachment: HIVE-9660.07.patch > store end offset of compressed data for RG in RowIndex in ORC > - > > Key: HIVE-9660 > URL: https://issues.apache.org/jira/browse/HIVE-9660 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-9660.01.patch, HIVE-9660.02.patch, > HIVE-9660.03.patch, HIVE-9660.04.patch, HIVE-9660.05.patch, > HIVE-9660.06.patch, HIVE-9660.07.patch, HIVE-9660.07.patch, HIVE-9660.patch, > HIVE-9660.patch > > > Right now the end offset is estimated, which in some cases results in tons of > extra data being read. > We can add a separate array to RowIndex (positions_v2?) that stores number of > compressed buffers for each RG, or end offset, or something, to remove this > estimation magic -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13341) Stats state is not captured correctly: differentiate load table and create table
[ https://issues.apache.org/jira/browse/HIVE-13341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13341: --- Status: Patch Available (was: Open) > Stats state is not captured correctly: differentiate load table and create > table > > > Key: HIVE-13341 > URL: https://issues.apache.org/jira/browse/HIVE-13341 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer, Statistics >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-13341.01.patch, HIVE-13341.02.patch, > HIVE-13341.03.patch, HIVE-13341.04.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13341) Stats state is not captured correctly: differentiate load table and create table
[ https://issues.apache.org/jira/browse/HIVE-13341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13341: --- Attachment: HIVE-13341.04.patch > Stats state is not captured correctly: differentiate load table and create > table > > > Key: HIVE-13341 > URL: https://issues.apache.org/jira/browse/HIVE-13341 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer, Statistics >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-13341.01.patch, HIVE-13341.02.patch, > HIVE-13341.03.patch, HIVE-13341.04.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13341) Stats state is not captured correctly: differentiate load table and create table
[ https://issues.apache.org/jira/browse/HIVE-13341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13341: --- Status: Open (was: Patch Available) > Stats state is not captured correctly: differentiate load table and create > table > > > Key: HIVE-13341 > URL: https://issues.apache.org/jira/browse/HIVE-13341 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer, Statistics >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-13341.01.patch, HIVE-13341.02.patch, > HIVE-13341.03.patch, HIVE-13341.04.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9660) store end offset of compressed data for RG in RowIndex in ORC
[ https://issues.apache.org/jira/browse/HIVE-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-9660: --- Attachment: HIVE-9660.07.patch Addressing review comments. The biggest change is the index to kind change for lengths tracking > store end offset of compressed data for RG in RowIndex in ORC > - > > Key: HIVE-9660 > URL: https://issues.apache.org/jira/browse/HIVE-9660 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-9660.01.patch, HIVE-9660.02.patch, > HIVE-9660.03.patch, HIVE-9660.04.patch, HIVE-9660.05.patch, > HIVE-9660.06.patch, HIVE-9660.07.patch, HIVE-9660.patch, HIVE-9660.patch > > > Right now the end offset is estimated, which in some cases results in tons of > extra data being read. > We can add a separate array to RowIndex (positions_v2?) that stores number of > compressed buffers for each RG, or end offset, or something, to remove this > estimation magic -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13429) Tool to remove dangling scratch dir
[ https://issues.apache.org/jira/browse/HIVE-13429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-13429: -- Attachment: (was: HIVE-13429.2.patch) > Tool to remove dangling scratch dir > --- > > Key: HIVE-13429 > URL: https://issues.apache.org/jira/browse/HIVE-13429 > Project: Hive > Issue Type: Improvement >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-13429.1.patch, HIVE-13429.2.patch > > > We have seen in some cases, user will leave the scratch dir behind, and > eventually eat out hdfs storage. This could happen when vm restarts and leave > no chance for Hive to run shutdown hook. This is applicable for both HiveCli > and HiveServer2. Here we provide an external tool to clear dead scratch dir > as needed. > We need a way to identify which scratch dir is in use. We will rely on HDFS > write lock for that. Here is how HDFS write lock works: > 1. A HDFS client open HDFS file for write and only close at the time of > shutdown > 2. Cleanup process can try to open HDFS file for write. If the client holding > this file is still running, we will get exception. Otherwise, we know the > client is dead > 3. If the HDFS client dies without closing the HDFS file, NN will reclaim the > lease after 10 min, ie, the HDFS file hold by the dead client is writable > again after 10 min > So here is how we remove dangling scratch directory in Hive: > 1. HiveCli/HiveServer2 opens a well-named lock file in scratch directory and > only close it when we about to drop scratch directory > 2. A command line tool cleardanglingscratchdir will check every scratch > directory and try open the lock file for write. If it does not get exception, > meaning the owner is dead and we can safely remove the scratch directory > 3. The 10 min window means it is possible a HiveCli/HiveServer2 is dead but > we still cannot reclaim the scratch directory for another 10 min. But this > should be tolerable -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13429) Tool to remove dangling scratch dir
[ https://issues.apache.org/jira/browse/HIVE-13429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-13429: -- Attachment: HIVE-13429.2.patch > Tool to remove dangling scratch dir > --- > > Key: HIVE-13429 > URL: https://issues.apache.org/jira/browse/HIVE-13429 > Project: Hive > Issue Type: Improvement >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-13429.1.patch, HIVE-13429.2.patch > > > We have seen in some cases, user will leave the scratch dir behind, and > eventually eat out hdfs storage. This could happen when vm restarts and leave > no chance for Hive to run shutdown hook. This is applicable for both HiveCli > and HiveServer2. Here we provide an external tool to clear dead scratch dir > as needed. > We need a way to identify which scratch dir is in use. We will rely on HDFS > write lock for that. Here is how HDFS write lock works: > 1. A HDFS client open HDFS file for write and only close at the time of > shutdown > 2. Cleanup process can try to open HDFS file for write. If the client holding > this file is still running, we will get exception. Otherwise, we know the > client is dead > 3. If the HDFS client dies without closing the HDFS file, NN will reclaim the > lease after 10 min, ie, the HDFS file hold by the dead client is writable > again after 10 min > So here is how we remove dangling scratch directory in Hive: > 1. HiveCli/HiveServer2 opens a well-named lock file in scratch directory and > only close it when we about to drop scratch directory > 2. A command line tool cleardanglingscratchdir will check every scratch > directory and try open the lock file for write. If it does not get exception, > meaning the owner is dead and we can safely remove the scratch directory > 3. The 10 min window means it is possible a HiveCli/HiveServer2 is dead but > we still cannot reclaim the scratch directory for another 10 min. But this > should be tolerable -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13429) Tool to remove dangling scratch dir
[ https://issues.apache.org/jira/browse/HIVE-13429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-13429: -- Attachment: HIVE-13429.2.patch > Tool to remove dangling scratch dir > --- > > Key: HIVE-13429 > URL: https://issues.apache.org/jira/browse/HIVE-13429 > Project: Hive > Issue Type: Improvement >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-13429.1.patch, HIVE-13429.2.patch > > > We have seen in some cases, user will leave the scratch dir behind, and > eventually eat out hdfs storage. This could happen when vm restarts and leave > no chance for Hive to run shutdown hook. This is applicable for both HiveCli > and HiveServer2. Here we provide an external tool to clear dead scratch dir > as needed. > We need a way to identify which scratch dir is in use. We will rely on HDFS > write lock for that. Here is how HDFS write lock works: > 1. A HDFS client open HDFS file for write and only close at the time of > shutdown > 2. Cleanup process can try to open HDFS file for write. If the client holding > this file is still running, we will get exception. Otherwise, we know the > client is dead > 3. If the HDFS client dies without closing the HDFS file, NN will reclaim the > lease after 10 min, ie, the HDFS file hold by the dead client is writable > again after 10 min > So here is how we remove dangling scratch directory in Hive: > 1. HiveCli/HiveServer2 opens a well-named lock file in scratch directory and > only close it when we about to drop scratch directory > 2. A command line tool cleardanglingscratchdir will check every scratch > directory and try open the lock file for write. If it does not get exception, > meaning the owner is dead and we can safely remove the scratch directory > 3. The 10 min window means it is possible a HiveCli/HiveServer2 is dead but > we still cannot reclaim the scratch directory for another 10 min. But this > should be tolerable -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13452) StatsOptimizer should return no rows on empty table with group by
[ https://issues.apache.org/jira/browse/HIVE-13452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231208#comment-15231208 ] Ashutosh Chauhan commented on HIVE-13452: - yeah.. difference is MySQL/Postgres treats constant in expressions for group by as positional reference in select list and in which case it doesnt make sense. In Hive, you can get either behavior by {{hive.groupby.orderby.position.alias}} config. However, important point here is even for queries like {{select count(*) from t1 group by c1;}} should return no resultset for empty table. group by 1 essentially mean.. treat all rows as one grouping, so in case for empty table group by 1 should return no rows and just select count(*) from t1 should return row with value 0. > StatsOptimizer should return no rows on empty table with group by > - > > Key: HIVE-13452 > URL: https://issues.apache.org/jira/browse/HIVE-13452 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer >Reporter: Ashutosh Chauhan >Assignee: Pengcheng Xiong > > {code} > create table t1 (a int); > analyze table t1 compute statistics; > analyze table t1 compute statistics for columns; > select count(1) from t1 group by 1; > set hive.compute.query.using.stats=true; > select count(1) from t1 group by 1; > {code} > In both cases result set should be empty. However, with statsoptimizer on > Hive returns one row with value 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13405) Fix Connection Leak in OrcRawRecordMerger
[ https://issues.apache.org/jira/browse/HIVE-13405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231199#comment-15231199 ] Prasanth Jayachandran commented on HIVE-13405: -- Yeah. This will work in jdk7 onwards. +1 > Fix Connection Leak in OrcRawRecordMerger > - > > Key: HIVE-13405 > URL: https://issues.apache.org/jira/browse/HIVE-13405 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.0.0 >Reporter: Thomas Poepping >Assignee: Thomas Poepping > Attachments: HIVE-13405.patch > > > In OrcRawRecordMerger.getLastFlushLength, if the opened stream throws an > IOException on .available() or on .readLong(), the function will exit without > closing the stream. > This patch adds a try-with-resources to fix this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13282) GroupBy and select operator encounter ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HIVE-13282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231193#comment-15231193 ] Vikram Dixit K edited comment on HIVE-13282 at 4/7/16 10:05 PM: Yes. We can move this out to 2.1.0. This only happens in case of reduce side SMB in tez. We have a simple workaround right now that will address this (disable smb join in this case). The real fix would take a lot of refactoring the code which is more suited for master than a maintenance release. was (Author: vikram.dixit): Yes. We can move this out to 2.1.0. This only happens in case of reduce side SMB in tez. We have a simple workaround right now that will address this. The real fix would take a lot of refactoring the code which is more suited for master than a maintenance release. > GroupBy and select operator encounter ArrayIndexOutOfBoundsException > > > Key: HIVE-13282 > URL: https://issues.apache.org/jira/browse/HIVE-13282 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.2.1, 2.0.0, 2.1.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K > > The group by and select operators run into the ArrayIndexOutOfBoundsException > when they incorrectly initialize themselves with tag 0 but the incoming tag > id is different. > {code} > select count(*) from > (select rt1.id from > (select t1.key as id, t1.value as od from tab t1 group by key, value) rt1) vt1 > join > (select rt2.id from > (select t2.key as id, t2.value as od from tab_part t2 group by key, value) > rt2) vt2 > where vt1.id=vt2.id; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13240) GroupByOperator: Drop the hash aggregates when closing operator
[ https://issues.apache.org/jira/browse/HIVE-13240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13240: Attachment: HIVE-13240.03.patch The same patch, to get the logs for test failures if any, to see if they are related > GroupByOperator: Drop the hash aggregates when closing operator > --- > > Key: HIVE-13240 > URL: https://issues.apache.org/jira/browse/HIVE-13240 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.3.0, 1.2.1, 2.0.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-13240.03.patch, HIVE-13240.1.patch, > HIVE-13240.2.patch > > > GroupByOperator holds onto the Hash aggregates accumulated when the plan is > cached. > Drop the hashAggregates in case of error during forwarding to the next > operator. > Added for PTF, TopN and all GroupBy cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13282) GroupBy and select operator encounter ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HIVE-13282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13282: Target Version/s: 1.2.2, 2.1.0 (was: 1.2.2, 2.1.0, 2.0.1) > GroupBy and select operator encounter ArrayIndexOutOfBoundsException > > > Key: HIVE-13282 > URL: https://issues.apache.org/jira/browse/HIVE-13282 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.2.1, 2.0.0, 2.1.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K > > The group by and select operators run into the ArrayIndexOutOfBoundsException > when they incorrectly initialize themselves with tag 0 but the incoming tag > id is different. > {code} > select count(*) from > (select rt1.id from > (select t1.key as id, t1.value as od from tab t1 group by key, value) rt1) vt1 > join > (select rt2.id from > (select t2.key as id, t2.value as od from tab_part t2 group by key, value) > rt2) vt2 > where vt1.id=vt2.id; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13176) OutOfMemoryError : GC overhead limit exceeded
[ https://issues.apache.org/jira/browse/HIVE-13176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231194#comment-15231194 ] Sergey Shelukhin commented on HIVE-13176: - This is targeting 2.0.1 but has no patch. Should it be moved out to 2.1.0? > OutOfMemoryError : GC overhead limit exceeded > -- > > Key: HIVE-13176 > URL: https://issues.apache.org/jira/browse/HIVE-13176 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Kavan Suresh >Assignee: Siddharth Seth > Attachments: dataNucleus.png, fs.png, shutdownhook.png > > > Detected leaks while testing hiveserver2 concurrency setup with LLAP. > 2016-02-26T12:50:58,131 ERROR [HiveServer2-Background-Pool: Thread-311030]: > operation.Operation (SQLOperation.java:run(230)) - Error running hive query: > org.apache.hive.service.cli.HiveSQLException: Error while processing > statement: FAILED: Execution Error, return code -101 from > org.apache.hadoop.hive.ql.exec.StatsTask. GC overhead limit exceeded > at > org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:333) > ~[hive-jdbc-2.0.0.2.3.5.1-36-standalone.jar:2.0.0.2.3.5.1-36] > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:177) > ~[hive-jdbc-2.0.0.2.3.5.1-36-standalone.jar:2.0.0.2.3.5.1-36] > at > org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:73) > ~[hive-jdbc-2.0.0.2.3.5.1-36-standalone.jar:2.0.0.2.3.5.1-36] > at > org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:227) > [hive-jdbc-2.0.0.2.3.5.1-36-standalone.jar:2.0.0.2.3.5.1-36] > at java.security.AccessController.doPrivileged(Native Method) > ~[?:1.8.0_45] > at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_45] > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > [hadoop-common-2.7.1.2.3.5.1-36.jar:?] > at > org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:239) > [hive-jdbc-2.0.0.2.3.5.1-36-standalone.jar:2.0.0.2.3.5.1-36] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [?:1.8.0_45] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > [?:1.8.0_45] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [?:1.8.0_45] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [?:1.8.0_45] > at java.lang.Thread.run(Thread.java:745) [?:1.8.0_45] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13370) Add test for HIVE-11470
[ https://issues.apache.org/jira/browse/HIVE-13370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231191#comment-15231191 ] Sergey Shelukhin commented on HIVE-13370: - Removing 2.0.1 target. Please feel free to commit to branch-2 anyway and fix for 2.0.1 if this happens before the release. > Add test for HIVE-11470 > --- > > Key: HIVE-13370 > URL: https://issues.apache.org/jira/browse/HIVE-13370 > Project: Hive > Issue Type: Bug >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan >Priority: Minor > Attachments: HIVE-13370.patch > > > HIVE-11470 added capability to handle NULL dynamic partitioning keys > properly. However, it did not add a test for the case, we should have one so > we don't have future regressions of the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13370) Add test for HIVE-11470
[ https://issues.apache.org/jira/browse/HIVE-13370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13370: Target Version/s: 1.3.0, 1.2.2, 2.1.0 (was: 1.3.0, 1.2.2, 2.1.0, 2.0.1) > Add test for HIVE-11470 > --- > > Key: HIVE-13370 > URL: https://issues.apache.org/jira/browse/HIVE-13370 > Project: Hive > Issue Type: Bug >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan >Priority: Minor > Attachments: HIVE-13370.patch > > > HIVE-11470 added capability to handle NULL dynamic partitioning keys > properly. However, it did not add a test for the case, we should have one so > we don't have future regressions of the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13405) Fix Connection Leak in OrcRawRecordMerger
[ https://issues.apache.org/jira/browse/HIVE-13405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231189#comment-15231189 ] Sergey Shelukhin commented on HIVE-13405: - +0.9. [~prasanth_j] does this make sense? > Fix Connection Leak in OrcRawRecordMerger > - > > Key: HIVE-13405 > URL: https://issues.apache.org/jira/browse/HIVE-13405 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.0.0 >Reporter: Thomas Poepping >Assignee: Thomas Poepping > Attachments: HIVE-13405.patch > > > In OrcRawRecordMerger.getLastFlushLength, if the opened stream throws an > IOException on .available() or on .readLong(), the function will exit without > closing the stream. > This patch adds a try-with-resources to fix this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13282) GroupBy and select operator encounter ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HIVE-13282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231180#comment-15231180 ] Sergey Shelukhin commented on HIVE-13282: - This is targeting 2.0.1 but has no patch. Should it be moved out to 2.1.0? > GroupBy and select operator encounter ArrayIndexOutOfBoundsException > > > Key: HIVE-13282 > URL: https://issues.apache.org/jira/browse/HIVE-13282 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.2.1, 2.0.0, 2.1.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K > > The group by and select operators run into the ArrayIndexOutOfBoundsException > when they incorrectly initialize themselves with tag 0 but the incoming tag > id is different. > {code} > select count(*) from > (select rt1.id from > (select t1.key as id, t1.value as od from tab t1 group by key, value) rt1) vt1 > join > (select rt2.id from > (select t2.key as id, t2.value as od from tab_part t2 group by key, value) > rt2) vt2 > where vt1.id=vt2.id; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13408) Issue appending HIVE_QUERY_ID without checking if the prefix already exists
[ https://issues.apache.org/jira/browse/HIVE-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13408: Status: Patch Available (was: Open) > Issue appending HIVE_QUERY_ID without checking if the prefix already exists > --- > > Key: HIVE-13408 > URL: https://issues.apache.org/jira/browse/HIVE-13408 > Project: Hive > Issue Type: Bug > Components: Shims >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K > Attachments: HIVE-13408.1.patch, HIVE-13408.2.patch > > > {code} > We are resetting the hadoop caller context to HIVE_QUERY_ID:HIVE_QUERY_ID: > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13187) hiveserver2 can suppress OOM errors in some cases
[ https://issues.apache.org/jira/browse/HIVE-13187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231179#comment-15231179 ] Sergey Shelukhin commented on HIVE-13187: - This is targeting 2.0.1 but has no patch. Should it be moved out to 2.1.0? > hiveserver2 can suppress OOM errors in some cases > - > > Key: HIVE-13187 > URL: https://issues.apache.org/jira/browse/HIVE-13187 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Siddharth Seth >Priority: Critical > > Affects at least branch-2. > See trace in https://issues.apache.org/jira/browse/HIVE-13176 > This looks to be in src/java/org/apache/hadoop/hive/ql/exec/TaskRunner.java. > That catches Throwable in the thread and sends it further up. There's no > checks to see if this is an Error or general Exception - Errors end up > getting suppressed, instead of killing HiveServer2. This is on the processing > threads. > It looks like the Handler threads have some kind of OOM checker on them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13255) FloatTreeReader.nextVector is expensive
[ https://issues.apache.org/jira/browse/HIVE-13255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13255: Fix Version/s: 2.0.1 2.1.0 > FloatTreeReader.nextVector is expensive > > > Key: HIVE-13255 > URL: https://issues.apache.org/jira/browse/HIVE-13255 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Fix For: 2.1.0, 2.0.1 > > Attachments: HIVE-13255.1.patch, HIVE-13255.2.patch, > bytecode-size-after.png, bytecode-size-before.png, float-reader-perf.png, > q1-bottleneck.png, q1-warm-perf-map.png > > > Some TPCDS queries on 1TB scale shows FloatTreeReader on profile samples. It > is most likely because of multiple branching and polymorphic dispatch in > FloatTreeReader.nextVector() implementation. See attached image for sampling > profile output. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13417) Some vector operators return "OP" as name
[ https://issues.apache.org/jira/browse/HIVE-13417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-13417: -- Resolution: Fixed Fix Version/s: 2.1.0 Status: Resolved (was: Patch Available) > Some vector operators return "OP" as name > - > > Key: HIVE-13417 > URL: https://issues.apache.org/jira/browse/HIVE-13417 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Fix For: 2.1.0 > > Attachments: HIVE-13417.1.patch, HIVE-13417.2.patch, > HIVE-13417.3.patch, HIVE-13417.4.patch > > > Select/Group by/Filter/etc need to return the same name whether they are the > regular or the vector operators. If they don't the regular path matching in > our optimizer code doesn't work on them. > From the code it looks an attempt was made to follow this - unfortunately > getOperatorName is static and polymorphism doesn't work on these functions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13417) Some vector operators return "OP" as name
[ https://issues.apache.org/jira/browse/HIVE-13417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231151#comment-15231151 ] Gunther Hagleitner commented on HIVE-13417: --- Committed to master. > Some vector operators return "OP" as name > - > > Key: HIVE-13417 > URL: https://issues.apache.org/jira/browse/HIVE-13417 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Fix For: 2.1.0 > > Attachments: HIVE-13417.1.patch, HIVE-13417.2.patch, > HIVE-13417.3.patch, HIVE-13417.4.patch > > > Select/Group by/Filter/etc need to return the same name whether they are the > regular or the vector operators. If they don't the regular path matching in > our optimizer code doesn't work on them. > From the code it looks an attempt was made to follow this - unfortunately > getOperatorName is static and polymorphism doesn't work on these functions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13417) Some vector operators return "OP" as name
[ https://issues.apache.org/jira/browse/HIVE-13417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231125#comment-15231125 ] Gunther Hagleitner commented on HIVE-13417: --- Test failure is unrelated. > Some vector operators return "OP" as name > - > > Key: HIVE-13417 > URL: https://issues.apache.org/jira/browse/HIVE-13417 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Attachments: HIVE-13417.1.patch, HIVE-13417.2.patch, > HIVE-13417.3.patch, HIVE-13417.4.patch > > > Select/Group by/Filter/etc need to return the same name whether they are the > regular or the vector operators. If they don't the regular path matching in > our optimizer code doesn't work on them. > From the code it looks an attempt was made to follow this - unfortunately > getOperatorName is static and polymorphism doesn't work on these functions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13417) Some vector operators return "OP" as name
[ https://issues.apache.org/jira/browse/HIVE-13417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231107#comment-15231107 ] Hive QA commented on HIVE-13417: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12797255/HIVE-13417.4.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9980 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7502/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7502/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7502/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12797255 - PreCommit-HIVE-TRUNK-Build > Some vector operators return "OP" as name > - > > Key: HIVE-13417 > URL: https://issues.apache.org/jira/browse/HIVE-13417 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Attachments: HIVE-13417.1.patch, HIVE-13417.2.patch, > HIVE-13417.3.patch, HIVE-13417.4.patch > > > Select/Group by/Filter/etc need to return the same name whether they are the > regular or the vector operators. If they don't the regular path matching in > our optimizer code doesn't work on them. > From the code it looks an attempt was made to follow this - unfortunately > getOperatorName is static and polymorphism doesn't work on these functions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13455) JDBC: disable UT for Statement.cancel (TestJdbcDriver2#testQueryCancel)
[ https://issues.apache.org/jira/browse/HIVE-13455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231099#comment-15231099 ] Vaibhav Gumashta commented on HIVE-13455: - The test calls fail() in a spawned thread and junit doesn't fail the test due to that. Will need to address that as well before re-enabling. > JDBC: disable UT for Statement.cancel (TestJdbcDriver2#testQueryCancel) > --- > > Key: HIVE-13455 > URL: https://issues.apache.org/jira/browse/HIVE-13455 > Project: Hive > Issue Type: Bug > Components: HiveServer2, JDBC >Affects Versions: 1.2.1, 2.0.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > Attachments: HIVE-13455.1.patch > > > JDBC Statement.cancel doesn't seem to work. The related UT is also flaky as a > result. We should disable it till we fix it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13420) Clarify HS2 WebUI Query 'Elapsed TIme'
[ https://issues.apache.org/jira/browse/HIVE-13420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-13420: - Issue Type: Sub-task (was: Improvement) Parent: HIVE-12338 > Clarify HS2 WebUI Query 'Elapsed TIme' > -- > > Key: HIVE-13420 > URL: https://issues.apache.org/jira/browse/HIVE-13420 > Project: Hive > Issue Type: Sub-task > Components: Diagnosability >Affects Versions: 2.0.0 >Reporter: Szehon Ho >Assignee: Szehon Ho > Attachments: Elapsed Time.png, HIVE-13420.2.patch, HIVE-13420.patch, > Patched UI.2.png, Patched UI.png > > > Today the "Queries" section of the WebUI shows SQLOperations that are not > closed. > Elapsed time is thus a bit confusing, people might take this to mean query > runtime, actually it is the time since the operation was opened. The query > may be finished, but operation is not closed. Perhaps another timer column > is needed showing the runtime of the query to reduce this confusion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13455) JDBC: disable UT for Statement.cancel (TestJdbcDriver2#testQueryCancel)
[ https://issues.apache.org/jira/browse/HIVE-13455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-13455: Attachment: HIVE-13455.1.patch > JDBC: disable UT for Statement.cancel (TestJdbcDriver2#testQueryCancel) > --- > > Key: HIVE-13455 > URL: https://issues.apache.org/jira/browse/HIVE-13455 > Project: Hive > Issue Type: Bug > Components: HiveServer2, JDBC >Affects Versions: 1.2.1, 2.0.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > Attachments: HIVE-13455.1.patch > > > JDBC Statement.cancel doesn't seem to work. The related UT is also flaky as a > result. We should disable it till we fix it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13455) JDBC: disable UT for Statement.cancel (TestJdbcDriver2#testQueryCancel)
[ https://issues.apache.org/jira/browse/HIVE-13455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-13455: Summary: JDBC: disable UT for Statement.cancel (TestJdbcDriver2#testQueryCancel) (was: JDBC: disable UT for Statement.cancel) > JDBC: disable UT for Statement.cancel (TestJdbcDriver2#testQueryCancel) > --- > > Key: HIVE-13455 > URL: https://issues.apache.org/jira/browse/HIVE-13455 > Project: Hive > Issue Type: Bug > Components: HiveServer2, JDBC >Affects Versions: 1.2.1, 2.0.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > > JDBC Statement.cancel doesn't seem to work. The related UT is also flaky as a > result. We should disable it till we fix it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13452) StatsOptimizer should return no rows on empty table with group by
[ https://issues.apache.org/jira/browse/HIVE-13452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231060#comment-15231060 ] Pengcheng Xiong commented on HIVE-13452: mysql {code} Database changed mysql> create table t1 (a int); Query OK, 0 rows affected (0.02 sec) mysql> select count(1) from t1 group by 1; ERROR 1056 (42000): Can't group on 'count(1)' mysql> select count(1) from t1; +--+ | count(1) | +--+ |0 | +--+ 1 row in set (0.00 sec) {code} > StatsOptimizer should return no rows on empty table with group by > - > > Key: HIVE-13452 > URL: https://issues.apache.org/jira/browse/HIVE-13452 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer >Reporter: Ashutosh Chauhan >Assignee: Pengcheng Xiong > > {code} > create table t1 (a int); > analyze table t1 compute statistics; > analyze table t1 compute statistics for columns; > select count(1) from t1 group by 1; > set hive.compute.query.using.stats=true; > select count(1) from t1 group by 1; > {code} > In both cases result set should be empty. However, with statsoptimizer on > Hive returns one row with value 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12741) HS2 ShutdownHookManager holds extra of Driver instance in master/branch-2.0
[ https://issues.apache.org/jira/browse/HIVE-12741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-12741: -- Fix Version/s: 1.3.0 > HS2 ShutdownHookManager holds extra of Driver instance in master/branch-2.0 > --- > > Key: HIVE-12741 > URL: https://issues.apache.org/jira/browse/HIVE-12741 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 2.0.0, 2.1.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12741.1.patch > > > HIVE-12187 was meant to fix the described memory leak, however because of > interaction with HIVE-12187 in branch-2.0/master, the fix fails to take > effect. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13456) JDBC: fix Statement.cancel
[ https://issues.apache.org/jira/browse/HIVE-13456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-13456: Description: JDBC Statement.cancel is supposed to work by cancelling the underlying execution and freeing resources. However, in my testing, I see it failing in some runs for the same query. > JDBC: fix Statement.cancel > -- > > Key: HIVE-13456 > URL: https://issues.apache.org/jira/browse/HIVE-13456 > Project: Hive > Issue Type: Bug > Components: HiveServer2, JDBC >Affects Versions: 1.2.1, 2.0.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > > JDBC Statement.cancel is supposed to work by cancelling the underlying > execution and freeing resources. However, in my testing, I see it failing in > some runs for the same query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13452) StatsOptimizer should return no rows on empty table with group by
[ https://issues.apache.org/jira/browse/HIVE-13452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231065#comment-15231065 ] Pengcheng Xiong commented on HIVE-13452: postgres {code} dbtmp=# create table t1 (a int); CREATE TABLE dbtmp=# select count(1) from t1 group by 1; ERROR: aggregates not allowed in GROUP BY clause LINE 1: select count(1) from t1 group by 1; ^ dbtmp=# select count(1) from t1; count --- 0 (1 row) {code} > StatsOptimizer should return no rows on empty table with group by > - > > Key: HIVE-13452 > URL: https://issues.apache.org/jira/browse/HIVE-13452 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer >Reporter: Ashutosh Chauhan >Assignee: Pengcheng Xiong > > {code} > create table t1 (a int); > analyze table t1 compute statistics; > analyze table t1 compute statistics for columns; > select count(1) from t1 group by 1; > set hive.compute.query.using.stats=true; > select count(1) from t1 group by 1; > {code} > In both cases result set should be empty. However, with statsoptimizer on > Hive returns one row with value 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12741) HS2 ShutdownHookManager holds extra of Driver instance in master/branch-2.0
[ https://issues.apache.org/jira/browse/HIVE-12741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231059#comment-15231059 ] Daniel Dai commented on HIVE-12741: --- Also pushed to branch-1. > HS2 ShutdownHookManager holds extra of Driver instance in master/branch-2.0 > --- > > Key: HIVE-12741 > URL: https://issues.apache.org/jira/browse/HIVE-12741 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 2.0.0, 2.1.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12741.1.patch > > > HIVE-12187 was meant to fix the described memory leak, however because of > interaction with HIVE-12187 in branch-2.0/master, the fix fails to take > effect. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13439) JDBC: provide a way to retrieve GUID to query Yarn ATS
[ https://issues.apache.org/jira/browse/HIVE-13439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-13439: Attachment: HIVE-13439.2.patch > JDBC: provide a way to retrieve GUID to query Yarn ATS > -- > > Key: HIVE-13439 > URL: https://issues.apache.org/jira/browse/HIVE-13439 > Project: Hive > Issue Type: Bug > Components: JDBC >Affects Versions: 1.2.1, 2.0.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > Attachments: HIVE-13439.1.patch, HIVE-13439.2.patch > > > HIVE-9673 added support for passing base64 encoded operation handles to ATS. > We should a method on client side to retrieve that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12741) HS2 ShutdownHookManager holds extra of Driver instance in master/branch-2.0
[ https://issues.apache.org/jira/browse/HIVE-12741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231046#comment-15231046 ] Daniel Dai commented on HIVE-12741: --- This affects branch-1 as well. The reason for this leak is Driver.compile is nested, and we only invoke destroy once for this case: {code} at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:402) at org.apache.hadoop.hive.ql.optimizer.IndexUtils.createRootTask(IndexUtils.java:223) at org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler.getIndexBuilderMapRedTask(CompactIndexHandler.java:151) at org.apache.hadoop.hive.ql.index.TableBasedIndexHandler.getIndexBuilderMapRedTask(TableBasedIndexHandler.java:108) at org.apache.hadoop.hive.ql.index.TableBasedIndexHandler.generateIndexBuildTaskList(TableBasedIndexHandler.java:92) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.getIndexBuilderMapRed(DDLSemanticAnalyzer.java:1228) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeAlterIndexRebuild(DDLSemanticAnalyzer.java:1175) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:408) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:464) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:318) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1194) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1188) at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:110) at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:181) at org.apache.hive.service.cli.operation.Operation.run(Operation.java:257) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:419) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:406) at sun.reflect.GeneratedMethodAccessor26.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78) at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36) at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59) at com.sun.proxy.$Proxy20.executeStatementAsync(Unknown Source) at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:276) at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1317) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1302) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} > HS2 ShutdownHookManager holds extra of Driver instance in master/branch-2.0 > --- > > Key: HIVE-12741 > URL: https://issues.apache.org/jira/browse/HIVE-12741 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 2.0.0, 2.1.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 2.0.0 > > Attachments: HIVE-12741.1.patch > > > HIVE-12187 was meant to fix the described memory leak, however because of > interaction with HIVE-12187 in branch-2.0/master, the fix fails to take > effect. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13360) Refactoring Hive Authorization
[ https://issues.apache.org/jira/browse/HIVE-13360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13360: --- Affects Version/s: 2.0.0 > Refactoring Hive Authorization > -- > > Key: HIVE-13360 > URL: https://issues.apache.org/jira/browse/HIVE-13360 > Project: Hive > Issue Type: Sub-task > Components: Security >Affects Versions: 2.0.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 2.1.0 > > Attachments: HIVE-13360.01.patch, HIVE-13360.02.patch, > HIVE-13360.03.patch, HIVE-13360.04.patch, HIVE-13360.final.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13360) Refactoring Hive Authorization
[ https://issues.apache.org/jira/browse/HIVE-13360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231022#comment-15231022 ] Pengcheng Xiong commented on HIVE-13360: manually rerun all the test cases that failed. can not repo any of them. pushed to master. Thanks [~ashutoshc] for the review. > Refactoring Hive Authorization > -- > > Key: HIVE-13360 > URL: https://issues.apache.org/jira/browse/HIVE-13360 > Project: Hive > Issue Type: Sub-task > Components: Security >Affects Versions: 2.0.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 2.1.0 > > Attachments: HIVE-13360.01.patch, HIVE-13360.02.patch, > HIVE-13360.03.patch, HIVE-13360.04.patch, HIVE-13360.final.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13360) Refactoring Hive Authorization
[ https://issues.apache.org/jira/browse/HIVE-13360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13360: --- Resolution: Fixed Status: Resolved (was: Patch Available) > Refactoring Hive Authorization > -- > > Key: HIVE-13360 > URL: https://issues.apache.org/jira/browse/HIVE-13360 > Project: Hive > Issue Type: Sub-task > Components: Security >Affects Versions: 2.0.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 2.1.0 > > Attachments: HIVE-13360.01.patch, HIVE-13360.02.patch, > HIVE-13360.03.patch, HIVE-13360.04.patch, HIVE-13360.final.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13360) Refactoring Hive Authorization
[ https://issues.apache.org/jira/browse/HIVE-13360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13360: --- Attachment: HIVE-13360.final.patch > Refactoring Hive Authorization > -- > > Key: HIVE-13360 > URL: https://issues.apache.org/jira/browse/HIVE-13360 > Project: Hive > Issue Type: Sub-task > Components: Security >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 2.1.0 > > Attachments: HIVE-13360.01.patch, HIVE-13360.02.patch, > HIVE-13360.03.patch, HIVE-13360.04.patch, HIVE-13360.final.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-12968) genNotNullFilterForJoinSourcePlan: needs to merge predicates into the multi-AND
[ https://issues.apache.org/jira/browse/HIVE-12968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan reassigned HIVE-12968: --- Assignee: Ashutosh Chauhan (was: Gopal V) > genNotNullFilterForJoinSourcePlan: needs to merge predicates into the > multi-AND > --- > > Key: HIVE-12968 > URL: https://issues.apache.org/jira/browse/HIVE-12968 > Project: Hive > Issue Type: Improvement > Components: Logical Optimizer >Affects Versions: 2.1.0 >Reporter: Gopal V >Assignee: Ashutosh Chauhan >Priority: Minor > Attachments: HIVE-12968.1.patch, HIVE-12968.2.patch, > HIVE-12968.3.patch, HIVE-12968.4.patch, HIVE-12968.5.patch, > HIVE-12968.6.patch, HIVE-12968.7.patch > > > {code} > predicate: ((cbigint is not null and cint is not null) and cint BETWEEN > 100 AND 300) (type: boolean) > {code} > does not fold the IS_NULL on cint, because of the structure of the AND clause. > For example, see {{tez_dynpart_hashjoin_1.q}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12959) LLAP: Add task scheduler timeout when no nodes are alive
[ https://issues.apache.org/jira/browse/HIVE-12959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231010#comment-15231010 ] Prasanth Jayachandran commented on HIVE-12959: -- [~sseth] Could you please review? This patch needs tez-0.8.3-SNAPSHOT for compilation. > LLAP: Add task scheduler timeout when no nodes are alive > > > Key: HIVE-12959 > URL: https://issues.apache.org/jira/browse/HIVE-12959 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-12959.1.patch, HIVE-12959.2.patch > > > When there are no llap daemons running task scheduler should have a timeout > to fail the query instead of waiting forever. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13455) JDBC: disable UT for Statement.cancel
[ https://issues.apache.org/jira/browse/HIVE-13455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-13455: Component/s: JDBC HiveServer2 > JDBC: disable UT for Statement.cancel > - > > Key: HIVE-13455 > URL: https://issues.apache.org/jira/browse/HIVE-13455 > Project: Hive > Issue Type: Bug > Components: HiveServer2, JDBC >Affects Versions: 1.2.1, 2.0.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > > JDBC Statement.cancel doesn't seem to work. The related UT is also flaky as a > result. We should disable it till we fix it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12959) LLAP: Add task scheduler timeout when no nodes are alive
[ https://issues.apache.org/jira/browse/HIVE-12959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-12959: - Attachment: HIVE-12959.2.patch > LLAP: Add task scheduler timeout when no nodes are alive > > > Key: HIVE-12959 > URL: https://issues.apache.org/jira/browse/HIVE-12959 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-12959.1.patch, HIVE-12959.2.patch > > > When there are no llap daemons running task scheduler should have a timeout > to fail the query instead of waiting forever. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13455) JDBC: disable UT for Statement.cancel
[ https://issues.apache.org/jira/browse/HIVE-13455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-13455: Affects Version/s: 1.2.1 2.0.0 > JDBC: disable UT for Statement.cancel > - > > Key: HIVE-13455 > URL: https://issues.apache.org/jira/browse/HIVE-13455 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1, 2.0.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > > JDBC Statement.cancel doesn't seem to work. The related UT is also flaky as a > result. We should disable it till we fix it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13413) add a llapstatus command line tool
[ https://issues.apache.org/jira/browse/HIVE-13413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-13413: -- Attachment: HIVE-13413.03.patch Updated patch with review comments addressed. Thanks for the reviews. > add a llapstatus command line tool > -- > > Key: HIVE-13413 > URL: https://issues.apache.org/jira/browse/HIVE-13413 > Project: Hive > Issue Type: Improvement > Components: llap >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-13413.01.patch, HIVE-13413.02.patch, > HIVE-13413.03.patch, appComplete, invalidApp, oneContainerDown, running, > starting > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13450) LLAP: wire encryption for HDFS
[ https://issues.apache.org/jira/browse/HIVE-13450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230959#comment-15230959 ] Sergey Shelukhin commented on HIVE-13450: - I thought we weren't sure if it would work in LLAP... are we? In that case a no-op I guess > LLAP: wire encryption for HDFS > -- > > Key: HIVE-13450 > URL: https://issues.apache.org/jira/browse/HIVE-13450 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Siddharth Seth > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13438) Add a service check script for llap
[ https://issues.apache.org/jira/browse/HIVE-13438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230955#comment-15230955 ] Sergey Shelukhin commented on HIVE-13438: - I meant executing it as a separate command in llapservicedriver, so that the user could do it. I guess it's not necessary for some cases, should be ok > Add a service check script for llap > --- > > Key: HIVE-13438 > URL: https://issues.apache.org/jira/browse/HIVE-13438 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K > Attachments: HIVE-13438.1.patch, HIVE-13438.2.patch > > > We want to have a test script that can be run by an installer such as ambari > that makes sure that the service is up and running. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13342) Improve logging in llap decider for llap
[ https://issues.apache.org/jira/browse/HIVE-13342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230950#comment-15230950 ] Vikram Dixit K commented on HIVE-13342: --- [~sseth] Yes. The other log lines should tell us which operator interferes with running in llap. I changed the exception to use the right configuration variable from HiveConf. However, there is currently no way to get the values a configuration can take from code. I think it is better to not add more configuration to enable/disable the mode = all behavior. If the user is not sure if they can run in llap, they need to use mode = auto. The mode = all behavior only prevents further checking on the query if it can be run in llap. If under mode all, query cannot be run in llap because some parts of the plan cannot be run in it, it makes sense to stop the user from proceeding. If you feel strongly about needing the flag, I can add one but I am not convinced at this point in time. > Improve logging in llap decider for llap > > > Key: HIVE-13342 > URL: https://issues.apache.org/jira/browse/HIVE-13342 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K > Attachments: HIVE-13342.1.patch, HIVE-13342.2.patch > > > Currently we do not log our decisions with respect to llap. Are we running > everything in llap mode or only parts of the plan. We need more logging. > Also, if llap mode is all but for some reason, we cannot run the work in llap > mode, fail and throw an exception advise the user to change the mode to auto. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13287) Add logic to estimate stats for IN operator
[ https://issues.apache.org/jira/browse/HIVE-13287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230934#comment-15230934 ] Jesus Camacho Rodriguez commented on HIVE-13287: Some new q files might need to be updated still. > Add logic to estimate stats for IN operator > --- > > Key: HIVE-13287 > URL: https://issues.apache.org/jira/browse/HIVE-13287 > Project: Hive > Issue Type: Bug > Components: Statistics >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13287.01.patch, HIVE-13287.02.patch, > HIVE-13287.patch > > > Currently, IN operator is considered in the default case: reduces the input > rows to the half. This may lead to wrong estimates for the number of rows > produced by Filter operators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13287) Add logic to estimate stats for IN operator
[ https://issues.apache.org/jira/browse/HIVE-13287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230932#comment-15230932 ] Jesus Camacho Rodriguez commented on HIVE-13287: I have uploaded a new patch; to keep it short, original patch had the problem that was taking original number of columns as zero for some cases (from evaluatedRowCount). New patch solves that issue. > Add logic to estimate stats for IN operator > --- > > Key: HIVE-13287 > URL: https://issues.apache.org/jira/browse/HIVE-13287 > Project: Hive > Issue Type: Bug > Components: Statistics >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13287.01.patch, HIVE-13287.02.patch, > HIVE-13287.patch > > > Currently, IN operator is considered in the default case: reduces the input > rows to the half. This may lead to wrong estimates for the number of rows > produced by Filter operators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13320) Apply HIVE-11544 to explicit conversions as well as implicit ones
[ https://issues.apache.org/jira/browse/HIVE-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nita Dembla updated HIVE-13320: --- Attachment: HIVE-13320.2.patch > Apply HIVE-11544 to explicit conversions as well as implicit ones > - > > Key: HIVE-13320 > URL: https://issues.apache.org/jira/browse/HIVE-13320 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 1.3.0, 1.2.1, 2.0.0, 2.1.0 >Reporter: Gopal V >Assignee: Nita Dembla > Attachments: HIVE-13320.1.patch, HIVE-13320.2.patch, > HIVE-13320.2.patch > > > Parsing 1 million blank values through cast(x as int) is 3x slower than > parsing a valid single digit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HIVE-13287) Add logic to estimate stats for IN operator
[ https://issues.apache.org/jira/browse/HIVE-13287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-13287 started by Jesus Camacho Rodriguez. -- > Add logic to estimate stats for IN operator > --- > > Key: HIVE-13287 > URL: https://issues.apache.org/jira/browse/HIVE-13287 > Project: Hive > Issue Type: Bug > Components: Statistics >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13287.01.patch, HIVE-13287.patch > > > Currently, IN operator is considered in the default case: reduces the input > rows to the half. This may lead to wrong estimates for the number of rows > produced by Filter operators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13287) Add logic to estimate stats for IN operator
[ https://issues.apache.org/jira/browse/HIVE-13287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13287: --- Status: Patch Available (was: In Progress) > Add logic to estimate stats for IN operator > --- > > Key: HIVE-13287 > URL: https://issues.apache.org/jira/browse/HIVE-13287 > Project: Hive > Issue Type: Bug > Components: Statistics >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13287.01.patch, HIVE-13287.patch > > > Currently, IN operator is considered in the default case: reduces the input > rows to the half. This may lead to wrong estimates for the number of rows > produced by Filter operators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13287) Add logic to estimate stats for IN operator
[ https://issues.apache.org/jira/browse/HIVE-13287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13287: --- Status: Open (was: Patch Available) > Add logic to estimate stats for IN operator > --- > > Key: HIVE-13287 > URL: https://issues.apache.org/jira/browse/HIVE-13287 > Project: Hive > Issue Type: Bug > Components: Statistics >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13287.01.patch, HIVE-13287.patch > > > Currently, IN operator is considered in the default case: reduces the input > rows to the half. This may lead to wrong estimates for the number of rows > produced by Filter operators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13413) add a llapstatus command line tool
[ https://issues.apache.org/jira/browse/HIVE-13413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230841#comment-15230841 ] Prasanth Jayachandran commented on HIVE-13413: -- LGTM, +1. Left minor comments in RB. > add a llapstatus command line tool > -- > > Key: HIVE-13413 > URL: https://issues.apache.org/jira/browse/HIVE-13413 > Project: Hive > Issue Type: Improvement > Components: llap >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-13413.01.patch, HIVE-13413.02.patch, appComplete, > invalidApp, oneContainerDown, running, starting > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13413) add a llapstatus command line tool
[ https://issues.apache.org/jira/browse/HIVE-13413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-13413: -- Attachment: HIVE-13413.02.patch Thanks for the review. Updated patch attached. bq. Create a follow-up? I guess at this point this tool is just used as health check/status of daemons. Per daemon configurations are obtained via JMX? Done bq. daemon webaddress/status page currently shows Error 404. Is that part of this jira or another? This is being added in HIVE-13398 bq. populateAppStatusFromLlapRegistry(). do we need to create new Configuration object? reuse already created object? Creating a new instance, since we're modifying a field to set the instance name. Don't want to modify the original configuration used by the class. bq. llapExtraInstances.add(llapInstance); This line add nulls to the list right? I don't see it used anywhere other than logging. use boolean instead? This was not supposed to be adding LlapInstances. Changed to add containerId. While that's not used - it could be useful for logging in the future. bq. nit: remove deadcode. // String nmUrl = (String) containerParams.get("hostUrl"); Done bq. wow. Map>> That was painful to deal with :( > add a llapstatus command line tool > -- > > Key: HIVE-13413 > URL: https://issues.apache.org/jira/browse/HIVE-13413 > Project: Hive > Issue Type: Improvement > Components: llap >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-13413.01.patch, HIVE-13413.02.patch, appComplete, > invalidApp, oneContainerDown, running, starting > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6090) Audit logs for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-6090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230735#comment-15230735 ] HeeSoo Kim commented on HIVE-6090: -- [~thiruvel] Would you check the error of your last patch? Does anyone review this ticket? > Audit logs for HiveServer2 > -- > > Key: HIVE-6090 > URL: https://issues.apache.org/jira/browse/HIVE-6090 > Project: Hive > Issue Type: Improvement > Components: Diagnosability, HiveServer2 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Labels: audit, hiveserver > Attachments: HIVE-6090.1.WIP.patch, HIVE-6090.1.patch, > HIVE-6090.3.patch, HIVE-6090.4.patch, HIVE-6090.patch > > > HiveMetastore has audit logs and would like to audit all queries or requests > to HiveServer2 also. This will help in understanding how the APIs were used, > queries submitted, users etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Dholakia updated HIVE-12049: -- Attachment: HIVE-12049.18.patch > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, > HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, > HIVE-12049.15.patch, HIVE-12049.16.patch, HIVE-12049.17.patch, > HIVE-12049.18.patch, HIVE-12049.2.patch, HIVE-12049.3.patch, > HIVE-12049.4.patch, HIVE-12049.5.patch, HIVE-12049.6.patch, > HIVE-12049.7.patch, HIVE-12049.9.patch, new-driver-profiles.png, > old-driver-profiles.png > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-13451) LLAP: wire encryption for shuffle
[ https://issues.apache.org/jira/browse/HIVE-13451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth resolved HIVE-13451. --- Resolution: Duplicate Assignee: (was: Siddharth Seth) > LLAP: wire encryption for shuffle > - > > Key: HIVE-13451 > URL: https://issues.apache.org/jira/browse/HIVE-13451 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13438) Add a service check script for llap
[ https://issues.apache.org/jira/browse/HIVE-13438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230648#comment-15230648 ] Vikram Dixit K commented on HIVE-13438: --- Fixed the error. [~sershe] Can you elaborate a little more? Are you suggesting that we could run the query as part of starting the service itself? We could add that too but we still need something to run end-to-end (from starting the shell onwards). > Add a service check script for llap > --- > > Key: HIVE-13438 > URL: https://issues.apache.org/jira/browse/HIVE-13438 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K > Attachments: HIVE-13438.1.patch, HIVE-13438.2.patch > > > We want to have a test script that can be run by an installer such as ambari > that makes sure that the service is up and running. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13450) LLAP: wire encryption for HDFS
[ https://issues.apache.org/jira/browse/HIVE-13450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230642#comment-15230642 ] Siddharth Seth commented on HIVE-13450: --- [~sershe] - I believe this will be handled by HDFS configuration, and certain paths setup to use encryption. Is there anything specific to be done ? > LLAP: wire encryption for HDFS > -- > > Key: HIVE-13450 > URL: https://issues.apache.org/jira/browse/HIVE-13450 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Siddharth Seth > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13446) LLAP: set default management protocol acls to deny all
[ https://issues.apache.org/jira/browse/HIVE-13446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230638#comment-15230638 ] Siddharth Seth commented on HIVE-13446: --- We could also ensure that the user connecting is the same user that the process is running as. Only HiveServer should have access to the management protocol at the moment. > LLAP: set default management protocol acls to deny all > -- > > Key: HIVE-13446 > URL: https://issues.apache.org/jira/browse/HIVE-13446 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin > > The user needs to set the acls. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13438) Add a service check script for llap
[ https://issues.apache.org/jira/browse/HIVE-13438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-13438: -- Attachment: HIVE-13438.2.patch > Add a service check script for llap > --- > > Key: HIVE-13438 > URL: https://issues.apache.org/jira/browse/HIVE-13438 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K > Attachments: HIVE-13438.1.patch, HIVE-13438.2.patch > > > We want to have a test script that can be run by an installer such as ambari > that makes sure that the service is up and running. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13415) Decouple Sessions from thrift binary transport
[ https://issues.apache.org/jira/browse/HIVE-13415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230621#comment-15230621 ] Szehon Ho commented on HIVE-13415: -- This just seems to remove it without making it configurable? > Decouple Sessions from thrift binary transport > -- > > Key: HIVE-13415 > URL: https://issues.apache.org/jira/browse/HIVE-13415 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Rajat Khandelwal >Assignee: Rajat Khandelwal > Attachments: HIVE-13415.01.patch > > > Current behaviour is: > * Open a thrift binary transport > * create a session > * close the transport > Then the session gets closed. Consequently, all the operations running in the > session also get killed. > Whereas, if you open an HTTP transport, and close, the enclosing sessions are > not closed. > This seems like a bad design, having transport and sessions tightly coupled. > I'd like to fix this. > The issue that introduced it is > [HIVE-9601|https://github.com/apache/hive/commit/48bea00c48853459af64b4ca9bfdc3e821c4ed82] > Relevant discussions at > [here|https://issues.apache.org/jira/browse/HIVE-11485?focusedCommentId=15223546=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15223546], > > [here|https://issues.apache.org/jira/browse/HIVE-11485?focusedCommentId=15223827=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15223827] > and mentioned links on those comments. > Another thing that seems like a slightly bad design is this line of code in > ThriftBinaryCLIService: > {noformat} > server.setServerEventHandler(serverEventHandler); > {noformat} > Whereas serverEventHandler is defined by the base class, with no users except > one sub-class(ThriftBinaryCLIService), violating the separation of concerns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13420) Clarify HS2 WebUI Query 'Elapsed TIme'
[ https://issues.apache.org/jira/browse/HIVE-13420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-13420: - Status: Patch Available (was: Open) > Clarify HS2 WebUI Query 'Elapsed TIme' > -- > > Key: HIVE-13420 > URL: https://issues.apache.org/jira/browse/HIVE-13420 > Project: Hive > Issue Type: Improvement > Components: Diagnosability >Affects Versions: 2.0.0 >Reporter: Szehon Ho >Assignee: Szehon Ho > Attachments: Elapsed Time.png, HIVE-13420.2.patch, HIVE-13420.patch, > Patched UI.2.png, Patched UI.png > > > Today the "Queries" section of the WebUI shows SQLOperations that are not > closed. > Elapsed time is thus a bit confusing, people might take this to mean query > runtime, actually it is the time since the operation was opened. The query > may be finished, but operation is not closed. Perhaps another timer column > is needed showing the runtime of the query to reduce this confusion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11981) ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized)
[ https://issues.apache.org/jira/browse/HIVE-11981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230452#comment-15230452 ] Qiuzhuang Lian commented on HIVE-11981: --- For more info, when I compact the table in hive 1.2 version, we see TreeReaderFactory errors as reported in this issue: https://issues.apache.org/jira/browse/HIVE-13432 > ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized) > -- > > Key: HIVE-11981 > URL: https://issues.apache.org/jira/browse/HIVE-11981 > Project: Hive > Issue Type: Bug > Components: Hive, Transactions >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: HIVE-11981.01.patch, HIVE-11981.02.patch, > HIVE-11981.03.patch, HIVE-11981.05.patch, HIVE-11981.06.patch, > HIVE-11981.07.patch, HIVE-11981.08.patch, HIVE-11981.09.patch, > HIVE-11981.091.patch, HIVE-11981.092.patch, HIVE-11981.093.patch, > HIVE-11981.094.patch, HIVE-11981.095.patch, HIVE-11981.096.patch, > HIVE-11981.097.patch, HIVE-11981.098.patch, HIVE-11981.099.patch, > HIVE-11981.0991.patch, HIVE-11981.0992.patch, ORC Schema Evolution Issues.docx > > > High priority issues with schema evolution for the ORC file format. > Schema evolution here is limited to adding new columns and a few cases of > column type-widening (e.g. int to bigint). > Renaming columns, deleting column, moving columns and other schema evolution > were not pursued due to lack of importance and lack of time. Also, it > appears a much more sophisticated metadata would be needed to support them. > The biggest issues for users have been adding new columns for ACID table > (HIVE-11421 Support Schema evolution for ACID tables) and vectorization > (HIVE-10598 Vectorization borks when column is added to table). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13420) Clarify HS2 WebUI Query 'Elapsed TIme'
[ https://issues.apache.org/jira/browse/HIVE-13420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230328#comment-15230328 ] Aihua Xu commented on HIVE-13420: - The patch looks good. +1. > Clarify HS2 WebUI Query 'Elapsed TIme' > -- > > Key: HIVE-13420 > URL: https://issues.apache.org/jira/browse/HIVE-13420 > Project: Hive > Issue Type: Improvement > Components: Diagnosability >Affects Versions: 2.0.0 >Reporter: Szehon Ho >Assignee: Szehon Ho > Attachments: Elapsed Time.png, HIVE-13420.2.patch, HIVE-13420.patch, > Patched UI.2.png, Patched UI.png > > > Today the "Queries" section of the WebUI shows SQLOperations that are not > closed. > Elapsed time is thus a bit confusing, people might take this to mean query > runtime, actually it is the time since the operation was opened. The query > may be finished, but operation is not closed. Perhaps another timer column > is needed showing the runtime of the query to reduce this confusion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-13427) Update committer list
[ https://issues.apache.org/jira/browse/HIVE-13427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu resolved HIVE-13427. - Resolution: Fixed Committed. Thanks Szehon. > Update committer list > - > > Key: HIVE-13427 > URL: https://issues.apache.org/jira/browse/HIVE-13427 > Project: Hive > Issue Type: Bug >Reporter: Aihua Xu >Assignee: Aihua Xu >Priority: Minor > Attachments: HIVE-13427.patch > > > Please update committer list: > Name: Aihua Xu > Apache ID: aihuaxu > Organization: Cloudera > Name: Yongzhi Chen > Apache ID: ychena > Organization: Cloudera -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13235) Insert from select generates incorrect result when hive.optimize.constant.propagation is on
[ https://issues.apache.org/jira/browse/HIVE-13235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230290#comment-15230290 ] Aihua Xu commented on HIVE-13235: - [~ashutoshc] I haven't had a final solution yet. Seems my solutions would fix the issue but also break valid constant propagation. I think it's on the right direction: for select operators, an alias and internal name are not enough. We should have another columnName if it's mapped to table column (e.g., select col1 as alias). The parent ops would only see col1 but child ops would only see alias. Right now, we ignore col1 but use alias always. I'm working on it but seems to need bigger changes. Will create RB when it's ready. > Insert from select generates incorrect result when > hive.optimize.constant.propagation is on > --- > > Key: HIVE-13235 > URL: https://issues.apache.org/jira/browse/HIVE-13235 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-13235.1.patch, HIVE-13235.2.patch, > HIVE-13235.3.patch > > > The following query returns incorrect result when constant optimization is > turned on. The subquery happens to have an alias p1 to be the same as the > input partition name. Constant optimizer will optimize it incorrectly as the > constant. > When constant optimizer is turned off, we will get the correct result. > {noformat} > set hive.cbo.enable=false; > set hive.optimize.constant.propagation = true; > create table t1(c1 string, c2 double) partitioned by (p1 string, p2 string); > create table t2(p1 double, c2 string); > insert into table t1 partition(p1='40', p2='p2') values('c1', 0.0); > INSERT OVERWRITE TABLE t2 select if((c2 = 0.0), c2, '0') as p1, 2 as p2 from > t1 where c1 = 'c1' and p1 = '40'; > select * from t2; > 40 2 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11351) Column Found in more than One Tables/Subqueries
[ https://issues.apache.org/jira/browse/HIVE-11351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230278#comment-15230278 ] Aihua Xu commented on HIVE-11351: - This does look the same issue as HIVE-13235. This fix would work I guess, but it will break many valid constant propagation. I'm still working on it. > Column Found in more than One Tables/Subqueries > --- > > Key: HIVE-11351 > URL: https://issues.apache.org/jira/browse/HIVE-11351 > Project: Hive > Issue Type: Bug > Environment: HIVE 1.1.0 >Reporter: MK >Assignee: Alina Abramova > Attachments: HIVE-11351-branch-1.0.patch > > > when execute a script: > INSERT overwrite TABLE tmp.tmp_dim_cpttr_categ1 >SELECT DISTINCT cur.categ_id AS categ_id, >cur.categ_code AS categ_code, >cur.categ_name AS categ_name, >cur.categ_parnt_id AS categ_parnt_id, >par.categ_name AS categ_parnt_name, >cur.mc_site_id AS mc_site_id >FROM tmp.tmp_dim_cpttr_categ cur >LEFT OUTER JOIN tmp.tmp_dim_cpttr_categ par >ON cur.categ_parnt_id = par.categ_id; > error occur : SemanticException Column categ_name Found in more than One > Tables/Subqueries > when modify the alias categ_name to categ_name_cur, it will be execute > successfully. > INSERT overwrite TABLE tmp.tmp_dim_cpttr_categ1 >SELECT DISTINCT cur.categ_id AS categ_id, >cur.categ_code AS categ_code, >cur.categ_name AS categ_name_cur, >cur.categ_parnt_id AS categ_parnt_id, >par.categ_name AS categ_parnt_name, >cur.mc_site_id AS mc_site_id >FROM tmp.tmp_dim_cpttr_categ cur >LEFT OUTER JOIN tmp.tmp_dim_cpttr_categ par >ON cur.categ_parnt_id = par.categ_id; > this happen when we upgrade hive from 0.10 to 1.1.0 . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9534) incorrect result set for query that projects a windowed aggregate
[ https://issues.apache.org/jira/browse/HIVE-9534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230269#comment-15230269 ] Aihua Xu commented on HIVE-9534: [~leftylev] I just updated the doc. I also filed HIVE-13453 to track the improvement to address the current limitation. > incorrect result set for query that projects a windowed aggregate > - > > Key: HIVE-9534 > URL: https://issues.apache.org/jira/browse/HIVE-9534 > Project: Hive > Issue Type: Bug > Components: PTF-Windowing >Reporter: N Campbell >Assignee: Aihua Xu > Fix For: 2.1.0 > > Attachments: HIVE-9534.1.patch, HIVE-9534.2.patch, HIVE-9534.3.patch, > HIVE-9534.4.patch > > > Result set returned by Hive has one row instead of 5 > {code} > select avg(distinct tsint.csint) over () from tsint > create table if not exists TSINT (RNUM int , CSINT smallint) > ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' > STORED AS TEXTFILE; > 0|\N > 1|-1 > 2|0 > 3|1 > 4|10 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-13453) Support ORDER BY and windowing clause in partitioning clause with distinct function
[ https://issues.apache.org/jira/browse/HIVE-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu reassigned HIVE-13453: --- Assignee: Aihua Xu (was: Harish Butani) > Support ORDER BY and windowing clause in partitioning clause with distinct > function > --- > > Key: HIVE-13453 > URL: https://issues.apache.org/jira/browse/HIVE-13453 > Project: Hive > Issue Type: Sub-task > Components: PTF-Windowing >Reporter: Aihua Xu >Assignee: Aihua Xu > > Current distinct function on partitioning doesn't support order by and > windowing clause due to performance reason. Explore an efficient way to > support that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13453) Support ORDER BY and windowing clause in partitioning clause with distinct function
[ https://issues.apache.org/jira/browse/HIVE-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-13453: Fix Version/s: (was: 2.1.0) > Support ORDER BY and windowing clause in partitioning clause with distinct > function > --- > > Key: HIVE-13453 > URL: https://issues.apache.org/jira/browse/HIVE-13453 > Project: Hive > Issue Type: Sub-task > Components: PTF-Windowing >Reporter: Aihua Xu >Assignee: Aihua Xu > > Current distinct function on partitioning doesn't support order by and > windowing clause due to performance reason. Explore an efficient way to > support that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13400) Following up HIVE-12481, add retry for Zookeeper service discovery
[ https://issues.apache.org/jira/browse/HIVE-13400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230099#comment-15230099 ] Hive QA commented on HIVE-13400: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12797180/HIVE-13400.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9964 tests executed *Failed tests:* {noformat} TestMiniTezCliDriver-vector_distinct_2.q-load_dyn_part2.q-join1.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.ql.security.TestMetastoreAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.thrift.TestHadoopAuthBridge23.testSaslWithHiveMetaStore {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7497/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7497/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7497/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12797180 - PreCommit-HIVE-TRUNK-Build > Following up HIVE-12481, add retry for Zookeeper service discovery > -- > > Key: HIVE-13400 > URL: https://issues.apache.org/jira/browse/HIVE-13400 > Project: Hive > Issue Type: Improvement > Components: JDBC >Affects Versions: 2.1.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-13400.1.patch, HIVE-13400.2.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12077) MSCK Repair table should fix partitions in batches
[ https://issues.apache.org/jira/browse/HIVE-12077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam updated HIVE-12077: Status: Patch Available (was: Open) Batch size can be configured for the msck repair command with the newly introduced propery "hive.msck.repair.batch.size". If the value is greater than zero, it will execute batchwise with the configured batch size. Default value for the property is zero. Zero means it will execute directly Not batchwise. > MSCK Repair table should fix partitions in batches > --- > > Key: HIVE-12077 > URL: https://issues.apache.org/jira/browse/HIVE-12077 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Ryan P >Assignee: Chinna Rao Lalam > Attachments: HIVE-12077.1.patch, HIVE-12077.2.patch, > HIVE-12077.3.patch > > > If a user attempts to run MSCK REPAIR TABLE on a directory with a large > number of untracked partitions HMS will OOME. I suspect this is because it > attempts to do one large bulk load in an effort to save time. Ultimately this > can lead to a collection so large in size that HMS eventually hits an Out of > Memory Exception. > Instead I suggest that Hive include a configurable batch size that HMS can > use to break up the load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12077) MSCK Repair table should fix partitions in batches
[ https://issues.apache.org/jira/browse/HIVE-12077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam updated HIVE-12077: Attachment: HIVE-12077.3.patch > MSCK Repair table should fix partitions in batches > --- > > Key: HIVE-12077 > URL: https://issues.apache.org/jira/browse/HIVE-12077 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Ryan P >Assignee: Chinna Rao Lalam > Attachments: HIVE-12077.1.patch, HIVE-12077.2.patch, > HIVE-12077.3.patch > > > If a user attempts to run MSCK REPAIR TABLE on a directory with a large > number of untracked partitions HMS will OOME. I suspect this is because it > attempts to do one large bulk load in an effort to save time. Ultimately this > can lead to a collection so large in size that HMS eventually hits an Out of > Memory Exception. > Instead I suggest that Hive include a configurable batch size that HMS can > use to break up the load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12077) MSCK Repair table should fix partitions in batches
[ https://issues.apache.org/jira/browse/HIVE-12077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam updated HIVE-12077: Attachment: HIVE-12077.2.patch > MSCK Repair table should fix partitions in batches > --- > > Key: HIVE-12077 > URL: https://issues.apache.org/jira/browse/HIVE-12077 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Ryan P >Assignee: Chinna Rao Lalam > Attachments: HIVE-12077.1.patch, HIVE-12077.2.patch > > > If a user attempts to run MSCK REPAIR TABLE on a directory with a large > number of untracked partitions HMS will OOME. I suspect this is because it > attempts to do one large bulk load in an effort to save time. Ultimately this > can lead to a collection so large in size that HMS eventually hits an Out of > Memory Exception. > Instead I suggest that Hive include a configurable batch size that HMS can > use to break up the load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11981) ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized)
[ https://issues.apache.org/jira/browse/HIVE-11981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229977#comment-15229977 ] Qiuzhuang Lian commented on HIVE-11981: --- nope, our table doesn't have any struct. here is the DDL, CREATE TABLE `my_orc_table`( `id` string, `store_no_from` string, `store_name_from` string, `store_no_to` string, `store_name_to` string, `order_unit_no_from` string, `order_unit_name_from` string, `order_unit_no_to` string, `order_unit_name_to` string, `store_no` string, `store_name` string, `company_no` string, `order_unit_no` string, `order_unit_name` string, `item_no` string, `item_code` string, `item_name` string, `brand_no` string, `brand_name` string, `category_no` string, `sku_no` string, `size_no` string, `size_kind` string, `bill_no` string, `status` tinyint, `bill_type` int, `in_out_flag` tinyint, `ref_bill_no` string, `ref_bill_type` int, `biz_type` int, `account_type` tinyint, `bill_date` date, `cost` decimal(12,2), `balance_offset` int, `balance_qty` int, `factory_in_offset` int, `factory_in_qty` int, `factory_in_diff_offset` int, `factory_in_diff_qty` int, `transit_in_offset` int, `transit_in_qty` int, `transit_out_offset` int, `transit_out_qty` int, `in_diff_offset` int, `in_diff_qty` int, `out_diff_offset` int, `out_diff_qty` int, `transit_in_account_offset` int, `transit_in_account_qty` int, `transit_out_account_offset` int, `transit_out_account_qty` int, `in_diff_account_offset` int, `in_diff_account_qty` int, `out_diff_account_offset` int, `out_diff_account_qty` int, `lock_offset` int, `lock_qty` int, `occupied_offset` int, `occupied_qty` int, `backup_offset` int, `backup_qty` int, `guest_bad_offset` int, `guest_bad_qty` int, `original_bad_offset` int, `original_bad_qty` int, `bad_transit_offset` int, `bad_transit_qty` int, `bad_diff_offset` int, `bad_diff_qty` int, `return_offset` int, `return_qty` int, `borrow_offset` int, `borrow_qty` int, `create_time` timestamp, `create_timestamp` timestamp, `update_time` timestamp, `sharding_flag` string, `yw_update_time` timestamp, `hive_create_time` timestamp, `biz_date` int) CLUSTERED BY ( id) INTO 10 BUCKETS ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 'hdfs://nn:9000/hive/warehouse/lqz.db/my_orc_table' TBLPROPERTIES ( 'COLUMN_STATS_ACCURATE'='{\"BASIC_STATS\":\"true\"}', 'last_modified_by'='hive', 'last_modified_time'='1460015324', 'numFiles'='23', 'numRows'='33828471', 'orc.compress'='SNAPPY', 'orc.create.index'='true', 'orc.stripe.size'='67108864', 'rawDataSize'='92332902940', 'totalSize'='1474582939', 'transactional'='true', 'transient_lastDdlTime'='1460015745') > ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized) > -- > > Key: HIVE-11981 > URL: https://issues.apache.org/jira/browse/HIVE-11981 > Project: Hive > Issue Type: Bug > Components: Hive, Transactions >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: HIVE-11981.01.patch, HIVE-11981.02.patch, > HIVE-11981.03.patch, HIVE-11981.05.patch, HIVE-11981.06.patch, > HIVE-11981.07.patch, HIVE-11981.08.patch, HIVE-11981.09.patch, > HIVE-11981.091.patch, HIVE-11981.092.patch, HIVE-11981.093.patch, > HIVE-11981.094.patch, HIVE-11981.095.patch, HIVE-11981.096.patch, > HIVE-11981.097.patch, HIVE-11981.098.patch, HIVE-11981.099.patch, > HIVE-11981.0991.patch, HIVE-11981.0992.patch, ORC Schema Evolution Issues.docx > > > High priority issues with schema evolution for the ORC file format. > Schema evolution here is limited to adding new columns and a few cases of > column type-widening (e.g. int to bigint). > Renaming columns, deleting column, moving columns and other schema evolution > were not pursued due to lack of importance and lack of time. Also, it > appears a much more sophisticated metadata would be needed to support them. > The biggest issues for users have been adding new columns for ACID table > (HIVE-11421 Support Schema evolution for ACID tables) and vectorization > (HIVE-10598 Vectorization borks when column is added to table). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11981) ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized)
[ https://issues.apache.org/jira/browse/HIVE-11981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229965#comment-15229965 ] Matt McCline commented on HIVE-11981: - [~qiuzhuang] What is the DDL for the table? Does it have a column of type STRUCT? > ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized) > -- > > Key: HIVE-11981 > URL: https://issues.apache.org/jira/browse/HIVE-11981 > Project: Hive > Issue Type: Bug > Components: Hive, Transactions >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: HIVE-11981.01.patch, HIVE-11981.02.patch, > HIVE-11981.03.patch, HIVE-11981.05.patch, HIVE-11981.06.patch, > HIVE-11981.07.patch, HIVE-11981.08.patch, HIVE-11981.09.patch, > HIVE-11981.091.patch, HIVE-11981.092.patch, HIVE-11981.093.patch, > HIVE-11981.094.patch, HIVE-11981.095.patch, HIVE-11981.096.patch, > HIVE-11981.097.patch, HIVE-11981.098.patch, HIVE-11981.099.patch, > HIVE-11981.0991.patch, HIVE-11981.0992.patch, ORC Schema Evolution Issues.docx > > > High priority issues with schema evolution for the ORC file format. > Schema evolution here is limited to adding new columns and a few cases of > column type-widening (e.g. int to bigint). > Renaming columns, deleting column, moving columns and other schema evolution > were not pursued due to lack of importance and lack of time. Also, it > appears a much more sophisticated metadata would be needed to support them. > The biggest issues for users have been adding new columns for ACID table > (HIVE-11421 Support Schema evolution for ACID tables) and vectorization > (HIVE-10598 Vectorization borks when column is added to table). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-12878: Status: Patch Available (was: In Progress) > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12878.01.patch, HIVE-12878.02.patch, > HIVE-12878.03.patch, HIVE-12878.04.patch, HIVE-12878.05.patch, > HIVE-12878.06.patch, HIVE-12878.07.patch > > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-12878: Attachment: HIVE-12878.07.patch > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12878.01.patch, HIVE-12878.02.patch, > HIVE-12878.03.patch, HIVE-12878.04.patch, HIVE-12878.05.patch, > HIVE-12878.06.patch, HIVE-12878.07.patch > > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12878) Support Vectorization for TEXTFILE and other formats
[ https://issues.apache.org/jira/browse/HIVE-12878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-12878: Status: In Progress (was: Patch Available) > Support Vectorization for TEXTFILE and other formats > > > Key: HIVE-12878 > URL: https://issues.apache.org/jira/browse/HIVE-12878 > Project: Hive > Issue Type: New Feature > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12878.01.patch, HIVE-12878.02.patch, > HIVE-12878.03.patch, HIVE-12878.04.patch, HIVE-12878.05.patch, > HIVE-12878.06.patch > > > Support vectorizing when the input format is TEXTFILE and other formats for > better Map Vertex performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)