Re: [DISCUSS] Hive 3.2
A few comments. I am going to move forward with a VOTE on Hive 3.2 next. -Original Message- From: Matt McCline Sent: Monday, October 26, 2020 2:12 PM To: dev@hive.apache.org Subject: RE: [EXTERNAL] Re: [DISCUSS] Hive 3.2 Hi László, Thank you for your response. Since 3.1.3-rc0 was tagged on Jan 13 there are 3156 commits in master more than in this tag. I mostly wanted to address the huge number of changes in master. We could do a 3.1.3 release with a modest number of changes, and a 3.2 with perhaps many or all of the 3,000+ changes in master. What do you think? Matt -Original Message- From: László Bodor Sent: Monday, October 26, 2020 4:19 AM To: dev@hive.apache.org Subject: [EXTERNAL] Re: [DISCUSS] Hive 3.2 Sorry, posted incorrect link for 3.1.3-rc0, the correct is: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fhive%2Freleases%2Ftag%2Frelease-3.1.3-rc0data=04%7C01%7Cmatt.mccline%40microsoft.com%7Ceaddbcb684604ac8867808d879a0ff93%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637393079672104819%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=CXYiz9oONt6%2FpsGhAXso30qgjr3JQljyTKnxdE3XSvI%3Dreserved=0 On Mon, 26 Oct 2020 at 12:17, László Bodor wrote: > Hey! > > I'm also interested in PMCs' opinion. I think it should be released > from branch-3, otherwise, it's a 4.0, right? (which is a heavier > discussion, and I don't know what Hive4 will be about.) On 3.x we have > an official 3.1.2 and an abandoned 3.1.3-rc0, which is not yet > released as far as I can see. I guess the next release is supposed to > be 3.1.3 as we haven't changed tez/hadoop/orc dependencies since that, > and I don't think branch-3 was actively maintained. > > Regards, > Laszlo Bodor > > On Thu, 22 Oct 2020 at 21:24, Matt McCline > wrote: > >> Hey, >> Hive master is about 2 years ahead of 3.1 - it seems like time to >> release those changes. >> So, let us have community discussion about creating a Hive 3.2 release. >> I volunteer to be the release manager. I have not done that before, >> so I will need help. >> I will start a VOTE thread soon, but I would like to hear some >> opinions first. >> >> Thank you, >> Matt >> >> (It is unclear if there are enough major features or dependencies on >> projects that necessitate a major version bump) >> >>
[jira] [Created] (HIVE-24387) Metastore access through JDBC handler does not use correct database accessor
Jesus Camacho Rodriguez created HIVE-24387: -- Summary: Metastore access through JDBC handler does not use correct database accessor Key: HIVE-24387 URL: https://issues.apache.org/jira/browse/HIVE-24387 Project: Hive Issue Type: Bug Components: JDBC storage handler Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez There is some differences in the SQL syntax for each RDBMS generated by the database accessor. For metastore, we always end up with the default accessor, which lead to errors, e.g., when a limit query is executed for a Postgres-backed metastore. {code} Error: java.io.IOException: java.io.IOException: org.apache.hive.storage.jdbc.exception.HiveJdbcDatabaseAccessException: Error while trying to get column names: ERROR: syntax error at or near "{" Position: 200 (state=,code=0) SELECT "TBL_COLUMN_GRANT_ID", "COLUMN_NAME", "CREATE_TIME", "GRANT_OPTION", "GRANTOR", "GRANTOR_TYPE", "PRINCIPAL_NAME", "PRINCIPAL_TYPE", "TBL_COL_PRIV", "TBL_ID", "AUTHORIZER" FROM "TBL_COL_PRIVS" {LIMIT 1} {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: Result of the TPC-DS benchmark on Hive master branch
Hi Zoltan, I have run another fresh TPC-DS test using the latest commit. Here is the summary: Commits used: 1) Hive, master, e9f72e654750de208227d46a22e983413b080c6c (HIVE-24366, Thu Nov 12) 2) Tez, 0.10.0, 22fec6c0ecc7ebe6f6f28800935cc6f69794dad5 (CHANGES.txt updated with TEZ-4238, Thu Oct 8) Scenario: 1) create a database consisting of external tables from a 100GB TPC-DS text dataset 2) create a database consisting of ORC tables 3) compute column statistics, set tez.runtime.compress=false 4) run TPC-DS queries and check the results Configuration: 1) set hive.execution.engine=tez, hive.execution.mode=container 2) set hive.cbo.enable=true Experiment #1: hive.optimize.shared.work.dppunion=true Query 2 fails: java.lang.IllegalArgumentException: Edge [Reducer 9 : org.apache.hadoop.hive.ql.exec.tez.ReduceTezProcessor] -> [Map 6 : org.apache.hadoop.hive.ql.exec.tez.MapTezProcessor] ({ BROADCAST : org.apache.tez.runtime.library.input.UnorderedKVInput >> PERSISTED >> org.apache.tez.runtime.library.output.UnorderedKVOutput >> NullEdgeManager }) already defined! Query 14 fails: org.apache.hadoop.hive.ql.parse.SemanticException: EXCEPT and INTERSECT operations are only supported with Cost Based Optimizations enabled. Please set 'hive.cbo.enable' to true! Query 59 fails: java.lang.IllegalArgumentException: Edge [Reducer 6 : org.apache.hadoop.hive.ql.exec.tez.ReduceTezProcessor] -> [Map 4 : org.apache.hadoop.hive.ql.exec.tez.MapTezProcessor] ({ BROADCAST : org.apache.tez.runtime.library.input.UnorderedKVInput >> PERSISTED >> org.apache.tez.runtime.library.output.UnorderedKVOutput >> NullEdgeManager }) already defined! Experiment #2: hive.optimize.shared.work.dppunion=false Query 14 fails: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException EXCEPT and INTERSECT operations are only supported with Cost Based Optimizations enabled. Please set 'hive.cbo.enable' to true! Summary: 1. With hive.optimize.shared.work.dppunion=true, query 2 and 59 fail. Please see the attachment for stack traces. 2. Query 14 fails in both cases, and it seems like another bug. Note that when hive.cbo.enable is set to true when running query 14. 3. For some queries, the number of rows is different between the two experiments. In most cases, it seems to be rounding errors, but the difference is rather large for some queries (e.g., query 29 and 58). Please see the attachment for the result. I could open a new Jira for this issue, or create a sub-task of HIVE-24384. Or perhaps HIVE-24384 is already enough. So please let me know which would be good for you. (I have automated the entire experiment, so if you would like to see the result of testing a new commit, I would be happy to rerun the experiment and get back to you.) Cheers, --- Sungwoo On Thu, Nov 12, 2020 at 10:49 PM Zoltan Haindrich wrote: > Hey Sungwoo! > > On 11/12/20 10:23 AM, Sungwoo Park wrote: > > Hi Zoltan, > > > > I used the same hive-site.xml for the previous test (which was okay) and > > the new test (which failed), so my guess is that it is perhaps due to a > > commit since the previous test. Let me try later to identify the commit > > that fails query 14, with the hope that identifying such a commit might > be > > useful in debugging. > > That would definetly help - if you could share the 2 commit hashes; it > might be possible that we could guess it from the commit message or > something. > > > > Another question: is HIVE-24360 part of a solution to the problem of > > hive.optimize.shared.work.dppunion? > > I have tried the latest commit (which includes HIVE-24360) using the > TPC-DS > > benchmark, and it seems like the problem still exists. > > Yes, HIVE-24360 should have fixed that - do you still see an exception > coming from tez-api reporting edge errors? > I will also pick these changes for a smaller benchmark run soon...but I'm > not running any right now. Could also note for which query you've seen the > exception - so that I > could also check it. > Could you please open a jira about this - and add the actual exception > trace/etc if available? > > cheers, > Zoltan > > > > > Cheers, > > > > --- Sungwoo > > > > On Mon, Nov 9, 2020 at 6:18 PM Zoltan Haindrich wrote: > > > >> Hey Sungwoo! > >> > >> Regarding Q14 / "java.lang.RuntimeException: equivalence mapping > violation" > >> > >> From the stack trace you shared it seems like the mapper have already > >> seen both the filter and the ast node earlier - and they are in separate > >> mapping groups. (Which is > >> unfortunate) I think it won't be simple to track that down - it will > >> definetly need some debugging. > >> The best would be to have a repro query for it... > >> > >> note: we already run q14 in TestTezPerf*Driver - could it might be > >> possible that we've disabled some features in the hive-site.xml for > these > >> tests; and that's why we > >> haven't seen it before? > >> > >> cheers, > >> Zoltan > >> > >> > > > 1) Hive,
Re: Credits page - Edits
Hi, svn co https://svn.apache.org/repos/asf/hive/hcatalog-historical/site/ worked for me. Narayanan On Fri, Nov 13, 2020 at 1:54 PM Anishek Agarwal wrote: > Hello, > > For hive under > > https://cwiki.apache.org/confluence/display/Hive/HowToCommit#HowToCommit-Newcommitters > > the step > svn co https://svn.apache.org/repos/asf/hive/site hive-site > says the URL is incorrect, anyone knows what the correct location is for > the above repo ? > > thanks > anishek >
[jira] [Created] (HIVE-24386) Add builder methods for GetTablesRequest and GetPartitionsRequest to HiveMetaStoreClient
Narayanan Venkateswaran created HIVE-24386: -- Summary: Add builder methods for GetTablesRequest and GetPartitionsRequest to HiveMetaStoreClient Key: HIVE-24386 URL: https://issues.apache.org/jira/browse/HIVE-24386 Project: Hive Issue Type: Sub-task Components: Hive Reporter: Narayanan Venkateswaran Assignee: Narayanan Venkateswaran Builder methods for GetTablesRequest and GetPartitionsRequest should be added to the HiveMetaStoreClient class. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24385) hive 数据类型date 值是不是有范围
罗鹏程 created HIVE-24385: -- Summary: hive 数据类型date 值是不是有范围 Key: HIVE-24385 URL: https://issues.apache.org/jira/browse/HIVE-24385 Project: Hive Issue Type: Task Components: CLI Affects Versions: 2.1.1 Environment: hive2.1 Reporter: 罗鹏程 字段数据类型是date, 插入数据"-99-99",结果是null,插入数据"-99-99",结果是“2020-10-19” 将数据类型修改为string时,再次插入数据"-99-99",结果是"-99-99", date是不是有范围? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24384) SharedWorkOptimizer improvements
Zoltan Haindrich created HIVE-24384: --- Summary: SharedWorkOptimizer improvements Key: HIVE-24384 URL: https://issues.apache.org/jira/browse/HIVE-24384 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich this started as a small feature addition but due to the sheer volume of the q.out changes - its better to do smaller changes at a time; which means more tickets... -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24383) Add Table type to HPL/SQL
Attila Magyar created HIVE-24383: Summary: Add Table type to HPL/SQL Key: HIVE-24383 URL: https://issues.apache.org/jira/browse/HIVE-24383 Project: Hive Issue Type: Improvement Components: hpl/sql Reporter: Attila Magyar Assignee: Attila Magyar -- This message was sent by Atlassian Jira (v8.3.4#803005)
Credits page - Edits
Hello, For hive under https://cwiki.apache.org/confluence/display/Hive/HowToCommit#HowToCommit-Newcommitters the step svn co https://svn.apache.org/repos/asf/hive/site hive-site says the URL is incorrect, anyone knows what the correct location is for the above repo ? thanks anishek