[jira] [Commented] (HIVE-4243) Fix column names in FileSinkOperator
[ https://issues.apache.org/jira/browse/HIVE-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939676#comment-14939676 ] Owen O'Malley commented on HIVE-4243: - The first two tests pass locally with the patch rebased to master. The last two tests are unrelated and fail on master without the patch. > Fix column names in FileSinkOperator > > > Key: HIVE-4243 > URL: https://issues.apache.org/jira/browse/HIVE-4243 > Project: Hive > Issue Type: Bug > Components: File Formats >Affects Versions: 1.3.0, 2.0.0 >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Attachments: HIVE-4243.patch, HIVE-4243.patch, HIVE-4243.patch, > HIVE-4243.patch, HIVE-4243.patch, HIVE-4243.patch, HIVE-4243.tmp.patch > > > All of the ObjectInspectors given to SerDe's by FileSinkOperator have virtual > column names. Since the files are part of tables, Hive knows the column > names. For self-describing file formats like ORC, having the real column > names will improve the understandability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11973) IN operator fails when the column type is DATE
[ https://issues.apache.org/jira/browse/HIVE-11973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-11973: Attachment: (was: HIVE-11973.1.patch) > IN operator fails when the column type is DATE > --- > > Key: HIVE-11973 > URL: https://issues.apache.org/jira/browse/HIVE-11973 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.0.0 >Reporter: sanjiv singh >Assignee: Yongzhi Chen > > Test DLL : > {code} > CREATE TABLE `date_dim`( > `d_date_sk` int, > `d_date_id` string, > `d_date` date, > `d_current_week` string, > `d_current_month` string, > `d_current_quarter` string, > `d_current_year` string) ; > {code} > Hive query : > {code} > SELECT * > FROM date_dim > WHERE d_date IN ('2000-03-22','2001-03-22') ; > {code} > In 1.0.0 , the above query fails with: > {code} > FAILED: SemanticException [Error 10014]: Line 1:180 Wrong arguments > ''2001-03-22'': The arguments for IN should be the same type! Types are: > {date IN (string, string)} > {code} > I changed the query as given to pass the error : > {code} > SELECT * > FROM date_dim > WHERE d_date IN (CAST('2000-03-22' AS DATE) , CAST('2001-03-22' AS DATE) > ) ; > {code} > But it works without casting : > {code} > SELECT * > FROM date_dim > WHERE d_date = '2000-03-22' ; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11973) IN operator fails when the column type is DATE
[ https://issues.apache.org/jira/browse/HIVE-11973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-11973: Attachment: HIVE-11973.1.patch Not sure why this patch is not in the waiting list of pre-commit build. Re-attach it. > IN operator fails when the column type is DATE > --- > > Key: HIVE-11973 > URL: https://issues.apache.org/jira/browse/HIVE-11973 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.0.0 >Reporter: sanjiv singh >Assignee: Yongzhi Chen > Attachments: HIVE-11973.1.patch > > > Test DLL : > {code} > CREATE TABLE `date_dim`( > `d_date_sk` int, > `d_date_id` string, > `d_date` date, > `d_current_week` string, > `d_current_month` string, > `d_current_quarter` string, > `d_current_year` string) ; > {code} > Hive query : > {code} > SELECT * > FROM date_dim > WHERE d_date IN ('2000-03-22','2001-03-22') ; > {code} > In 1.0.0 , the above query fails with: > {code} > FAILED: SemanticException [Error 10014]: Line 1:180 Wrong arguments > ''2001-03-22'': The arguments for IN should be the same type! Types are: > {date IN (string, string)} > {code} > I changed the query as given to pass the error : > {code} > SELECT * > FROM date_dim > WHERE d_date IN (CAST('2000-03-22' AS DATE) , CAST('2001-03-22' AS DATE) > ) ; > {code} > But it works without casting : > {code} > SELECT * > FROM date_dim > WHERE d_date = '2000-03-22' ; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11785) Support escaping carriage return and new line for LazySimpleSerDe
[ https://issues.apache.org/jira/browse/HIVE-11785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-11785: Attachment: HIVE-11785.2.patch Update all the affected thrift code and update the unit test baselines. > Support escaping carriage return and new line for LazySimpleSerDe > - > > Key: HIVE-11785 > URL: https://issues.apache.org/jira/browse/HIVE-11785 > Project: Hive > Issue Type: New Feature > Components: Query Processor >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Fix For: 2.0.0 > > Attachments: HIVE-11785.2.patch, HIVE-11785.patch, test.parquet > > > Create the table and perform the queries as follows. You will see different > results when the setting changes. > The expected result should be: > {noformat} > 1 newline > here > 2 carriage return > 3 both > here > {noformat} > {noformat} > hive> create table repo (lvalue int, charstring string) stored as parquet; > OK > Time taken: 0.34 seconds > hive> load data inpath '/tmp/repo/test.parquet' overwrite into table repo; > Loading data to table default.repo > chgrp: changing ownership of > 'hdfs://nameservice1/user/hive/warehouse/repo/test.parquet': User does not > belong to hive > Table default.repo stats: [numFiles=1, numRows=0, totalSize=610, > rawDataSize=0] > OK > Time taken: 0.732 seconds > hive> set hive.fetch.task.conversion=more; > hive> select * from repo; > OK > 1 newline > here > here carriage return > 3 both > here > Time taken: 0.253 seconds, Fetched: 3 row(s) > hive> set hive.fetch.task.conversion=none; > hive> select * from repo; > Query ID = root_20150909113535_e081db8b-ccd9-4c44-aad9-d990ffb8edf3 > Total jobs = 1 > Launching Job 1 out of 1 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1441752031022_0006, Tracking URL = > http://host-10-17-81-63.coe.cloudera.com:8088/proxy/application_1441752031022_0006/ > Kill Command = > /opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hadoop/bin/hadoop job > -kill job_1441752031022_0006 > Hadoop job information for Stage-1: number of mappers: 1; number of reducers: > 0 > 2015-09-09 11:35:54,127 Stage-1 map = 0%, reduce = 0% > 2015-09-09 11:36:04,664 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.98 > sec > MapReduce Total cumulative CPU time: 2 seconds 980 msec > Ended Job = job_1441752031022_0006 > MapReduce Jobs Launched: > Stage-Stage-1: Map: 1 Cumulative CPU: 2.98 sec HDFS Read: 4251 HDFS > Write: 51 SUCCESS > Total MapReduce CPU Time Spent: 2 seconds 980 msec > OK > 1 newline > NULL NULL > 2 carriage return > NULL NULL > 3 both > NULL NULL > Time taken: 25.131 seconds, Fetched: 6 row(s) > hive> > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11980) Follow up on HIVE-11696, exception is thrown from CTAS from the table with table-level serde is Parquet while partition-level serde is JSON
[ https://issues.apache.org/jira/browse/HIVE-11980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940004#comment-14940004 ] Aihua Xu commented on HIVE-11980: - The tests seem not related to the patch. [~szehon] can you help review the change? > Follow up on HIVE-11696, exception is thrown from CTAS from the table with > table-level serde is Parquet while partition-level serde is JSON > --- > > Key: HIVE-11980 > URL: https://issues.apache.org/jira/browse/HIVE-11980 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-11980.patch > > > When we create a new table from the table with table-level serde to be > Parquet and partition-level serde to be JSON, currently the following > exception will be thrown if there are struct fields. > Apparently, getStructFieldsDataAsList() also needs to handle the case of List > in addition to ArrayWritable similar to getStructFieldData. > {noformat} > Caused by: java.lang.UnsupportedOperationException: Cannot inspect > java.util.ArrayList > at > org.apache.hadoop.hive.ql.io.parquet.serde.ArrayWritableObjectInspector.getStructFieldsDataAsList(ArrayWritableObjectInspector.java:172) > at > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:354) > at > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:257) > at > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:241) > at > org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:55) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:720) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:813) > at > org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:813) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97) > at > org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:162) > at > org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6207) Integrate HCatalog with locking
[ https://issues.apache.org/jira/browse/HIVE-6207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-6207: - Assignee: (was: Eugene Koifman) > Integrate HCatalog with locking > --- > > Key: HIVE-6207 > URL: https://issues.apache.org/jira/browse/HIVE-6207 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 0.13.0 >Reporter: Alan Gates > Attachments: ACIDHCatalogDesign.pdf, HIVE-6207.patch > > > HCatalog currently ignores any locks created by Hive users. It should > respect the locks Hive creates as well as create locks itself when locking is > configured. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11175) create function using jar does not work with sql std authorization
[ https://issues.apache.org/jira/browse/HIVE-11175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940066#comment-14940066 ] Thejas M Nair commented on HIVE-11175: -- Sorry about the delay in reviewing the patch! The changes look good. However, looks like the new test is failing because of difference in paths of jar as seen in the q.out files. Also looks like some of the old udf test cases also need to be updated. We should check for presence of additional input entity and one less output entity. The test failure in TestHCatClient.testTableSchemaPropagation seems to be unrelated. > create function using jar does not work with sql std authorization > -- > > Key: HIVE-11175 > URL: https://issues.apache.org/jira/browse/HIVE-11175 > Project: Hive > Issue Type: Bug > Components: Authorization >Affects Versions: 1.2.0 >Reporter: Olaf Flebbe >Assignee: Olaf Flebbe > Fix For: 2.0.0 > > Attachments: HIVE-11175.1.patch > > > {code:sql}create function xxx as 'xxx' using jar 'file://foo.jar' {code} > gives error code for need of accessing a local foo.jar resource with ADMIN > privileges. Same for HDFS (DFS_URI) . > problem is that the semantic analysis enforces the ADMIN privilege for write > but the jar is clearly input not output. > Patch und Testcase appendend. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12005) Remove hbase based stats collection mechanism
[ https://issues.apache.org/jira/browse/HIVE-12005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-12005: - Affects Version/s: 2.0.0 > Remove hbase based stats collection mechanism > - > > Key: HIVE-12005 > URL: https://issues.apache.org/jira/browse/HIVE-12005 > Project: Hive > Issue Type: Task > Components: Statistics >Affects Versions: 2.0.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-12005.patch > > > Currently, hbase is one of the mechanism to collect and store statistics. I > have never come across anyone using it. FileSystem based collection mechanism > is default for few releases and is working well. We shall remove hbase stats > collector. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12005) Remove hbase based stats collection mechanism
[ https://issues.apache.org/jira/browse/HIVE-12005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940147#comment-14940147 ] Prasanth Jayachandran commented on HIVE-12005: -- LGTM, +1. Pending tests. Are we getting rid of JDBC as well? JDBC is giving more trouble related to maxKeyPrefix across different DBs. We are patching it here and there to work around the limits imposed by different DBs. > Remove hbase based stats collection mechanism > - > > Key: HIVE-12005 > URL: https://issues.apache.org/jira/browse/HIVE-12005 > Project: Hive > Issue Type: Task > Components: Statistics >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-12005.patch > > > Currently, hbase is one of the mechanism to collect and store statistics. I > have never come across anyone using it. FileSystem based collection mechanism > is default for few releases and is working well. We shall remove hbase stats > collector. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10752) Revert HIVE-5193
[ https://issues.apache.org/jira/browse/HIVE-10752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940101#comment-14940101 ] Daniel Dai commented on HIVE-10752: --- This should be committed to branch-1 as well. Also create HIVE-12006 to redo it in a right way. > Revert HIVE-5193 > > > Key: HIVE-10752 > URL: https://issues.apache.org/jira/browse/HIVE-10752 > Project: Hive > Issue Type: Sub-task > Components: HCatalog >Affects Versions: 1.2.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-10752.patch > > > Revert HIVE-5193 since it causes pig+hcatalog not working. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-12006) Enable Columnar Pushdown for RC/ORC File for HCatLoader
[ https://issues.apache.org/jira/browse/HIVE-12006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai resolved HIVE-12006. --- Resolution: Duplicate Sorry didn't realize there is one already. Sure, I will take a look. > Enable Columnar Pushdown for RC/ORC File for HCatLoader > --- > > Key: HIVE-12006 > URL: https://issues.apache.org/jira/browse/HIVE-12006 > Project: Hive > Issue Type: Improvement > Components: HCatalog >Affects Versions: 1.2.1 >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 1.3.0, 2.0.0 > > > This initially enabled by HIVE-5193. However, HIVE-10752 reverted it since > there is issue in original implementation. > We shall fix the issue an reenable it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11720) Allow HiveServer2 to set custom http request/response header size
[ https://issues.apache.org/jira/browse/HIVE-11720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-11720: Attachment: HIVE-11720.4.patch [~thejas] It is indeed needed. Added the change. > Allow HiveServer2 to set custom http request/response header size > - > > Key: HIVE-11720 > URL: https://issues.apache.org/jira/browse/HIVE-11720 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > Attachments: HIVE-11720.1.patch, HIVE-11720.2.patch, > HIVE-11720.3.patch, HIVE-11720.4.patch > > > In HTTP transport mode, authentication information is sent over as part of > HTTP headers. Sometimes (observed when Kerberos is used) the default buffer > size for the headers is not enough, resulting in an HTTP 413 FULL head error. > We can expose those as customizable params. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11972) [Refactor] Improve determination of dynamic partitioning columns in FileSink Operator
[ https://issues.apache.org/jira/browse/HIVE-11972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940137#comment-14940137 ] Prasanth Jayachandran commented on HIVE-11972: -- Looks much cleaner now. LGTM, +1 > [Refactor] Improve determination of dynamic partitioning columns in FileSink > Operator > - > > Key: HIVE-11972 > URL: https://issues.apache.org/jira/browse/HIVE-11972 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-11972.2.patch, HIVE-11972.3.patch, > HIVE-11972.4.patch, HIVE-11972.patch > > > Currently it uses column names to locate DP columns, which is brittle since > column names may change during planning and optimization phases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12005) Remove hbase based stats collection mechanism
[ https://issues.apache.org/jira/browse/HIVE-12005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940155#comment-14940155 ] Ashutosh Chauhan commented on HIVE-12005: - yup.. I indeed plan to remove JDBC as well. > Remove hbase based stats collection mechanism > - > > Key: HIVE-12005 > URL: https://issues.apache.org/jira/browse/HIVE-12005 > Project: Hive > Issue Type: Task > Components: Statistics >Affects Versions: 2.0.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-12005.patch > > > Currently, hbase is one of the mechanism to collect and store statistics. I > have never come across anyone using it. FileSystem based collection mechanism > is default for few releases and is working well. We shall remove hbase stats > collector. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12007) Hive LDAP Authenticator should allow just Domain without baseDN (for AD)
[ https://issues.apache.org/jira/browse/HIVE-12007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-12007: - Attachment: HIVE-12007.patch > Hive LDAP Authenticator should allow just Domain without baseDN (for AD) > > > Key: HIVE-12007 > URL: https://issues.apache.org/jira/browse/HIVE-12007 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Attachments: HIVE-12007.patch > > > When the baseDN is not configured but only the Domain has been set in > hive-site.xml, LDAP Atn provider cannot locate the user in the directory. > Authentication fails in such cases. This is a change from the prior > implementation where the auth request succeeds based on being able to bind to > the directory. This has been called out in the design doc in HIVE-7193. > But we should allow this for now for backward compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12007) Hive LDAP Authenticator should allow just Domain without baseDN (for AD)
[ https://issues.apache.org/jira/browse/HIVE-12007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940241#comment-14940241 ] Szehon Ho commented on HIVE-12007: -- Backward compatibility is important, +1 > Hive LDAP Authenticator should allow just Domain without baseDN (for AD) > > > Key: HIVE-12007 > URL: https://issues.apache.org/jira/browse/HIVE-12007 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Attachments: HIVE-12007.patch > > > When the baseDN is not configured but only the Domain has been set in > hive-site.xml, LDAP Atn provider cannot locate the user in the directory. > Authentication fails in such cases. This is a change from the prior > implementation where the auth request succeeds based on being able to bind to > the directory. This has been called out in the design doc in HIVE-7193. > But we should allow this for now for backward compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12007) Hive LDAP Authenticator should allow just Domain without baseDN (for AD)
[ https://issues.apache.org/jira/browse/HIVE-12007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940267#comment-14940267 ] Naveen Gangam commented on HIVE-12007: -- code posted for review at https://reviews.apache.org/r/38936/ > Hive LDAP Authenticator should allow just Domain without baseDN (for AD) > > > Key: HIVE-12007 > URL: https://issues.apache.org/jira/browse/HIVE-12007 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Attachments: HIVE-12007.patch > > > When the baseDN is not configured but only the Domain has been set in > hive-site.xml, LDAP Atn provider cannot locate the user in the directory. > Authentication fails in such cases. This is a change from the prior > implementation where the auth request succeeds based on being able to bind to > the directory. This has been called out in the design doc in HIVE-7193. > But we should allow this for now for backward compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12006) Enable Columnar Pushdown for RC/ORC File for HCatLoader
[ https://issues.apache.org/jira/browse/HIVE-12006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940108#comment-14940108 ] Aihua Xu commented on HIVE-12006: - Hi Daniel, I already created the task HIVE-10755 for it and had the patch available. Can you take a look at that? > Enable Columnar Pushdown for RC/ORC File for HCatLoader > --- > > Key: HIVE-12006 > URL: https://issues.apache.org/jira/browse/HIVE-12006 > Project: Hive > Issue Type: Improvement > Components: HCatalog >Affects Versions: 1.2.1 >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 1.3.0, 2.0.0 > > > This initially enabled by HIVE-5193. However, HIVE-10752 reverted it since > there is issue in original implementation. > We shall fix the issue an reenable it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-11896) CBO: Calcite Operator To Hive Operator (Calcite Return Path): deal with hive default partition when inserting data
[ https://issues.apache.org/jira/browse/HIVE-11896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-11896. - Resolution: Fixed Assignee: Ashutosh Chauhan (was: Pengcheng Xiong) This is fixed via HIVE-11972 > CBO: Calcite Operator To Hive Operator (Calcite Return Path): deal with hive > default partition when inserting data > -- > > Key: HIVE-11896 > URL: https://issues.apache.org/jira/browse/HIVE-11896 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Pengcheng Xiong >Assignee: Ashutosh Chauhan > > To repro, run dynpart_sort_opt_vectorization.q with return path turned on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11896) CBO: Calcite Operator To Hive Operator (Calcite Return Path): deal with hive default partition when inserting data
[ https://issues.apache.org/jira/browse/HIVE-11896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-11896: Fix Version/s: 2.0.0 > CBO: Calcite Operator To Hive Operator (Calcite Return Path): deal with hive > default partition when inserting data > -- > > Key: HIVE-11896 > URL: https://issues.apache.org/jira/browse/HIVE-11896 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Pengcheng Xiong >Assignee: Ashutosh Chauhan > Fix For: 2.0.0 > > > To repro, run dynpart_sort_opt_vectorization.q with return path turned on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10752) Revert HIVE-5193
[ https://issues.apache.org/jira/browse/HIVE-10752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940106#comment-14940106 ] Daniel Dai commented on HIVE-10752: --- Committed to branch-1. > Revert HIVE-5193 > > > Key: HIVE-10752 > URL: https://issues.apache.org/jira/browse/HIVE-10752 > Project: Hive > Issue Type: Sub-task > Components: HCatalog >Affects Versions: 1.2.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-10752.patch > > > Revert HIVE-5193 since it causes pig+hcatalog not working. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12006) Enable Columnar Pushdown for RC/ORC File for HCatLoader
[ https://issues.apache.org/jira/browse/HIVE-12006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940125#comment-14940125 ] Aihua Xu commented on HIVE-12006: - Thanks. I have that patch for a while, but haven't got it committed. Appreciate it if you can review it. :) > Enable Columnar Pushdown for RC/ORC File for HCatLoader > --- > > Key: HIVE-12006 > URL: https://issues.apache.org/jira/browse/HIVE-12006 > Project: Hive > Issue Type: Improvement > Components: HCatalog >Affects Versions: 1.2.1 >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 1.3.0, 2.0.0 > > > This initially enabled by HIVE-5193. However, HIVE-10752 reverted it since > there is issue in original implementation. > We shall fix the issue an reenable it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12004) SDPO doesnt set colExprMap correctly on new RS
[ https://issues.apache.org/jira/browse/HIVE-12004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940144#comment-14940144 ] Prasanth Jayachandran commented on HIVE-12004: -- LGTM, +1. Pending tests > SDPO doesnt set colExprMap correctly on new RS > -- > > Key: HIVE-12004 > URL: https://issues.apache.org/jira/browse/HIVE-12004 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer >Affects Versions: 1.2.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-12004.patch > > > As a result plan gets into a bad state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11980) Follow up on HIVE-11696, exception is thrown from CTAS from the table with table-level serde is Parquet while partition-level serde is JSON
[ https://issues.apache.org/jira/browse/HIVE-11980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940187#comment-14940187 ] Szehon Ho commented on HIVE-11980: -- Looks simple, +1 > Follow up on HIVE-11696, exception is thrown from CTAS from the table with > table-level serde is Parquet while partition-level serde is JSON > --- > > Key: HIVE-11980 > URL: https://issues.apache.org/jira/browse/HIVE-11980 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-11980.patch > > > When we create a new table from the table with table-level serde to be > Parquet and partition-level serde to be JSON, currently the following > exception will be thrown if there are struct fields. > Apparently, getStructFieldsDataAsList() also needs to handle the case of List > in addition to ArrayWritable similar to getStructFieldData. > {noformat} > Caused by: java.lang.UnsupportedOperationException: Cannot inspect > java.util.ArrayList > at > org.apache.hadoop.hive.ql.io.parquet.serde.ArrayWritableObjectInspector.getStructFieldsDataAsList(ArrayWritableObjectInspector.java:172) > at > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:354) > at > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:257) > at > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:241) > at > org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:55) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:720) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:813) > at > org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:813) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97) > at > org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:162) > at > org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11878) ClassNotFoundException can possibly occur if multiple jars are registered one at a time in Hive
[ https://issues.apache.org/jira/browse/HIVE-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940183#comment-14940183 ] Ratandeep Ratti commented on HIVE-11878: Hi [~jdere] I got some time to look into this today. I incorporated your suggestion where I create a fresh classloader when a new session is created. I use, as parent, the thread context classloader for the freshly created session classloader (See RB: https://reviews.apache.org/r/38663/) . I have some doubts about using the thread context classloader as the parent. This does not seem to provide clean isolation between jars/resources between different sessions. Case in point: a thread context classloader could be a previous session's classloader .This can happen when the same thread was used to work on a previous session, and is now being used to work on the newer current session. The thread context classloaer could contain a different implementation of the same class also present in the session classloader. Do you see this a a problem? Another potential problem I'm thinking about -- which is present in the proposed approach (see RB) is -- in HiveServer2 any worker thread can serve any request by mapping it to a persistent session. Couldn't this lead to a situation where for a specific session the session specific classloader (conf.getClassLoader()) and the thread context classloader end up being different? Say we have two worker thread t1 and t2 .The very first query is handled by t1 where a fresh session s1 is created along with a fresh classloader c1, which is set as the session specific classloader and the thread context classloader. The next query for the same session is handled by t2. I guess since it is the same session s1, we do not create a fresh classloader. The session specific classloader is c1, but since it is a different thread and no classloader has been set on it, the thread will have the system classloader as its context classloader. Couldn't this cause potential CNF exceptions? If I understood correctly this problem also exists in the current implementation, doesn't it? > ClassNotFoundException can possibly occur if multiple jars are registered > one at a time in Hive > > > Key: HIVE-11878 > URL: https://issues.apache.org/jira/browse/HIVE-11878 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Ratandeep Ratti >Assignee: Ratandeep Ratti > Labels: URLClassLoader > Attachments: HIVE-11878.patch, HIVE-11878_approach3.patch, > HIVE-11878_qtest.patch > > > When we register a jar on the Hive console. Hive creates a fresh URL > classloader which includes the path of the current jar to be registered and > all the jar paths of the parent classloader. The parent classlaoder is the > current ThreadContextClassLoader. Once the URLClassloader is created Hive > sets that as the current ThreadContextClassloader. > So if we register multiple jars in Hive, there will be multiple > URLClassLoaders created, each classloader including the jars from its parent > and the one extra jar to be registered. The last URLClassLoader created will > end up as the current ThreadContextClassLoader. (See details: > org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath) > Now here's an example in which the above strategy can lead to a CNF exception. > We register 2 jars *j1* and *j2* in Hive console. *j1* contains the UDF class > *c1* and internally relies on class *c2* in jar *j2*. We register *j1* first, > the URLClassLoader *u1* is created and also set as the > ThreadContextClassLoader. We register *j2* next, the new URLClassLoader > created will be *u2* with *u1* as parent and *u2* becomes the new > ThreadContextClassLoader. Note *u2* includes paths to both jars *j1* and *j2* > whereas *u1* only has paths to *j1* (For details see: > org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath). > Now when we register class *c1* under a temporary function in Hive, we load > the class using {code} class.forName("c1", true, > Thread.currentThread().getContextClassLoader()) {code} . The > currentThreadContext class-loader is *u2*, and it has the path to the class > *c1*, but note that Class-loaders work by delegating to parent class-loader > first. In this case class *c1* will be found and *defined* by class-loader > *u1*. > Now *c1* from jar *j1* has *u1* as its class-loader. If a method (say > initialize) is called in *c1*, which references the class *c2*, *c2* will not > be found since the class-loader used to search for *c2* will be *u1* (Since > the caller's class-loader is used to load a class) > I've added a qtest to explain the problem. Please see the attached patch
[jira] [Updated] (HIVE-11914) When transactions gets a heartbeat, it doesn't update the lock heartbeat.
[ https://issues.apache.org/jira/browse/HIVE-11914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-11914: -- Component/s: HCatalog > When transactions gets a heartbeat, it doesn't update the lock heartbeat. > - > > Key: HIVE-11914 > URL: https://issues.apache.org/jira/browse/HIVE-11914 > Project: Hive > Issue Type: Bug > Components: HCatalog, Transactions >Affects Versions: 1.0.1 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > > TxnHandler.heartbeatTxn() updates the timestamp on the txn but not on the > associated locks. This makes SHOW LOCKS confusing/misleading. > This is especially visible in Streaming API use cases which use > TxnHandler.heartbeatTxnRange(HeartbeatTxnRangeRequest rqst) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11914) When transactions gets a heartbeat, it doesn't update the lock heartbeat.
[ https://issues.apache.org/jira/browse/HIVE-11914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940380#comment-14940380 ] Eugene Koifman commented on HIVE-11914: --- this should have a test in TestStreaming which is easier after HIVE-11983 is committed > When transactions gets a heartbeat, it doesn't update the lock heartbeat. > - > > Key: HIVE-11914 > URL: https://issues.apache.org/jira/browse/HIVE-11914 > Project: Hive > Issue Type: Bug > Components: HCatalog, Transactions >Affects Versions: 1.0.1 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > > TxnHandler.heartbeatTxn() updates the timestamp on the txn but not on the > associated locks. This makes SHOW LOCKS confusing/misleading. > This is especially visible in Streaming API use cases which use > TxnHandler.heartbeatTxnRange(HeartbeatTxnRangeRequest rqst) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-12009) Add IMetastoreClient.heartbeat(long[] lockIds)
[ https://issues.apache.org/jira/browse/HIVE-12009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman resolved HIVE-12009. --- Resolution: Won't Fix > Add IMetastoreClient.heartbeat(long[] lockIds) > -- > > Key: HIVE-12009 > URL: https://issues.apache.org/jira/browse/HIVE-12009 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > > Current API only allows heart beating 1 lock ID (external ID) at a time. > For multi statement txns we can have multiple locks per txn and should be > able to do it in 1 cal. > Used from DbTxnManager.heartbeat() -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12009) Add IMetastoreClient.heartbeat(long[] lockIds)
[ https://issues.apache.org/jira/browse/HIVE-12009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940386#comment-14940386 ] Eugene Koifman commented on HIVE-12009: --- actually this should not be necessary. the only time you can have > 1 'ext' lock is when there is a transaction open, thus it's better to heartbeat the txn > Add IMetastoreClient.heartbeat(long[] lockIds) > -- > > Key: HIVE-12009 > URL: https://issues.apache.org/jira/browse/HIVE-12009 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > > Current API only allows heart beating 1 lock ID (external ID) at a time. > For multi statement txns we can have multiple locks per txn and should be > able to do it in 1 cal. > Used from DbTxnManager.heartbeat() -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9695) Redundant filter operator in reducer Vertex when CBO is disabled
[ https://issues.apache.org/jira/browse/HIVE-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-9695: -- Issue Type: Improvement (was: Bug) > Redundant filter operator in reducer Vertex when CBO is disabled > > > Key: HIVE-9695 > URL: https://issues.apache.org/jira/browse/HIVE-9695 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Affects Versions: 2.0.0 >Reporter: Mostafa Mokhtar >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-9695.patch > > > There is a redundant filter operator in reducer Vertex when CBO is disabled. > Query > {code} > select > ss_item_sk, ss_ticket_number, ss_store_sk > from > store_sales a, store_returns b, store > where > a.ss_item_sk = b.sr_item_sk > and a.ss_ticket_number = b.sr_ticket_number > and ss_sold_date_sk between 2450816 and 2451500 > and sr_returned_date_sk between 2450816 and 2451500 > and s_store_sk = ss_store_sk; > {code} > Plan snippet > {code} > Statistics: Num rows: 57439344 Data size: 1838059008 Basic stats: COMPLETE > Column stats: COMPLETE > Filter Operator > predicate: (_col1 = _col27) and (_col8 = _col34)) and > _col22 BETWEEN 2450816 AND 2451500) and _col45 BETWEEN 2450816 AND 2451500) > and (_col49 = _col6)) (type: boolean) > {code} > Full plan with CBO disabled > {code} > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 3 (BROADCAST_EDGE), Map 4 > (SIMPLE_EDGE) > DagName: mmokhtar_20150214182626_ad6820c7-b667-4652-ab25-cb60deed1a6d:13 > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: b > filterExpr: ((sr_item_sk is not null and sr_ticket_number > is not null) and sr_returned_date_sk BETWEEN 2450816 AND 2451500) (type: > boolean) > Statistics: Num rows: 2370038095 Data size: 170506118656 > Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (sr_item_sk is not null and sr_ticket_number > is not null) (type: boolean) > Statistics: Num rows: 706893063 Data size: 6498502768 > Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: sr_item_sk (type: int), > sr_ticket_number (type: int) > sort order: ++ > Map-reduce partition columns: sr_item_sk (type: int), > sr_ticket_number (type: int) > Statistics: Num rows: 706893063 Data size: 6498502768 > Basic stats: COMPLETE Column stats: COMPLETE > value expressions: sr_returned_date_sk (type: int) > Execution mode: vectorized > Map 3 > Map Operator Tree: > TableScan > alias: store > filterExpr: s_store_sk is not null (type: boolean) > Statistics: Num rows: 1704 Data size: 3256276 Basic stats: > COMPLETE Column stats: COMPLETE > Filter Operator > predicate: s_store_sk is not null (type: boolean) > Statistics: Num rows: 1704 Data size: 6816 Basic stats: > COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: s_store_sk (type: int) > sort order: + > Map-reduce partition columns: s_store_sk (type: int) > Statistics: Num rows: 1704 Data size: 6816 Basic stats: > COMPLETE Column stats: COMPLETE > Execution mode: vectorized > Map 4 > Map Operator Tree: > TableScan > alias: a > filterExpr: (((ss_item_sk is not null and ss_ticket_number > is not null) and ss_store_sk is not null) and ss_sold_date_sk BETWEEN 2450816 > AND 2451500) (type: boolean) > Statistics: Num rows: 28878719387 Data size: 2405805439460 > Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((ss_item_sk is not null and ss_ticket_number > is not null) and ss_store_sk is not null) (type: boolean) > Statistics: Num rows: 8405840828 Data size: 110101408700 > Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: ss_item_sk (type: int), > ss_ticket_number (type:
[jira] [Commented] (HIVE-12003) Hive Streaming API : Add check to ensure table is transactional
[ https://issues.apache.org/jira/browse/HIVE-12003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940402#comment-14940402 ] Roshan Naik commented on HIVE-12003: i am revising this patch to exclude the -w option > Hive Streaming API : Add check to ensure table is transactional > --- > > Key: HIVE-12003 > URL: https://issues.apache.org/jira/browse/HIVE-12003 > Project: Hive > Issue Type: Bug > Components: HCatalog, Hive, Transactions >Affects Versions: 1.2.1 >Reporter: Roshan Naik >Assignee: Roshan Naik > Attachments: HIVE-12003.patch > > > Check if TBLPROPERTIES ('transactional'='true') is set when opening connection -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11969) start Tez session in background when starting CLI
[ https://issues.apache.org/jira/browse/HIVE-11969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11969: Attachment: HIVE-11969.02.patch Updated > start Tez session in background when starting CLI > - > > Key: HIVE-11969 > URL: https://issues.apache.org/jira/browse/HIVE-11969 > Project: Hive > Issue Type: Bug > Components: Tez >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11969.01.patch, HIVE-11969.02.patch, > HIVE-11969.patch > > > Tez session spins up AM, which can cause delays, esp. if the cluster is very > busy. > This can be done in background, so the AM might get started while the user is > running local commands and doing other things. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11642) LLAP: make sure tests pass #3
[ https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11642: Attachment: HIVE-11642.16.patch Update patch to avoid conflicts > LLAP: make sure tests pass #3 > - > > Key: HIVE-11642 > URL: https://issues.apache.org/jira/browse/HIVE-11642 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11642.01.patch, HIVE-11642.02.patch, > HIVE-11642.03.patch, HIVE-11642.04.patch, HIVE-11642.05.patch, > HIVE-11642.15.patch, HIVE-11642.16.patch, HIVE-11642.patch > > > Tests should pass against the most recent branch and Tez 0.8. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11914) When transactions gets a heartbeat, it doesn't update the lock heartbeat.
[ https://issues.apache.org/jira/browse/HIVE-11914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-11914: -- Description: TxnHandler.heartbeatTxn() updates the timestamp on the txn but not on the associated locks. This makes SHOW LOCKS confusing/misleading. This is especially visible in Streaming API use cases which use TxnHandler.heartbeatTxnRange(HeartbeatTxnRangeRequest rqst) was:TxnHandler.heartbeatTxn() updates the timestamp on the txn but not on the associated locks. This makes SHOW LOCKS confusing/misleading. > When transactions gets a heartbeat, it doesn't update the lock heartbeat. > - > > Key: HIVE-11914 > URL: https://issues.apache.org/jira/browse/HIVE-11914 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.1 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > > TxnHandler.heartbeatTxn() updates the timestamp on the txn but not on the > associated locks. This makes SHOW LOCKS confusing/misleading. > This is especially visible in Streaming API use cases which use > TxnHandler.heartbeatTxnRange(HeartbeatTxnRangeRequest rqst) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9695) Redundant filter operator in reducer Vertex when CBO is disabled
[ https://issues.apache.org/jira/browse/HIVE-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-9695: -- Affects Version/s: (was: 0.14.0) 2.0.0 > Redundant filter operator in reducer Vertex when CBO is disabled > > > Key: HIVE-9695 > URL: https://issues.apache.org/jira/browse/HIVE-9695 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 2.0.0 >Reporter: Mostafa Mokhtar >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-9695.patch > > > There is a redundant filter operator in reducer Vertex when CBO is disabled. > Query > {code} > select > ss_item_sk, ss_ticket_number, ss_store_sk > from > store_sales a, store_returns b, store > where > a.ss_item_sk = b.sr_item_sk > and a.ss_ticket_number = b.sr_ticket_number > and ss_sold_date_sk between 2450816 and 2451500 > and sr_returned_date_sk between 2450816 and 2451500 > and s_store_sk = ss_store_sk; > {code} > Plan snippet > {code} > Statistics: Num rows: 57439344 Data size: 1838059008 Basic stats: COMPLETE > Column stats: COMPLETE > Filter Operator > predicate: (_col1 = _col27) and (_col8 = _col34)) and > _col22 BETWEEN 2450816 AND 2451500) and _col45 BETWEEN 2450816 AND 2451500) > and (_col49 = _col6)) (type: boolean) > {code} > Full plan with CBO disabled > {code} > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 3 (BROADCAST_EDGE), Map 4 > (SIMPLE_EDGE) > DagName: mmokhtar_20150214182626_ad6820c7-b667-4652-ab25-cb60deed1a6d:13 > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: b > filterExpr: ((sr_item_sk is not null and sr_ticket_number > is not null) and sr_returned_date_sk BETWEEN 2450816 AND 2451500) (type: > boolean) > Statistics: Num rows: 2370038095 Data size: 170506118656 > Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (sr_item_sk is not null and sr_ticket_number > is not null) (type: boolean) > Statistics: Num rows: 706893063 Data size: 6498502768 > Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: sr_item_sk (type: int), > sr_ticket_number (type: int) > sort order: ++ > Map-reduce partition columns: sr_item_sk (type: int), > sr_ticket_number (type: int) > Statistics: Num rows: 706893063 Data size: 6498502768 > Basic stats: COMPLETE Column stats: COMPLETE > value expressions: sr_returned_date_sk (type: int) > Execution mode: vectorized > Map 3 > Map Operator Tree: > TableScan > alias: store > filterExpr: s_store_sk is not null (type: boolean) > Statistics: Num rows: 1704 Data size: 3256276 Basic stats: > COMPLETE Column stats: COMPLETE > Filter Operator > predicate: s_store_sk is not null (type: boolean) > Statistics: Num rows: 1704 Data size: 6816 Basic stats: > COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: s_store_sk (type: int) > sort order: + > Map-reduce partition columns: s_store_sk (type: int) > Statistics: Num rows: 1704 Data size: 6816 Basic stats: > COMPLETE Column stats: COMPLETE > Execution mode: vectorized > Map 4 > Map Operator Tree: > TableScan > alias: a > filterExpr: (((ss_item_sk is not null and ss_ticket_number > is not null) and ss_store_sk is not null) and ss_sold_date_sk BETWEEN 2450816 > AND 2451500) (type: boolean) > Statistics: Num rows: 28878719387 Data size: 2405805439460 > Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((ss_item_sk is not null and ss_ticket_number > is not null) and ss_store_sk is not null) (type: boolean) > Statistics: Num rows: 8405840828 Data size: 110101408700 > Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: ss_item_sk (type: int), >
[jira] [Updated] (HIVE-11720) Allow HiveServer2 to set custom http request/response header size
[ https://issues.apache.org/jira/browse/HIVE-11720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-11720: Attachment: HIVE-11720.4.patch > Allow HiveServer2 to set custom http request/response header size > - > > Key: HIVE-11720 > URL: https://issues.apache.org/jira/browse/HIVE-11720 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > Attachments: HIVE-11720.1.patch, HIVE-11720.2.patch, > HIVE-11720.3.patch, HIVE-11720.4.patch, HIVE-11720.4.patch > > > In HTTP transport mode, authentication information is sent over as part of > HTTP headers. Sometimes (observed when Kerberos is used) the default buffer > size for the headers is not enough, resulting in an HTTP 413 FULL head error. > We can expose those as customizable params. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11898) support default partition in metastoredirectsql
[ https://issues.apache.org/jira/browse/HIVE-11898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11898: Description: Right now, direct SQL intentionally skips the processing for default partition for PPD case; the SQL query fails and we fall back to JDO. Add support for default partition based on the same rules as JDO (don't return it) > support default partition in metastoredirectsql > --- > > Key: HIVE-11898 > URL: https://issues.apache.org/jira/browse/HIVE-11898 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11898.01.patch, HIVE-11898.02.patch, > HIVE-11898.patch > > > Right now, direct SQL intentionally skips the processing for default > partition for PPD case; the SQL query fails and we fall back to JDO. Add > support for default partition based on the same rules as JDO (don't return it) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11983) Hive streaming API uses incorrect logic to assign buckets to incoming records
[ https://issues.apache.org/jira/browse/HIVE-11983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated HIVE-11983: --- Attachment: HIVE-11983.4.patch Uploading v4 patch .. found that the patch was not applying due to the above noted -w option. This patch is created without the -w option and applies cleanly using 'patch -p0' and 'git apply -p 0' RB however still doesn't like it. This patch is on top of commit SHA 24988f7 (HIVE-11972) > Hive streaming API uses incorrect logic to assign buckets to incoming records > - > > Key: HIVE-11983 > URL: https://issues.apache.org/jira/browse/HIVE-11983 > Project: Hive > Issue Type: Bug > Components: HCatalog, Transactions >Affects Versions: 1.2.1 >Reporter: Roshan Naik >Assignee: Roshan Naik > Labels: streaming, streaming_api > Attachments: HIVE-11983.3.patch, HIVE-11983.4.patch, HIVE-11983.patch > > > The Streaming API tries to distribute records evenly into buckets. > All records in every Transaction that is part of TransactionBatch goes to the > same bucket and a new bucket number is chose for each TransactionBatch. > Fix: API needs to hash each record to determine which bucket it belongs to. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9695) Redundant filter operator in reducer Vertex when CBO is disabled
[ https://issues.apache.org/jira/browse/HIVE-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-9695: -- Attachment: HIVE-9695.patch > Redundant filter operator in reducer Vertex when CBO is disabled > > > Key: HIVE-9695 > URL: https://issues.apache.org/jira/browse/HIVE-9695 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 0.14.0 >Reporter: Mostafa Mokhtar >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-9695.patch > > > There is a redundant filter operator in reducer Vertex when CBO is disabled. > Query > {code} > select > ss_item_sk, ss_ticket_number, ss_store_sk > from > store_sales a, store_returns b, store > where > a.ss_item_sk = b.sr_item_sk > and a.ss_ticket_number = b.sr_ticket_number > and ss_sold_date_sk between 2450816 and 2451500 > and sr_returned_date_sk between 2450816 and 2451500 > and s_store_sk = ss_store_sk; > {code} > Plan snippet > {code} > Statistics: Num rows: 57439344 Data size: 1838059008 Basic stats: COMPLETE > Column stats: COMPLETE > Filter Operator > predicate: (_col1 = _col27) and (_col8 = _col34)) and > _col22 BETWEEN 2450816 AND 2451500) and _col45 BETWEEN 2450816 AND 2451500) > and (_col49 = _col6)) (type: boolean) > {code} > Full plan with CBO disabled > {code} > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 3 (BROADCAST_EDGE), Map 4 > (SIMPLE_EDGE) > DagName: mmokhtar_20150214182626_ad6820c7-b667-4652-ab25-cb60deed1a6d:13 > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: b > filterExpr: ((sr_item_sk is not null and sr_ticket_number > is not null) and sr_returned_date_sk BETWEEN 2450816 AND 2451500) (type: > boolean) > Statistics: Num rows: 2370038095 Data size: 170506118656 > Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (sr_item_sk is not null and sr_ticket_number > is not null) (type: boolean) > Statistics: Num rows: 706893063 Data size: 6498502768 > Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: sr_item_sk (type: int), > sr_ticket_number (type: int) > sort order: ++ > Map-reduce partition columns: sr_item_sk (type: int), > sr_ticket_number (type: int) > Statistics: Num rows: 706893063 Data size: 6498502768 > Basic stats: COMPLETE Column stats: COMPLETE > value expressions: sr_returned_date_sk (type: int) > Execution mode: vectorized > Map 3 > Map Operator Tree: > TableScan > alias: store > filterExpr: s_store_sk is not null (type: boolean) > Statistics: Num rows: 1704 Data size: 3256276 Basic stats: > COMPLETE Column stats: COMPLETE > Filter Operator > predicate: s_store_sk is not null (type: boolean) > Statistics: Num rows: 1704 Data size: 6816 Basic stats: > COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: s_store_sk (type: int) > sort order: + > Map-reduce partition columns: s_store_sk (type: int) > Statistics: Num rows: 1704 Data size: 6816 Basic stats: > COMPLETE Column stats: COMPLETE > Execution mode: vectorized > Map 4 > Map Operator Tree: > TableScan > alias: a > filterExpr: (((ss_item_sk is not null and ss_ticket_number > is not null) and ss_store_sk is not null) and ss_sold_date_sk BETWEEN 2450816 > AND 2451500) (type: boolean) > Statistics: Num rows: 28878719387 Data size: 2405805439460 > Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((ss_item_sk is not null and ss_ticket_number > is not null) and ss_store_sk is not null) (type: boolean) > Statistics: Num rows: 8405840828 Data size: 110101408700 > Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: ss_item_sk (type: int), > ss_ticket_number (type: int) >
[jira] [Updated] (HIVE-11969) start Tez session in background when starting CLI
[ https://issues.apache.org/jira/browse/HIVE-11969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11969: Attachment: HIVE-11969.02.patch > start Tez session in background when starting CLI > - > > Key: HIVE-11969 > URL: https://issues.apache.org/jira/browse/HIVE-11969 > Project: Hive > Issue Type: Bug > Components: Tez >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11969.01.patch, HIVE-11969.02.patch, > HIVE-11969.patch > > > Tez session spins up AM, which can cause delays, esp. if the cluster is very > busy. > This can be done in background, so the AM might get started while the user is > running local commands and doing other things. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11969) start Tez session in background when starting CLI
[ https://issues.apache.org/jira/browse/HIVE-11969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11969: Attachment: (was: HIVE-11969.02.patch) > start Tez session in background when starting CLI > - > > Key: HIVE-11969 > URL: https://issues.apache.org/jira/browse/HIVE-11969 > Project: Hive > Issue Type: Bug > Components: Tez >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11969.01.patch, HIVE-11969.02.patch, > HIVE-11969.patch > > > Tez session spins up AM, which can cause delays, esp. if the cluster is very > busy. > This can be done in background, so the AM might get started while the user is > running local commands and doing other things. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11914) When transactions gets a heartbeat, it doesn't update the lock heartbeat.
[ https://issues.apache.org/jira/browse/HIVE-11914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-11914: -- Attachment: HIVE-11914.patch prelim patch > When transactions gets a heartbeat, it doesn't update the lock heartbeat. > - > > Key: HIVE-11914 > URL: https://issues.apache.org/jira/browse/HIVE-11914 > Project: Hive > Issue Type: Bug > Components: HCatalog, Transactions >Affects Versions: 1.0.1 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-11914.patch > > > TxnHandler.heartbeatTxn() updates the timestamp on the txn but not on the > associated locks. This makes SHOW LOCKS confusing/misleading. > This is especially visible in Streaming API use cases which use > TxnHandler.heartbeatTxnRange(HeartbeatTxnRangeRequest rqst) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11517) Vectorized auto_smb_mapjoin_14.q produces different results
[ https://issues.apache.org/jira/browse/HIVE-11517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-11517: Fix Version/s: 1.0.0 1.2.0 > Vectorized auto_smb_mapjoin_14.q produces different results > --- > > Key: HIVE-11517 > URL: https://issues.apache.org/jira/browse/HIVE-11517 > Project: Hive > Issue Type: Bug >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 1.0.0, 1.2.0, 1.3.0, 2.0.0 > > Attachments: HIVE-11517.01.patch, HIVE-11517.02.patch > > > Converted Q file to use ORC and turned on vectorization. > The query: > {code} > select count(*) from ( > select a.key as key, a.value as val1, b.value as val2 from tbl1 a join tbl2 > b on a.key = b.key > ) subq1 > {code} > produces 10 instead of 22. > The query: > {code} > select src1.key, src1.cnt1, src2.cnt1 from > ( > select key, count(*) as cnt1 from > ( > select a.key as key, a.value as val1, b.value as val2 from tbl1 a join > tbl2 b on a.key = b.key > ) subq1 group by key > ) src1 > join > ( > select key, count(*) as cnt1 from > ( > select a.key as key, a.value as val1, b.value as val2 from tbl1 a join > tbl2 b on a.key = b.key > ) subq2 group by key > ) src2 > {code} > produces: > {code} > 0 3 3 > 2 1 1 > 4 1 1 > 5 3 3 > 8 1 1 > 9 1 1 > {code} > instead of: > {code} > 0 9 9 > 2 1 1 > 4 1 1 > 5 9 9 > 8 1 1 > 9 1 1 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11517) Vectorized auto_smb_mapjoin_14.q produces different results
[ https://issues.apache.org/jira/browse/HIVE-11517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940454#comment-14940454 ] Matt McCline commented on HIVE-11517: - Added branch-1.0 and branch-1.2 > Vectorized auto_smb_mapjoin_14.q produces different results > --- > > Key: HIVE-11517 > URL: https://issues.apache.org/jira/browse/HIVE-11517 > Project: Hive > Issue Type: Bug >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 1.0.0, 1.2.0, 1.3.0, 2.0.0 > > Attachments: HIVE-11517.01.patch, HIVE-11517.02.patch > > > Converted Q file to use ORC and turned on vectorization. > The query: > {code} > select count(*) from ( > select a.key as key, a.value as val1, b.value as val2 from tbl1 a join tbl2 > b on a.key = b.key > ) subq1 > {code} > produces 10 instead of 22. > The query: > {code} > select src1.key, src1.cnt1, src2.cnt1 from > ( > select key, count(*) as cnt1 from > ( > select a.key as key, a.value as val1, b.value as val2 from tbl1 a join > tbl2 b on a.key = b.key > ) subq1 group by key > ) src1 > join > ( > select key, count(*) as cnt1 from > ( > select a.key as key, a.value as val1, b.value as val2 from tbl1 a join > tbl2 b on a.key = b.key > ) subq2 group by key > ) src2 > {code} > produces: > {code} > 0 3 3 > 2 1 1 > 4 1 1 > 5 3 3 > 8 1 1 > 9 1 1 > {code} > instead of: > {code} > 0 9 9 > 2 1 1 > 4 1 1 > 5 9 9 > 8 1 1 > 9 1 1 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11928) ORC footer section can also exceed protobuf message limit
[ https://issues.apache.org/jira/browse/HIVE-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940485#comment-14940485 ] Prasanth Jayachandran commented on HIVE-11928: -- The test ran successfully but the results are not posted due to "403 Forbidden" error. Copy pasting the results here from http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5485/consoleFull {code} {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12764451/HIVE-11928.3.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9640 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5485/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5485/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5485/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12764451 - PreCommit-HIVE-TRUNK-Build 2015-10-01 13:16:51,702 ERROR JIRAService.postComment:176 Encountered error attempting to post comment to HIVE-11928 java.lang.RuntimeException: 403 Forbidden at org.apache.hive.ptest.execution.JIRAService.postComment(JIRAService.java:171) at org.apache.hive.ptest.execution.PTest.publishJiraComment(PTest.java:242) at org.apache.hive.ptest.execution.PTest.run(PTest.java:216) at org.apache.hive.ptest.api.server.TestExecutor.run(TestExecutor.java:120) {code} > ORC footer section can also exceed protobuf message limit > - > > Key: HIVE-11928 > URL: https://issues.apache.org/jira/browse/HIVE-11928 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Jagruti Varia >Assignee: Prasanth Jayachandran > Attachments: HIVE-11928-branch-1.patch, HIVE-11928.1.patch, > HIVE-11928.1.patch, HIVE-11928.2.patch, HIVE-11928.2.patch, HIVE-11928.3.patch > > > Similar to HIVE-11592 but for orc footer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11976) Extend CBO rules to being able to apply rules only once on a given operator
[ https://issues.apache.org/jira/browse/HIVE-11976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940530#comment-14940530 ] Szehon Ho commented on HIVE-11976: -- Posting on behalf of HiveQA which was locked out: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12764445/HIVE-11976.01.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9625 tests executed *Failed tests:* {noformat} TestMiniTezCliDriver-orc_merge6.q-vector_outer_join0.q-mapreduce1.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_cond_pushdown org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5484/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5484/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5484/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} > Extend CBO rules to being able to apply rules only once on a given operator > --- > > Key: HIVE-11976 > URL: https://issues.apache.org/jira/browse/HIVE-11976 > Project: Hive > Issue Type: New Feature > Components: CBO >Affects Versions: 2.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-11976.01.patch, HIVE-11976.patch > > > Create a way to bail out quickly from HepPlanner if the rule has been already > applied on a certain operator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11928) ORC footer section can also exceed protobuf message limit
[ https://issues.apache.org/jira/browse/HIVE-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-11928: - Attachment: HIVE-11928-branch-1.patch Uploading new patch for branch-1 > ORC footer section can also exceed protobuf message limit > - > > Key: HIVE-11928 > URL: https://issues.apache.org/jira/browse/HIVE-11928 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Jagruti Varia >Assignee: Prasanth Jayachandran > Attachments: HIVE-11928-branch-1.patch, HIVE-11928-branch-1.patch, > HIVE-11928.1.patch, HIVE-11928.1.patch, HIVE-11928.2.patch, > HIVE-11928.2.patch, HIVE-11928.3.patch > > > Similar to HIVE-11592 but for orc footer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11973) IN operator fails when the column type is DATE
[ https://issues.apache.org/jira/browse/HIVE-11973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940528#comment-14940528 ] Szehon Ho commented on HIVE-11973: -- Posting on behalf of HiveQA which was locked out: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12764251/HIVE-11973.1.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9642 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5483/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5483/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5483/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} > IN operator fails when the column type is DATE > --- > > Key: HIVE-11973 > URL: https://issues.apache.org/jira/browse/HIVE-11973 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.0.0 >Reporter: sanjiv singh >Assignee: Yongzhi Chen > Attachments: HIVE-11973.1.patch > > > Test DLL : > {code} > CREATE TABLE `date_dim`( > `d_date_sk` int, > `d_date_id` string, > `d_date` date, > `d_current_week` string, > `d_current_month` string, > `d_current_quarter` string, > `d_current_year` string) ; > {code} > Hive query : > {code} > SELECT * > FROM date_dim > WHERE d_date IN ('2000-03-22','2001-03-22') ; > {code} > In 1.0.0 , the above query fails with: > {code} > FAILED: SemanticException [Error 10014]: Line 1:180 Wrong arguments > ''2001-03-22'': The arguments for IN should be the same type! Types are: > {date IN (string, string)} > {code} > I changed the query as given to pass the error : > {code} > SELECT * > FROM date_dim > WHERE d_date IN (CAST('2000-03-22' AS DATE) , CAST('2001-03-22' AS DATE) > ) ; > {code} > But it works without casting : > {code} > SELECT * > FROM date_dim > WHERE d_date = '2000-03-22' ; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11983) Hive streaming API uses incorrect logic to assign buckets to incoming records
[ https://issues.apache.org/jira/browse/HIVE-11983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940549#comment-14940549 ] Eugene Koifman commented on HIVE-11983: --- DelimitedInputorWriter:215 - why did you changes this to StringBuffer? AbstractRecordWriter:80 - why change how class is loaded? StrictJsonWriter: the 2 c'tor seem identical except for HiveConf. Could 1st one use this(endPoint, null)? getObjectInspectorsForBucketedCols() seems exactly the same as in DelimitedInputWriter getBucketFields() - same as above The write(long, byte[]) methods on the 2 writes: 1 calls reorderFields() the other does not. Is that intentional? TestStreaming this has driver.run("set ") - Driver doesn't support "set" command, so all of these are guaranteed to fail. > Hive streaming API uses incorrect logic to assign buckets to incoming records > - > > Key: HIVE-11983 > URL: https://issues.apache.org/jira/browse/HIVE-11983 > Project: Hive > Issue Type: Bug > Components: HCatalog, Transactions >Affects Versions: 1.2.1 >Reporter: Roshan Naik >Assignee: Roshan Naik > Labels: streaming, streaming_api > Attachments: HIVE-11983.3.patch, HIVE-11983.4.patch, HIVE-11983.patch > > > The Streaming API tries to distribute records evenly into buckets. > All records in every Transaction that is part of TransactionBatch goes to the > same bucket and a new bucket number is chose for each TransactionBatch. > Fix: API needs to hash each record to determine which bucket it belongs to. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12011) unable to create temporary table using CTAS if regular table with that name already exists
[ https://issues.apache.org/jira/browse/HIVE-12011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-12011: --- Attachment: HIVE-12011.01.patch > unable to create temporary table using CTAS if regular table with that name > already exists > -- > > Key: HIVE-12011 > URL: https://issues.apache.org/jira/browse/HIVE-12011 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12011.01.patch > > > CTAS temporary table query fails if regular table with the same name already > exists. > Steps to reproduce the issue: > {noformat} > hive> use dbtemptable; > OK > Time taken: 0.273 seconds > hive> create table a(i int); > OK > Time taken: 0.297 seconds > hive> create temporary table a(i int); > OK > Time taken: 0.165 seconds > hive> create table b(i int); > OK > Time taken: 0.212 seconds > hive> create temporary table b as select * from a; > FAILED: SemanticException org.apache.hadoop.hive.ql.parse.SemanticException: > Table already exists: dbtemptable.b > hive> create table c(i int); > OK > Time taken: 0.264 seconds > hive> create temporary table b as select * from c; > FAILED: SemanticException org.apache.hadoop.hive.ql.parse.SemanticException: > Table already exists: dbtemptable.b > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12012) select query on json table with map type column fails
[ https://issues.apache.org/jira/browse/HIVE-12012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-12012: -- Attachment: HIVE-12012.1.patch JsonSerDe seems to only support map values of type string. Attaching patch and test cases. [~sushanth], can you take a look? There do not seem to many tests for JsonSerDe and I want to make sure I'm not breaking any behavior here. > select query on json table with map type column fails > - > > Key: HIVE-12012 > URL: https://issues.apache.org/jira/browse/HIVE-12012 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Reporter: Jagruti Varia >Assignee: Jason Dere > Attachments: HIVE-12012.1.patch > > > select query on json table throws this error if table contains map type > column: > {noformat} > Failed with exception > java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: > org.codehaus.jackson.JsonParseException: Current token (FIELD_NAME) not > numeric, can not use numeric value accessors > at [Source: java.io.ByteArrayInputStream@295f79b; line: 1, column: 26] > {noformat} > steps to reproduce the issue: > {noformat} > hive> create table c_complex(a array,b map) row format > serde 'org.apache.hive.hcatalog.data.JsonSerDe'; > OK > Time taken: 0.319 seconds > hive> insert into table c_complex select array('aaa'),map('aaa',1) from > studenttab10k limit 2; > Query ID = hrt_qa_20150826183232_47deb33a-19c0-4d2b-a92f-726659eb9413 > Total jobs = 1 > Launching Job 1 out of 1 > Status: Running (Executing on YARN cluster with App id > application_1440603993714_0010) > > VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED > KILLED > > Map 1 .. SUCCEEDED 1 100 0 > 0 > Reducer 2 .. SUCCEEDED 1 100 0 > 0 > > VERTICES: 02/02 [==>>] 100% ELAPSED TIME: 11.75 s > > > Loading data to table default.c_complex > Table default.c_complex stats: [numFiles=1, numRows=2, totalSize=56, > rawDataSize=0] > OK > Time taken: 13.706 seconds > hive> select * from c_complex; > OK > Failed with exception > java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: > org.codehaus.jackson.JsonParseException: Current token (FIELD_NAME) not > numeric, can not use numeric value accessors > at [Source: java.io.ByteArrayInputStream@295f79b; line: 1, column: 26] > Time taken: 0.115 seconds > hive> select count(*) from c_complex; > OK > 2 > Time taken: 0.205 seconds, Fetched: 1 row(s) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11786) Deprecate the use of redundant column in colunm stats related tables
[ https://issues.apache.org/jira/browse/HIVE-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940534#comment-14940534 ] Szehon Ho commented on HIVE-11786: -- Posting on behalf of HiveQA which was locked out: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12764455/HIVE-11786.2.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9640 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5487/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5487/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5487/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} > Deprecate the use of redundant column in colunm stats related tables > > > Key: HIVE-11786 > URL: https://issues.apache.org/jira/browse/HIVE-11786 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-11786.1.patch, HIVE-11786.1.patch, > HIVE-11786.2.patch, HIVE-11786.patch > > > The stats tables such as TAB_COL_STATS, PART_COL_STATS have redundant columns > such as DB_NAME, TABLE_NAME, PARTITION_NAME since these tables already have > foreign key like TBL_ID, or PART_ID referencing to TBLS or PARTITIONS. > These redundant columns violate database normalization rules and cause a lot > of inconvenience (sometimes difficult) in column stats related feature > implementation. For example, when renaming a table, we have to update > TABLE_NAME column in these tables as well which is unnecessary. > This JIRA is first to deprecate the use of these columns at HMS code level. A > followed JIRA is to be opened to focus on DB schema change and upgrade. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11928) ORC footer and metadata section can also exceed protobuf message limit
[ https://issues.apache.org/jira/browse/HIVE-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-11928: - Summary: ORC footer and metadata section can also exceed protobuf message limit (was: ORC footer section can also exceed protobuf message limit) > ORC footer and metadata section can also exceed protobuf message limit > -- > > Key: HIVE-11928 > URL: https://issues.apache.org/jira/browse/HIVE-11928 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Jagruti Varia >Assignee: Prasanth Jayachandran > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-11928-branch-1.patch, HIVE-11928-branch-1.patch, > HIVE-11928.1.patch, HIVE-11928.1.patch, HIVE-11928.2.patch, > HIVE-11928.2.patch, HIVE-11928.3.patch > > > Similar to HIVE-11592 but for orc footer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11928) ORC footer section can also exceed protobuf message limit
[ https://issues.apache.org/jira/browse/HIVE-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940532#comment-14940532 ] Szehon Ho commented on HIVE-11928: -- Posting on behalf of HiveQA which was locked out: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12764451/HIVE-11928.3.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9640 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5485/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5485/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5485/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} > ORC footer section can also exceed protobuf message limit > - > > Key: HIVE-11928 > URL: https://issues.apache.org/jira/browse/HIVE-11928 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Jagruti Varia >Assignee: Prasanth Jayachandran > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-11928-branch-1.patch, HIVE-11928-branch-1.patch, > HIVE-11928.1.patch, HIVE-11928.1.patch, HIVE-11928.2.patch, > HIVE-11928.2.patch, HIVE-11928.3.patch > > > Similar to HIVE-11592 but for orc footer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11684) Implement limit pushdown through outer join in CBO
[ https://issues.apache.org/jira/browse/HIVE-11684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940533#comment-14940533 ] Szehon Ho commented on HIVE-11684: -- Posting on behalf of HiveQA which was locked out: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12764453/HIVE-11684.12.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9642 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5486/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5486/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5486/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} > Implement limit pushdown through outer join in CBO > -- > > Key: HIVE-11684 > URL: https://issues.apache.org/jira/browse/HIVE-11684 > Project: Hive > Issue Type: New Feature > Components: CBO >Affects Versions: 2.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-11684.01.patch, HIVE-11684.02.patch, > HIVE-11684.03.patch, HIVE-11684.04.patch, HIVE-11684.05.patch, > HIVE-11684.07.patch, HIVE-11684.08.patch, HIVE-11684.09.patch, > HIVE-11684.10.patch, HIVE-11684.11.patch, HIVE-11684.12.patch, > HIVE-11684.12.patch, HIVE-11684.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12012) select query on json table with map containing numeric values fails
[ https://issues.apache.org/jira/browse/HIVE-12012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-12012: -- Summary: select query on json table with map containing numeric values fails (was: select query on json table with map type column fails) > select query on json table with map containing numeric values fails > --- > > Key: HIVE-12012 > URL: https://issues.apache.org/jira/browse/HIVE-12012 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Reporter: Jagruti Varia >Assignee: Jason Dere > Attachments: HIVE-12012.1.patch > > > select query on json table throws this error if table contains map type > column: > {noformat} > Failed with exception > java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: > org.codehaus.jackson.JsonParseException: Current token (FIELD_NAME) not > numeric, can not use numeric value accessors > at [Source: java.io.ByteArrayInputStream@295f79b; line: 1, column: 26] > {noformat} > steps to reproduce the issue: > {noformat} > hive> create table c_complex(a array,b map) row format > serde 'org.apache.hive.hcatalog.data.JsonSerDe'; > OK > Time taken: 0.319 seconds > hive> insert into table c_complex select array('aaa'),map('aaa',1) from > studenttab10k limit 2; > Query ID = hrt_qa_20150826183232_47deb33a-19c0-4d2b-a92f-726659eb9413 > Total jobs = 1 > Launching Job 1 out of 1 > Status: Running (Executing on YARN cluster with App id > application_1440603993714_0010) > > VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED > KILLED > > Map 1 .. SUCCEEDED 1 100 0 > 0 > Reducer 2 .. SUCCEEDED 1 100 0 > 0 > > VERTICES: 02/02 [==>>] 100% ELAPSED TIME: 11.75 s > > > Loading data to table default.c_complex > Table default.c_complex stats: [numFiles=1, numRows=2, totalSize=56, > rawDataSize=0] > OK > Time taken: 13.706 seconds > hive> select * from c_complex; > OK > Failed with exception > java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: > org.codehaus.jackson.JsonParseException: Current token (FIELD_NAME) not > numeric, can not use numeric value accessors > at [Source: java.io.ByteArrayInputStream@295f79b; line: 1, column: 26] > Time taken: 0.115 seconds > hive> select count(*) from c_complex; > OK > 2 > Time taken: 0.205 seconds, Fetched: 1 row(s) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12003) Hive Streaming API : Add check to ensure table is transactional
[ https://issues.apache.org/jira/browse/HIVE-12003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated HIVE-12003: --- Attachment: HIVE-12003.2.patch > Hive Streaming API : Add check to ensure table is transactional > --- > > Key: HIVE-12003 > URL: https://issues.apache.org/jira/browse/HIVE-12003 > Project: Hive > Issue Type: Bug > Components: HCatalog, Hive, Transactions >Affects Versions: 1.2.1 >Reporter: Roshan Naik >Assignee: Roshan Naik > Attachments: HIVE-12003.2.patch, HIVE-12003.patch > > > Check if TBLPROPERTIES ('transactional'='true') is set when opening connection -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11720) Allow HiveServer2 to set custom http request/response header size
[ https://issues.apache.org/jira/browse/HIVE-11720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940465#comment-14940465 ] Thejas M Nair commented on HIVE-11720: -- +1 > Allow HiveServer2 to set custom http request/response header size > - > > Key: HIVE-11720 > URL: https://issues.apache.org/jira/browse/HIVE-11720 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > Attachments: HIVE-11720.1.patch, HIVE-11720.2.patch, > HIVE-11720.3.patch, HIVE-11720.4.patch, HIVE-11720.4.patch > > > In HTTP transport mode, authentication information is sent over as part of > HTTP headers. Sometimes (observed when Kerberos is used) the default buffer > size for the headers is not enough, resulting in an HTTP 413 FULL head error. > We can expose those as customizable params. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6859) Test JIRA
[ https://issues.apache.org/jira/browse/HIVE-6859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940518#comment-14940518 ] Hive QA commented on HIVE-6859: --- Test comment. > Test JIRA > - > > Key: HIVE-6859 > URL: https://issues.apache.org/jira/browse/HIVE-6859 > Project: Hive > Issue Type: Bug >Reporter: Szehon Ho >Assignee: Szehon Ho > Attachments: HIVE-6859.1.patch, HIVE-6859.2.patch, HIVE-6859.patch, > HIVE-6891.4.patch, HIVE-6891.5.patch, HIVE-6891.6.patch, HIVE-6891.7.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11995) Remove repetitively setting permissions in insert/load overwrite partition
[ https://issues.apache.org/jira/browse/HIVE-11995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940525#comment-14940525 ] Szehon Ho commented on HIVE-11995: -- Posting behalf of HiveQA which was locked out: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12764495/HIVE-11995.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9625 tests executed *Failed tests:* {noformat} TestMiniTezCliDriver-script_pipe.q-mapjoin_decimal.q-transform_ppr2.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_skewtable org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5482/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5482/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5482/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} > Remove repetitively setting permissions in insert/load overwrite partition > -- > > Key: HIVE-11995 > URL: https://issues.apache.org/jira/browse/HIVE-11995 > Project: Hive > Issue Type: Bug > Components: Security >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-11995.patch > > > When hive.warehouse.subdir.inherit.perms is set to true, insert/load > overwrite .. partition set table and partition permissions repetitively which > is not necessary and causing performance issue especially in the cases where > there are multiple levels of partitions involved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11923) allow qtests to run via a single client session for tez and llap
[ https://issues.apache.org/jira/browse/HIVE-11923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-11923: - Attachment: HIVE-11923.2.patch Another try for precommit test. > allow qtests to run via a single client session for tez and llap > > > Key: HIVE-11923 > URL: https://issues.apache.org/jira/browse/HIVE-11923 > Project: Hive > Issue Type: Improvement > Components: Testing Infrastructure >Affects Versions: 1.3.0, 2.0.0 >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-11923.1.txt, HIVE-11923.2.branchllap.txt, > HIVE-11923.2.patch, HIVE-11923.2.txt, HIVE-11923.2.txt, > HIVE-11923.branch-1.txt > > > Launching a new session - AM and containers for each test adds unnecessary > overheads. Running via a single session should reduce the run time > significantly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12012) select query on json table with map type column fails
[ https://issues.apache.org/jira/browse/HIVE-12012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940586#comment-14940586 ] Jason Dere commented on HIVE-12012: --- Looks like the same problem reported in HCATALOG-630 > select query on json table with map type column fails > - > > Key: HIVE-12012 > URL: https://issues.apache.org/jira/browse/HIVE-12012 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Reporter: Jagruti Varia >Assignee: Jason Dere > Attachments: HIVE-12012.1.patch > > > select query on json table throws this error if table contains map type > column: > {noformat} > Failed with exception > java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: > org.codehaus.jackson.JsonParseException: Current token (FIELD_NAME) not > numeric, can not use numeric value accessors > at [Source: java.io.ByteArrayInputStream@295f79b; line: 1, column: 26] > {noformat} > steps to reproduce the issue: > {noformat} > hive> create table c_complex(a array,b map) row format > serde 'org.apache.hive.hcatalog.data.JsonSerDe'; > OK > Time taken: 0.319 seconds > hive> insert into table c_complex select array('aaa'),map('aaa',1) from > studenttab10k limit 2; > Query ID = hrt_qa_20150826183232_47deb33a-19c0-4d2b-a92f-726659eb9413 > Total jobs = 1 > Launching Job 1 out of 1 > Status: Running (Executing on YARN cluster with App id > application_1440603993714_0010) > > VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED > KILLED > > Map 1 .. SUCCEEDED 1 100 0 > 0 > Reducer 2 .. SUCCEEDED 1 100 0 > 0 > > VERTICES: 02/02 [==>>] 100% ELAPSED TIME: 11.75 s > > > Loading data to table default.c_complex > Table default.c_complex stats: [numFiles=1, numRows=2, totalSize=56, > rawDataSize=0] > OK > Time taken: 13.706 seconds > hive> select * from c_complex; > OK > Failed with exception > java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: > org.codehaus.jackson.JsonParseException: Current token (FIELD_NAME) not > numeric, can not use numeric value accessors > at [Source: java.io.ByteArrayInputStream@295f79b; line: 1, column: 26] > Time taken: 0.115 seconds > hive> select count(*) from c_complex; > OK > 2 > Time taken: 0.205 seconds, Fetched: 1 row(s) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11969) start Tez session in background when starting CLI
[ https://issues.apache.org/jira/browse/HIVE-11969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940455#comment-14940455 ] Sergey Shelukhin commented on HIVE-11969: - Forgot to add the config flag, will do in next iteration > start Tez session in background when starting CLI > - > > Key: HIVE-11969 > URL: https://issues.apache.org/jira/browse/HIVE-11969 > Project: Hive > Issue Type: Bug > Components: Tez >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11969.01.patch, HIVE-11969.02.patch, > HIVE-11969.patch > > > Tez session spins up AM, which can cause delays, esp. if the cluster is very > busy. > This can be done in background, so the AM might get started while the user is > running local commands and doing other things. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12010) Tests should use FileSystem based stats collection mechanism
[ https://issues.apache.org/jira/browse/HIVE-12010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-12010: Attachment: HIVE-12010.patch > Tests should use FileSystem based stats collection mechanism > > > Key: HIVE-12010 > URL: https://issues.apache.org/jira/browse/HIVE-12010 > Project: Hive > Issue Type: Task > Components: Statistics >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-12010.patch > > > Although fs based collection mechanism is default for last few releases, > tests still use jdbc for stats collection. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11175) create function using jar does not work with sql std authorization
[ https://issues.apache.org/jira/browse/HIVE-11175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939399#comment-14939399 ] Hive QA commented on HIVE-11175: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12743388/HIVE-11175.1.patch {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9640 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_create_func1 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_udf_using org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_nonexistent_resource org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_udf_local_resource org.apache.hadoop.hive.ql.TestCreateUdfEntities.testUdfWithDfsResource org.apache.hadoop.hive.ql.TestCreateUdfEntities.testUdfWithLocalResource org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5481/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5481/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5481/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12743388 - PreCommit-HIVE-TRUNK-Build > create function using jar does not work with sql std authorization > -- > > Key: HIVE-11175 > URL: https://issues.apache.org/jira/browse/HIVE-11175 > Project: Hive > Issue Type: Bug > Components: Authorization >Affects Versions: 1.2.0 >Reporter: Olaf Flebbe >Assignee: Olaf Flebbe > Fix For: 2.0.0 > > Attachments: HIVE-11175.1.patch > > > {code:sql}create function xxx as 'xxx' using jar 'file://foo.jar' {code} > gives error code for need of accessing a local foo.jar resource with ADMIN > privileges. Same for HDFS (DFS_URI) . > problem is that the semantic analysis enforces the ADMIN privilege for write > but the jar is clearly input not output. > Patch und Testcase appendend. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11642) LLAP: make sure tests pass #3
[ https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940613#comment-14940613 ] Hive QA commented on HIVE-11642: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12764669/HIVE-11642.16.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9696 tests executed *Failed tests:* {noformat} TestMiniLlapCliDriver - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_scriptfile1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_reduce org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5488/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5488/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5488/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12764669 - PreCommit-HIVE-TRUNK-Build > LLAP: make sure tests pass #3 > - > > Key: HIVE-11642 > URL: https://issues.apache.org/jira/browse/HIVE-11642 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11642.01.patch, HIVE-11642.02.patch, > HIVE-11642.03.patch, HIVE-11642.04.patch, HIVE-11642.05.patch, > HIVE-11642.15.patch, HIVE-11642.16.patch, HIVE-11642.patch > > > Tests should pass against the most recent branch and Tez 0.8. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11983) Hive streaming API uses incorrect logic to assign buckets to incoming records
[ https://issues.apache.org/jira/browse/HIVE-11983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940619#comment-14940619 ] Eugene Koifman commented on HIVE-11983: --- also, in createRecordUpdater() {noformat} - .statementId(-1) - .finalDestination(partitionPath)); {noformat} is removed. This is wrong - these lines must be there. > Hive streaming API uses incorrect logic to assign buckets to incoming records > - > > Key: HIVE-11983 > URL: https://issues.apache.org/jira/browse/HIVE-11983 > Project: Hive > Issue Type: Bug > Components: HCatalog, Transactions >Affects Versions: 1.2.1 >Reporter: Roshan Naik >Assignee: Roshan Naik > Labels: streaming, streaming_api > Attachments: HIVE-11983.3.patch, HIVE-11983.4.patch, HIVE-11983.patch > > > The Streaming API tries to distribute records evenly into buckets. > All records in every Transaction that is part of TransactionBatch goes to the > same bucket and a new bucket number is chose for each TransactionBatch. > Fix: API needs to hash each record to determine which bucket it belongs to. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10755) Rework on HIVE-5193 to enhance the column oriented table acess
[ https://issues.apache.org/jira/browse/HIVE-10755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940647#comment-14940647 ] Daniel Dai commented on HIVE-10755: --- The approach should be fine. We also need to avoid appending ids multiple times to conf to avoid unnecessary warning from ColumnProjectionUtils.getReadColumnIDs. We shall also add a test case which cause HIVE-10752. A simple join + foreach should reproduce HIVE-10752: {code} A = load '" + COMPLEX_TABLE + "' using org.apache.hive.hcatalog.pig.HCatLoader(); B = load '" + COMPLEX_TABLE + "' using org.apache.hive.hcatalog.pig.HCatLoader(); C = join A by name, B by name; D = foreach C generate B::studentid; {code} > Rework on HIVE-5193 to enhance the column oriented table acess > -- > > Key: HIVE-10755 > URL: https://issues.apache.org/jira/browse/HIVE-10755 > Project: Hive > Issue Type: Sub-task > Components: HCatalog >Affects Versions: 1.2.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Fix For: 2.0.0 > > Attachments: HIVE-10755.patch > > > Add the support of column pruning for column oriented table access which was > done in HIVE-5193 but was reverted due to the join issue in HIVE-10720. > In 1.3.0, the patch posted by Viray didn't work, probably due to some jar > reference. That seems to get fixed and that patch works in 2.0.0 now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12002) correct implementation typo
[ https://issues.apache.org/jira/browse/HIVE-12002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940696#comment-14940696 ] Hive QA commented on HIVE-12002: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12764468/HIVE-12002.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9641 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5489/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5489/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5489/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12764468 - PreCommit-HIVE-TRUNK-Build > correct implementation typo > --- > > Key: HIVE-12002 > URL: https://issues.apache.org/jira/browse/HIVE-12002 > Project: Hive > Issue Type: Improvement > Components: Beeline, HCatalog, Metastore >Affects Versions: 1.2.1 >Reporter: Alex Moundalexis >Assignee: Alex Moundalexis >Priority: Trivial > Labels: newbie, typo > Attachments: HIVE-12002.patch > > > The term "implemenation" is seen in HiveMetaScore INFO logs. Correcting. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11878) ClassNotFoundException can possibly occur if multiple jars are registered one at a time in Hive
[ https://issues.apache.org/jira/browse/HIVE-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ratandeep Ratti updated HIVE-11878: --- Attachment: HIVE-11878_approach3_per_session_clasloader.patch > ClassNotFoundException can possibly occur if multiple jars are registered > one at a time in Hive > > > Key: HIVE-11878 > URL: https://issues.apache.org/jira/browse/HIVE-11878 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Ratandeep Ratti >Assignee: Ratandeep Ratti > Labels: URLClassLoader > Attachments: HIVE-11878.patch, HIVE-11878_approach3.patch, > HIVE-11878_approach3_per_session_clasloader.patch, HIVE-11878_qtest.patch > > > When we register a jar on the Hive console. Hive creates a fresh URL > classloader which includes the path of the current jar to be registered and > all the jar paths of the parent classloader. The parent classlaoder is the > current ThreadContextClassLoader. Once the URLClassloader is created Hive > sets that as the current ThreadContextClassloader. > So if we register multiple jars in Hive, there will be multiple > URLClassLoaders created, each classloader including the jars from its parent > and the one extra jar to be registered. The last URLClassLoader created will > end up as the current ThreadContextClassLoader. (See details: > org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath) > Now here's an example in which the above strategy can lead to a CNF exception. > We register 2 jars *j1* and *j2* in Hive console. *j1* contains the UDF class > *c1* and internally relies on class *c2* in jar *j2*. We register *j1* first, > the URLClassLoader *u1* is created and also set as the > ThreadContextClassLoader. We register *j2* next, the new URLClassLoader > created will be *u2* with *u1* as parent and *u2* becomes the new > ThreadContextClassLoader. Note *u2* includes paths to both jars *j1* and *j2* > whereas *u1* only has paths to *j1* (For details see: > org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath). > Now when we register class *c1* under a temporary function in Hive, we load > the class using {code} class.forName("c1", true, > Thread.currentThread().getContextClassLoader()) {code} . The > currentThreadContext class-loader is *u2*, and it has the path to the class > *c1*, but note that Class-loaders work by delegating to parent class-loader > first. In this case class *c1* will be found and *defined* by class-loader > *u1*. > Now *c1* from jar *j1* has *u1* as its class-loader. If a method (say > initialize) is called in *c1*, which references the class *c2*, *c2* will not > be found since the class-loader used to search for *c2* will be *u1* (Since > the caller's class-loader is used to load a class) > I've added a qtest to explain the problem. Please see the attached patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11878) ClassNotFoundException can possibly occur if multiple jars are registered one at a time in Hive
[ https://issues.apache.org/jira/browse/HIVE-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940705#comment-14940705 ] Ratandeep Ratti commented on HIVE-11878: s/above to/above two/ > ClassNotFoundException can possibly occur if multiple jars are registered > one at a time in Hive > > > Key: HIVE-11878 > URL: https://issues.apache.org/jira/browse/HIVE-11878 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Ratandeep Ratti >Assignee: Ratandeep Ratti > Labels: URLClassLoader > Attachments: HIVE-11878.patch, HIVE-11878_approach3.patch, > HIVE-11878_approach3_per_session_clasloader.patch, HIVE-11878_qtest.patch > > > When we register a jar on the Hive console. Hive creates a fresh URL > classloader which includes the path of the current jar to be registered and > all the jar paths of the parent classloader. The parent classlaoder is the > current ThreadContextClassLoader. Once the URLClassloader is created Hive > sets that as the current ThreadContextClassloader. > So if we register multiple jars in Hive, there will be multiple > URLClassLoaders created, each classloader including the jars from its parent > and the one extra jar to be registered. The last URLClassLoader created will > end up as the current ThreadContextClassLoader. (See details: > org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath) > Now here's an example in which the above strategy can lead to a CNF exception. > We register 2 jars *j1* and *j2* in Hive console. *j1* contains the UDF class > *c1* and internally relies on class *c2* in jar *j2*. We register *j1* first, > the URLClassLoader *u1* is created and also set as the > ThreadContextClassLoader. We register *j2* next, the new URLClassLoader > created will be *u2* with *u1* as parent and *u2* becomes the new > ThreadContextClassLoader. Note *u2* includes paths to both jars *j1* and *j2* > whereas *u1* only has paths to *j1* (For details see: > org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath). > Now when we register class *c1* under a temporary function in Hive, we load > the class using {code} class.forName("c1", true, > Thread.currentThread().getContextClassLoader()) {code} . The > currentThreadContext class-loader is *u2*, and it has the path to the class > *c1*, but note that Class-loaders work by delegating to parent class-loader > first. In this case class *c1* will be found and *defined* by class-loader > *u1*. > Now *c1* from jar *j1* has *u1* as its class-loader. If a method (say > initialize) is called in *c1*, which references the class *c2*, *c2* will not > be found since the class-loader used to search for *c2* will be *u1* (Since > the caller's class-loader is used to load a class) > I've added a qtest to explain the problem. Please see the attached patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-12015) LLAP: merge master into branch
[ https://issues.apache.org/jira/browse/HIVE-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-12015. - Resolution: Fixed > LLAP: merge master into branch > -- > > Key: HIVE-12015 > URL: https://issues.apache.org/jira/browse/HIVE-12015 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: llap > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11973) IN operator fails when the column type is DATE
[ https://issues.apache.org/jira/browse/HIVE-11973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940690#comment-14940690 ] Yongzhi Chen commented on HIVE-11973: - The 1 failure is not related. The age is more than 300. > IN operator fails when the column type is DATE > --- > > Key: HIVE-11973 > URL: https://issues.apache.org/jira/browse/HIVE-11973 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.0.0 >Reporter: sanjiv singh >Assignee: Yongzhi Chen > Attachments: HIVE-11973.1.patch > > > Test DLL : > {code} > CREATE TABLE `date_dim`( > `d_date_sk` int, > `d_date_id` string, > `d_date` date, > `d_current_week` string, > `d_current_month` string, > `d_current_quarter` string, > `d_current_year` string) ; > {code} > Hive query : > {code} > SELECT * > FROM date_dim > WHERE d_date IN ('2000-03-22','2001-03-22') ; > {code} > In 1.0.0 , the above query fails with: > {code} > FAILED: SemanticException [Error 10014]: Line 1:180 Wrong arguments > ''2001-03-22'': The arguments for IN should be the same type! Types are: > {date IN (string, string)} > {code} > I changed the query as given to pass the error : > {code} > SELECT * > FROM date_dim > WHERE d_date IN (CAST('2000-03-22' AS DATE) , CAST('2001-03-22' AS DATE) > ) ; > {code} > But it works without casting : > {code} > SELECT * > FROM date_dim > WHERE d_date = '2000-03-22' ; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11878) ClassNotFoundException can possibly occur if multiple jars are registered one at a time in Hive
[ https://issues.apache.org/jira/browse/HIVE-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940701#comment-14940701 ] Ratandeep Ratti commented on HIVE-11878: Also note: The above to problems, I think, should also exist in Hive currently. Am I missing something here? > ClassNotFoundException can possibly occur if multiple jars are registered > one at a time in Hive > > > Key: HIVE-11878 > URL: https://issues.apache.org/jira/browse/HIVE-11878 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Ratandeep Ratti >Assignee: Ratandeep Ratti > Labels: URLClassLoader > Attachments: HIVE-11878.patch, HIVE-11878_approach3.patch, > HIVE-11878_qtest.patch > > > When we register a jar on the Hive console. Hive creates a fresh URL > classloader which includes the path of the current jar to be registered and > all the jar paths of the parent classloader. The parent classlaoder is the > current ThreadContextClassLoader. Once the URLClassloader is created Hive > sets that as the current ThreadContextClassloader. > So if we register multiple jars in Hive, there will be multiple > URLClassLoaders created, each classloader including the jars from its parent > and the one extra jar to be registered. The last URLClassLoader created will > end up as the current ThreadContextClassLoader. (See details: > org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath) > Now here's an example in which the above strategy can lead to a CNF exception. > We register 2 jars *j1* and *j2* in Hive console. *j1* contains the UDF class > *c1* and internally relies on class *c2* in jar *j2*. We register *j1* first, > the URLClassLoader *u1* is created and also set as the > ThreadContextClassLoader. We register *j2* next, the new URLClassLoader > created will be *u2* with *u1* as parent and *u2* becomes the new > ThreadContextClassLoader. Note *u2* includes paths to both jars *j1* and *j2* > whereas *u1* only has paths to *j1* (For details see: > org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath). > Now when we register class *c1* under a temporary function in Hive, we load > the class using {code} class.forName("c1", true, > Thread.currentThread().getContextClassLoader()) {code} . The > currentThreadContext class-loader is *u2*, and it has the path to the class > *c1*, but note that Class-loaders work by delegating to parent class-loader > first. In this case class *c1* will be found and *defined* by class-loader > *u1*. > Now *c1* from jar *j1* has *u1* as its class-loader. If a method (say > initialize) is called in *c1*, which references the class *c2*, *c2* will not > be found since the class-loader used to search for *c2* will be *u1* (Since > the caller's class-loader is used to load a class) > I've added a qtest to explain the problem. Please see the attached patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-12013) LLAP: disable most llap tests before merge
[ https://issues.apache.org/jira/browse/HIVE-12013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-12013: --- Assignee: Sergey Shelukhin > LLAP: disable most llap tests before merge > -- > > Key: HIVE-12013 > URL: https://issues.apache.org/jira/browse/HIVE-12013 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > > Tests cannot be parallelized before we merge, and tests cannot run fast > enough (they did once, so I guess cannot always run fast enough) when they > are not parallelized, and we need tests to pass before merging. > LLAP is off by default, we did see tests pass recently (with the exception of > a few out file diffs), and some tests will still be run, so it should be ok > to proceed as follows. > We will disable most of the LLAP q tests for now, merge, enable paralllelism > and re-enable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12013) LLAP: disable most llap tests before merge
[ https://issues.apache.org/jira/browse/HIVE-12013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12013: Description: Tests cannot be parallelized before we merge, and tests cannot run fast enough (they did once, so I guess cannot always run fast enough) when they are not parallelized, and we need tests to pass before merging. LLAP is off by default, we did see tests pass recently (with the exception of a few out file diffs), and some tests will still be run, so it should be ok to proceed as follows. We will disable most of the LLAP q tests for now, merge, enable paralllelism and re-enable. was: Tests cannot be parallelized before we merge, and tests cannot run fast enough (they did once, so I guess cannot always run fast enough) when they are not parallelized. LLAP is off by default, we did see tests pass recently (with the exception of a few out file diffs), and some tests will still be run, so it should be ok to proceed as follows. We will disable most of the LLAP q tests for now, merge, enable paralllelism and re-enable. > LLAP: disable most llap tests before merge > -- > > Key: HIVE-12013 > URL: https://issues.apache.org/jira/browse/HIVE-12013 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin > > Tests cannot be parallelized before we merge, and tests cannot run fast > enough (they did once, so I guess cannot always run fast enough) when they > are not parallelized, and we need tests to pass before merging. > LLAP is off by default, we did see tests pass recently (with the exception of > a few out file diffs), and some tests will still be run, so it should be ok > to proceed as follows. > We will disable most of the LLAP q tests for now, merge, enable paralllelism > and re-enable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12013) LLAP: disable most llap tests before merge
[ https://issues.apache.org/jira/browse/HIVE-12013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12013: Description: Tests cannot be parallelized before we merge, and tests cannot run fast enough (they did once, so I guess cannot always run fast enough) when they are not parallelized. LLAP is off by default, we did see tests pass recently (with the exception of a few out file diffs), and some tests will still be run, so it should be ok to proceed as follows. We will disable most of the LLAP q tests for now, merge, enable paralllelism and re-enable. was: Tests cannot be parallelized before we merge, and tests cannot run fast enough (they did once, so I guess cannot always run fast enough) when they are not parallelized. LLAP is off by default, we did see tests pass recently (with the exception of a few out file diffs), and some tests will still be run. We will disable most of the LLAP q tests for now, merge, enable paralllelism and re-enable. > LLAP: disable most llap tests before merge > -- > > Key: HIVE-12013 > URL: https://issues.apache.org/jira/browse/HIVE-12013 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin > > Tests cannot be parallelized before we merge, and tests cannot run fast > enough (they did once, so I guess cannot always run fast enough) when they > are not parallelized. > LLAP is off by default, we did see tests pass recently (with the exception of > a few out file diffs), and some tests will still be run, so it should be ok > to proceed as follows. > We will disable most of the LLAP q tests for now, merge, enable paralllelism > and re-enable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10571) HiveMetaStoreClient should close existing thrift connection before its reconnect
[ https://issues.apache.org/jira/browse/HIVE-10571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940716#comment-14940716 ] Lefty Leverenz commented on HIVE-10571: --- Thanks [~ctang.ma]. My email message with that commit ID only shows master -- how strange. The backports for 1.0.2 and 1.2.2 should also be listed in Fix Version/s so that release notes can pick up this jira when those versions are released. > HiveMetaStoreClient should close existing thrift connection before its > reconnect > > > Key: HIVE-10571 > URL: https://issues.apache.org/jira/browse/HIVE-10571 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-10571.patch, HIVE-10571.patch, HIVE-10571.patch > > > HiveMetaStoreClient should first close its existing thrift connection, no > matter it is already dead or still live, before its opening another > connection in its reconnect() method. Otherwise, it might lead to resource > huge accumulation or leak at HMS site when client keeps on retrying. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11765) SMB Join fails in Hive 1.2
[ https://issues.apache.org/jira/browse/HIVE-11765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940717#comment-14940717 ] Prasanth Jayachandran commented on HIVE-11765: -- I just tried in hive-1.2.1 release binary. Still unable to reproduce. Tried again with mr and tez. > SMB Join fails in Hive 1.2 > -- > > Key: HIVE-11765 > URL: https://issues.apache.org/jira/browse/HIVE-11765 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.0, 1.2.1 >Reporter: Na Yang >Assignee: Prasanth Jayachandran > Attachments: employee (1).csv > > > SMB join on Hive 1.2 fails with the following stack trace : > {code} > java.io.IOException: java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:266) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:333) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:719) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.(MapTask.java:173) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:437) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:348) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:408) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:252) > ... 11 more > Caused by: java.lang.IndexOutOfBoundsException: toIndex = 5 > at java.util.ArrayList.subListRangeCheck(ArrayList.java:1004) > at java.util.ArrayList.subList(ArrayList.java:996) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderFactory.getSchemaOnRead(RecordReaderFactory.java:161) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderFactory.createTreeReader(RecordReaderFactory.java:66) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.(RecordReaderImpl.java:202) > at > org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:539) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.createReaderFromFile(OrcInputFormat.java:230) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.(OrcInputFormat.java:163) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1104) > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:67) > {code} > This error happens after adding the patch of HIVE-10591. Reverting HIVE-10591 > fixes this exception. > Steps to reproduce: > {code} > SET hive.enforce.sorting=true; > SET hive.enforce.bucketing=true; > SET hive.exec.dynamic.partition=true; > SET mapreduce.reduce.import.limit=-1; > SET hive.optimize.bucketmapjoin=true; > SET hive.optimize.bucketmapjoin.sortedmerge=true; > SET hive.auto.convert.join=true; > SET hive.auto.convert.sortmerge.join=true; > create Table table1 (empID int, name varchar(64), email varchar(64), company > varchar(64), age int) clustered by (age) sorted by (age ASC) INTO 384 buckets > stored as ORC; > create Table table2 (empID int, name varchar(64), email varchar(64), company > varchar(64), age int) clustered by (age) sorted by (age ASC) into 384 buckets > stored as ORC; > create Table table_tmp (empID int, name varchar(64), email varchar(64), > company varchar(64), age int); > load data local inpath '/tmp/employee.csv’ into table table_tmp; > INSERT OVERWRITE table table1 select * from table_tmp; > INSERT OVERWRITE table table2 select * from table_tmp; > SELECT table1.age, table2.age from table1 inner join table2 on >
[jira] [Updated] (HIVE-11642) LLAP: make sure tests pass #3
[ https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11642: Attachment: (was: HIVE-11642.15.patch) > LLAP: make sure tests pass #3 > - > > Key: HIVE-11642 > URL: https://issues.apache.org/jira/browse/HIVE-11642 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11642.01.patch, HIVE-11642.02.patch, > HIVE-11642.03.patch, HIVE-11642.04.patch, HIVE-11642.05.patch, > HIVE-11642.17.patch, HIVE-11642.patch > > > Tests should pass against the most recent branch and Tez 0.8. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11642) LLAP: make sure tests pass #3
[ https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11642: Attachment: HIVE-11642.17.patch > LLAP: make sure tests pass #3 > - > > Key: HIVE-11642 > URL: https://issues.apache.org/jira/browse/HIVE-11642 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11642.01.patch, HIVE-11642.02.patch, > HIVE-11642.03.patch, HIVE-11642.04.patch, HIVE-11642.05.patch, > HIVE-11642.17.patch, HIVE-11642.patch > > > Tests should pass against the most recent branch and Tez 0.8. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-12013) LLAP: disable most llap tests before merge
[ https://issues.apache.org/jira/browse/HIVE-12013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-12013. - Resolution: Fixed Fix Version/s: llap > LLAP: disable most llap tests before merge > -- > > Key: HIVE-12013 > URL: https://issues.apache.org/jira/browse/HIVE-12013 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: llap > > > Tests cannot be parallelized before we merge, and tests cannot run fast > enough (they did once, so I guess cannot always run fast enough) when they > are not parallelized, and we need tests to pass before merging. > LLAP is off by default, we did see tests pass recently (with the exception of > a few out file diffs), and some tests will still be run, so it should be ok > to proceed as follows. > We will disable most of the LLAP q tests for now, merge, enable paralllelism > and re-enable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11642) LLAP: make sure tests pass #3
[ https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11642: Attachment: (was: HIVE-11642.15.patch) > LLAP: make sure tests pass #3 > - > > Key: HIVE-11642 > URL: https://issues.apache.org/jira/browse/HIVE-11642 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11642.01.patch, HIVE-11642.02.patch, > HIVE-11642.03.patch, HIVE-11642.04.patch, HIVE-11642.05.patch, > HIVE-11642.17.patch, HIVE-11642.patch > > > Tests should pass against the most recent branch and Tez 0.8. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11642) LLAP: make sure tests pass #3
[ https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11642: Attachment: (was: HIVE-11642.16.patch) > LLAP: make sure tests pass #3 > - > > Key: HIVE-11642 > URL: https://issues.apache.org/jira/browse/HIVE-11642 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11642.01.patch, HIVE-11642.02.patch, > HIVE-11642.03.patch, HIVE-11642.04.patch, HIVE-11642.05.patch, > HIVE-11642.17.patch, HIVE-11642.patch > > > Tests should pass against the most recent branch and Tez 0.8. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11473) Upgrade Spark dependency to 1.5 [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-11473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940637#comment-14940637 ] Rui Li commented on HIVE-11473: --- Hi [~xuefuz], the latest test result is [here|http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/view/All/job/PreCommit-HIVE-SPARK-Build/lastBuild/]. {{parquet_join}} still fails. But it passes on my machine (using your updated tarball). Do we need to do some cleanup for the pre-commit test? Or would you mind try that test on your side? Thanks. I also noticed snapshots of hive jars are uploaded [here|http://repository.apache.org/snapshots/org/apache/hive/]. We need to make sure to run {{mvn clean install -DskipTests -Phadoop-2}} under hive-home before the test, so that the test won't pick up a snapshot from external repo. > Upgrade Spark dependency to 1.5 [Spark Branch] > -- > > Key: HIVE-11473 > URL: https://issues.apache.org/jira/browse/HIVE-11473 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Jimmy Xiang >Assignee: Rui Li > Attachments: HIVE-11473.1-spark.patch, HIVE-11473.2-spark.patch, > HIVE-11473.3-spark.patch, HIVE-11473.3-spark.patch > > > In Spark 1.5, SparkListener interface is changed. So HoS may fail to create > the spark client if the un-implemented event callback method is invoked. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11786) Deprecate the use of redundant column in colunm stats related tables
[ https://issues.apache.org/jira/browse/HIVE-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940698#comment-14940698 ] Chaoyu Tang commented on HIVE-11786: The test failure is not related to the patch. > Deprecate the use of redundant column in colunm stats related tables > > > Key: HIVE-11786 > URL: https://issues.apache.org/jira/browse/HIVE-11786 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-11786.1.patch, HIVE-11786.1.patch, > HIVE-11786.2.patch, HIVE-11786.patch > > > The stats tables such as TAB_COL_STATS, PART_COL_STATS have redundant columns > such as DB_NAME, TABLE_NAME, PARTITION_NAME since these tables already have > foreign key like TBL_ID, or PART_ID referencing to TBLS or PARTITIONS. > These redundant columns violate database normalization rules and cause a lot > of inconvenience (sometimes difficult) in column stats related feature > implementation. For example, when renaming a table, we have to update > TABLE_NAME column in these tables as well which is unnecessary. > This JIRA is first to deprecate the use of these columns at HMS code level. A > followed JIRA is to be opened to focus on DB schema change and upgrade. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11553) use basic file metadata cache in ETLSplitStrategy-related paths
[ https://issues.apache.org/jira/browse/HIVE-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11553: Attachment: HIVE-11553.07.patch HiveQA didn't pick it up... trying the same patch again > use basic file metadata cache in ETLSplitStrategy-related paths > --- > > Key: HIVE-11553 > URL: https://issues.apache.org/jira/browse/HIVE-11553 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11553.01.patch, HIVE-11553.02.patch, > HIVE-11553.03.patch, HIVE-11553.04.patch, HIVE-11553.06.patch, > HIVE-11553.06.patch, HIVE-11553.07.patch, HIVE-11553.patch > > > This is the first step; uses the simple footer-getting API, without PPD. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11927) Implement/Enable constant related optimization rules in Calcite: enable HiveReduceExpressionsRule to fold constants
[ https://issues.apache.org/jira/browse/HIVE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11927: --- Attachment: HIVE-11927.03.patch > Implement/Enable constant related optimization rules in Calcite: enable > HiveReduceExpressionsRule to fold constants > --- > > Key: HIVE-11927 > URL: https://issues.apache.org/jira/browse/HIVE-11927 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11927.01.patch, HIVE-11927.02.patch, > HIVE-11927.03.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11642) LLAP: make sure tests pass #3
[ https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11642: Attachment: HIVE-11642.15.patch > LLAP: make sure tests pass #3 > - > > Key: HIVE-11642 > URL: https://issues.apache.org/jira/browse/HIVE-11642 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11642.01.patch, HIVE-11642.02.patch, > HIVE-11642.03.patch, HIVE-11642.04.patch, HIVE-11642.05.patch, > HIVE-11642.15.patch, HIVE-11642.15.patch, HIVE-11642.16.patch, > HIVE-11642.patch > > > Tests should pass against the most recent branch and Tez 0.8. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
[ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940719#comment-14940719 ] Sergey Shelukhin commented on HIVE-11675: - I have patch long in the works but I keep getting distracted by other random crap. > make use of file footer PPD API in ETL strategy or separate strategy > > > Key: HIVE-11675 > URL: https://issues.apache.org/jira/browse/HIVE-11675 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > > Need to take a look at the best flow. It won't be much different if we do > filtering metastore call for each partition. So perhaps we'd need the custom > sync point/batching after all. > Or we can make it opportunistic and not fetch any footers unless it can be > pushed down to metastore or fetched from local cache, that way the only slow > threaded op is directory listings -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)
[ https://issues.apache.org/jira/browse/HIVE-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939580#comment-14939580 ] Jesus Camacho Rodriguez commented on HIVE-11634: 1. That's what I thought, no problem. 2. OK, this case should be solved or studied as part of this JIRA. 3. That's fine; we can create a new JIRA case for that. But maybe I would remove then the changes in PcrExprProcFactory.java and I would follow-up in the new JIRA, as that code is not working as expected. What do you think? 4. Maybe I didn't explain it properly. The idea is that you would only prepend non-partition columns, and without clustering them, but iff the NDV in the IN clause is reduced. In any case, we can create a new JIRA for this too, and maybe assign it to me? As I see it, the modification to the original optimization that you have just created should not be too complicated. > Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...) > -- > > Key: HIVE-11634 > URL: https://issues.apache.org/jira/browse/HIVE-11634 > Project: Hive > Issue Type: Bug > Components: CBO >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11634.1.patch, HIVE-11634.2.patch, > HIVE-11634.3.patch, HIVE-11634.4.patch, HIVE-11634.5.patch, > HIVE-11634.6.patch, HIVE-11634.7.patch, HIVE-11634.8.patch, > HIVE-11634.9.patch, HIVE-11634.91.patch, HIVE-11634.92.patch, > HIVE-11634.93.patch, HIVE-11634.94.patch, HIVE-11634.95.patch, > HIVE-11634.96.patch, HIVE-11634.97.patch > > > Currently, we do not support partition pruning for the following scenario > {code} > create table pcr_t1 (key int, value string) partitioned by (ds string); > insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src > where key < 20 order by key; > insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src > where key < 20 order by key; > insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src > where key < 20 order by key; > explain extended select ds from pcr_t1 where struct(ds, key) in > (struct('2000-04-08',1), struct('2000-04-09',2)); > {code} > If we run the above query, we see that all the partitions of table pcr_t1 are > present in the filter predicate where as we can prune partition > (ds='2000-04-10'). > The optimization is to rewrite the above query into the following. > {code} > explain extended select ds from pcr_t1 where (struct(ds)) IN > (struct('2000-04-08'), struct('2000-04-09')) and struct(ds, key) in > (struct('2000-04-08',1), struct('2000-04-09',2)); > {code} > The predicate (struct(ds)) IN (struct('2000-04-08'), struct('2000-04-09')) > is used by partition pruner to prune the columns which otherwise will not be > pruned. > This is an extension of the idea presented in HIVE-11573. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11543) Provide log4j properties migration tool
[ https://issues.apache.org/jira/browse/HIVE-11543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940728#comment-14940728 ] Prasanth Jayachandran commented on HIVE-11543: -- https://issues.apache.org/jira/browse/LOG4J2-952 added properties file based configuration back to log4j2. This is available as part of recent log4j2 2.4 release. https://logging.apache.org/log4j/2.0/changes-report.html#a2.4 Need to investigate further if old properties files are still compatible. > Provide log4j properties migration tool > --- > > Key: HIVE-11543 > URL: https://issues.apache.org/jira/browse/HIVE-11543 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Blocker > > For log4j2 migration, if users are performing upgrades then we need to > provide a tool for converting the existing log4j.properties file to > log4j2.xml file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12016) Update log4j2 version to 2.4
[ https://issues.apache.org/jira/browse/HIVE-12016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-12016: - Attachment: HIVE-12016.1.patch [~gopalv]/[~sershe] can someone take a look at this patch? > Update log4j2 version to 2.4 > > > Key: HIVE-12016 > URL: https://issues.apache.org/jira/browse/HIVE-12016 > Project: Hive > Issue Type: Sub-task > Components: Logging >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Fix For: 2.0.0 > > Attachments: HIVE-12016.1.patch > > > The latest 2.4 release of log4j2 brought back properties file based > configuration. https://logging.apache.org/log4j/2.0/changes-report.html#a2.4 > bump up the version number to 2.4. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11997) Add ability to send Compaction Jobs to specific queue
[ https://issues.apache.org/jira/browse/HIVE-11997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940773#comment-14940773 ] Hive QA commented on HIVE-11997: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12764492/HIVE-11997.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9641 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5492/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5492/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5492/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12764492 - PreCommit-HIVE-TRUNK-Build > Add ability to send Compaction Jobs to specific queue > - > > Key: HIVE-11997 > URL: https://issues.apache.org/jira/browse/HIVE-11997 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-11997.patch > > > need new HiveConf param to specify queue name -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12016) Update log4j2 version to 2.4
[ https://issues.apache.org/jira/browse/HIVE-12016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-12016: - Attachment: HIVE-12016.2.patch > Update log4j2 version to 2.4 > > > Key: HIVE-12016 > URL: https://issues.apache.org/jira/browse/HIVE-12016 > Project: Hive > Issue Type: Sub-task > Components: Logging >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Fix For: 2.0.0 > > Attachments: HIVE-12016.1.patch, HIVE-12016.2.patch > > > The latest 2.4 release of log4j2 brought back properties file based > configuration. https://logging.apache.org/log4j/2.0/changes-report.html#a2.4 > bump up the version number to 2.4. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11954) Extend logic to choose side table in MapJoin Conversion algorithm
[ https://issues.apache.org/jira/browse/HIVE-11954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-11954: --- Attachment: HIVE-11954.02.patch > Extend logic to choose side table in MapJoin Conversion algorithm > - > > Key: HIVE-11954 > URL: https://issues.apache.org/jira/browse/HIVE-11954 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 2.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-11954.01.patch, HIVE-11954.02.patch, > HIVE-11954.patch, HIVE-11954.patch > > > Selection of side table (in memory/hash table) in MapJoin Conversion > algorithm needs to be more sophisticated. > In an N way Map Join, Hive should pick an input stream as side table (in > memory table) that has least cost in producing relation (like TS(FIL|Proj)*). > Cost based choice needs extended cost model; without return path its going to > be hard to do this. > For the time being we could employ a modified cost based algorithm for side > table selection. > New algorithm is described below: > 1. Identify the candidate set of inputs for side table (in memory/hash table) > from the inputs (based on conditional task size) > 2. For each of the input identify its cost, memory requirement. Cost is 1 for > each heavy weight relation op (Join, GB, PTF/Windowing, TF, etc.). Cost for > an input is the total no of heavy weight ops in its branch. > 3. Order set from #1 on cost & memory req (ascending order) > 4. Pick the first element from #3 as the side table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11445) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : groupby distinct does not work
[ https://issues.apache.org/jira/browse/HIVE-11445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-11445: --- Affects Version/s: 2.0.0 > CBO: Calcite Operator To Hive Operator (Calcite Return Path) : groupby > distinct does not work > - > > Key: HIVE-11445 > URL: https://issues.apache.org/jira/browse/HIVE-11445 > Project: Hive > Issue Type: Sub-task > Components: CBO >Affects Versions: 2.0.0 >Reporter: Pengcheng Xiong >Assignee: Jesus Camacho Rodriguez > Fix For: 2.0.0 > > Attachments: HIVE-11445.01.patch, HIVE-11445.02.patch, > HIVE-11445.03.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)