[jira] [Updated] (HIVE-17845) insert fails if target table columns are not lowercase
[ https://issues.apache.org/jira/browse/HIVE-17845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naresh P R updated HIVE-17845: -- Status: Patch Available (was: In Progress) > insert fails if target table columns are not lowercase > -- > > Key: HIVE-17845 > URL: https://issues.apache.org/jira/browse/HIVE-17845 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 >Reporter: Naresh P R >Assignee: Naresh P R >Priority: Minor > Fix For: 2.3.0 > > Attachments: HIVE-17845.patch > > > eg., > INSERT INTO TABLE EMP(ID,NAME) select * FROM SRC; > FAILED: SemanticException 1:27 '[ID,NAME]' in insert schema specification are > not found among regular columns of default.EMP nor dynamic partition > columns.. Error encountered near token 'NAME' > Whereas below insert is successful: > INSERT INTO TABLE EMP(id,name) select * FROM SRC; -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17845) insert fails if target table columns are not lowercase
[ https://issues.apache.org/jira/browse/HIVE-17845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naresh P R updated HIVE-17845: -- Status: In Progress (was: Patch Available) > insert fails if target table columns are not lowercase > -- > > Key: HIVE-17845 > URL: https://issues.apache.org/jira/browse/HIVE-17845 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 >Reporter: Naresh P R >Assignee: Naresh P R >Priority: Minor > Fix For: 2.3.0 > > Attachments: HIVE-17845.patch > > > eg., > INSERT INTO TABLE EMP(ID,NAME) select * FROM SRC; > FAILED: SemanticException 1:27 '[ID,NAME]' in insert schema specification are > not found among regular columns of default.EMP nor dynamic partition > columns.. Error encountered near token 'NAME' > Whereas below insert is successful: > INSERT INTO TABLE EMP(id,name) select * FROM SRC; -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17863) Vectorization: Two Q files produce wrong PTF query results
[ https://issues.apache.org/jira/browse/HIVE-17863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline reassigned HIVE-17863: --- > Vectorization: Two Q files produce wrong PTF query results > -- > > Key: HIVE-17863 > URL: https://issues.apache.org/jira/browse/HIVE-17863 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.0.0 >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > > vector_windowing_multipartitioning.q > vector_windowing_order_null.q -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16601) Display Session Id and Query Name / Id in Spark UI
[ https://issues.apache.org/jira/browse/HIVE-16601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212089#comment-16212089 ] Xuefu Zhang commented on HIVE-16601: Thanks for the update. Personally I like the way that app name is formatted. However, job group portion is less readable. To format job group in a similar way as formatting app name would be great. (Instead of just "", maybe we can have "query_id="). Thoughts? > Display Session Id and Query Name / Id in Spark UI > -- > > Key: HIVE-16601 > URL: https://issues.apache.org/jira/browse/HIVE-16601 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-16601.1.patch, HIVE-16601.2.patch, > HIVE-16601.3.patch, HIVE-16601.4.patch, HIVE-16601.5.patch, > HIVE-16601.6.patch, Spark UI Applications List.png, Spark UI Jobs List.png > > > We should display the session id for each HoS Application Launched, and the > Query Name / Id and Dag Id for each Spark job launched. Hive-on-MR does > something similar via the {{mapred.job.name}} parameter. The query name is > displayed in the Job Name of the MR app. > The changes here should also allow us to leverage the config > {{hive.query.name}} for HoS. > This should help with debuggability of HoS applications. The Hive-on-Tez UI > does something similar. > Related issues for Hive-on-Tez: HIVE-12357, HIVE-12523 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
[ https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-17458: -- Attachment: HIVE-17458.03.patch patch 3 - checkpoint can fully vectorized original file reads in the absence of delete events otherwise falls back to VectorizedOrcAcidRowReader if request is using LLAP and needs ROW__IDs projected it will fail ungracefully > VectorizedOrcAcidRowBatchReader doesn't handle 'original' files > --- > > Key: HIVE-17458 > URL: https://issues.apache.org/jira/browse/HIVE-17458 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-17458.01.patch, HIVE-17458.02.patch, > HIVE-17458.03.patch > > > VectorizedOrcAcidRowBatchReader will not be used for original files. This > will likely look like a perf regression when converting a table from non-acid > to acid until it runs through a major compaction. > With Load Data support, if large files are added via Load Data, the read ops > will not vectorize until major compaction. > There is no reason why this should be the case. Just like > OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other > files in the logical tranche/bucket and calculate the offset for the RowBatch > of the split. (Presumably getRecordReader().getRowNumber() works the same in > vector mode). > In this case we don't even need OrcSplit.isOriginal() - the reader can infer > it from file path... which in particular simplifies > OrcInputFormat.determineSplitStrategies() -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
[ https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-17458: -- Status: Patch Available (was: Open) > VectorizedOrcAcidRowBatchReader doesn't handle 'original' files > --- > > Key: HIVE-17458 > URL: https://issues.apache.org/jira/browse/HIVE-17458 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-17458.01.patch, HIVE-17458.02.patch, > HIVE-17458.03.patch > > > VectorizedOrcAcidRowBatchReader will not be used for original files. This > will likely look like a perf regression when converting a table from non-acid > to acid until it runs through a major compaction. > With Load Data support, if large files are added via Load Data, the read ops > will not vectorize until major compaction. > There is no reason why this should be the case. Just like > OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other > files in the logical tranche/bucket and calculate the offset for the RowBatch > of the split. (Presumably getRecordReader().getRowNumber() works the same in > vector mode). > In this case we don't even need OrcSplit.isOriginal() - the reader can infer > it from file path... which in particular simplifies > OrcInputFormat.determineSplitStrategies() -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17771) Implement commands to manage resource plan.
[ https://issues.apache.org/jira/browse/HIVE-17771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212017#comment-16212017 ] Sergey Shelukhin commented on HIVE-17771: - +1 pending tests. > Implement commands to manage resource plan. > --- > > Key: HIVE-17771 > URL: https://issues.apache.org/jira/browse/HIVE-17771 > Project: Hive > Issue Type: Sub-task >Reporter: Harish Jaiprakash >Assignee: Harish Jaiprakash > Attachments: HIVE-17771.01.patch, HIVE-17771.02.patch, > HIVE-17771.03.patch > > > Please see parent jira about llap workload management. > This jira is to implement create and show resource plan commands in hive to > configure resource plans for llap workload. The following are the proposed > commands implemented as part of the jira: > CREATE RESOURCE PLAN plan_name WITH QUERY_PARALLELISM parallelism; > SHOW RESOURCE PLAN plan_name; > SHOW RESOURCE PLANS; > ALTER RESOURCE PLAN plan_name SET QUERY_PARALLELISM = parallelism; > ALTER RESOURCE PLAN plan_name RENAME TO new_name; > ALTER RESOURCE PLAN plan_name ACTIVATE; > ALTER RESOURCE PLAN plan_name DISABLE; > ALTER RESOURCE PLAN plan_name ENABLE; > DROP RESOURCE PLAN; > It will be followed up with more jiras to manage pools, triggers and copy > resource plans. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17817) Stabilize crossproduct warning message output order
[ https://issues.apache.org/jira/browse/HIVE-17817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-17817: Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Pushed to master. Thanks, Zoltan! > Stabilize crossproduct warning message output order > --- > > Key: HIVE-17817 > URL: https://issues.apache.org/jira/browse/HIVE-17817 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Fix For: 3.0.0 > > Attachments: HIVE-17817.01.patch, HIVE-17817.02.patch > > > {{CrossProductCheck}} warning printout sometimes happens in reverse order; > which reduces people's confidence in the test's reliability. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17607) remove ColumnStatsDesc usage from columnstatsupdatetask
[ https://issues.apache.org/jira/browse/HIVE-17607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-17607: Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Pushed to master. Thanks, Geregly. > remove ColumnStatsDesc usage from columnstatsupdatetask > --- > > Key: HIVE-17607 > URL: https://issues.apache.org/jira/browse/HIVE-17607 > Project: Hive > Issue Type: Sub-task >Reporter: Zoltan Haindrich >Assignee: Gergely Hajós > Fix For: 3.0.0 > > Attachments: HIVE-17607.1.patch, HIVE-17607.2.patch, > HIVE-17607.3.patch > > > it's not entirely connected to this task...it should either has its own > descriptor; or work sould take on the: tablename/coltype/colname payload -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17578) Create a TableRef object for Table/Partition
[ https://issues.apache.org/jira/browse/HIVE-17578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-17578: Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Pushed to master. Thanks, Geregly. > Create a TableRef object for Table/Partition > > > Key: HIVE-17578 > URL: https://issues.apache.org/jira/browse/HIVE-17578 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Gergely Hajós > Fix For: 3.0.0 > > Attachments: HIVE-17578.1.patch > > > a quick {{git grep DbName |grep -i TableName}} uncovers quite a lot of places > where the fully qualified {{dbname.tablename}} is being produced > and most of the time the Table object is also present which might as well can > have a method to service a tableref. > There might be some hidden bugs because of this...because at some places the > fully qualified table name is produced earlier... > example callsite: > https://github.com/apache/hive/blob/266c50554b86462d8b5ac84d28a2b237b8dbfa7e/ql/src/java/org/apache/hadoop/hive/ql/hooks/UpdateInputAccessTimeHook.java#L63 > and called method: > https://github.com/apache/hive/blob/266c50554b86462d8b5ac84d28a2b237b8dbfa7e/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L620 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17473) implement workload management pools
[ https://issues.apache.org/jira/browse/HIVE-17473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17473: Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Committed to master. Thanks for the reviews! > implement workload management pools > --- > > Key: HIVE-17473 > URL: https://issues.apache.org/jira/browse/HIVE-17473 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 3.0.0 > > Attachments: HIVE-17473.01.patch, HIVE-17473.03.patch, > HIVE-17473.04.patch, HIVE-17473.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16601) Display Session Id and Query Name / Id in Spark UI
[ https://issues.apache.org/jira/browse/HIVE-16601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211991#comment-16211991 ] Sahil Takiar commented on HIVE-16601: - [~xuefuz], attached updated screenshots. > Display Session Id and Query Name / Id in Spark UI > -- > > Key: HIVE-16601 > URL: https://issues.apache.org/jira/browse/HIVE-16601 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-16601.1.patch, HIVE-16601.2.patch, > HIVE-16601.3.patch, HIVE-16601.4.patch, HIVE-16601.5.patch, > HIVE-16601.6.patch, Spark UI Applications List.png, Spark UI Jobs List.png > > > We should display the session id for each HoS Application Launched, and the > Query Name / Id and Dag Id for each Spark job launched. Hive-on-MR does > something similar via the {{mapred.job.name}} parameter. The query name is > displayed in the Job Name of the MR app. > The changes here should also allow us to leverage the config > {{hive.query.name}} for HoS. > This should help with debuggability of HoS applications. The Hive-on-Tez UI > does something similar. > Related issues for Hive-on-Tez: HIVE-12357, HIVE-12523 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16601) Display Session Id and Query Name / Id in Spark UI
[ https://issues.apache.org/jira/browse/HIVE-16601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-16601: Attachment: Spark UI Applications List.png Spark UI Jobs List.png > Display Session Id and Query Name / Id in Spark UI > -- > > Key: HIVE-16601 > URL: https://issues.apache.org/jira/browse/HIVE-16601 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-16601.1.patch, HIVE-16601.2.patch, > HIVE-16601.3.patch, HIVE-16601.4.patch, HIVE-16601.5.patch, > HIVE-16601.6.patch, Spark UI Applications List.png, Spark UI Jobs List.png > > > We should display the session id for each HoS Application Launched, and the > Query Name / Id and Dag Id for each Spark job launched. Hive-on-MR does > something similar via the {{mapred.job.name}} parameter. The query name is > displayed in the Job Name of the MR app. > The changes here should also allow us to leverage the config > {{hive.query.name}} for HoS. > This should help with debuggability of HoS applications. The Hive-on-Tez UI > does something similar. > Related issues for Hive-on-Tez: HIVE-12357, HIVE-12523 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16601) Display Session Id and Query Name / Id in Spark UI
[ https://issues.apache.org/jira/browse/HIVE-16601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-16601: Attachment: (was: Spark UI Jobs List.png) > Display Session Id and Query Name / Id in Spark UI > -- > > Key: HIVE-16601 > URL: https://issues.apache.org/jira/browse/HIVE-16601 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-16601.1.patch, HIVE-16601.2.patch, > HIVE-16601.3.patch, HIVE-16601.4.patch, HIVE-16601.5.patch, HIVE-16601.6.patch > > > We should display the session id for each HoS Application Launched, and the > Query Name / Id and Dag Id for each Spark job launched. Hive-on-MR does > something similar via the {{mapred.job.name}} parameter. The query name is > displayed in the Job Name of the MR app. > The changes here should also allow us to leverage the config > {{hive.query.name}} for HoS. > This should help with debuggability of HoS applications. The Hive-on-Tez UI > does something similar. > Related issues for Hive-on-Tez: HIVE-12357, HIVE-12523 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16601) Display Session Id and Query Name / Id in Spark UI
[ https://issues.apache.org/jira/browse/HIVE-16601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-16601: Attachment: (was: Spark UI Applications List.png) > Display Session Id and Query Name / Id in Spark UI > -- > > Key: HIVE-16601 > URL: https://issues.apache.org/jira/browse/HIVE-16601 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-16601.1.patch, HIVE-16601.2.patch, > HIVE-16601.3.patch, HIVE-16601.4.patch, HIVE-16601.5.patch, HIVE-16601.6.patch > > > We should display the session id for each HoS Application Launched, and the > Query Name / Id and Dag Id for each Spark job launched. Hive-on-MR does > something similar via the {{mapred.job.name}} parameter. The query name is > displayed in the Job Name of the MR app. > The changes here should also allow us to leverage the config > {{hive.query.name}} for HoS. > This should help with debuggability of HoS applications. The Hive-on-Tez UI > does something similar. > Related issues for Hive-on-Tez: HIVE-12357, HIVE-12523 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-10378) Hive Update statement set keyword work with lower case only and doesn't give any error if wrong column name specified in the set clause.
[ https://issues.apache.org/jira/browse/HIVE-10378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211976#comment-16211976 ] Eugene Koifman commented on HIVE-10378: --- [~osayankin] could you include a test with the fix? > Hive Update statement set keyword work with lower case only and doesn't give > any error if wrong column name specified in the set clause. > > > Key: HIVE-10378 > URL: https://issues.apache.org/jira/browse/HIVE-10378 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0, 1.1.0 > Environment: Hadoop: 2.6.0 > Hive : 1.0.0/1.1.0 > OS:Linux >Reporter: Vineet Kandpal >Assignee: Oleksiy Sayankin > Fix For: 2.3.2 > > Attachments: HIVE-10378.1.patch > > > Brief: Hive Update statement set keyword work with lower case only and > doesn't give any error if wrong column name specified in the set clause. > Steps to reproduce: > following are the steps performed for the same: > 1. Create Table with transactional properties. > create table customer(id int ,name string, email string) clustered by (id) > into 2 buckets stored as orc TBLPROPERTIES('transactional'='true') > 2. Insert data into transactional table: > insert into table customer values > (1,'user1','us...@user1.com'),(2,'user2','us...@user1.com'),(3,'user3','us...@gmail.com') > 3. Search result: > 0: jdbc:hive2://localhost:1> select * from customer; > +--++--+--+ > | customer.id | customer.name | customer.email | > +--++--+--+ > | 2| user2 | us...@user1.com | > | 3| user3 | us...@gmail.com | > | 1| user1 | us...@user1.com | > +--++--+--+ > 3 rows selected (0.299 seconds) > 4. Update table column name with some clause In below column name is used in > the UPPER case (NAME) and it is not updating the column value : > 0: jdbc:hive2://localhost:1> update customer set NAME = > 'notworking' where id = 1; > INFO : Table default.customer stats: [numFiles=10, numRows=3, > totalSize=6937, rawDataSize=0] > No rows affected (20.343 seconds) > 0: jdbc:hive2://localhost:1> select * from customer; > +--++--+--+ > | customer.id | customer.name | customer.email | > +--++--+--+ > | 2| user2 | us...@user1.com | > | 3| user3 | us...@gmail.com | > | 1| user1 | us...@user1.com | > +--++--+--+ > 3 rows selected (0.321 seconds) > 5. Update table column name with some clause In below column name is used in > the LOWER case (name) and it is updating the column value > 0: jdbc:hive2://localhost:1> update customer set name = 'working' > where id = 1; > INFO : Table default.customer stats: [numFiles=11, numRows=3, > totalSize=7699, rawDataSize=0] > No rows affected (19.74 seconds) > 0: jdbc:hive2://localhost:1> select * from customer; > +--++--+--+ > | customer.id | customer.name | customer.email | > +--++--+--+ > | 2| user2 | us...@user1.com | > | 3| user3 | us...@gmail.com | > | 1| working| us...@user1.com | > +--++--+--+ > 3 rows selected (0.333 seconds) > 0: jdbc:hive2://localhost:1> > 6. We have also seen that if we put the column name incorrect in set keyword > of the update statement it accept the query and execute job. There should > validation on the column name used in the set clause. > 0: jdbc:hive2://localhost:1> update customer set name_44 = > 'working' where id = 1; > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17578) Create a TableRef object for Table/Partition
[ https://issues.apache.org/jira/browse/HIVE-17578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211975#comment-16211975 ] Ashutosh Chauhan commented on HIVE-17578: - +1 > Create a TableRef object for Table/Partition > > > Key: HIVE-17578 > URL: https://issues.apache.org/jira/browse/HIVE-17578 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Gergely Hajós > Attachments: HIVE-17578.1.patch > > > a quick {{git grep DbName |grep -i TableName}} uncovers quite a lot of places > where the fully qualified {{dbname.tablename}} is being produced > and most of the time the Table object is also present which might as well can > have a method to service a tableref. > There might be some hidden bugs because of this...because at some places the > fully qualified table name is produced earlier... > example callsite: > https://github.com/apache/hive/blob/266c50554b86462d8b5ac84d28a2b237b8dbfa7e/ql/src/java/org/apache/hadoop/hive/ql/hooks/UpdateInputAccessTimeHook.java#L63 > and called method: > https://github.com/apache/hive/blob/266c50554b86462d8b5ac84d28a2b237b8dbfa7e/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L620 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HIVE-17862) Update copyright date in NOTICE
[ https://issues.apache.org/jira/browse/HIVE-17862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez resolved HIVE-17862. Resolution: Fixed > Update copyright date in NOTICE > --- > > Key: HIVE-17862 > URL: https://issues.apache.org/jira/browse/HIVE-17862 > Project: Hive > Issue Type: Task >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Trivial > Fix For: 2.3.1 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17862) Update copyright date in NOTICE
[ https://issues.apache.org/jira/browse/HIVE-17862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez reassigned HIVE-17862: -- > Update copyright date in NOTICE > --- > > Key: HIVE-17862 > URL: https://issues.apache.org/jira/browse/HIVE-17862 > Project: Hive > Issue Type: Task >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Trivial > Fix For: 2.3.1 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17607) remove ColumnStatsDesc usage from columnstatsupdatetask
[ https://issues.apache.org/jira/browse/HIVE-17607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211946#comment-16211946 ] Ashutosh Chauhan commented on HIVE-17607: - +1 > remove ColumnStatsDesc usage from columnstatsupdatetask > --- > > Key: HIVE-17607 > URL: https://issues.apache.org/jira/browse/HIVE-17607 > Project: Hive > Issue Type: Sub-task >Reporter: Zoltan Haindrich >Assignee: Gergely Hajós > Attachments: HIVE-17607.1.patch, HIVE-17607.2.patch, > HIVE-17607.3.patch > > > it's not entirely connected to this task...it should either has its own > descriptor; or work sould take on the: tablename/coltype/colname payload -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17826) Error writing to RandomAccessFile after operation log is closed
[ https://issues.apache.org/jira/browse/HIVE-17826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211942#comment-16211942 ] Andrew Sherman commented on HIVE-17826: --- Thanks [~aihuaxu] I did think about that but I am not sure how to do it in a simple and clean way. cleanupOperationLog() is called from Operation.closeO so adding a delay inline will prevent the session from terminating which seems weird. And doing it asynchronously makes it more complicated. But as we just discussed IRL I will think about it some more. > Error writing to RandomAccessFile after operation log is closed > --- > > Key: HIVE-17826 > URL: https://issues.apache.org/jira/browse/HIVE-17826 > Project: Hive > Issue Type: Bug >Reporter: Andrew Sherman >Assignee: Andrew Sherman > Attachments: HIVE-17826.1.patch > > > We are seeing the error from HS2 process stdout. > {noformat} > 2017-09-07 10:17:23,933 AsyncLogger-1 ERROR Attempted to append to > non-started appender query-file-appender > 2017-09-07 10:17:23,934 AsyncLogger-1 ERROR Attempted to append to > non-started appender query-file-appender > 2017-09-07 10:17:23,935 AsyncLogger-1 ERROR Unable to write to stream > /var/log/hive/operation_logs/dd38df5b-3c09-48c9-ad64-a2eee093bea6/hive_20170907101723_1a6ad4b9-f662-4e7a-a495-06e3341308f9 > for appender query-file-appender > 2017-09-07 10:17:23,935 AsyncLogger-1 ERROR An exception occurred processing > Appender query-file-appender > org.apache.logging.log4j.core.appender.AppenderLoggingException: Error > writing to RandomAccessFile > /var/log/hive/operation_logs/dd38df5b-3c09-48c9-ad64-a2eee093bea6/hive_20170907101723_1a6ad4b9-f662-4e7a-a495-06e3341308f9 > at > org.apache.logging.log4j.core.appender.RandomAccessFileManager.flush(RandomAccessFileManager.java:114) > at > org.apache.logging.log4j.core.appender.RandomAccessFileManager.write(RandomAccessFileManager.java:103) > at > org.apache.logging.log4j.core.appender.OutputStreamManager.write(OutputStreamManager.java:136) > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputStreamAppender.java:105) > at > org.apache.logging.log4j.core.appender.RandomAccessFileAppender.append(RandomAccessFileAppender.java:89) > at > org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:152) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:125) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:116) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84) > at > org.apache.logging.log4j.core.appender.routing.RoutingAppender.append(RoutingAppender.java:112) > at > org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:152) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:125) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:116) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84) > at > org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:390) > at > org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:378) > at > org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:362) > at > org.apache.logging.log4j.core.config.AwaitCompletionReliabilityStrategy.log(AwaitCompletionReliabilityStrategy.java:79) > at > org.apache.logging.log4j.core.async.AsyncLogger.actualAsyncLog(AsyncLogger.java:385) > at > org.apache.logging.log4j.core.async.RingBufferLogEvent.execute(RingBufferLogEvent.java:103) > at > org.apache.logging.log4j.core.async.RingBufferLogEventHandler.onEvent(RingBufferLogEventHandler.java:43) > at > org.apache.logging.log4j.core.async.RingBufferLogEventHandler.onEvent(RingBufferLogEventHandler.java:28) > at > com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:129) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.io.IOException: Stream Closed > at java.io.RandomAccessFile.writeBytes(Native Method) > at java.io.RandomAccessFile.write(RandomAccessFile.java:525) > at > org.apache.logging.log4j.core.appender.RandomAccessFileManager.flush(RandomAccessFileManager.java:111) > ... 25 more > {noformat}
[jira] [Assigned] (HIVE-17856) MM tables - IOW is broken
[ https://issues.apache.org/jira/browse/HIVE-17856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-17856: --- Assignee: Steve Yeom > MM tables - IOW is broken > - > > Key: HIVE-17856 > URL: https://issues.apache.org/jira/browse/HIVE-17856 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Sergey Shelukhin >Assignee: Steve Yeom > Labels: mm-gap-1 > > The following tests were removed from mm_all during "integration"... I should > have never allowed such manner of intergration. > MM logic should have been kept intact until ACID logic could catch up. Alas, > here we are. > {noformat} > drop table iow0_mm; > create table iow0_mm(key int) tblproperties("transactional"="true", > "transactional_properties"="insert_only"); > insert overwrite table iow0_mm select key from intermediate; > insert into table iow0_mm select key + 1 from intermediate; > select * from iow0_mm order by key; > insert overwrite table iow0_mm select key + 2 from intermediate; > select * from iow0_mm order by key; > drop table iow0_mm; > drop table iow1_mm; > create table iow1_mm(key int) partitioned by (key2 int) > tblproperties("transactional"="true", > "transactional_properties"="insert_only"); > insert overwrite table iow1_mm partition (key2) > select key as k1, key from intermediate union all select key as k1, key from > intermediate; > insert into table iow1_mm partition (key2) > select key + 1 as k1, key from intermediate union all select key as k1, key > from intermediate; > select * from iow1_mm order by key, key2; > insert overwrite table iow1_mm partition (key2) > select key + 3 as k1, key from intermediate union all select key + 4 as k1, > key from intermediate; > select * from iow1_mm order by key, key2; > insert overwrite table iow1_mm partition (key2) > select key + 3 as k1, key + 3 from intermediate union all select key + 2 as > k1, key + 2 from intermediate; > select * from iow1_mm order by key, key2; > drop table iow1_mm; > {noformat} > {noformat} > drop table simple_mm; > create table simple_mm(key int) stored as orc tblproperties > ("transactional"="true", "transactional_properties"="insert_only"); > insert into table simple_mm select key from intermediate; > -insert overwrite table simple_mm select key from intermediate; > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17858) MM - some union cases are broken
[ https://issues.apache.org/jira/browse/HIVE-17858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-17858: --- Assignee: Sergey Shelukhin > MM - some union cases are broken > > > Key: HIVE-17858 > URL: https://issues.apache.org/jira/browse/HIVE-17858 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Labels: mm-gap-1 > > mm_all test no longer runs on LLAP; if it's executed in LLAP, one can see > that some union cases no longer work. > Queries on partunion_mm, skew_dp_union_mm produce no results. > I'm not sure what part of "integration" broke it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-11266) count(*) wrong result based on table statistics for external tables
[ https://issues.apache.org/jira/browse/HIVE-11266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-11266: --- Target Version/s: 3.0.0, 2.4.0 (was: 3.0.0, 2.4.0, 2.3.1) > count(*) wrong result based on table statistics for external tables > --- > > Key: HIVE-11266 > URL: https://issues.apache.org/jira/browse/HIVE-11266 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Simone Battaglia >Assignee: Jesus Camacho Rodriguez >Priority: Blocker > Fix For: 3.0.0 > > Attachments: HIVE-11266.01.patch, HIVE-11266.patch > > > Hive returns wrong count result on an external table with table statistics if > I change table data files. > This is the scenario in details: > 1) create external table my_table (...) location 'my_location'; > 2) analyze table my_table compute statistics; > 3) change/add/delete one or more files in 'my_location' directory; > 4) select count(\*) from my_table; > In this case the count query doesn't generate a MR job and returns the result > based on table statistics. This result is wrong because is based on > statistics stored in the Hive metastore and doesn't take into account > modifications introduced on data files. > Obviously setting "hive.compute.query.using.stats" to FALSE this problem > doesn't occur but the default value of this property is TRUE. > I thinks that also this post on stackoverflow, that shows another type of bug > in case of multiple insert, is related to the one that I reported: > http://stackoverflow.com/questions/24080276/wrong-result-for-count-in-hive-table -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17430) Add LOAD DATA test for blobstores
[ https://issues.apache.org/jira/browse/HIVE-17430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-17430: --- Target Version/s: (was: 2.3.1) > Add LOAD DATA test for blobstores > - > > Key: HIVE-17430 > URL: https://issues.apache.org/jira/browse/HIVE-17430 > Project: Hive > Issue Type: Test > Components: Tests >Affects Versions: 2.2.0 >Reporter: Yuzhou Sun >Assignee: Yuzhou Sun > Fix For: 3.0.0, 2.4.0 > > Attachments: HIVE-17430.patch > > > This patch introduces load_data.q regression tests into the hive-blobstore > qtest module. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17636) Add multiple_agg.q test for blobstores
[ https://issues.apache.org/jira/browse/HIVE-17636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-17636: --- Target Version/s: (was: 2.3.1) > Add multiple_agg.q test for blobstores > -- > > Key: HIVE-17636 > URL: https://issues.apache.org/jira/browse/HIVE-17636 > Project: Hive > Issue Type: Test > Components: Tests >Reporter: Ran Gu >Assignee: Ran Gu > Fix For: 3.0.0, 2.4.0 > > Attachments: HIVE-17636.patch > > > This patch introduces multiple_agg.q regression tests into the hive-blobstore > qtest module. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17729) Add Database & Explain related blobstore tests
[ https://issues.apache.org/jira/browse/HIVE-17729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211936#comment-16211936 ] Jesus Camacho Rodriguez commented on HIVE-17729: Changing targeted fix version to 2.3.2 as this is not a blocker and we are releasing 2.3.1. > Add Database & Explain related blobstore tests > -- > > Key: HIVE-17729 > URL: https://issues.apache.org/jira/browse/HIVE-17729 > Project: Hive > Issue Type: Test > Components: Tests >Reporter: Rentao Wu >Assignee: Rentao Wu > Attachments: HIVE-17729.patch > > > This patch introduces the following regression tests into the hive-blobstore > qtest module: > * create_database.q -> tests tables with location inherited from database > * multiple_db.q -> tests query spanning multiple databases > * explain.q -> tests EXPLAIN INSERT OVERWRITE command > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17729) Add Database & Explain related blobstore tests
[ https://issues.apache.org/jira/browse/HIVE-17729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-17729: --- Target Version/s: 3.0.0, 2.4.0, 2.3.2 (was: 3.0.0, 2.4.0, 2.3.1) > Add Database & Explain related blobstore tests > -- > > Key: HIVE-17729 > URL: https://issues.apache.org/jira/browse/HIVE-17729 > Project: Hive > Issue Type: Test > Components: Tests >Reporter: Rentao Wu >Assignee: Rentao Wu > Attachments: HIVE-17729.patch > > > This patch introduces the following regression tests into the hive-blobstore > qtest module: > * create_database.q -> tests tables with location inherited from database > * multiple_db.q -> tests query spanning multiple databases > * explain.q -> tests EXPLAIN INSERT OVERWRITE command > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-10378) Hive Update statement set keyword work with lower case only and doesn't give any error if wrong column name specified in the set clause.
[ https://issues.apache.org/jira/browse/HIVE-10378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10378: --- Fix Version/s: (was: 2.3.1) 2.3.2 > Hive Update statement set keyword work with lower case only and doesn't give > any error if wrong column name specified in the set clause. > > > Key: HIVE-10378 > URL: https://issues.apache.org/jira/browse/HIVE-10378 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0, 1.1.0 > Environment: Hadoop: 2.6.0 > Hive : 1.0.0/1.1.0 > OS:Linux >Reporter: Vineet Kandpal >Assignee: Oleksiy Sayankin > Fix For: 2.3.2 > > Attachments: HIVE-10378.1.patch > > > Brief: Hive Update statement set keyword work with lower case only and > doesn't give any error if wrong column name specified in the set clause. > Steps to reproduce: > following are the steps performed for the same: > 1. Create Table with transactional properties. > create table customer(id int ,name string, email string) clustered by (id) > into 2 buckets stored as orc TBLPROPERTIES('transactional'='true') > 2. Insert data into transactional table: > insert into table customer values > (1,'user1','us...@user1.com'),(2,'user2','us...@user1.com'),(3,'user3','us...@gmail.com') > 3. Search result: > 0: jdbc:hive2://localhost:1> select * from customer; > +--++--+--+ > | customer.id | customer.name | customer.email | > +--++--+--+ > | 2| user2 | us...@user1.com | > | 3| user3 | us...@gmail.com | > | 1| user1 | us...@user1.com | > +--++--+--+ > 3 rows selected (0.299 seconds) > 4. Update table column name with some clause In below column name is used in > the UPPER case (NAME) and it is not updating the column value : > 0: jdbc:hive2://localhost:1> update customer set NAME = > 'notworking' where id = 1; > INFO : Table default.customer stats: [numFiles=10, numRows=3, > totalSize=6937, rawDataSize=0] > No rows affected (20.343 seconds) > 0: jdbc:hive2://localhost:1> select * from customer; > +--++--+--+ > | customer.id | customer.name | customer.email | > +--++--+--+ > | 2| user2 | us...@user1.com | > | 3| user3 | us...@gmail.com | > | 1| user1 | us...@user1.com | > +--++--+--+ > 3 rows selected (0.321 seconds) > 5. Update table column name with some clause In below column name is used in > the LOWER case (name) and it is updating the column value > 0: jdbc:hive2://localhost:1> update customer set name = 'working' > where id = 1; > INFO : Table default.customer stats: [numFiles=11, numRows=3, > totalSize=7699, rawDataSize=0] > No rows affected (19.74 seconds) > 0: jdbc:hive2://localhost:1> select * from customer; > +--++--+--+ > | customer.id | customer.name | customer.email | > +--++--+--+ > | 2| user2 | us...@user1.com | > | 3| user3 | us...@gmail.com | > | 1| working| us...@user1.com | > +--++--+--+ > 3 rows selected (0.333 seconds) > 0: jdbc:hive2://localhost:1> > 6. We have also seen that if we put the column name incorrect in set keyword > of the update statement it accept the query and execute job. There should > validation on the column name used in the set clause. > 0: jdbc:hive2://localhost:1> update customer set name_44 = > 'working' where id = 1; > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-10378) Hive Update statement set keyword work with lower case only and doesn't give any error if wrong column name specified in the set clause.
[ https://issues.apache.org/jira/browse/HIVE-10378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211938#comment-16211938 ] Jesus Camacho Rodriguez commented on HIVE-10378: Changing targeted fix version to 2.3.2 as this is not a blocker and we are releasing 2.3.1. > Hive Update statement set keyword work with lower case only and doesn't give > any error if wrong column name specified in the set clause. > > > Key: HIVE-10378 > URL: https://issues.apache.org/jira/browse/HIVE-10378 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0, 1.1.0 > Environment: Hadoop: 2.6.0 > Hive : 1.0.0/1.1.0 > OS:Linux >Reporter: Vineet Kandpal >Assignee: Oleksiy Sayankin > Fix For: 2.3.2 > > Attachments: HIVE-10378.1.patch > > > Brief: Hive Update statement set keyword work with lower case only and > doesn't give any error if wrong column name specified in the set clause. > Steps to reproduce: > following are the steps performed for the same: > 1. Create Table with transactional properties. > create table customer(id int ,name string, email string) clustered by (id) > into 2 buckets stored as orc TBLPROPERTIES('transactional'='true') > 2. Insert data into transactional table: > insert into table customer values > (1,'user1','us...@user1.com'),(2,'user2','us...@user1.com'),(3,'user3','us...@gmail.com') > 3. Search result: > 0: jdbc:hive2://localhost:1> select * from customer; > +--++--+--+ > | customer.id | customer.name | customer.email | > +--++--+--+ > | 2| user2 | us...@user1.com | > | 3| user3 | us...@gmail.com | > | 1| user1 | us...@user1.com | > +--++--+--+ > 3 rows selected (0.299 seconds) > 4. Update table column name with some clause In below column name is used in > the UPPER case (NAME) and it is not updating the column value : > 0: jdbc:hive2://localhost:1> update customer set NAME = > 'notworking' where id = 1; > INFO : Table default.customer stats: [numFiles=10, numRows=3, > totalSize=6937, rawDataSize=0] > No rows affected (20.343 seconds) > 0: jdbc:hive2://localhost:1> select * from customer; > +--++--+--+ > | customer.id | customer.name | customer.email | > +--++--+--+ > | 2| user2 | us...@user1.com | > | 3| user3 | us...@gmail.com | > | 1| user1 | us...@user1.com | > +--++--+--+ > 3 rows selected (0.321 seconds) > 5. Update table column name with some clause In below column name is used in > the LOWER case (name) and it is updating the column value > 0: jdbc:hive2://localhost:1> update customer set name = 'working' > where id = 1; > INFO : Table default.customer stats: [numFiles=11, numRows=3, > totalSize=7699, rawDataSize=0] > No rows affected (19.74 seconds) > 0: jdbc:hive2://localhost:1> select * from customer; > +--++--+--+ > | customer.id | customer.name | customer.email | > +--++--+--+ > | 2| user2 | us...@user1.com | > | 3| user3 | us...@gmail.com | > | 1| working| us...@user1.com | > +--++--+--+ > 3 rows selected (0.333 seconds) > 0: jdbc:hive2://localhost:1> > 6. We have also seen that if we put the column name incorrect in set keyword > of the update statement it accept the query and execute job. There should > validation on the column name used in the set clause. > 0: jdbc:hive2://localhost:1> update customer set name_44 = > 'working' where id = 1; > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17819) Add sampling.q test for blobstores
[ https://issues.apache.org/jira/browse/HIVE-17819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211935#comment-16211935 ] Jesus Camacho Rodriguez commented on HIVE-17819: Changing targeted fix version to 2.3.2 as this is not a blocker and we are releasing 2.3.1. > Add sampling.q test for blobstores > -- > > Key: HIVE-17819 > URL: https://issues.apache.org/jira/browse/HIVE-17819 > Project: Hive > Issue Type: Test > Components: Tests >Reporter: Ran Gu > Attachments: HIVE-17819.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17819) Add sampling.q test for blobstores
[ https://issues.apache.org/jira/browse/HIVE-17819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-17819: --- Target Version/s: 2.3.2 (was: 2.3.1) > Add sampling.q test for blobstores > -- > > Key: HIVE-17819 > URL: https://issues.apache.org/jira/browse/HIVE-17819 > Project: Hive > Issue Type: Test > Components: Tests >Reporter: Ran Gu > Attachments: HIVE-17819.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17820) Add buckets.q test for blobstores
[ https://issues.apache.org/jira/browse/HIVE-17820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211934#comment-16211934 ] Jesus Camacho Rodriguez commented on HIVE-17820: Changing targeted fix version to 2.3.2 as this is not a blocker and we are releasing 2.3.1. > Add buckets.q test for blobstores > - > > Key: HIVE-17820 > URL: https://issues.apache.org/jira/browse/HIVE-17820 > Project: Hive > Issue Type: Test > Components: Tests >Reporter: Ran Gu > Attachments: HIVE-17820.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17820) Add buckets.q test for blobstores
[ https://issues.apache.org/jira/browse/HIVE-17820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-17820: --- Target Version/s: 2.3.2 (was: 2.3.1) > Add buckets.q test for blobstores > - > > Key: HIVE-17820 > URL: https://issues.apache.org/jira/browse/HIVE-17820 > Project: Hive > Issue Type: Test > Components: Tests >Reporter: Ran Gu > Attachments: HIVE-17820.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17856) MM tables - IOW is broken
[ https://issues.apache.org/jira/browse/HIVE-17856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17856: Description: The following tests were removed from mm_all during "integration"... I should have never allowed such manner of intergration. MM logic should have been kept intact until ACID logic could catch up. Alas, here we are. {noformat} drop table iow0_mm; create table iow0_mm(key int) tblproperties("transactional"="true", "transactional_properties"="insert_only"); insert overwrite table iow0_mm select key from intermediate; insert into table iow0_mm select key + 1 from intermediate; select * from iow0_mm order by key; insert overwrite table iow0_mm select key + 2 from intermediate; select * from iow0_mm order by key; drop table iow0_mm; drop table iow1_mm; create table iow1_mm(key int) partitioned by (key2 int) tblproperties("transactional"="true", "transactional_properties"="insert_only"); insert overwrite table iow1_mm partition (key2) select key as k1, key from intermediate union all select key as k1, key from intermediate; insert into table iow1_mm partition (key2) select key + 1 as k1, key from intermediate union all select key as k1, key from intermediate; select * from iow1_mm order by key, key2; insert overwrite table iow1_mm partition (key2) select key + 3 as k1, key from intermediate union all select key + 4 as k1, key from intermediate; select * from iow1_mm order by key, key2; insert overwrite table iow1_mm partition (key2) select key + 3 as k1, key + 3 from intermediate union all select key + 2 as k1, key + 2 from intermediate; select * from iow1_mm order by key, key2; drop table iow1_mm; {noformat} {noformat} drop table simple_mm; create table simple_mm(key int) stored as orc tblproperties ("transactional"="true", "transactional_properties"="insert_only"); insert into table simple_mm select key from intermediate; -insert overwrite table simple_mm select key from intermediate; {noformat} was: The following tests were removed from mm_all during "integration"... I should have never allowed such manner of intergration. MM logic should have been kept intact until ACID logic could catch up. Alas, here we are. Additionally multi-IOW tests may produce incorrect results. They were/are commented out in mm_all. {noformat} drop table iow0_mm; create table iow0_mm(key int) tblproperties("transactional"="true", "transactional_properties"="insert_only"); insert overwrite table iow0_mm select key from intermediate; insert into table iow0_mm select key + 1 from intermediate; select * from iow0_mm order by key; insert overwrite table iow0_mm select key + 2 from intermediate; select * from iow0_mm order by key; drop table iow0_mm; drop table iow1_mm; create table iow1_mm(key int) partitioned by (key2 int) tblproperties("transactional"="true", "transactional_properties"="insert_only"); insert overwrite table iow1_mm partition (key2) select key as k1, key from intermediate union all select key as k1, key from intermediate; insert into table iow1_mm partition (key2) select key + 1 as k1, key from intermediate union all select key as k1, key from intermediate; select * from iow1_mm order by key, key2; insert overwrite table iow1_mm partition (key2) select key + 3 as k1, key from intermediate union all select key + 4 as k1, key from intermediate; select * from iow1_mm order by key, key2; insert overwrite table iow1_mm partition (key2) select key + 3 as k1, key + 3 from intermediate union all select key + 2 as k1, key + 2 from intermediate; select * from iow1_mm order by key, key2; drop table iow1_mm; {noformat} {noformat} drop table simple_mm; create table simple_mm(key int) stored as orc tblproperties ("transactional"="true", "transactional_properties"="insert_only"); insert into table simple_mm select key from intermediate; -insert overwrite table simple_mm select key from intermediate; {noformat} > MM tables - IOW is broken > - > > Key: HIVE-17856 > URL: https://issues.apache.org/jira/browse/HIVE-17856 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Sergey Shelukhin > Labels: mm-gap-1 > > The following tests were removed from mm_all during "integration"... I should > have never allowed such manner of intergration. > MM logic should have been kept intact until ACID logic could catch up. Alas, > here we are. > {noformat} > drop table iow0_mm; > create table iow0_mm(key int) tblproperties("transactional"="true", > "transactional_properties"="insert_only"); > insert overwrite table iow0_mm select key from intermediate; > insert into table iow0_mm select key + 1 from intermediate; > select * from iow0_mm order by key; > insert overwrite table iow0_mm select key + 2 from intermediate; > select * from iow0_mm order by key; > drop
[jira] [Updated] (HIVE-17793) Parameterize Logging Messages
[ https://issues.apache.org/jira/browse/HIVE-17793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-17793: Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Pushed to master. Thanks, Beluga! > Parameterize Logging Messages > - > > Key: HIVE-17793 > URL: https://issues.apache.org/jira/browse/HIVE-17793 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Trivial > Fix For: 3.0.0 > > Attachments: HIVE-17793.1.patch, HIVE-17793.2.patch > > > * Use SLF4J parameterized logging > * Remove use of archaic Util's "stringifyException" and simply allow logging > framework to handle formatting of output. Also saves having to create the > error message and then throwing it away when the logging level is set higher > than the logging message > * Add some {{LOG.isDebugEnabled}} around complex debug messages -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17657) export/import for MM tables is broken
[ https://issues.apache.org/jira/browse/HIVE-17657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17657: Labels: mm-gap-2 (was: mm-gap-1) > export/import for MM tables is broken > - > > Key: HIVE-17657 > URL: https://issues.apache.org/jira/browse/HIVE-17657 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman > Labels: mm-gap-2 > > there is mm_exim.q but it's not clear from the tests what file structure it > creates > On import the txnids in the directory names would have to be remapped if > importing to a different cluster. Perhaps export can be smart and export > highest base_x and accretive deltas (minus aborted ones). Then import can > ...? It would have to remap txn ids from the archive to new txn ids. This > would then mean that import is made up of several transactions rather than 1 > atomic op. (all locks must belong to a transaction) > One possibility is to open a new txn for each dir in the archive (where > start/end txn of file name is the same) and commit all of them at once (need > new TMgr API for that). This assumes using a shared lock (if any!) and thus > allows other inserts (not related to import) to occur. > What if you have delta_6_9, such as a result of concatenate? If we stipulate > that this must mean that there is no delta_6_6 or any other "obsolete" delta > in the archive we can map it to a new single txn delta_x_x. > Add read_only mode for tables (useful in general, may be needed for upgrade > etc) and use that to make the above atomic. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17855) conversion to MM tables via alter may be broken
[ https://issues.apache.org/jira/browse/HIVE-17855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17855: Labels: mm-gap-2 (was: mm-gap-1) > conversion to MM tables via alter may be broken > --- > > Key: HIVE-17855 > URL: https://issues.apache.org/jira/browse/HIVE-17855 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Sergey Shelukhin > Labels: mm-gap-2 > > {noformat} > git difftool 77511070dd^ 77511070dd -- */mm_conversions.q > {noformat} > Looks like during ACID "integration" alter was simply quietly changed to > create+insert, because it's broken. > I asked to keep feature parity with every change but I should have rather > insisted on it and -1d all the patches that didn't... This is just annoying. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17817) Stabilize crossproduct warning message output order
[ https://issues.apache.org/jira/browse/HIVE-17817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211920#comment-16211920 ] Hive QA commented on HIVE-17817: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12893009/HIVE-17817.02.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 11309 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=204) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=221) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7388/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7388/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7388/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12893009 - PreCommit-HIVE-Build > Stabilize crossproduct warning message output order > --- > > Key: HIVE-17817 > URL: https://issues.apache.org/jira/browse/HIVE-17817 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Attachments: HIVE-17817.01.patch, HIVE-17817.02.patch > > > {{CrossProductCheck}} warning printout sometimes happens in reverse order; > which reduces people's confidence in the test's reliability. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17799) Add Ellipsis For Truncated Query In Hive Lock
[ https://issues.apache.org/jira/browse/HIVE-17799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211917#comment-16211917 ] Ashutosh Chauhan commented on HIVE-17799: - +1 > Add Ellipsis For Truncated Query In Hive Lock > - > > Key: HIVE-17799 > URL: https://issues.apache.org/jira/browse/HIVE-17799 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Trivial > Attachments: HIVE-17799.1.patch > > > [HIVE-16334] introduced truncation for storing queries in ZK lock nodes. > This Jira is to add ellipsis into the query to let the operator know that > truncation has occurred and therefore they will not find the specific query > in their logs, only a prefix match will work. > {code:sql} > -- Truncation of query may be confusing to operator > -- Without truncation > SELECT * FROM TABLE WHERE COL=1 > -- With truncation (operator will not find this query in workload) > SELECT * FROM TABLE > -- With truncation (operator will know this is only a prefix match) > SELECT * FROM TABLE... > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17807) Execute maven commands in batch mode for ptests
[ https://issues.apache.org/jira/browse/HIVE-17807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-17807: Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Committed to master. Thanks, Vijay! > Execute maven commands in batch mode for ptests > --- > > Key: HIVE-17807 > URL: https://issues.apache.org/jira/browse/HIVE-17807 > Project: Hive > Issue Type: Bug >Reporter: Vijay Kumar >Assignee: Vijay Kumar > Fix For: 3.0.0 > > Attachments: HIVE-17807.patch > > > No need to run in interactive mode in CI environment. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17645) MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)
[ https://issues.apache.org/jira/browse/HIVE-17645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211914#comment-16211914 ] Eugene Koifman commented on HIVE-17645: --- There is a single LM for the warehouse. Having multiple TMs is like having multiple Sessions - they can lock the same resources as long as requested locks are compatible. The issue is that each TM has it's own transaction context, i.e. in a different transaction. For Spark reads, each Query Fragment uses a separate TM (and ValidTxnList) so from the overall query perspective this creates Read Committed semantics. (to be continued) > MM tables patch conflicts with HIVE-17482 (Spark/Acid integration) > -- > > Key: HIVE-17645 > URL: https://issues.apache.org/jira/browse/HIVE-17645 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Eugene Koifman > Labels: mm-gap-2 > > MM code introduces > {noformat} > HiveTxnManager txnManager = SessionState.get().getTxnMgr() > {noformat} > in a number of places (e.g _DDLTask.generateAddMmTasks(Table tbl)_). > HIVE-17482 adds a mode where a TransactionManager not associated with the > session should be used. This will need to be addressed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17853) RetryingMetaStoreClient loses UGI impersonation-context when reconnecting after timeout
[ https://issues.apache.org/jira/browse/HIVE-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-17853: Description: The {{RetryingMetaStoreClient}} is used to automatically reconnect to the Hive metastore, after client timeout, transparently to the user. In case of user impersonation (e.g. Oozie super-user {{oozie}} impersonating a Hadoop user {{mithun}}, to run a workflow), in case of timeout, we find that the reconnect causes the {{UGI.doAs()}} context to be lost. Any further metastore operations will be attempted as the login-user ({{oozie}}), as opposed to the effective user ({{mithun}}). We should have a fix for this shortly. was: The {{RetryingMetaStoreClient}} is used to automatically reconnect to the Hive metastore, after client timeout, transparently to the user. In case of user impersonation (e.g. Oozie super-user {{oozie}} impersonating a Hadoop user {{mithun}}, to run a workflow), in case of timeout, we find that the reconnect causes the {{UGI.doAs()}} context to be lost. Any further metastore operations will be attempted as the login-user ({{oozie}}), as opposed to the effective user ({{mithunr}}). We should have a fix for this shortly. > RetryingMetaStoreClient loses UGI impersonation-context when reconnecting > after timeout > --- > > Key: HIVE-17853 > URL: https://issues.apache.org/jira/browse/HIVE-17853 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0, 2.4.0, 2.2.1 >Reporter: Mithun Radhakrishnan >Assignee: Chris Drome >Priority: Critical > > The {{RetryingMetaStoreClient}} is used to automatically reconnect to the > Hive metastore, after client timeout, transparently to the user. > In case of user impersonation (e.g. Oozie super-user {{oozie}} impersonating > a Hadoop user {{mithun}}, to run a workflow), in case of timeout, we find > that the reconnect causes the {{UGI.doAs()}} context to be lost. Any further > metastore operations will be attempted as the login-user ({{oozie}}), as > opposed to the effective user ({{mithun}}). > We should have a fix for this shortly. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17857) Upgrade to orc 1.4
[ https://issues.apache.org/jira/browse/HIVE-17857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211907#comment-16211907 ] Ashutosh Chauhan commented on HIVE-17857: - cc: [~sershe] [~prasanth_j] > Upgrade to orc 1.4 > -- > > Key: HIVE-17857 > URL: https://issues.apache.org/jira/browse/HIVE-17857 > Project: Hive > Issue Type: Task > Components: ORC >Reporter: Ashutosh Chauhan > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17856) MM tables - IOW is broken
[ https://issues.apache.org/jira/browse/HIVE-17856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17856: Description: The following tests were removed from mm_all during "integration"... I should have never allowed such manner of intergration. MM logic should have been kept intact until ACID logic could catch up. Alas, here we are. {noformat} drop table iow0_mm; create table iow0_mm(key int) tblproperties("transactional"="true", "transactional_properties"="insert_only"); insert overwrite table iow0_mm select key from intermediate; insert into table iow0_mm select key + 1 from intermediate; select * from iow0_mm order by key; insert overwrite table iow0_mm select key + 2 from intermediate; select * from iow0_mm order by key; drop table iow0_mm; drop table iow1_mm; create table iow1_mm(key int) partitioned by (key2 int) tblproperties("transactional"="true", "transactional_properties"="insert_only"); insert overwrite table iow1_mm partition (key2) select key as k1, key from intermediate union all select key as k1, key from intermediate; insert into table iow1_mm partition (key2) select key + 1 as k1, key from intermediate union all select key as k1, key from intermediate; select * from iow1_mm order by key, key2; insert overwrite table iow1_mm partition (key2) select key + 3 as k1, key from intermediate union all select key + 4 as k1, key from intermediate; select * from iow1_mm order by key, key2; insert overwrite table iow1_mm partition (key2) select key + 3 as k1, key + 3 from intermediate union all select key + 2 as k1, key + 2 from intermediate; select * from iow1_mm order by key, key2; drop table iow1_mm; {noformat} {noformat} drop table simple_mm; create table simple_mm(key int) stored as orc tblproperties ("transactional"="true", "transactional_properties"="insert_only"); insert into table simple_mm select key from intermediate; -insert overwrite table simple_mm select key from intermediate; {noformat} was: The following tests were removed from mm_all {noformat} drop table iow0_mm; create table iow0_mm(key int) tblproperties("transactional"="true", "transactional_properties"="insert_only"); insert overwrite table iow0_mm select key from intermediate; insert into table iow0_mm select key + 1 from intermediate; select * from iow0_mm order by key; insert overwrite table iow0_mm select key + 2 from intermediate; select * from iow0_mm order by key; drop table iow0_mm; drop table iow1_mm; create table iow1_mm(key int) partitioned by (key2 int) tblproperties("transactional"="true", "transactional_properties"="insert_only"); insert overwrite table iow1_mm partition (key2) select key as k1, key from intermediate union all select key as k1, key from intermediate; insert into table iow1_mm partition (key2) select key + 1 as k1, key from intermediate union all select key as k1, key from intermediate; select * from iow1_mm order by key, key2; insert overwrite table iow1_mm partition (key2) select key + 3 as k1, key from intermediate union all select key + 4 as k1, key from intermediate; select * from iow1_mm order by key, key2; insert overwrite table iow1_mm partition (key2) select key + 3 as k1, key + 3 from intermediate union all select key + 2 as k1, key + 2 from intermediate; select * from iow1_mm order by key, key2; drop table iow1_mm; {noformat} {noformat} drop table simple_mm; create table simple_mm(key int) stored as orc tblproperties ("transactional"="true", "transactional_properties"="insert_only"); insert into table simple_mm select key from intermediate; -insert overwrite table simple_mm select key from intermediate; {noformat} > MM tables - IOW is broken > - > > Key: HIVE-17856 > URL: https://issues.apache.org/jira/browse/HIVE-17856 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Sergey Shelukhin > Labels: mm-gap-1 > > The following tests were removed from mm_all during "integration"... I should > have never allowed such manner of intergration. > MM logic should have been kept intact until ACID logic could catch up. Alas, > here we are. > {noformat} > drop table iow0_mm; > create table iow0_mm(key int) tblproperties("transactional"="true", > "transactional_properties"="insert_only"); > insert overwrite table iow0_mm select key from intermediate; > insert into table iow0_mm select key + 1 from intermediate; > select * from iow0_mm order by key; > insert overwrite table iow0_mm select key + 2 from intermediate; > select * from iow0_mm order by key; > drop table iow0_mm; > drop table iow1_mm; > create table iow1_mm(key int) partitioned by (key2 int) > tblproperties("transactional"="true", > "transactional_properties"="insert_only"); > insert overwrite table iow1_mm partition (key2) > select key as k1, key from intermediate
[jira] [Updated] (HIVE-17855) conversion to MM tables via alter may be broken
[ https://issues.apache.org/jira/browse/HIVE-17855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17855: Description: {noformat} git difftool 77511070dd^ 77511070dd -- */mm_conversions.q {noformat} Looks like during ACID "integration" alter was simply quietly changed to create+insert, because it's broken. I asked to keep feature parity with every change but I should have rather insisted on it and -1d all the patches that didn't... This is just annoying. was: {noformat} git difftool 77511070dd 77511070dd^ -- */mm_conversions.q {noformat} Looks like during ACID "integration" alter was simply quietly changed to create+insert, because it's broken. I asked to keep feature parity with every change but I should have rather insisted on it and -1d all the patches that didn't... This is just annoying. > conversion to MM tables via alter may be broken > --- > > Key: HIVE-17855 > URL: https://issues.apache.org/jira/browse/HIVE-17855 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Sergey Shelukhin > Labels: mm-gap-1 > > {noformat} > git difftool 77511070dd^ 77511070dd -- */mm_conversions.q > {noformat} > Looks like during ACID "integration" alter was simply quietly changed to > create+insert, because it's broken. > I asked to keep feature parity with every change but I should have rather > insisted on it and -1d all the patches that didn't... This is just annoying. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HIVE-16964) _orc_acid_version file is missing
[ https://issues.apache.org/jira/browse/HIVE-16964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom resolved HIVE-16964. --- Resolution: Won't Fix > _orc_acid_version file is missing > - > > Key: HIVE-16964 > URL: https://issues.apache.org/jira/browse/HIVE-16964 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Steve Yeom > > OrcRecordUpdater creates OrcRecordUpdater.ACID_FORMAT in the dir that it > creates - but there is nothing Hive.moveAcidFiles() that copies it final > location. > It doesn't look like CompactorMR even attempts to create it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16964) _orc_acid_version file is missing
[ https://issues.apache.org/jira/browse/HIVE-16964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211889#comment-16211889 ] Steve Yeom commented on HIVE-16964: --- Talked with Eugene. Also checked with the current Hive master code with the unit test, "TestTxnCommands2#testNonAcidToAcidConversion1". 1. Currently Hive.moveAcidFiles() does not move a "_orc_acid_version" file. This static method is called by the MoveTask for the Hive session of th. I.e., FileSinkOperator at map reduce task creates such a file but the MoveTask does not move the file to the final destination dir. 2. The intention for creating a "_orc_acid_version" file is to handle the case where we have multiple versions of ACID file formats. I.e., in that case, we need format version info somewhere either in the Metastore or in the directory. As Eugene indicated, currently for ACID tables, inserter/deleters create delta directories independently and readers read relevant dirs without conflicts with writers via Snapshot isolation. So there can be cases to have multiple versions of delta directories per partition or table directory since compactors are not sync with writers. So in this case, one "_orc_acid_version" file may be needed per delta dir. 3. Possibly like the case of micromanaged tables, we can remove the steps to create directories in a staging are and to perform MoveTask to move the delta and base directories along with orc_acid_version file(s) to a final destination. Thus based on 3 and 4, I think we can lower the priority of this jira since the fix of this jira (moving such a file to final destination) may not be used at all for HDP 3.0. > _orc_acid_version file is missing > - > > Key: HIVE-16964 > URL: https://issues.apache.org/jira/browse/HIVE-16964 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Steve Yeom > > OrcRecordUpdater creates OrcRecordUpdater.ACID_FORMAT in the dir that it > creates - but there is nothing Hive.moveAcidFiles() that copies it final > location. > It doesn't look like CompactorMR even attempts to create it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17645) MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)
[ https://issues.apache.org/jira/browse/HIVE-17645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211886#comment-16211886 ] Sergey Shelukhin commented on HIVE-17645: - Spark+ACID only uses the hacky mode for select queries so it should be ok as long as we don't get it from the session for selects. However, a larger concern I have is this... how does it work at all if a different TxnManager has non shared state with the main one? They'd be able to take locks separately in parallel for the same things. And if they don't have non-shared state (rely on the same metastore DB, ZK/DB lock paths, etc) then what's the problem with getting a different txn manager? > MM tables patch conflicts with HIVE-17482 (Spark/Acid integration) > -- > > Key: HIVE-17645 > URL: https://issues.apache.org/jira/browse/HIVE-17645 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Eugene Koifman > Labels: mm-gap-2 > > MM code introduces > {noformat} > HiveTxnManager txnManager = SessionState.get().getTxnMgr() > {noformat} > in a number of places (e.g _DDLTask.generateAddMmTasks(Table tbl)_). > HIVE-17482 adds a mode where a TransactionManager not associated with the > session should be used. This will need to be addressed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17854) LlapRecordReader should have getRowNumber() like org.apache.orc.RecordReader
[ https://issues.apache.org/jira/browse/HIVE-17854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-17854: -- Description: getRowNumber() is required to read data in non-acid tables that were converted to acid. Currently when VectorizedOrcAcidRowBatchReader is used from LlapRecordReader this functionality is not available making it impossible to vectorize reads (that need ROW__ID)/updates of non-aicd-to-acid tables with LLAP (until major compaction) in the presence of any deletes. cc [~t3rmin4t0r], [~sershe], [~teddy.choi] was: getRowNumber() is required to read data in non-acid tables that were converted to acid. Currently when VectorizedOrcAcidRowBatchReader is used from LlapRecordReader this functionality is not available making it impossible to vectorize reads/updates of non-aicd-to-acid tables with LLAP (until major compaction) in the presence of any deletes. cc [~t3rmin4t0r], [~sershe], [~teddy.choi] > LlapRecordReader should have getRowNumber() like org.apache.orc.RecordReader > > > Key: HIVE-17854 > URL: https://issues.apache.org/jira/browse/HIVE-17854 > Project: Hive > Issue Type: Bug > Components: llap, Transactions >Affects Versions: 3.0.0 >Reporter: Eugene Koifman > > getRowNumber() is required to read data in non-acid tables that were > converted to acid. > Currently when VectorizedOrcAcidRowBatchReader is used from LlapRecordReader > this functionality is not available making it impossible to vectorize reads > (that need ROW__ID)/updates of non-aicd-to-acid tables with LLAP (until major > compaction) in the presence of any deletes. > cc [~t3rmin4t0r], [~sershe], [~teddy.choi] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17854) LlapRecordReader should have getRowNumber() like org.apache.orc.RecordReader
[ https://issues.apache.org/jira/browse/HIVE-17854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-17854: -- Description: getRowNumber() is required to read data in non-acid tables that were converted to acid. Currently when VectorizedOrcAcidRowBatchReader is used from LlapRecordReader this functionality is not available making it impossible to vectorize reads/updates of non-aicd-to-acid tables with LLAP (until major compaction) in the presence of any deletes. cc [~t3rmin4t0r], [~sershe], [~teddy.choi] was: getRowNumber() is required to read data in non-acid tables that were converted to acid. Currently when VectorizedOrcAcidRowBatchReader is used from LlapRecordReader this functionality is not available making it impossible to vectorize reads/updates of non-aicd-to-acid tables with LLAP (until major compaction) in the presence of any deletes. > LlapRecordReader should have getRowNumber() like org.apache.orc.RecordReader > > > Key: HIVE-17854 > URL: https://issues.apache.org/jira/browse/HIVE-17854 > Project: Hive > Issue Type: Bug > Components: llap, Transactions >Affects Versions: 3.0.0 >Reporter: Eugene Koifman > > getRowNumber() is required to read data in non-acid tables that were > converted to acid. > Currently when VectorizedOrcAcidRowBatchReader is used from LlapRecordReader > this functionality is not available making it impossible to vectorize > reads/updates of non-aicd-to-acid tables with LLAP (until major compaction) > in the presence of any deletes. > cc [~t3rmin4t0r], [~sershe], [~teddy.choi] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17771) Implement commands to manage resource plan.
[ https://issues.apache.org/jira/browse/HIVE-17771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211854#comment-16211854 ] Hive QA commented on HIVE-17771: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12892992/HIVE-17771.03.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 11310 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=101) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] (batchId=243) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query39] (batchId=243) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] (batchId=243) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] (batchId=241) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=204) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=221) org.apache.hadoop.hive.ql.security.authorization.plugin.TestHiveOperationType.checkHiveOperationTypeMatch (batchId=284) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes (batchId=228) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes (batchId=228) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7387/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7387/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7387/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 16 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12892992 - PreCommit-HIVE-Build > Implement commands to manage resource plan. > --- > > Key: HIVE-17771 > URL: https://issues.apache.org/jira/browse/HIVE-17771 > Project: Hive > Issue Type: Sub-task >Reporter: Harish Jaiprakash >Assignee: Harish Jaiprakash > Attachments: HIVE-17771.01.patch, HIVE-17771.02.patch, > HIVE-17771.03.patch > > > Please see parent jira about llap workload management. > This jira is to implement create and show resource plan commands in hive to > configure resource plans for llap workload. The following are the proposed > commands implemented as part of the jira: > CREATE RESOURCE PLAN plan_name WITH QUERY_PARALLELISM parallelism; > SHOW RESOURCE PLAN plan_name; > SHOW RESOURCE PLANS; > ALTER RESOURCE PLAN plan_name SET QUERY_PARALLELISM = parallelism; > ALTER RESOURCE PLAN plan_name RENAME TO new_name; > ALTER RESOURCE PLAN plan_name ACTIVATE; > ALTER RESOURCE PLAN plan_name DISABLE; > ALTER RESOURCE PLAN plan_name ENABLE; > DROP RESOURCE PLAN; > It will be followed up with more jiras to manage pools, triggers and copy > resource plans. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17853) RetryingMetaStoreClient loses UGI impersonation-context when reconnecting after timeout
[ https://issues.apache.org/jira/browse/HIVE-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan reassigned HIVE-17853: --- > RetryingMetaStoreClient loses UGI impersonation-context when reconnecting > after timeout > --- > > Key: HIVE-17853 > URL: https://issues.apache.org/jira/browse/HIVE-17853 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0, 2.4.0, 2.2.1 >Reporter: Mithun Radhakrishnan >Assignee: Chris Drome >Priority: Critical > > The {{RetryingMetaStoreClient}} is used to automatically reconnect to the > Hive metastore, after client timeout, transparently to the user. > In case of user impersonation (e.g. Oozie super-user {{oozie}} impersonating > a Hadoop user {{mithun}}, to run a workflow), in case of timeout, we find > that the reconnect causes the {{UGI.doAs()}} context to be lost. Any further > metastore operations will be attempted as the login-user ({{oozie}}), as > opposed to the effective user ({{mithunr}}). > We should have a fix for this shortly. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17645) MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)
[ https://issues.apache.org/jira/browse/HIVE-17645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211844#comment-16211844 ] Eugene Koifman commented on HIVE-17645: --- Ask [~jdere] > MM tables patch conflicts with HIVE-17482 (Spark/Acid integration) > -- > > Key: HIVE-17645 > URL: https://issues.apache.org/jira/browse/HIVE-17645 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Eugene Koifman > Labels: mm-gap-2 > > MM code introduces > {noformat} > HiveTxnManager txnManager = SessionState.get().getTxnMgr() > {noformat} > in a number of places (e.g _DDLTask.generateAddMmTasks(Table tbl)_). > HIVE-17482 adds a mode where a TransactionManager not associated with the > session should be used. This will need to be addressed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-17657) export/import for MM tables is broken
[ https://issues.apache.org/jira/browse/HIVE-17657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211834#comment-16211834 ] Sergey Shelukhin edited comment on HIVE-17657 at 10/19/17 10:10 PM: Well, parts were broken by invalid merge 42a38577bc including commit 1b0d8df58e that removed a bunch of code... the old code from around 77511070dd (not sure if this is valid, this was the final MM-ACID integration commit for the file) needs to be restored. was (Author: sershe): Well, parts were broken by invalid merge 42a38577bc that removed a bunch of code... > export/import for MM tables is broken > - > > Key: HIVE-17657 > URL: https://issues.apache.org/jira/browse/HIVE-17657 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman > Labels: mm-gap-1 > > there is mm_exim.q but it's not clear from the tests what file structure it > creates > On import the txnids in the directory names would have to be remapped if > importing to a different cluster. Perhaps export can be smart and export > highest base_x and accretive deltas (minus aborted ones). Then import can > ...? It would have to remap txn ids from the archive to new txn ids. This > would then mean that import is made up of several transactions rather than 1 > atomic op. (all locks must belong to a transaction) > One possibility is to open a new txn for each dir in the archive (where > start/end txn of file name is the same) and commit all of them at once (need > new TMgr API for that). This assumes using a shared lock (if any!) and thus > allows other inserts (not related to import) to occur. > What if you have delta_6_9, such as a result of concatenate? If we stipulate > that this must mean that there is no delta_6_6 or any other "obsolete" delta > in the archive we can map it to a new single txn delta_x_x. > Add read_only mode for tables (useful in general, may be needed for upgrade > etc) and use that to make the above atomic. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17657) export/import for MM tables is broken
[ https://issues.apache.org/jira/browse/HIVE-17657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211834#comment-16211834 ] Sergey Shelukhin commented on HIVE-17657: - Well, parts were broken by invalid merge 42a38577bc that removed a bunch of code... > export/import for MM tables is broken > - > > Key: HIVE-17657 > URL: https://issues.apache.org/jira/browse/HIVE-17657 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman > Labels: mm-gap-1 > > there is mm_exim.q but it's not clear from the tests what file structure it > creates > On import the txnids in the directory names would have to be remapped if > importing to a different cluster. Perhaps export can be smart and export > highest base_x and accretive deltas (minus aborted ones). Then import can > ...? It would have to remap txn ids from the archive to new txn ids. This > would then mean that import is made up of several transactions rather than 1 > atomic op. (all locks must belong to a transaction) > One possibility is to open a new txn for each dir in the archive (where > start/end txn of file name is the same) and commit all of them at once (need > new TMgr API for that). This assumes using a shared lock (if any!) and thus > allows other inserts (not related to import) to occur. > What if you have delta_6_9, such as a result of concatenate? If we stipulate > that this must mean that there is no delta_6_6 or any other "obsolete" delta > in the archive we can map it to a new single txn delta_x_x. > Add read_only mode for tables (useful in general, may be needed for upgrade > etc) and use that to make the above atomic. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17695) collapse union all produced directories into delta directory name suffix for MM
[ https://issues.apache.org/jira/browse/HIVE-17695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17695: Priority: Minor (was: Major) > collapse union all produced directories into delta directory name suffix for > MM > --- > > Key: HIVE-17695 > URL: https://issues.apache.org/jira/browse/HIVE-17695 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman >Priority: Minor > > this has special handling for writes resulting from Union All query > In full Acid case at least, these subdirs get collapsed in favor of > statementId based dir names (delta_x_y_stmtId). It would be cleaner/simpler > to make MM follow the same logic. (full acid does it Hive.moveFiles() I > think) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17645) MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)
[ https://issues.apache.org/jira/browse/HIVE-17645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17645: Labels: mm-gap-2 (was: ) > MM tables patch conflicts with HIVE-17482 (Spark/Acid integration) > -- > > Key: HIVE-17645 > URL: https://issues.apache.org/jira/browse/HIVE-17645 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Eugene Koifman > Labels: mm-gap-2 > > MM code introduces > {noformat} > HiveTxnManager txnManager = SessionState.get().getTxnMgr() > {noformat} > in a number of places (e.g _DDLTask.generateAddMmTasks(Table tbl)_). > HIVE-17482 adds a mode where a TransactionManager not associated with the > session should be used. This will need to be addressed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17645) MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)
[ https://issues.apache.org/jira/browse/HIVE-17645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211822#comment-16211822 ] Sergey Shelukhin commented on HIVE-17645: - Hmm.. how can it be valid to have multiple txn managers in a single HS2? > MM tables patch conflicts with HIVE-17482 (Spark/Acid integration) > -- > > Key: HIVE-17645 > URL: https://issues.apache.org/jira/browse/HIVE-17645 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Eugene Koifman > > MM code introduces > {noformat} > HiveTxnManager txnManager = SessionState.get().getTxnMgr() > {noformat} > in a number of places (e.g _DDLTask.generateAddMmTasks(Table tbl)_). > HIVE-17482 adds a mode where a TransactionManager not associated with the > session should be used. This will need to be addressed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HIVE-17646) MetaStoreUtils.isToInsertOnlyTable(Map<String, String> props) is not needed
[ https://issues.apache.org/jira/browse/HIVE-17646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-17646. - Resolution: Not A Bug See the last comment. > MetaStoreUtils.isToInsertOnlyTable(Mapprops) is not needed > --- > > Key: HIVE-17646 > URL: https://issues.apache.org/jira/browse/HIVE-17646 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman > > TransactionValidationListener is where all the logic to verify > "transactional" & "transactional_properties" should be -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17647) DDLTask.generateAddMmTasks(Table tbl) should not start transactions
[ https://issues.apache.org/jira/browse/HIVE-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17647: Labels: mm-gap-2 (was: ) > DDLTask.generateAddMmTasks(Table tbl) should not start transactions > --- > > Key: HIVE-17647 > URL: https://issues.apache.org/jira/browse/HIVE-17647 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman > Labels: mm-gap-2 > > This method has > {noformat} > if (txnManager.isTxnOpen()) { > mmWriteId = txnManager.getCurrentTxnId(); > } else { > mmWriteId = txnManager.openTxn(new Context(conf), conf.getUser()); > txnManager.commitTxn(); > } > {noformat} > this should throw if there is no open transaction. It should never open one. > In general the logic seems suspect. Looks like the intent is to move all > existing files into a delta_x_x/ when a plain table is converted to MM table. > This seems like something that needs to be done from under an Exclusive lock > to prevent concurrent Insert operations writing data under table/partition > root. But this is too late to acquire locks which should be done from the > Driver.acquireLocks() (or else have deadlock detector since acquiring them > here would bread all-or-nothing lock acquisition semantics currently required > w/o deadlock detector) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-17657) export/import for MM tables is broken
[ https://issues.apache.org/jira/browse/HIVE-17657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211809#comment-16211809 ] Sergey Shelukhin edited comment on HIVE-17657 at 10/19/17 9:56 PM: --- This was broken even in tests after one of the master merges. Also even before that, when it was "working" in tests, it looks like some commit since 2e602596f7af6c302fd23628d4337673ca38be86 has removed the call to getValidMmDirectoriesFromTableOrPart from ExportSemanticAnalyzer.java, and I'm not sure it added anything in return. Need to take a look at that first, and then fix the runtime error (some file is missing). After that, mm_exim test needs to be reenabled. was (Author: sershe): This was broken even in tests after one of the master merges. Also even before that, when it was "working" in tests, it looks like some commit since 2e602596f7af6c302fd23628d4337673ca38be86 has removed the call to getValidMmDirectoriesFromTableOrPart from ExportSemanticAnalyzer.java, and I'm not sure it added anything in return. Need to take a look at that first, and then fix the runtime error (some file is missing). > export/import for MM tables is broken > - > > Key: HIVE-17657 > URL: https://issues.apache.org/jira/browse/HIVE-17657 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman > Labels: mm-gap-1 > > there is mm_exim.q but it's not clear from the tests what file structure it > creates > On import the txnids in the directory names would have to be remapped if > importing to a different cluster. Perhaps export can be smart and export > highest base_x and accretive deltas (minus aborted ones). Then import can > ...? It would have to remap txn ids from the archive to new txn ids. This > would then mean that import is made up of several transactions rather than 1 > atomic op. (all locks must belong to a transaction) > One possibility is to open a new txn for each dir in the archive (where > start/end txn of file name is the same) and commit all of them at once (need > new TMgr API for that). This assumes using a shared lock (if any!) and thus > allows other inserts (not related to import) to occur. > What if you have delta_6_9, such as a result of concatenate? If we stipulate > that this must mean that there is no delta_6_6 or any other "obsolete" delta > in the archive we can map it to a new single txn delta_x_x. > Add read_only mode for tables (useful in general, may be needed for upgrade > etc) and use that to make the above atomic. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-17657) export/import for MM tables is broken
[ https://issues.apache.org/jira/browse/HIVE-17657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211809#comment-16211809 ] Sergey Shelukhin edited comment on HIVE-17657 at 10/19/17 9:55 PM: --- This was broken even in tests after one of the master merges. Also even before that, when it was "working" in tests, it looks like some commit since 2e602596f7af6c302fd23628d4337673ca38be86 has removed the call to getValidMmDirectoriesFromTableOrPart from ExportSemanticAnalyzer.java, and I'm not sure it added anything in return. Need to take a look at that first, and then fix the runtime error (some file is missing). was (Author: sershe): This was broken even in tests after one of the master merges. Also even before that, when it was "working" in tests, it looks like some commit since 2e602596f7af6c302fd23628d4337673ca38be86 has removed the call to getValidMmDirectoriesFromTableOrPart, and I'm not sure it added anything in return. Need to take a look at that first, and then fix the runtime error (some file is missing). > export/import for MM tables is broken > - > > Key: HIVE-17657 > URL: https://issues.apache.org/jira/browse/HIVE-17657 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman > Labels: mm-gap-1 > > there is mm_exim.q but it's not clear from the tests what file structure it > creates > On import the txnids in the directory names would have to be remapped if > importing to a different cluster. Perhaps export can be smart and export > highest base_x and accretive deltas (minus aborted ones). Then import can > ...? It would have to remap txn ids from the archive to new txn ids. This > would then mean that import is made up of several transactions rather than 1 > atomic op. (all locks must belong to a transaction) > One possibility is to open a new txn for each dir in the archive (where > start/end txn of file name is the same) and commit all of them at once (need > new TMgr API for that). This assumes using a shared lock (if any!) and thus > allows other inserts (not related to import) to occur. > What if you have delta_6_9, such as a result of concatenate? If we stipulate > that this must mean that there is no delta_6_6 or any other "obsolete" delta > in the archive we can map it to a new single txn delta_x_x. > Add read_only mode for tables (useful in general, may be needed for upgrade > etc) and use that to make the above atomic. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17657) export/import for MM tables is broken
[ https://issues.apache.org/jira/browse/HIVE-17657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17657: Labels: mm-gap-1 (was: ) > export/import for MM tables is broken > - > > Key: HIVE-17657 > URL: https://issues.apache.org/jira/browse/HIVE-17657 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman > Labels: mm-gap-1 > > there is mm_exim.q but it's not clear from the tests what file structure it > creates > On import the txnids in the directory names would have to be remapped if > importing to a different cluster. Perhaps export can be smart and export > highest base_x and accretive deltas (minus aborted ones). Then import can > ...? It would have to remap txn ids from the archive to new txn ids. This > would then mean that import is made up of several transactions rather than 1 > atomic op. (all locks must belong to a transaction) > One possibility is to open a new txn for each dir in the archive (where > start/end txn of file name is the same) and commit all of them at once (need > new TMgr API for that). This assumes using a shared lock (if any!) and thus > allows other inserts (not related to import) to occur. > What if you have delta_6_9, such as a result of concatenate? If we stipulate > that this must mean that there is no delta_6_6 or any other "obsolete" delta > in the archive we can map it to a new single txn delta_x_x. > Add read_only mode for tables (useful in general, may be needed for upgrade > etc) and use that to make the above atomic. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17657) export/import for MM tables is broken
[ https://issues.apache.org/jira/browse/HIVE-17657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211809#comment-16211809 ] Sergey Shelukhin commented on HIVE-17657: - This was broken even in tests after one of the master merges. Also even before that, when it was "working" in tests, it looks like some commit since 2e602596f7af6c302fd23628d4337673ca38be86 has removed the call to getValidMmDirectoriesFromTableOrPart, and I'm not sure it added anything in return. Need to take a look at that first, and then fix the runtime error (some file is missing). > export/import for MM tables is broken > - > > Key: HIVE-17657 > URL: https://issues.apache.org/jira/browse/HIVE-17657 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman > > there is mm_exim.q but it's not clear from the tests what file structure it > creates > On import the txnids in the directory names would have to be remapped if > importing to a different cluster. Perhaps export can be smart and export > highest base_x and accretive deltas (minus aborted ones). Then import can > ...? It would have to remap txn ids from the archive to new txn ids. This > would then mean that import is made up of several transactions rather than 1 > atomic op. (all locks must belong to a transaction) > One possibility is to open a new txn for each dir in the archive (where > start/end txn of file name is the same) and commit all of them at once (need > new TMgr API for that). This assumes using a shared lock (if any!) and thus > allows other inserts (not related to import) to occur. > What if you have delta_6_9, such as a result of concatenate? If we stipulate > that this must mean that there is no delta_6_6 or any other "obsolete" delta > in the archive we can map it to a new single txn delta_x_x. > Add read_only mode for tables (useful in general, may be needed for upgrade > etc) and use that to make the above atomic. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17657) export/import for MM tables is broken
[ https://issues.apache.org/jira/browse/HIVE-17657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17657: Summary: export/import for MM tables is broken (was: Does ExIm for MM tables work?) > export/import for MM tables is broken > - > > Key: HIVE-17657 > URL: https://issues.apache.org/jira/browse/HIVE-17657 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman > > there is mm_exim.q but it's not clear from the tests what file structure it > creates > On import the txnids in the directory names would have to be remapped if > importing to a different cluster. Perhaps export can be smart and export > highest base_x and accretive deltas (minus aborted ones). Then import can > ...? It would have to remap txn ids from the archive to new txn ids. This > would then mean that import is made up of several transactions rather than 1 > atomic op. (all locks must belong to a transaction) > One possibility is to open a new txn for each dir in the archive (where > start/end txn of file name is the same) and commit all of them at once (need > new TMgr API for that). This assumes using a shared lock (if any!) and thus > allows other inserts (not related to import) to occur. > What if you have delta_6_9, such as a result of concatenate? If we stipulate > that this must mean that there is no delta_6_6 or any other "obsolete" delta > in the archive we can map it to a new single txn delta_x_x. > Add read_only mode for tables (useful in general, may be needed for upgrade > etc) and use that to make the above atomic. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17660) Compaction for MM runs Cleaner - needs test once IOW is supported
[ https://issues.apache.org/jira/browse/HIVE-17660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-17660: --- Assignee: Eugene Koifman > Compaction for MM runs Cleaner - needs test once IOW is supported > - > > Key: HIVE-17660 > URL: https://issues.apache.org/jira/browse/HIVE-17660 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman > > Deletion of aborted deltas happens from CompactorMR.run() i.e. from Worker > but the Worker still sets compaction_queue entry to READY_FOR_CLEANING. > This is not needed if there are no base_N dirs which can be created by Insert > Overwrite > In this case we can't delete deltas < N until we know no one is reading them, > i.e. in Cleaner -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17673) JavaUtils.extractTxnId() etc
[ https://issues.apache.org/jira/browse/HIVE-17673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17673: Status: Patch Available (was: Open) > JavaUtils.extractTxnId() etc > > > Key: HIVE-17673 > URL: https://issues.apache.org/jira/browse/HIVE-17673 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman >Assignee: Sergey Shelukhin >Priority: Minor > Attachments: HIVE-17673.patch > > > these should be in AcidUtils -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17673) JavaUtils.extractTxnId() etc
[ https://issues.apache.org/jira/browse/HIVE-17673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17673: Attachment: HIVE-17673.patch [~ekoifman] can you take a look? I'll look at the history of getValidMmDirectoriesFromTableOrPart and either remove it or file another jira to fix whatever was broken by the removal of its callers. > JavaUtils.extractTxnId() etc > > > Key: HIVE-17673 > URL: https://issues.apache.org/jira/browse/HIVE-17673 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman >Assignee: Sergey Shelukhin >Priority: Minor > Attachments: HIVE-17673.patch > > > these should be in AcidUtils -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17673) JavaUtils.extractTxnId() etc
[ https://issues.apache.org/jira/browse/HIVE-17673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-17673: --- Assignee: Sergey Shelukhin > JavaUtils.extractTxnId() etc > > > Key: HIVE-17673 > URL: https://issues.apache.org/jira/browse/HIVE-17673 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman >Assignee: Sergey Shelukhin >Priority: Minor > > these should be in AcidUtils -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17839) Cannot generate thrift definitions in standalone-metastore.
[ https://issues.apache.org/jira/browse/HIVE-17839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-17839: -- Status: Patch Available (was: Open) Forgot to redefine the work for the antrun plugin in standalone metastore. > Cannot generate thrift definitions in standalone-metastore. > --- > > Key: HIVE-17839 > URL: https://issues.apache.org/jira/browse/HIVE-17839 > Project: Hive > Issue Type: Bug >Reporter: Harish Jaiprakash >Assignee: Alan Gates > Attachments: HIVE-17839.patch > > > mvn clean install -Pthriftif -Dthrift.home=... does not regenerate the thrift > sources. This is after the https://issues.apache.org/jira/browse/HIVE-17506 > fix. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17839) Cannot generate thrift definitions in standalone-metastore.
[ https://issues.apache.org/jira/browse/HIVE-17839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-17839: -- Attachment: HIVE-17839.patch > Cannot generate thrift definitions in standalone-metastore. > --- > > Key: HIVE-17839 > URL: https://issues.apache.org/jira/browse/HIVE-17839 > Project: Hive > Issue Type: Bug >Reporter: Harish Jaiprakash >Assignee: Alan Gates > Attachments: HIVE-17839.patch > > > mvn clean install -Pthriftif -Dthrift.home=... does not regenerate the thrift > sources. This is after the https://issues.apache.org/jira/browse/HIVE-17506 > fix. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17841) implement applying the resource plan
[ https://issues.apache.org/jira/browse/HIVE-17841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211748#comment-16211748 ] Hive QA commented on HIVE-17841: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12892986/HIVE-17841.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 26 failed/errored test(s), 11306 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] (batchId=243) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] (batchId=243) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] (batchId=241) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=204) org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testClusterFractions (batchId=279) org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testDestroyAndReturn (batchId=279) org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testQueueName (batchId=279) org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testQueueing (batchId=279) org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testReopen (batchId=279) org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testReuse (batchId=279) org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testReuseWithDifferentPool (batchId=279) org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testReuseWithQueueing (batchId=279) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=221) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes (batchId=228) org.apache.hive.jdbc.TestTriggersWorkloadManager.org.apache.hive.jdbc.TestTriggersWorkloadManager (batchId=228) org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testCancelRenewTokenFlow (batchId=242) org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testConnection (batchId=242) org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testIsValid (batchId=242) org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testIsValidNeg (batchId=242) org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testNegativeProxyAuth (batchId=242) org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testNegativeTokenAuth (batchId=242) org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testProxyAuth (batchId=242) org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testTokenAuth (batchId=242) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7386/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7386/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7386/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 26 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12892986 - PreCommit-HIVE-Build > implement applying the resource plan > > > Key: HIVE-17841 > URL: https://issues.apache.org/jira/browse/HIVE-17841 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17841.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HIVE-17676) Enable JDBC + MiniLLAP tests in HIVE-17508 after HIVE-17566
[ https://issues.apache.org/jira/browse/HIVE-17676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran resolved HIVE-17676. -- Resolution: Won't Fix The tests are already enabled with HIVE-17508. Closing this as won't fix. > Enable JDBC + MiniLLAP tests in HIVE-17508 after HIVE-17566 > --- > > Key: HIVE-17676 > URL: https://issues.apache.org/jira/browse/HIVE-17676 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17471) Vectorization: Enable hive.vectorized.row.identifier.enabled to true by default
[ https://issues.apache.org/jira/browse/HIVE-17471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211742#comment-16211742 ] Matt McCline commented on HIVE-17471: - Yes. +1 LGTM tests pending. (Do we need more tests?) > Vectorization: Enable hive.vectorized.row.identifier.enabled to true by > default > --- > > Key: HIVE-17471 > URL: https://issues.apache.org/jira/browse/HIVE-17471 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Sergey Shelukhin > Attachments: HIVE-17471.patch > > > We set it disabled in https://issues.apache.org/jira/browse/HIVE-17116 > "Vectorization: Add infrastructure for vectorization of ROW__ID struct" > But forgot to turn it on to true by default in Teddy's ACID ROW__ID work... -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17471) Vectorization: Enable hive.vectorized.row.identifier.enabled to true by default
[ https://issues.apache.org/jira/browse/HIVE-17471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17471: Status: Patch Available (was: Open) > Vectorization: Enable hive.vectorized.row.identifier.enabled to true by > default > --- > > Key: HIVE-17471 > URL: https://issues.apache.org/jira/browse/HIVE-17471 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Sergey Shelukhin > Attachments: HIVE-17471.patch > > > We set it disabled in https://issues.apache.org/jira/browse/HIVE-17116 > "Vectorization: Add infrastructure for vectorization of ROW__ID struct" > But forgot to turn it on to true by default in Teddy's ACID ROW__ID work... -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17471) Vectorization: Enable hive.vectorized.row.identifier.enabled to true by default
[ https://issues.apache.org/jira/browse/HIVE-17471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17471: Attachment: HIVE-17471.patch [~mmccline] does this make sense? > Vectorization: Enable hive.vectorized.row.identifier.enabled to true by > default > --- > > Key: HIVE-17471 > URL: https://issues.apache.org/jira/browse/HIVE-17471 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Sergey Shelukhin > Attachments: HIVE-17471.patch > > > We set it disabled in https://issues.apache.org/jira/browse/HIVE-17116 > "Vectorization: Add infrastructure for vectorization of ROW__ID struct" > But forgot to turn it on to true by default in Teddy's ACID ROW__ID work... -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17471) Vectorization: Enable hive.vectorized.row.identifier.enabled to true by default
[ https://issues.apache.org/jira/browse/HIVE-17471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-17471: --- Assignee: Sergey Shelukhin (was: Teddy Choi) > Vectorization: Enable hive.vectorized.row.identifier.enabled to true by > default > --- > > Key: HIVE-17471 > URL: https://issues.apache.org/jira/browse/HIVE-17471 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Sergey Shelukhin > > We set it disabled in https://issues.apache.org/jira/browse/HIVE-17116 > "Vectorization: Add infrastructure for vectorization of ROW__ID struct" > But forgot to turn it on to true by default in Teddy's ACID ROW__ID work... -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16850) Converting table to insert-only acid may open a txn in an inappropriate place
[ https://issues.apache.org/jira/browse/HIVE-16850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-16850: Description: This would work for unit-testing, but would need to be fixed for production. {noformat} HiveTxnManager txnManager = SessionState.get().getTxnMgr(); if (txnManager.isTxnOpen()) { mmWriteId = txnManager.getCurrentTxnId(); } else { mmWriteId = txnManager.openTxn(new Context(conf), conf.getUser()); txnManager.commitTxn(); } {noformat} was: {noformat} HiveTxnManager txnManager = SessionState.get().getTxnMgr(); if (txnManager.isTxnOpen()) { mmWriteId = txnManager.getCurrentTxnId(); } else { mmWriteId = txnManager.openTxn(new Context(conf), conf.getUser()); txnManager.commitTxn(); } {noformat} > Converting table to insert-only acid may open a txn in an inappropriate place > - > > Key: HIVE-16850 > URL: https://issues.apache.org/jira/browse/HIVE-16850 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Wei Zheng >Assignee: Eugene Koifman > Labels: mm-gap-2 > Fix For: hive-14535 > > > This would work for unit-testing, but would need to be fixed for production. > {noformat} > HiveTxnManager txnManager = SessionState.get().getTxnMgr(); > if (txnManager.isTxnOpen()) { > mmWriteId = txnManager.getCurrentTxnId(); > } else { > mmWriteId = txnManager.openTxn(new Context(conf), conf.getUser()); > txnManager.commitTxn(); > } > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16850) Only open a new transaction when there's no currently opened transaction
[ https://issues.apache.org/jira/browse/HIVE-16850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-16850: Description: {noformat} HiveTxnManager txnManager = SessionState.get().getTxnMgr(); if (txnManager.isTxnOpen()) { mmWriteId = txnManager.getCurrentTxnId(); } else { mmWriteId = txnManager.openTxn(new Context(conf), conf.getUser()); txnManager.commitTxn(); } {noformat} > Only open a new transaction when there's no currently opened transaction > > > Key: HIVE-16850 > URL: https://issues.apache.org/jira/browse/HIVE-16850 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Wei Zheng >Assignee: Eugene Koifman > Labels: mm-gap-2 > Fix For: hive-14535 > > > {noformat} > HiveTxnManager txnManager = SessionState.get().getTxnMgr(); > if (txnManager.isTxnOpen()) { > mmWriteId = txnManager.getCurrentTxnId(); > } else { > mmWriteId = txnManager.openTxn(new Context(conf), conf.getUser()); > txnManager.commitTxn(); > } > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16850) Converting table to insert-only acid may open a txn in an inappropriate place
[ https://issues.apache.org/jira/browse/HIVE-16850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-16850: Labels: mm-gap-2 (was: ) > Converting table to insert-only acid may open a txn in an inappropriate place > - > > Key: HIVE-16850 > URL: https://issues.apache.org/jira/browse/HIVE-16850 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Wei Zheng >Assignee: Eugene Koifman > Labels: mm-gap-2 > Fix For: hive-14535 > > > This would work for unit-testing, but would need to be fixed for production. > {noformat} > HiveTxnManager txnManager = SessionState.get().getTxnMgr(); > if (txnManager.isTxnOpen()) { > mmWriteId = txnManager.getCurrentTxnId(); > } else { > mmWriteId = txnManager.openTxn(new Context(conf), conf.getUser()); > txnManager.commitTxn(); > } > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16850) Converting table to insert-only acid may open a txn in an inappropriate place
[ https://issues.apache.org/jira/browse/HIVE-16850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-16850: Summary: Converting table to insert-only acid may open a txn in an inappropriate place (was: Only open a new transaction when there's no currently opened transaction) > Converting table to insert-only acid may open a txn in an inappropriate place > - > > Key: HIVE-16850 > URL: https://issues.apache.org/jira/browse/HIVE-16850 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Wei Zheng >Assignee: Eugene Koifman > Labels: mm-gap-2 > Fix For: hive-14535 > > > {noformat} > HiveTxnManager txnManager = SessionState.get().getTxnMgr(); > if (txnManager.isTxnOpen()) { > mmWriteId = txnManager.getCurrentTxnId(); > } else { > mmWriteId = txnManager.openTxn(new Context(conf), conf.getUser()); > txnManager.commitTxn(); > } > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16850) Only open a new transaction when there's no currently opened transaction
[ https://issues.apache.org/jira/browse/HIVE-16850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-16850: Attachment: (was: HIVE-16850.patch) > Only open a new transaction when there's no currently opened transaction > > > Key: HIVE-16850 > URL: https://issues.apache.org/jira/browse/HIVE-16850 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Wei Zheng >Assignee: Eugene Koifman > Fix For: hive-14535 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17698) FileSinkDesk.getMergeInputDirName() uses stmtId=0
[ https://issues.apache.org/jira/browse/HIVE-17698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211718#comment-16211718 ] Eugene Koifman commented on HIVE-17698: --- +1 > FileSinkDesk.getMergeInputDirName() uses stmtId=0 > - > > Key: HIVE-17698 > URL: https://issues.apache.org/jira/browse/HIVE-17698 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman >Assignee: Sergey Shelukhin > Attachments: HIVE-17698.patch > > > this is certainly wrong for multi statement txn but may also affect writes > from Union All queries if these are made to follow full Acid convention > _return new Path(root, AcidUtils.deltaSubdir(txnId, txnId, 0));_ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16850) Only open a new transaction when there's no currently opened transaction
[ https://issues.apache.org/jira/browse/HIVE-16850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211715#comment-16211715 ] Eugene Koifman commented on HIVE-16850: --- yes, the point is that it should not. We can't be opening transactions in random places > Only open a new transaction when there's no currently opened transaction > > > Key: HIVE-16850 > URL: https://issues.apache.org/jira/browse/HIVE-16850 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Wei Zheng >Assignee: Eugene Koifman > Fix For: hive-14535 > > Attachments: HIVE-16850.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17828) Metastore: mysql upgrade scripts to 3.0.0 is broken
[ https://issues.apache.org/jira/browse/HIVE-17828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211695#comment-16211695 ] Sergey Shelukhin commented on HIVE-17828: - +1 pending tests > Metastore: mysql upgrade scripts to 3.0.0 is broken > --- > > Key: HIVE-17828 > URL: https://issues.apache.org/jira/browse/HIVE-17828 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Prasanth Jayachandran > Attachments: HIVE-17828.1.patch, HIVE-17828.2.patch > > > {code} > +-+ > | | > +-+ > | Finished upgrading MetaStore schema from 2.2.0 to 2.3.0 | > +-+ > 1 row in set, 1 warning (0.00 sec) > mysql> source upgrade-2.3.0-to-3.0.0.mysql.sql > ++ > || > ++ > | Upgrading MetaStore schema from 2.3.0 to 3.0.0 | > ++ > {code} > {code} > -- > CREATE TABLE WM_RESOURCEPLAN ( > `RP_ID` bigint(20) NOT NULL, > `NAME` varchar(128) NOT NULL, > `QUERY_PARALLELISM` int(11), > `STATUS` varchar(20) NOT NULL, > PRIMARY KEY (`RP_ID`), > KEY `UNIQUE_WM_RESOURCEPLAN` (`NAME`), > ) ENGINE=InnoDB DEFAULT CHARSET=latin1 > -- > ERROR 1064 (42000): You have an error in your SQL syntax; check the manual > that corresponds to your MySQL server version for the right syntax to use > near ') ENGINE=InnoDB DEFAULT CHARSET=latin1' at line 8 > -- > CREATE TABLE WM_POOL > ( > `POOL_ID` bigint(20) NOT NULL, > `RP_ID` bigint(20) NOT NULL, > `PATH` varchar(1024) NOT NULL, > `PARENT_POOL_ID` bigint(20), > `ALLOC_FRACTION` DOUBLE, > `QUERY_PARALLELISM` int(11), > PRIMARY KEY (`POOL_ID`), > KEY `UNIQUE_WM_POOL` (`RP_ID`, `PATH`), > CONSTRAINT `WM_POOL_FK1` FOREIGN KEY (`RP_ID`) REFERENCES > `WM_RESOURCEPLAN` (`RP_ID`), > CONSTRAINT `WM_POOL_FK2` FOREIGN KEY (`PARENT_POOL_ID`) REFERENCES > `WM_POOL` (`POOL_ID`) > ) ENGINE=InnoDB DEFAULT CHARSET=latin1 > -- > ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes > -- > CREATE TABLE WM_TRIGGER > ( > `TRIGGER_ID` bigint(20) NOT NULL, > `RP_ID` bigint(20) NOT NULL, > `NAME` varchar(128) NOT NULL, > `TRIGGER_EXPRESSION` varchar(1024), > `ACTION_EXPRESSION` varchar(1024), > PRIMARY KEY (`TRIGGER_ID`), > KEY `UNIQUE_WM_TRIGGER` (`RP_ID`, `NAME`), > CONSTRAINT `WM_TRIGGER_FK1` FOREIGN KEY (`RP_ID`) REFERENCES > `WM_RESOURCEPLAN` (`RP_ID`) > ) ENGINE=InnoDB DEFAULT CHARSET=latin1 > -- > ERROR 1215 (HY000): Cannot add foreign key constraint > -- > CREATE TABLE WM_POOL_TO_TRIGGER > ( > `POOL_ID` bigint(20) NOT NULL, > `TRIGGER_ID` bigint(20) NOT NULL, > PRIMARY KEY (`POOL_ID`, `TRIGGER_ID`), > CONSTRAINT `WM_POOL_TO_TRIGGER_FK1` FOREIGN KEY (`POOL_ID`) REFERENCES > `WM_POOL` (`POOL_ID`), > CONSTRAINT `WM_POOL_TO_TRIGGER_FK2` FOREIGN KEY (`TRIGGER_ID`) REFERENCES > `WM_TRIGGER` (`TRIGGER_ID`) > ) ENGINE=InnoDB DEFAULT CHARSET=latin1 > -- > ERROR 1215 (HY000): Cannot add foreign key constraint > -- > CREATE TABLE WM_MAPPING > ( > `MAPPING_ID` bigint(20) NOT NULL, > `RP_ID` bigint(20) NOT NULL, > `ENTITY_TYPE` varchar(10) NOT NULL, > `ENTITY_NAME` varchar(128) NOT NULL, > `POOL_ID` bigint(20) NOT NULL, > `ORDERING int, > PRIMARY KEY (`MAPPING_ID`), > KEY `UNIQUE_WM_MAPPING` (`RP_ID`, `ENTITY_TYPE`, `ENTITY_NAME`), > CONSTRAINT `WM_MAPPING_FK1` FOREIGN KEY (`RP_ID`) REFERENCES > `WM_RESOURCEPLAN` (`RP_ID`), > CONSTRAINT `WM_MAPPING_FK2` FOREIGN KEY (`POOL_ID`) REFERENCES `WM_POOL` > (`POOL_ID`) > ) ENGINE=InnoDB DEFAULT CHARSET=latin1; > -- > ERROR 1064 (42000): You have an error in your SQL syntax; check the manual > that corresponds to your MySQL server version for the right syntax to use > near 'MAPPING_ID`), > KEY `UNIQUE_WM_MAPPING` (`RP_ID`, `ENTITY_TYPE`, `ENTITY_NAME`' at line 8 > -- > UPDATE VERSION SET SCHEMA_VERSION='3.0.0', VERSION_COMMENT='Hive release > version 3.0.0' where VER_ID=1 > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17698) FileSinkDesk.getMergeInputDirName() uses stmtId=0
[ https://issues.apache.org/jira/browse/HIVE-17698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17698: Status: Patch Available (was: Open) > FileSinkDesk.getMergeInputDirName() uses stmtId=0 > - > > Key: HIVE-17698 > URL: https://issues.apache.org/jira/browse/HIVE-17698 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman >Assignee: Sergey Shelukhin > Attachments: HIVE-17698.patch > > > this is certainly wrong for multi statement txn but may also affect writes > from Union All queries if these are made to follow full Acid convention > _return new Path(root, AcidUtils.deltaSubdir(txnId, txnId, 0));_ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17698) FileSinkDesk.getMergeInputDirName() uses stmtId=0
[ https://issues.apache.org/jira/browse/HIVE-17698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17698: Attachment: HIVE-17698.patch [~ekoifman] can you take a look? thanks > FileSinkDesk.getMergeInputDirName() uses stmtId=0 > - > > Key: HIVE-17698 > URL: https://issues.apache.org/jira/browse/HIVE-17698 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman >Assignee: Sergey Shelukhin > Attachments: HIVE-17698.patch > > > this is certainly wrong for multi statement txn but may also affect writes > from Union All queries if these are made to follow full Acid convention > _return new Path(root, AcidUtils.deltaSubdir(txnId, txnId, 0));_ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17698) FileSinkDesk.getMergeInputDirName() uses stmtId=0
[ https://issues.apache.org/jira/browse/HIVE-17698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-17698: --- Assignee: Sergey Shelukhin > FileSinkDesk.getMergeInputDirName() uses stmtId=0 > - > > Key: HIVE-17698 > URL: https://issues.apache.org/jira/browse/HIVE-17698 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman >Assignee: Sergey Shelukhin > > this is certainly wrong for multi statement txn but may also affect writes > from Union All queries if these are made to follow full Acid convention > _return new Path(root, AcidUtils.deltaSubdir(txnId, txnId, 0));_ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17828) Metastore: mysql upgrade scripts to 3.0.0 is broken
[ https://issues.apache.org/jira/browse/HIVE-17828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-17828: - Attachment: HIVE-17828.2.patch I did not see the issue in mysql 5.7.10. But when I downgraded to 5.6.38 I saw the 767 limit for key column. Fixed it in the new path. I don't see other issues that [~gopalv] mentioned even with 5.6 version. > Metastore: mysql upgrade scripts to 3.0.0 is broken > --- > > Key: HIVE-17828 > URL: https://issues.apache.org/jira/browse/HIVE-17828 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Prasanth Jayachandran > Attachments: HIVE-17828.1.patch, HIVE-17828.2.patch > > > {code} > +-+ > | | > +-+ > | Finished upgrading MetaStore schema from 2.2.0 to 2.3.0 | > +-+ > 1 row in set, 1 warning (0.00 sec) > mysql> source upgrade-2.3.0-to-3.0.0.mysql.sql > ++ > || > ++ > | Upgrading MetaStore schema from 2.3.0 to 3.0.0 | > ++ > {code} > {code} > -- > CREATE TABLE WM_RESOURCEPLAN ( > `RP_ID` bigint(20) NOT NULL, > `NAME` varchar(128) NOT NULL, > `QUERY_PARALLELISM` int(11), > `STATUS` varchar(20) NOT NULL, > PRIMARY KEY (`RP_ID`), > KEY `UNIQUE_WM_RESOURCEPLAN` (`NAME`), > ) ENGINE=InnoDB DEFAULT CHARSET=latin1 > -- > ERROR 1064 (42000): You have an error in your SQL syntax; check the manual > that corresponds to your MySQL server version for the right syntax to use > near ') ENGINE=InnoDB DEFAULT CHARSET=latin1' at line 8 > -- > CREATE TABLE WM_POOL > ( > `POOL_ID` bigint(20) NOT NULL, > `RP_ID` bigint(20) NOT NULL, > `PATH` varchar(1024) NOT NULL, > `PARENT_POOL_ID` bigint(20), > `ALLOC_FRACTION` DOUBLE, > `QUERY_PARALLELISM` int(11), > PRIMARY KEY (`POOL_ID`), > KEY `UNIQUE_WM_POOL` (`RP_ID`, `PATH`), > CONSTRAINT `WM_POOL_FK1` FOREIGN KEY (`RP_ID`) REFERENCES > `WM_RESOURCEPLAN` (`RP_ID`), > CONSTRAINT `WM_POOL_FK2` FOREIGN KEY (`PARENT_POOL_ID`) REFERENCES > `WM_POOL` (`POOL_ID`) > ) ENGINE=InnoDB DEFAULT CHARSET=latin1 > -- > ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes > -- > CREATE TABLE WM_TRIGGER > ( > `TRIGGER_ID` bigint(20) NOT NULL, > `RP_ID` bigint(20) NOT NULL, > `NAME` varchar(128) NOT NULL, > `TRIGGER_EXPRESSION` varchar(1024), > `ACTION_EXPRESSION` varchar(1024), > PRIMARY KEY (`TRIGGER_ID`), > KEY `UNIQUE_WM_TRIGGER` (`RP_ID`, `NAME`), > CONSTRAINT `WM_TRIGGER_FK1` FOREIGN KEY (`RP_ID`) REFERENCES > `WM_RESOURCEPLAN` (`RP_ID`) > ) ENGINE=InnoDB DEFAULT CHARSET=latin1 > -- > ERROR 1215 (HY000): Cannot add foreign key constraint > -- > CREATE TABLE WM_POOL_TO_TRIGGER > ( > `POOL_ID` bigint(20) NOT NULL, > `TRIGGER_ID` bigint(20) NOT NULL, > PRIMARY KEY (`POOL_ID`, `TRIGGER_ID`), > CONSTRAINT `WM_POOL_TO_TRIGGER_FK1` FOREIGN KEY (`POOL_ID`) REFERENCES > `WM_POOL` (`POOL_ID`), > CONSTRAINT `WM_POOL_TO_TRIGGER_FK2` FOREIGN KEY (`TRIGGER_ID`) REFERENCES > `WM_TRIGGER` (`TRIGGER_ID`) > ) ENGINE=InnoDB DEFAULT CHARSET=latin1 > -- > ERROR 1215 (HY000): Cannot add foreign key constraint > -- > CREATE TABLE WM_MAPPING > ( > `MAPPING_ID` bigint(20) NOT NULL, > `RP_ID` bigint(20) NOT NULL, > `ENTITY_TYPE` varchar(10) NOT NULL, > `ENTITY_NAME` varchar(128) NOT NULL, > `POOL_ID` bigint(20) NOT NULL, > `ORDERING int, > PRIMARY KEY (`MAPPING_ID`), > KEY `UNIQUE_WM_MAPPING` (`RP_ID`, `ENTITY_TYPE`, `ENTITY_NAME`), > CONSTRAINT `WM_MAPPING_FK1` FOREIGN KEY (`RP_ID`) REFERENCES > `WM_RESOURCEPLAN` (`RP_ID`), > CONSTRAINT `WM_MAPPING_FK2` FOREIGN KEY (`POOL_ID`) REFERENCES `WM_POOL` > (`POOL_ID`) > ) ENGINE=InnoDB DEFAULT CHARSET=latin1; > -- > ERROR 1064 (42000): You have an error in your SQL syntax; check the manual > that corresponds to your MySQL server version for the right syntax to use > near 'MAPPING_ID`), > KEY `UNIQUE_WM_MAPPING` (`RP_ID`, `ENTITY_TYPE`, `ENTITY_NAME`' at line 8 > -- > UPDATE VERSION SET SCHEMA_VERSION='3.0.0', VERSION_COMMENT='Hive release > version 3.0.0' where VER_ID=1 > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16850) Only open a new transaction when there's no currently opened transaction
[ https://issues.apache.org/jira/browse/HIVE-16850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211674#comment-16211674 ] Sergey Shelukhin commented on HIVE-16850: - Hmm.. the attached patch is already applied to DDLTask. > Only open a new transaction when there's no currently opened transaction > > > Key: HIVE-16850 > URL: https://issues.apache.org/jira/browse/HIVE-16850 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Wei Zheng >Assignee: Eugene Koifman > Fix For: hive-14535 > > Attachments: HIVE-16850.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17748) REplCopyTaks.execut(DriverContext)
[ https://issues.apache.org/jira/browse/HIVE-17748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17748: Attachment: HIVE-17748.patch [~ekoifman] can you take a look? thnx > REplCopyTaks.execut(DriverContext) > -- > > Key: HIVE-17748 > URL: https://issues.apache.org/jira/browse/HIVE-17748 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman >Assignee: Sergey Shelukhin > Attachments: HIVE-17748.patch > > > has > {noformat} > Path fromPath = work.getFromPaths()[0]; > toPath = work.getToPaths()[0]; > {noformat} > should this throw if from/to paths have > 1 element? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17748) REplCopyTaks.execut(DriverContext)
[ https://issues.apache.org/jira/browse/HIVE-17748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17748: Status: Patch Available (was: Open) > REplCopyTaks.execut(DriverContext) > -- > > Key: HIVE-17748 > URL: https://issues.apache.org/jira/browse/HIVE-17748 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman >Assignee: Sergey Shelukhin > Attachments: HIVE-17748.patch > > > has > {noformat} > Path fromPath = work.getFromPaths()[0]; > toPath = work.getToPaths()[0]; > {noformat} > should this throw if from/to paths have > 1 element? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17748) ReplCopyTask doesn't support multi-file CopyWork
[ https://issues.apache.org/jira/browse/HIVE-17748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17748: Summary: ReplCopyTask doesn't support multi-file CopyWork (was: REplCopyTaks.execut(DriverContext)) > ReplCopyTask doesn't support multi-file CopyWork > > > Key: HIVE-17748 > URL: https://issues.apache.org/jira/browse/HIVE-17748 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman >Assignee: Sergey Shelukhin > Attachments: HIVE-17748.patch > > > has > {noformat} > Path fromPath = work.getFromPaths()[0]; > toPath = work.getToPaths()[0]; > {noformat} > should this throw if from/to paths have > 1 element? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Reopened] (HIVE-16850) Only open a new transaction when there's no currently opened transaction
[ https://issues.apache.org/jira/browse/HIVE-16850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman reopened HIVE-16850: --- it's not fixed - this code is still there > Only open a new transaction when there's no currently opened transaction > > > Key: HIVE-16850 > URL: https://issues.apache.org/jira/browse/HIVE-16850 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Wei Zheng >Assignee: Eugene Koifman > Fix For: hive-14535 > > Attachments: HIVE-16850.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17748) REplCopyTaks.execut(DriverContext)
[ https://issues.apache.org/jira/browse/HIVE-17748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-17748: --- Assignee: Sergey Shelukhin > REplCopyTaks.execut(DriverContext) > -- > > Key: HIVE-17748 > URL: https://issues.apache.org/jira/browse/HIVE-17748 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman >Assignee: Sergey Shelukhin > > has > {noformat} > Path fromPath = work.getFromPaths()[0]; > toPath = work.getToPaths()[0]; > {noformat} > should this throw if from/to paths have > 1 element? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Work started] (HIVE-17848) Bucket Map Join : Implement an efficient way to minimize loading hash table
[ https://issues.apache.org/jira/browse/HIVE-17848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-17848 started by Deepak Jaiswal. - > Bucket Map Join : Implement an efficient way to minimize loading hash table > --- > > Key: HIVE-17848 > URL: https://issues.apache.org/jira/browse/HIVE-17848 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > > In bucket mapjoin, each task loads its own copy of hash table which is > inefficient as load is IO heavy and due to multiple copies of same hash > table, the tables may get GCed on a busy system. > Implement a subcache with softreference to each hash table corresponding to > its bucketID such that it can be reused by a task. > This needs changes from Tez side to push bucket id to TezProcessor. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17851) Bucket Map Join : Pick correct number of buckets
[ https://issues.apache.org/jira/browse/HIVE-17851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal reassigned HIVE-17851: - > Bucket Map Join : Pick correct number of buckets > > > Key: HIVE-17851 > URL: https://issues.apache.org/jira/browse/HIVE-17851 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > > CREATE TABLE tab_part (key int, value string) PARTITIONED BY(ds STRING) > CLUSTERED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE; > CREATE TABLE tab(key int, value string) PARTITIONED BY(ds STRING) CLUSTERED > BY (key) INTO 4 BUCKETS STORED AS TEXTFILE; > select a.key, a.value, b.value > from tab a join tab_part b on a.key = b.key; > In above case, if tab_part is bigger then it should be the streaming side and > the smaller side should create two hash tables. However, currently, it > blindly picks 4 as number of buckets as it is the maximum number of buckets > among all the tables involved in the join and create 4 hash tables. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HIVE-16850) Only open a new transaction when there's no currently opened transaction
[ https://issues.apache.org/jira/browse/HIVE-16850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-16850. - Resolution: Done Assignee: Eugene Koifman Looks like this is already fixed. > Only open a new transaction when there's no currently opened transaction > > > Key: HIVE-16850 > URL: https://issues.apache.org/jira/browse/HIVE-16850 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Wei Zheng >Assignee: Eugene Koifman > Fix For: hive-14535 > > Attachments: HIVE-16850.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17696) Vectorized reader does not seem to be pushing down projection columns in certain code paths
[ https://issues.apache.org/jira/browse/HIVE-17696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211650#comment-16211650 ] Hive QA commented on HIVE-17696: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12892984/HIVE-17696.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 11309 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] (batchId=47) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_1_23] (batchId=76) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_partitioned_date_time] (batchId=163) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] (batchId=243) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] (batchId=243) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] (batchId=241) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=204) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=221) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes (batchId=228) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7385/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7385/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7385/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12892984 - PreCommit-HIVE-Build > Vectorized reader does not seem to be pushing down projection columns in > certain code paths > --- > > Key: HIVE-17696 > URL: https://issues.apache.org/jira/browse/HIVE-17696 > Project: Hive > Issue Type: Sub-task >Reporter: Vihang Karajgaonkar >Assignee: Ferdinand Xu > Attachments: HIVE-17696.patch > > > This is the code snippet from {{VectorizedParquetRecordReader.java}} > {noformat} > MessageType tableSchema; > if (indexAccess) { > List indexSequence = new ArrayList<>(); > // Generates a sequence list of indexes > for(int i = 0; i < columnNamesList.size(); i++) { > indexSequence.add(i); > } > tableSchema = DataWritableReadSupport.getSchemaByIndex(fileSchema, > columnNamesList, > indexSequence); > } else { > tableSchema = DataWritableReadSupport.getSchemaByName(fileSchema, > columnNamesList, > columnTypesList); > } > indexColumnsWanted = > ColumnProjectionUtils.getReadColumnIDs(configuration); > if (!ColumnProjectionUtils.isReadAllColumns(configuration) && > !indexColumnsWanted.isEmpty()) { > requestedSchema = > DataWritableReadSupport.getSchemaByIndex(tableSchema, > columnNamesList, indexColumnsWanted); > } else { > requestedSchema = fileSchema; > } > this.reader = new ParquetFileReader( > configuration, footer.getFileMetaData(), file, blocks, > requestedSchema.getColumns()); > {noformat} > Couple of things to notice here: > Most of this code is duplicated from {{DataWritableReadSupport.init()}} > method. > the else condition passes in fileSchema instead of using tableSchema like we > do in DataWritableReadSupport.init() method. Does this cause projection > columns to be missed when we read parquet files? We should probably just > reuse ReadContext returned from {{DataWritableReadSupport.init()}} method > here. -- This message was sent by Atlassian JIRA (v6.4.14#64029)