from:"Eugene Koifman \\\\\\\(JIRA\\\\\\\)"

[jira] [Commented] (HIVE-20431) txn stats write ID check triggers on set location

2018-08-21 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588145#comment-16588145
 ] 

Eugene Koifman commented on HIVE-20431:
---

can you explain what this is meant to accomplish?  
There seem to be more changes than the title suggests.

Also,
"// Note: used in RenamePartitionDesc, not here." - new enums with comments 
like these -  seems like a hack...

> txn stats write ID check triggers on set location
> -
>
> Key: HIVE-20431
> URL: https://issues.apache.org/jira/browse/HIVE-20431
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20431.01.patch, HIVE-20431.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20287) Document the differences between managed and external tables

2018-08-21 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587941#comment-16587941
 ] 

Eugene Koifman commented on HIVE-20287:
---

Only Managed tables can be transactional, which implies no update/delete/merge 
on External.

> Document the differences between managed and external tables
> 
>
> Key: HIVE-20287
> URL: https://issues.apache.org/jira/browse/HIVE-20287
> Project: Hive
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lars Francke
>Assignee: Lars Francke
>Priority: Major
>
> We should document all the differences between managed and external tables. I 
> need everyone's help for that though.
> This is what I know:
> * ARCHIVE/UNARCHIVE - DDLTask - Only works for managed tables
> * TRUNCATE - DDLSemanticAnalyzer - Only works for managed tables
> * MERGE/CONCATENATE - HiveRelOpMaterializationValidator - Only works for 
> managed tables
> * Constraints - DDLSemanticAnalyzer -  (NOT NULL, DEFAULT, CHECK, only RELY 
> ist allowed)
> * IMPORT - ImportSemanticAnalyzer - This has some wild restrictions I didn't 
> follow for external tables
> * Query Results Caching - https://issues.apache.org/jira/browse/HIVE-18513 
> SemanticAnalyzer
>  
> Hortonworks has extra documentation listing these things:
> * Query cache
> * Materialized views, except in a limited way
> * Default statistics gathering
> * Compute queries using statistics
> * Automatic runtime filtering
> * File merging after insert
>  
> It'd be great if someone (from Hortonworks or otherwise) could elaborate on 
> those.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18453) ACID: Add "CREATE TRANSACTIONAL TABLE" syntax to unify ACID ORC & Parquet support

2018-08-21 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-18453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587805#comment-16587805
 ] 

Eugene Koifman commented on HIVE-18453:
---

[~ikryvenko], the build bot didn't pick up patch 6 and it doesn't compile for 
me.
The output of describe table commands in create_transactional.q.out doesn't 
show transactional=true table property anywhere - I don't think you are 
actually creating transactional tables.  Instead of testing various select 
statements (which I think would already be covered elsewhere), I would add some 
update/delete commands - the system will throw exceptions if the table is not 
transactional.

{{MetastoreConf.CREATE_TABLES_AS_ACID}} is a global config prop. If it's true, 
then any Create Table statement is examined and if the table can be made 
transactional=true, it will be made so, w/o the user explicitly specifying 
transactional=true.
{{HiveConf.HIVE_CREATE_TABLES_AS_INSERT_ONLY}} works similarly.  If it's true, 
the system will make a Create Table stmt create a table with {{tblproperties 
("transactional"="true", "transactional_properties"="insert_only")}}.  I just 
meant it may be useful to look at where they are referenced in 
{{SemanticAnalyzer}} to help with your implementation.

Basically, {{create transactional table}} should just be syntactic sugar 
for {{create table p tblproperties ("transactional"="true")}}

> ACID: Add "CREATE TRANSACTIONAL TABLE" syntax to unify ACID ORC & Parquet 
> support
> -
>
> Key: HIVE-18453
> URL: https://issues.apache.org/jira/browse/HIVE-18453
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Gopal V
>Assignee: Igor Kryvenko
>Priority: Major
> Attachments: HIVE-18453.01.patch, HIVE-18453.02.patch, 
> HIVE-18453.03.patch, HIVE-18453.04.patch, HIVE-18453.05.patch, 
> HIVE-18453.06.patch
>
>
> The ACID table markers are currently done with TBLPROPERTIES which is 
> inherently fragile.
> The "create transactional table" offers a way to standardize the syntax and 
> allows for future compatibility changes to support Parquet ACIDv2 tables 
> along with ORC tables.
> The ACIDv2 design is format independent, with the ability to add new 
> vectorized input formats with no changes to the design.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20428) HiveStreamingConnection should use addPartition if not exists API

2018-08-20 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586589#comment-16586589
 ] 

Eugene Koifman commented on HIVE-20428:
---

it won't currently, but perhaps it should/will
How much overhead is this?

> HiveStreamingConnection should use addPartition if not exists API
> -
>
> Key: HIVE-20428
> URL: https://issues.apache.org/jira/browse/HIVE-20428
> Project: Hive
>  Issue Type: Bug
>  Components: Streaming, Transactions
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Priority: Major
>
> [https://github.com/apache/hive/blob/f280361374c6219d8734d5972c740d6d6c3fb7ef/streaming/src/java/org/apache/hive/streaming/HiveStreamingConnection.java#L379-L381]
>  
> catches AlreadyExistsException when adding partition. Instead use 
> add_partitions API with ifNotExists set to true. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20410) aborted Insert Overwrite on transactional table causes "Not enough history available for..." error

2018-08-17 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20410:
--
   Resolution: Fixed
Fix Version/s: 4.0.0
 Release Note: n/a
   Status: Resolved  (was: Patch Available)

committed to master (4.0)
thanks Sergey for the review

> aborted Insert Overwrite on transactional table causes "Not enough history 
> available for..." error
> --
>
> Key: HIVE-20410
> URL: https://issues.apache.org/jira/browse/HIVE-20410
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20410.01.patch, HIVE-20410.02.patch
>
>
> suppose 
> insert overwrite T values(1)
> is aborted.
> this creates a base_x directory (for insert-only transactional tables 
> currently and for full CRUD once 'rename' in the MoveTask is eliminated) but 
> subsequent read fails with "Not enough history available for..." error.
> The problem is that the logic to produce this exception finds this base_x but 
> treats it as if it was produced by a compactor, in which case the error 
> would'v been appropriate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18294) add switch to make acid table the default

2018-08-17 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584376#comment-16584376
 ] 

Eugene Koifman commented on HIVE-18294:
---

OK, since I didn't even know about this table parameter, it's safe to assume 
all of my code is checking TableType and ignoring the table parameter.  I would 
vote for TableType if you are trying to standardize.  

> add switch to make acid table the default
> -
>
> Key: HIVE-18294
> URL: https://issues.apache.org/jira/browse/HIVE-18294
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18294.01.patch, HIVE-18294.03.patch, 
> HIVE-18294.04.patch, HIVE-18294.05.patch
>
>
> it would be convenient for testing to have a switch that enables the behavior 
> where all suitable table tables (currently ORC + not sorted) are 
> automatically created with transactional=true, ie. full acid.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18294) add switch to make acid table the default

2018-08-17 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584248#comment-16584248
 ] 

Eugene Koifman commented on HIVE-18294:
---

{noformat}
public enum TableType {
  MANAGED_TABLE, EXTERNAL_TABLE, VIRTUAL_VIEW, MATERIALIZED_VIEW
}
{noformat}

I expected the table to be one of the above.  I'm not sure what EXTERNAL 
parameter indicates and how that differs from TableType.EXTERNAL_TABLE

> add switch to make acid table the default
> -
>
> Key: HIVE-18294
> URL: https://issues.apache.org/jira/browse/HIVE-18294
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18294.01.patch, HIVE-18294.03.patch, 
> HIVE-18294.04.patch, HIVE-18294.05.patch
>
>
> it would be convenient for testing to have a switch that enables the behavior 
> where all suitable table tables (currently ORC + not sorted) are 
> automatically created with transactional=true, ie. full acid.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20410) aborted Insert Overwrite on transactional table causes "Not enough history available for..." error

2018-08-17 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20410:
--
Attachment: HIVE-20410.02.patch

> aborted Insert Overwrite on transactional table causes "Not enough history 
> available for..." error
> --
>
> Key: HIVE-20410
> URL: https://issues.apache.org/jira/browse/HIVE-20410
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20410.01.patch, HIVE-20410.02.patch
>
>
> suppose 
> insert overwrite T values(1)
> is aborted.
> this creates a base_x directory (for insert-only transactional tables 
> currently and for full CRUD once 'rename' in the MoveTask is eliminated) but 
> subsequent read fails with "Not enough history available for..." error.
> The problem is that the logic to produce this exception finds this base_x but 
> treats it as if it was produced by a compactor, in which case the error 
> would'v been appropriate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20410) aborted Insert Overwrite on transactional table causes "Not enough history available for..." error

2018-08-16 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583246#comment-16583246
 ] 

Eugene Koifman commented on HIVE-20410:
---

the modified JUnit test in TestTxnCommands does that (if I understand what you 
are asking)



> aborted Insert Overwrite on transactional table causes "Not enough history 
> available for..." error
> --
>
> Key: HIVE-20410
> URL: https://issues.apache.org/jira/browse/HIVE-20410
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20410.01.patch
>
>
> suppose 
> insert overwrite T values(1)
> is aborted.
> this creates a base_x directory (for insert-only transactional tables 
> currently and for full CRUD once 'rename' in the MoveTask is eliminated) but 
> subsequent read fails with "Not enough history available for..." error.
> The problem is that the logic to produce this exception finds this base_x but 
> treats it as if it was produced by a compactor, in which case the error 
> would'v been appropriate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20410) aborted Insert Overwrite on transactional table causes "Not enough history available for..." error

2018-08-16 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20410:
--
Status: Patch Available  (was: Open)

> aborted Insert Overwrite on transactional table causes "Not enough history 
> available for..." error
> --
>
> Key: HIVE-20410
> URL: https://issues.apache.org/jira/browse/HIVE-20410
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20410.01.patch
>
>
> suppose 
> insert overwrite T values(1)
> is aborted.
> this creates a base_x directory (for insert-only transactional tables 
> currently and for full CRUD once 'rename' in the MoveTask is eliminated) but 
> subsequent read fails with "Not enough history available for..." error.
> The problem is that the logic to produce this exception finds this base_x but 
> treats it as if it was produced by a compactor, in which case the error 
> would'v been appropriate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20410) aborted Insert Overwrite on transactional table causes "Not enough history available for..." error

2018-08-16 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583235#comment-16583235
 ] 

Eugene Koifman commented on HIVE-20410:
---

should've been caught in HIVE-17457 if it had been done right

> aborted Insert Overwrite on transactional table causes "Not enough history 
> available for..." error
> --
>
> Key: HIVE-20410
> URL: https://issues.apache.org/jira/browse/HIVE-20410
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20410.01.patch
>
>
> suppose 
> insert overwrite T values(1)
> is aborted.
> this creates a base_x directory (for insert-only transactional tables 
> currently and for full CRUD once 'rename' in the MoveTask is eliminated) but 
> subsequent read fails with "Not enough history available for..." error.
> The problem is that the logic to produce this exception finds this base_x but 
> treats it as if it was produced by a compactor, in which case the error 
> would'v been appropriate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20410) aborted Insert Overwrite on transactional table causes "Not enough history available for..." error

2018-08-16 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20410:
--
Attachment: HIVE-20410.01.patch

> aborted Insert Overwrite on transactional table causes "Not enough history 
> available for..." error
> --
>
> Key: HIVE-20410
> URL: https://issues.apache.org/jira/browse/HIVE-20410
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20410.01.patch
>
>
> suppose 
> insert overwrite T values(1)
> is aborted.
> this creates a base_x directory (for insert-only transactional tables 
> currently and for full CRUD once 'rename' in the MoveTask is eliminated) but 
> subsequent read fails with "Not enough history available for..." error.
> The problem is that the logic to produce this exception finds this base_x but 
> treats it as if it was produced by a compactor, in which case the error 
> would'v been appropriate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20410) aborted Insert Overwrite on transactional table causes "Not enough history available for..." error

2018-08-16 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-20410:
-


> aborted Insert Overwrite on transactional table causes "Not enough history 
> available for..." error
> --
>
> Key: HIVE-20410
> URL: https://issues.apache.org/jira/browse/HIVE-20410
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
>
> suppose 
> insert overwrite T values(1)
> is aborted.
> this creates a base_x directory (for insert-only transactional tables 
> currently and for full CRUD once 'rename' in the MoveTask is eliminated) but 
> subsequent read fails with "Not enough history available for..." error.
> The problem is that the logic to produce this exception finds this base_x but 
> treats it as if it was produced by a compactor, in which case the error 
> would'v been appropriate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20409) Hive ACID: Update/delete/merge leave behind the staging directory

2018-08-16 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20409:
--
Component/s: Transactions

> Hive ACID: Update/delete/merge leave behind the staging directory
> -
>
> Key: HIVE-20409
> URL: https://issues.apache.org/jira/browse/HIVE-20409
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
> Environment: Hive-2.1,java-1.8
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-20409.patch
>
>
> UpdateDeleteSemanticAnalyzer creates query context while rewriting the 
> context which doesn't set hdfscleanup, As a result, Driver doesn't clear the 
> staging dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20397) HiveStrictManagedMigration updates

2018-08-16 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582984#comment-16582984
 ] 

Eugene Koifman commented on HIVE-20397:
---

+1

> HiveStrictManagedMigration updates
> --
>
> Key: HIVE-20397
> URL: https://issues.apache.org/jira/browse/HIVE-20397
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-20397.1.patch
>
>
> - Switch from using Driver instance to using metastore calls via 
> Hive.alterDatabase/Hive.alterTable
> - For tables converted from ORC to ACID tables, handle renaming of the files 
> - Fix error handling so utility does not terminate after the first error 
> encountered



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18021) Insert overwrite on acid table with Union All optimizations

2018-08-15 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-18021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18021:
--
Attachment: HIVE-18021.not_atomic.patch

> Insert overwrite on acid table with Union All optimizations
> ---
>
> Key: HIVE-18021
> URL: https://issues.apache.org/jira/browse/HIVE-18021
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Priority: Major
> Attachments: HIVE-18021.not_atomic.patch
>
>
> This is a followup from HIVE-14988.
> T is unbucketed acid table
> {noformat}
> insert into T select a,b from S union all select a,b from S1
> {noformat}
> will create a separate subdirectory for each leg of the union in the target 
> table
> (automatically on Tez, with some props enabled on MR)
> Regular Insert will make each subdirectory be a delta_x_x_0, delta_x_x_1.  
> See HIVE-15899.
> There is no such suffix mechanism for base_x/.  
> Need to figure how this should work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18774) ACID: Use the _copy_N files copyNumber as the implicit statement-id

2018-08-15 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-18774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18774:
--
Attachment: HIVE-18774.03.wip.patch

> ACID: Use the _copy_N files copyNumber as the implicit statement-id
> ---
>
> Key: HIVE-18774
> URL: https://issues.apache.org/jira/browse/HIVE-18774
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
> Environment: if this is not done in 3.0 it cannot be done at all
>Reporter: Gopal V
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-18774.03.wip.patch
>
>
> When upgrading flat ORC files to ACID, use the _copy_N numbering as a 
> statement-id to avoid having to align the row numbering between _copy_1 and 
> _copy_2 files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HIVE-19985) ACID: Skip decoding the ROW__ID sections for read-only queries

2018-08-15 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581555#comment-16581555
 ] 

Eugene Koifman edited comment on HIVE-19985 at 8/15/18 8:13 PM:


patch 5 includes a fix to a stupid bug in 
{{VectorizedOrcAcidRowBatchReader.copyFromBase()}} wrt {{payloadCol}} 
calculation that broke non LLAP path and some additional tests


was (Author: ekoifman):
patch 5 includes a stupid bug in 
{{VectorizedOrcAcidRowBatchReader.copyFromBase()}} wrt {{payloadCol}} 
calculation that broke non LLAP path and some additional tests

> ACID: Skip decoding the ROW__ID sections for read-only queries 
> ---
>
> Key: HIVE-19985
> URL: https://issues.apache.org/jira/browse/HIVE-19985
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Gopal V
>Assignee: Eugene Koifman
>Priority: Major
>  Labels: Branch3Candidate
> Attachments: HIVE-19985.01.patch, HIVE-19985.04.patch, 
> HIVE-19985.05.patch
>
>
> For a base_n file there are no aborted transactions within the file and if 
> there are no pending delete deltas, the entire ACID ROW__ID can be skipped 
> for all read-only queries (i.e SELECT), though it still needs to be projected 
> out for MERGE, UPDATE and DELETE queries.
> This patch tries to entirely ignore the ACID ROW__ID fields for all tables 
> where there are no possible deletes or aborted transactions for an ACID split.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19985) ACID: Skip decoding the ROW__ID sections for read-only queries

2018-08-15 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581555#comment-16581555
 ] 

Eugene Koifman commented on HIVE-19985:
---

patch 5 includes a stupid bug in 
{{VectorizedOrcAcidRowBatchReader.copyFromBase()}} wrt {{payloadCol}} 
calculation that broke non LLAP path and some additional tests

> ACID: Skip decoding the ROW__ID sections for read-only queries 
> ---
>
> Key: HIVE-19985
> URL: https://issues.apache.org/jira/browse/HIVE-19985
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Gopal V
>Assignee: Eugene Koifman
>Priority: Major
>  Labels: Branch3Candidate
> Attachments: HIVE-19985.01.patch, HIVE-19985.04.patch, 
> HIVE-19985.05.patch
>
>
> For a base_n file there are no aborted transactions within the file and if 
> there are no pending delete deltas, the entire ACID ROW__ID can be skipped 
> for all read-only queries (i.e SELECT), though it still needs to be projected 
> out for MERGE, UPDATE and DELETE queries.
> This patch tries to entirely ignore the ACID ROW__ID fields for all tables 
> where there are no possible deletes or aborted transactions for an ACID split.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19985) ACID: Skip decoding the ROW__ID sections for read-only queries

2018-08-15 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-19985:
--
Attachment: HIVE-19985.05.patch

> ACID: Skip decoding the ROW__ID sections for read-only queries 
> ---
>
> Key: HIVE-19985
> URL: https://issues.apache.org/jira/browse/HIVE-19985
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Gopal V
>Assignee: Eugene Koifman
>Priority: Major
>  Labels: Branch3Candidate
> Attachments: HIVE-19985.01.patch, HIVE-19985.04.patch, 
> HIVE-19985.05.patch
>
>
> For a base_n file there are no aborted transactions within the file and if 
> there are no pending delete deltas, the entire ACID ROW__ID can be skipped 
> for all read-only queries (i.e SELECT), though it still needs to be projected 
> out for MERGE, UPDATE and DELETE queries.
> This patch tries to entirely ignore the ACID ROW__ID fields for all tables 
> where there are no possible deletes or aborted transactions for an ACID split.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20392) make compaction atomic on S3

2018-08-14 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20392:
--
Attachment: HIVE-20392.01.patch

> make compaction atomic on S3
> 
>
> Key: HIVE-20392
> URL: https://issues.apache.org/jira/browse/HIVE-20392
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20392.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20392) make compaction atomic on S3

2018-08-14 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-20392:
-


> make compaction atomic on S3
> 
>
> Key: HIVE-20392
> URL: https://issues.apache.org/jira/browse/HIVE-20392
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20378) don't update stats during alter for txn table conversion

2018-08-14 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580523#comment-16580523
 ] 

Eugene Koifman commented on HIVE-20378:
---

If I start with a table with stats, and convert it transactional, this will set 
stats_accurate to false.  Am I reading this right?  Isn't better to set 
DO_NOT_UPDATE_STATS?

> don't update stats during alter for txn table conversion
> 
>
> Key: HIVE-20378
> URL: https://issues.apache.org/jira/browse/HIVE-20378
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20378.01.patch, HIVE-20378.02.patch, 
> HIVE-20378.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20343) Hive 3: CTAS does not respect transactional_properties

2018-08-14 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580367#comment-16580367
 ] 

Eugene Koifman commented on HIVE-20343:
---

+1

> Hive 3: CTAS does not respect transactional_properties
> --
>
> Key: HIVE-20343
> URL: https://issues.apache.org/jira/browse/HIVE-20343
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.1.0
> Environment: hive-3
>Reporter: Rajkumar Singh
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20343.patch
>
>
> Steps to reproduce:
> {code}
> create table ctasexampleinsertonly stored as orc  TBLPROPERTIES 
> ("transactional_properties"="insert_only") as select * from testtable limit 1;
> {code}
> look for transactional_properties which is 'default' not the expected 
> "insert_only"
> {code}
>  describe formatted ctasexampleinsertonly
>  
> +---++---+
> |   col_name| data_type   
>|comment|
> +---++---+
> | # col_name| data_type   
>| comment   |
> | name  | varchar(8)  
>|   |
> | time  | double  
>|   |
> |   | NULL
>| NULL  |
> | # Detailed Table Information  | NULL
>| NULL  |
> | Database: | default 
>| NULL  |
> | OwnerType:| USER
>| NULL  |
> | Owner:| hive
>| NULL  |
> | CreateTime:   | Wed Aug 08 21:35:15 UTC 2018
>| NULL  |
> | LastAccessTime:   | UNKNOWN 
>| NULL  |
> | Retention:| 0   
>| NULL  |
> | Location: | 
> hdfs://xx:8020/warehouse/tablespace/managed/hive/ctasexampleinsertonly
>  | NULL  |
> | Table Type:   | MANAGED_TABLE   
>| NULL  |
> | Table Parameters: | NULL
>| NULL  |
> |   | COLUMN_STATS_ACCURATE   
>| {}|
> |   | bucketing_version   
>| 2 |
> |   | numFiles
>| 1 |
> |   | numRows 
>| 1 |
> |   | rawDataSize 
>| 0 |
> |   | totalSize   
>| 754   |
> |   | transactional   
>| true  |
> |   | transactional_properties
>| default   |
> |   | transient_lastDdlTime   
>| 1533764115|
> |   | NULL
>| NULL  |
> | # Storage Information | NULL
>| NULL  |
> | SerDe Library:| org.apache.hadoop.hive.ql.io.orc.OrcSerde   
>| NULL  |
> | InputFormat:  | 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat| NULL  |
> | OutputFormat: | 
> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat   | NULL  |
> | Compressed:   | No  
>| NULL  |
> | Num Buckets:  | -1  
>| NULL  |
> | Bucket Columns:   | []

[jira] [Updated] (HIVE-20372) WRTIE_SET typo in TxnHandler

2018-08-13 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20372:
--
Component/s: Transactions

> WRTIE_SET typo in TxnHandler
> 
>
> Key: HIVE-20372
> URL: https://issues.apache.org/jira/browse/HIVE-20372
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore, Transactions
>Affects Versions: 3.1.0
>Reporter: Laszlo Bodor
>Priority: Trivial
>  Labels: Newbie, newbie, newbie++, newbiee
> Fix For: 4.0.0
>
>
> [https://github.com/prongs/apache-hive/blob/deabe59371e98a21f4c3a58a9d8da51e4632fca5/metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L765]
> minor typo



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18444) when creating transactional table make sure location has no data

2018-08-13 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-18444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18444:
--
Target Version/s: 4.0.0  (was: 3.0.0)

> when creating transactional table make sure location has no data
> 
>
> Key: HIVE-18444
> URL: https://issues.apache.org/jira/browse/HIVE-18444
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Vaibhav Gumashta
>Priority: Major
>
> if a user creates a new transactional table but sets a location to some place 
> that already has data any number of things can break.  
> Data may not be in Acid format, it may have been written by another cluster 
> and txnids won't make sense in current cluster.  Once per table writeIDs 
> (HIVE-18192) are there, if the data was written by another table, writeIDs 
> won't match.
> This could actually work if the data at the existing location was not written 
> by an acid write but it would be safer/cleaner to just prevent this (at least 
> at first).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-18444) when creating transactional table make sure location has no data

2018-08-13 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-18444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-18444:
-

Assignee: Vaibhav Gumashta

> when creating transactional table make sure location has no data
> 
>
> Key: HIVE-18444
> URL: https://issues.apache.org/jira/browse/HIVE-18444
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Vaibhav Gumashta
>Priority: Major
>
> if a user creates a new transactional table but sets a location to some place 
> that already has data any number of things can break.  
> Data may not be in Acid format, it may have been written by another cluster 
> and txnids won't make sense in current cluster.  Once per table writeIDs 
> (HIVE-18192) are there, if the data was written by another table, writeIDs 
> won't match.
> This could actually work if the data at the existing location was not written 
> by an acid write but it would be safer/cleaner to just prevent this (at least 
> at first).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-19081) Add partition should prevent loading acid files

2018-08-13 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-19081:
-

Assignee: Vaibhav Gumashta  (was: Eugene Koifman)

> Add partition should prevent loading acid files
> ---
>
> Key: HIVE-19081
> URL: https://issues.apache.org/jira/browse/HIVE-19081
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Vaibhav Gumashta
>Priority: Major
>
> similar to HIVE-19029
> {{Alter Table T add Partition ...} T is acid should check to make sure input 
> files were not copied from another Acid table, i.e. make sure the files don't 
> have Acid metadata columns.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19081) Add partition should prevent loading acid files

2018-08-13 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-19081:
--
Target Version/s: 3.1.0  (was: 3.0.0)

> Add partition should prevent loading acid files
> ---
>
> Key: HIVE-19081
> URL: https://issues.apache.org/jira/browse/HIVE-19081
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
>
> similar to HIVE-19029
> {{Alter Table T add Partition ...} T is acid should check to make sure input 
> files were not copied from another Acid table, i.e. make sure the files don't 
> have Acid metadata columns.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19115) Merge: Semijoin hints are dropped by the merge

2018-08-13 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578608#comment-16578608
 ] 

Eugene Koifman commented on HIVE-19115:
---

[~djaiswal] is this a dup?

> Merge: Semijoin hints are dropped by the merge
> --
>
> Key: HIVE-19115
> URL: https://issues.apache.org/jira/browse/HIVE-19115
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning, Transactions
>Reporter: Gopal V
>Assignee: Deepak Jaiswal
>Priority: Major
>
> {code}
> create table target stored as orc as select ss_ticket_number, ss_item_sk, 
> current_timestamp as `ts` from tpcds_bin_partitioned_orc_1000.store_sales;
> create table source stored as orc as select sr_ticket_number, sr_item_sk, 
> d_date from tpcds_bin_partitioned_orc_1000.store_returns join 
> tpcds_bin_partitioned_orc_1000.date_dim where d_date_sk = sr_returned_date_sk;
> merge /* +semi(T, sr_ticket_number, S, 1) */ into target T using (select 
> * from source where year(d_date) = 1998) S ON T.ss_ticket_number = 
> S.sr_ticket_number and sr_item_sk = ss_item_sk 
> when matched THEN UPDATE SET ts = current_timestamp
> when not matched and sr_item_sk is not null and sr_ticket_number is not null 
> THEN INSERT VALUES(S.sr_ticket_number, S.sr_item_sk, current_timestamp);
> {code}
> The semijoin hints are ignored and the code says 
> {code}
>  todo: do we care to preserve comments in original SQL?
> {code}
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java#L624
> in this case we do.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20324) change hive.compactor.max.num.delta default to 50

2018-08-13 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-20324:
-

Assignee: Eugene Koifman

> change hive.compactor.max.num.delta default to 50
> -
>
> Key: HIVE-20324
> URL: https://issues.apache.org/jira/browse/HIVE-20324
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
>
> current default is 500 - this is way to hight.  OOM is likely at 50 or so.
> Need to update the default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20324) change hive.compactor.max.num.delta default to 50

2018-08-13 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-20324:
-

Assignee: Vaibhav Gumashta  (was: Eugene Koifman)

> change hive.compactor.max.num.delta default to 50
> -
>
> Key: HIVE-20324
> URL: https://issues.apache.org/jira/browse/HIVE-20324
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Eugene Koifman
>Assignee: Vaibhav Gumashta
>Priority: Major
>
> current default is 500 - this is way to hight.  OOM is likely at 50 or so.
> Need to update the default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20234) Add an option to disable stats computation from Compactor

2018-08-13 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-20234:
-

Assignee: Vaibhav Gumashta

> Add an option to disable stats computation from Compactor
> -
>
> Key: HIVE-20234
> URL: https://issues.apache.org/jira/browse/HIVE-20234
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Vaibhav Gumashta
>Priority: Major
>
> Currently \{{Woker.StatsUpdater}} will run \{{analyze table ... compute 
> statistics for columns ...}} at the end of each Major compaction to update 
> stats on columns that already have stats.
>  
> It would be useful to add a config option that allows better control over 
> this.  I could have 3 values: don't update col stats, update existing col 
> stats, update all col stats.
> Should this have ability to update table level stats?  Is that needed given 
> HIVE-19532?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20369) TestPreUpgradeTool not run by ptest

2018-08-11 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-20369:
-


> TestPreUpgradeTool not run by ptest
> ---
>
> Key: HIVE-20369
> URL: https://issues.apache.org/jira/browse/HIVE-20369
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
>
> TestPreUpgradeTool is not showing up in ptest runs
> probably because upgrade-acid module is disconnected from root pom
> how does standalone-metastore work?  it's also disconnected
> also, hive-upgrade jar is not showing up in tar with mvn package



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (HIVE-19749) Acid V1 to V2 upgrade

2018-08-11 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman resolved HIVE-19749.
---
   Resolution: Fixed
Fix Version/s: 4.0.0

> Acid V1 to V2 upgrade
> -
>
> Key: HIVE-19749
> URL: https://issues.apache.org/jira/browse/HIVE-19749
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Fix For: 4.0.0
>
>
> umbrella jira



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19800) Create separate submodules for pre and post upgrade and add rename file logic

2018-08-11 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-19800:
--
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

> Create separate submodules for pre and post upgrade and add rename file logic
> -
>
> Key: HIVE-19800
> URL: https://issues.apache.org/jira/browse/HIVE-19800
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Fix For: 4.0.0
>
> Attachments: HIVE-19800.01.patch, HIVE-19800.02.patch, 
> HIVE-19800.03.patch, HIVE-19800.04.patch, HIVE-19800.05.patch, 
> HIVE-19800.06.patch, HIVE-19800.07.patch, HIVE-19800.08.patch
>
>
> this is a followup to HIVE-19751 which includes HIVE-19751 since it hasn't 
> landed yet
> this includes file rename logic and HIVE-19750 since it hasn't landed yet 
> either
>  
> cc [~jdere]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19800) Create separate submodules for pre and post upgrade and add rename file logic

2018-08-11 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-19800:
--
Attachment: HIVE-19800.08.patch

> Create separate submodules for pre and post upgrade and add rename file logic
> -
>
> Key: HIVE-19800
> URL: https://issues.apache.org/jira/browse/HIVE-19800
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-19800.01.patch, HIVE-19800.02.patch, 
> HIVE-19800.03.patch, HIVE-19800.04.patch, HIVE-19800.05.patch, 
> HIVE-19800.06.patch, HIVE-19800.07.patch, HIVE-19800.08.patch
>
>
> this is a followup to HIVE-19751 which includes HIVE-19751 since it hasn't 
> landed yet
> this includes file rename logic and HIVE-19750 since it hasn't landed yet 
> either
>  
> cc [~jdere]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19800) Create separate submodules for pre and post upgrade and add rename file logic

2018-08-11 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-19800:
--
Status: Patch Available  (was: Open)

> Create separate submodules for pre and post upgrade and add rename file logic
> -
>
> Key: HIVE-19800
> URL: https://issues.apache.org/jira/browse/HIVE-19800
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-19800.01.patch, HIVE-19800.02.patch, 
> HIVE-19800.03.patch, HIVE-19800.04.patch, HIVE-19800.05.patch, 
> HIVE-19800.06.patch, HIVE-19800.07.patch
>
>
> this is a followup to HIVE-19751 which includes HIVE-19751 since it hasn't 
> landed yet
> this includes file rename logic and HIVE-19750 since it hasn't landed yet 
> either
>  
> cc [~jdere]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19800) Create separate submodules for pre and post upgrade and add rename file logic

2018-08-11 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-19800:
--
Attachment: HIVE-19800.07.patch

> Create separate submodules for pre and post upgrade and add rename file logic
> -
>
> Key: HIVE-19800
> URL: https://issues.apache.org/jira/browse/HIVE-19800
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-19800.01.patch, HIVE-19800.02.patch, 
> HIVE-19800.03.patch, HIVE-19800.04.patch, HIVE-19800.05.patch, 
> HIVE-19800.06.patch, HIVE-19800.07.patch
>
>
> this is a followup to HIVE-19751 which includes HIVE-19751 since it hasn't 
> landed yet
> this includes file rename logic and HIVE-19750 since it hasn't landed yet 
> either
>  
> cc [~jdere]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19800) Create separate submodules for pre and post upgrade and add rename file logic

2018-08-11 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-19800:
--
Status: Open  (was: Patch Available)

> Create separate submodules for pre and post upgrade and add rename file logic
> -
>
> Key: HIVE-19800
> URL: https://issues.apache.org/jira/browse/HIVE-19800
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-19800.01.patch, HIVE-19800.02.patch, 
> HIVE-19800.03.patch, HIVE-19800.04.patch, HIVE-19800.05.patch, 
> HIVE-19800.06.patch, HIVE-19800.07.patch
>
>
> this is a followup to HIVE-19751 which includes HIVE-19751 since it hasn't 
> landed yet
> this includes file rename logic and HIVE-19750 since it hasn't landed yet 
> either
>  
> cc [~jdere]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19800) Create separate submodules for pre and post upgrade and add rename file logic

2018-08-10 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-19800:
--
Attachment: HIVE-19800.06.patch

> Create separate submodules for pre and post upgrade and add rename file logic
> -
>
> Key: HIVE-19800
> URL: https://issues.apache.org/jira/browse/HIVE-19800
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-19800.01.patch, HIVE-19800.02.patch, 
> HIVE-19800.03.patch, HIVE-19800.04.patch, HIVE-19800.05.patch, 
> HIVE-19800.06.patch
>
>
> this is a followup to HIVE-19751 which includes HIVE-19751 since it hasn't 
> landed yet
> this includes file rename logic and HIVE-19750 since it hasn't landed yet 
> either
>  
> cc [~jdere]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19800) Create separate submodules for pre and post upgrade and add rename file logic

2018-08-10 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-19800:
--
Status: Patch Available  (was: Open)

> Create separate submodules for pre and post upgrade and add rename file logic
> -
>
> Key: HIVE-19800
> URL: https://issues.apache.org/jira/browse/HIVE-19800
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-19800.01.patch, HIVE-19800.02.patch, 
> HIVE-19800.03.patch, HIVE-19800.04.patch, HIVE-19800.05.patch
>
>
> this is a followup to HIVE-19751 which includes HIVE-19751 since it hasn't 
> landed yet
> this includes file rename logic and HIVE-19750 since it hasn't landed yet 
> either
>  
> cc [~jdere]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19800) Create separate submodules for pre and post upgrade and add rename file logic

2018-08-10 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-19800:
--
Attachment: HIVE-19800.05.patch

> Create separate submodules for pre and post upgrade and add rename file logic
> -
>
> Key: HIVE-19800
> URL: https://issues.apache.org/jira/browse/HIVE-19800
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-19800.01.patch, HIVE-19800.02.patch, 
> HIVE-19800.03.patch, HIVE-19800.04.patch, HIVE-19800.05.patch
>
>
> this is a followup to HIVE-19751 which includes HIVE-19751 since it hasn't 
> landed yet
> this includes file rename logic and HIVE-19750 since it hasn't landed yet 
> either
>  
> cc [~jdere]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20354) Semijoin hints dont work with merge statements

2018-08-10 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576922#comment-16576922
 ] 

Eugene Koifman commented on HIVE-20354:
---

+1 patch 4

> Semijoin hints dont work with merge statements
> --
>
> Key: HIVE-20354
> URL: https://issues.apache.org/jira/browse/HIVE-20354
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20354.1.patch, HIVE-20354.2.patch, 
> HIVE-20354.3.patch, HIVE-20354.4.patch
>
>
> When merge statement is rewritten, it ignores any comment in the query which 
> may include hints like semijoin.
> If it is, it should not be ignored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20354) Semijoin hints dont work with merge statements

2018-08-09 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20354:
--
Component/s: Transactions

> Semijoin hints dont work with merge statements
> --
>
> Key: HIVE-20354
> URL: https://issues.apache.org/jira/browse/HIVE-20354
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20354.1.patch
>
>
> When merge statement is rewritten, it ignores any comment in the query which 
> may include hints like semijoin.
> If it is, it should not be ignored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20343) Hive 3: CTAS does not respect transactional_properties

2018-08-09 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20343:
--
Component/s: (was: Hive)
 Transactions

> Hive 3: CTAS does not respect transactional_properties
> --
>
> Key: HIVE-20343
> URL: https://issues.apache.org/jira/browse/HIVE-20343
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.1.0
> Environment: hive-3
>Reporter: Rajkumar Singh
>Priority: Major
>
> Steps to reproduce:
> {code}
> create table ctasexampleinsertonly stored as orc  TBLPROPERTIES 
> ("transactional_properties"="insert_only") as select * from testtable limit 1;
> {code}
> look for transactional_properties which is 'default' not the expected 
> "insert_only"
> {code}
>  describe formatted ctasexampleinsertonly
>  
> +---++---+
> |   col_name| data_type   
>|comment|
> +---++---+
> | # col_name| data_type   
>| comment   |
> | name  | varchar(8)  
>|   |
> | time  | double  
>|   |
> |   | NULL
>| NULL  |
> | # Detailed Table Information  | NULL
>| NULL  |
> | Database: | default 
>| NULL  |
> | OwnerType:| USER
>| NULL  |
> | Owner:| hive
>| NULL  |
> | CreateTime:   | Wed Aug 08 21:35:15 UTC 2018
>| NULL  |
> | LastAccessTime:   | UNKNOWN 
>| NULL  |
> | Retention:| 0   
>| NULL  |
> | Location: | 
> hdfs://xx:8020/warehouse/tablespace/managed/hive/ctasexampleinsertonly
>  | NULL  |
> | Table Type:   | MANAGED_TABLE   
>| NULL  |
> | Table Parameters: | NULL
>| NULL  |
> |   | COLUMN_STATS_ACCURATE   
>| {}|
> |   | bucketing_version   
>| 2 |
> |   | numFiles
>| 1 |
> |   | numRows 
>| 1 |
> |   | rawDataSize 
>| 0 |
> |   | totalSize   
>| 754   |
> |   | transactional   
>| true  |
> |   | transactional_properties
>| default   |
> |   | transient_lastDdlTime   
>| 1533764115|
> |   | NULL
>| NULL  |
> | # Storage Information | NULL
>| NULL  |
> | SerDe Library:| org.apache.hadoop.hive.ql.io.orc.OrcSerde   
>| NULL  |
> | InputFormat:  | 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat| NULL  |
> | OutputFormat: | 
> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat   | NULL  |
> | Compressed:   | No  
>| NULL  |
> | Num Buckets:  | -1  
>| NULL  |
> | Bucket Columns:   | []  
>| NULL

[jira] [Commented] (HIVE-19800) Create separate submodules for pre and post upgrade and add rename file logic

2018-08-07 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16572504#comment-16572504
 ] 

Eugene Koifman commented on HIVE-19800:
---

todo: BUG-107516

> Create separate submodules for pre and post upgrade and add rename file logic
> -
>
> Key: HIVE-19800
> URL: https://issues.apache.org/jira/browse/HIVE-19800
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-19800.01.patch, HIVE-19800.02.patch, 
> HIVE-19800.03.patch, HIVE-19800.04.patch
>
>
> this is a followup to HIVE-19751 which includes HIVE-19751 since it hasn't 
> landed yet
> this includes file rename logic and HIVE-19750 since it hasn't landed yet 
> either
>  
> cc [~jdere]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HIVE-20327) Compactor should gracefully handle 0 length files and invalid orc files

2018-08-07 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16572131#comment-16572131
 ] 

Eugene Koifman edited comment on HIVE-20327 at 8/7/18 7:01 PM:
---

patch 2 is a prototype and an unsuccessful attempt to repro this

Reader deltaReader = OrcFile.createReader(deltaFile, 
OrcFile.readerOptions(conf).maxLength(length));
recordReader = reader.rowsOptions(options, conf);

recordReader.hasNext() returns false when deltaFile is an empty file


was (Author: ekoifman):
patch 2 is a prototype and an unsuccessful attempt to repro this

> Compactor should gracefully handle 0 length files and invalid orc files
> ---
>
> Key: HIVE-20327
> URL: https://issues.apache.org/jira/browse/HIVE-20327
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20327.02.patch
>
>
> Older versions of Streaming API did not handle interrupts well and could 
> leave 0-length ORC files behind which cannot be read.
> These should be just skipped.
> Other cases of file where ORC Reader cannot be created
> 1. regular write (1 txn delta) where the client died and didn't properly 
> close the file - this delta should be aborted and never read
> 2. streaming ingest write (delta_x_y, x < y).  There should always be a side 
> file if the file was not closed properly. (though it may still indicate that 
> length is 0)
> If we check these cases and still can't create a reader, it should not 
> silently skip the file since the system thinks it contains at least some 
> committed data but the file is corrupted (and the side file doesn't point at 
> a valid footer) - we should never be in this situation and we should throw so 
> that the end user can try manual intervention (where the only option may be 
> deleting the file)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20327) Compactor should gracefully handle 0 length files and invalid orc files

2018-08-07 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16572131#comment-16572131
 ] 

Eugene Koifman commented on HIVE-20327:
---

patch 2 is a prototype and an unsuccessful attempt to repro this

> Compactor should gracefully handle 0 length files and invalid orc files
> ---
>
> Key: HIVE-20327
> URL: https://issues.apache.org/jira/browse/HIVE-20327
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20327.02.patch
>
>
> Older versions of Streaming API did not handle interrupts well and could 
> leave 0-length ORC files behind which cannot be read.
> These should be just skipped.
> Other cases of file where ORC Reader cannot be created
> 1. regular write (1 txn delta) where the client died and didn't properly 
> close the file - this delta should be aborted and never read
> 2. streaming ingest write (delta_x_y, x < y).  There should always be a side 
> file if the file was not closed properly. (though it may still indicate that 
> length is 0)
> If we check these cases and still can't create a reader, it should not 
> silently skip the file since the system thinks it contains at least some 
> committed data but the file is corrupted (and the side file doesn't point at 
> a valid footer) - we should never be in this situation and we should throw so 
> that the end user can try manual intervention (where the only option may be 
> deleting the file)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20327) Compactor should gracefully handle 0 length files and invalid orc files

2018-08-07 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20327:
--
Attachment: HIVE-20327.02.patch

> Compactor should gracefully handle 0 length files and invalid orc files
> ---
>
> Key: HIVE-20327
> URL: https://issues.apache.org/jira/browse/HIVE-20327
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20327.02.patch
>
>
> Older versions of Streaming API did not handle interrupts well and could 
> leave 0-length ORC files behind which cannot be read.
> These should be just skipped.
> Other cases of file where ORC Reader cannot be created
> 1. regular write (1 txn delta) where the client died and didn't properly 
> close the file - this delta should be aborted and never read
> 2. streaming ingest write (delta_x_y, x < y).  There should always be a side 
> file if the file was not closed properly. (though it may still indicate that 
> length is 0)
> If we check these cases and still can't create a reader, it should not 
> silently skip the file since the system thinks it contains at least some 
> committed data but the file is corrupted (and the side file doesn't point at 
> a valid footer) - we should never be in this situation and we should throw so 
> that the end user can try manual intervention (where the only option may be 
> deleting the file)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20332) Materialized views: Introduce heuristic on selectivity over ROW__ID to favour incremental rebuild

2018-08-07 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16572085#comment-16572085
 ] 

Eugene Koifman commented on HIVE-20332:
---

HIVE-20313 should be considered (though hard to say how much effort this would 
be)

> Materialized views: Introduce heuristic on selectivity over ROW__ID to favour 
> incremental rebuild
> -
>
> Key: HIVE-20332
> URL: https://issues.apache.org/jira/browse/HIVE-20332
> Project: Hive
>  Issue Type: Improvement
>  Components: Materialized views
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, we do not expose stats over {{ROW\_\_ID.writeId}} to the 
> optimizer. Even if we did, we always assume uniform distribution of the 
> column values, which can easily lead to overestimations on the number of rows 
> read when we filter on {{ROW\_\_ID.writeId}} for materialized views (think 
> about a large transaction for MV creation and then small ones for incremental 
> maintenance). This overestimation can lead to incremental view maintenance 
> not being triggered as cost of the incremental plan is overestimated (we 
> think we will read more rows than we actually do). This could be fixed by 
> introducing histograms that reflect better the column values distribution.
> Till that moment, we will use a config variable that will set the selectivity 
> for filter condition on {{ROW\_\_ID}} during the cost calculation. Setting 
> that variable to a low value will favour incremental rebuild over full 
> rebuild.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20327) Compactor should gracefully handle 0 length files and invalid orc files

2018-08-06 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-20327:
-


> Compactor should gracefully handle 0 length files and invalid orc files
> ---
>
> Key: HIVE-20327
> URL: https://issues.apache.org/jira/browse/HIVE-20327
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
>
> Older versions of Streaming API did not handle interrupts well and could 
> leave 0-length ORC files behind which cannot be read.
> These should be just skipped.
> Other cases of file where ORC Reader cannot be created
> 1. regular write (1 txn delta) where the client died and didn't properly 
> close the file - this delta should be aborted and never read
> 2. streaming ingest write (delta_x_y, x < y).  There should always be a side 
> file if the file was not closed properly. (though it may still indicate that 
> length is 0)
> If we check these cases and still can't create a reader, it should not 
> silently skip the file since the system thinks it contains at least some 
> committed data but the file is corrupted (and the side file doesn't point at 
> a valid footer) - we should never be in this situation and we should throw so 
> that the end user can try manual intervention (where the only option may be 
> deleting the file)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20291) Allow HiveStreamingConnection to receive a WriteId

2018-08-06 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570564#comment-16570564
 ] 

Eugene Koifman commented on HIVE-20291:
---

I think if there is way to allocate a unique statement id for each writer that 
is the best option.

> Allow HiveStreamingConnection to receive a WriteId
> --
>
> Key: HIVE-20291
> URL: https://issues.apache.org/jira/browse/HIVE-20291
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20291.1.patch, HIVE-20291.2.patch
>
>
> If the writeId is received externally it won't need to open connections to 
> the metastore. It won't be able to the commit in this case as well so it must 
> be done by the entity passing the writeId.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-17683) Add explain locks command

2018-08-03 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17683:
--
   Resolution: Fixed
Fix Version/s: 3.2.0
   Status: Resolved  (was: Patch Available)

> Add explain locks  command
> ---
>
> Key: HIVE-17683
> URL: https://issues.apache.org/jira/browse/HIVE-17683
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Critical
> Fix For: 3.2.0
>
> Attachments: HIVE-17683-branch-3.patch, HIVE-17683.01.patch, 
> HIVE-17683.02.patch, HIVE-17683.03.patch, HIVE-17683.04.patch, 
> HIVE-17683.05.patch, HIVE-17683.06.patch
>
>
> Explore if it's possible to add info about what locks will be asked for to 
> the query plan.
> Lock acquisition (for Acid Lock Manager) is done in 
> DbTxnManager.acquireLocks() which is called once the query starts running.  
> Would need to refactor that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17683) Add explain locks command

2018-08-03 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568960#comment-16568960
 ] 

Eugene Koifman commented on HIVE-17683:
---

committed to branch-3
thanks Igor for the contribution

> Add explain locks  command
> ---
>
> Key: HIVE-17683
> URL: https://issues.apache.org/jira/browse/HIVE-17683
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Critical
> Attachments: HIVE-17683-branch-3.patch, HIVE-17683.01.patch, 
> HIVE-17683.02.patch, HIVE-17683.03.patch, HIVE-17683.04.patch, 
> HIVE-17683.05.patch, HIVE-17683.06.patch
>
>
> Explore if it's possible to add info about what locks will be asked for to 
> the query plan.
> Lock acquisition (for Acid Lock Manager) is done in 
> DbTxnManager.acquireLocks() which is called once the query starts running.  
> Would need to refactor that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20311) add txn stats checks to some more paths

2018-08-03 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568935#comment-16568935
 ] 

Eugene Koifman commented on HIVE-20311:
---

+1 pending tests

> add txn stats checks to some more paths
> ---
>
> Key: HIVE-20311
> URL: https://issues.apache.org/jira/browse/HIVE-20311
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20311.patch
>
>
> These were set to false in the original patch for no reason as far as I see.
> I later added notes but not TODOs to switch them over, so they remained as 
> non-txn.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20313) consider making ROW__ID a 1st class object

2018-08-03 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20313:
--
Description: 
ROW_ID, which is a struct that represents a unique row ID within a partition of 
a full CRUD transactional table is currently modeled as a {{VirtualColumn}}.  
Acid metadata columns from which ROW_ID is built are actually stored in the 
data file.  

There is no end to special handling of acid metadata columns in the code to 
make this work.

Perhaps a better approach is to add struct column to an acid table at creation 
time and make it a 1st class citizen visible in the metastore.  'select 
count(*) ' would need special handling to remove it.  There may need to be 
a way to make these columns read-only.

For data added via Load Data, Add Partition, etc (i.e. original files in a CRUD 
table), acid reader would have fill in the values as it does today.

This would make schema evolution, PPD, projection pruning work seamlessly.
This should also make adding formats other than ORC in full CRUD tables easy.

This will likely be painful but should be investigated.



  was:
ROW__ID, which is a struct that represents a unique row ID within a partition 
of a full CRUD transactional table is currently modeled as a {{VirtualColumn}}. 
 Acid metadata columns from which ROW__ID is built are actually stored in the 
data file.  

There is no end to special handling of acid metadata columns in the code to 
make this work.

Perhaps a better approach is to add struct column to an acid table at creation 
time and make it a 1st class citizen visible in the metastore.  'select 
count(*) ' would need special handling to remove it.  There may need to be 
a way to make these columns read-only.

For data added via Load Data, Add Partition, etc (i.e. original files in a CRUD 
table), acid reader would have fill in the values as it does today.

This would make schema evolution, PPD, projection pruning work seamlessly.
This should also make adding formats other than ORC in full CRUD tables easy.

This will likely be painful but should be investigated.




> consider making ROW__ID a 1st class object
> --
>
> Key: HIVE-20313
> URL: https://issues.apache.org/jira/browse/HIVE-20313
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 0.11.0
>Reporter: Eugene Koifman
>Priority: Major
>
> ROW_ID, which is a struct that represents a unique row ID within a partition 
> of a full CRUD transactional table is currently modeled as a 
> {{VirtualColumn}}.  Acid metadata columns from which ROW_ID is built are 
> actually stored in the data file.  
> There is no end to special handling of acid metadata columns in the code to 
> make this work.
> Perhaps a better approach is to add struct column to an acid table at 
> creation time and make it a 1st class citizen visible in the metastore.  
> 'select count(*) ' would need special handling to remove it.  There may 
> need to be a way to make these columns read-only.
> For data added via Load Data, Add Partition, etc (i.e. original files in a 
> CRUD table), acid reader would have fill in the values as it does today.
> This would make schema evolution, PPD, projection pruning work seamlessly.
> This should also make adding formats other than ORC in full CRUD tables easy.
> This will likely be painful but should be investigated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HIVE-19985) ACID: Skip decoding the ROW__ID sections for read-only queries

2018-08-03 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568479#comment-16568479
 ] 

Eugene Koifman edited comment on HIVE-19985 at 8/3/18 5:04 PM:
---

[~gopalv], patch 4 includes LLAP handling
cc [~ashutoshc]

includes hive.optimize.acid.meta.columns option so this feature can be disabled


was (Author: ekoifman):
[~gopalv], patch 4 includes LLAP handling
cc [~ashutoshc]

> ACID: Skip decoding the ROW__ID sections for read-only queries 
> ---
>
> Key: HIVE-19985
> URL: https://issues.apache.org/jira/browse/HIVE-19985
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Gopal V
>Assignee: Eugene Koifman
>Priority: Major
>  Labels: Branch3Candidate
> Attachments: HIVE-19985.01.patch, HIVE-19985.04.patch
>
>
> For a base_n file there are no aborted transactions within the file and if 
> there are no pending delete deltas, the entire ACID ROW__ID can be skipped 
> for all read-only queries (i.e SELECT), though it still needs to be projected 
> out for MERGE, UPDATE and DELETE queries.
> This patch tries to entirely ignore the ACID ROW__ID fields for all tables 
> where there are no possible deletes or aborted transactions for an ACID split.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19985) ACID: Skip decoding the ROW__ID sections for read-only queries

2018-08-03 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568479#comment-16568479
 ] 

Eugene Koifman commented on HIVE-19985:
---

[~gopalv], patch 4 includes LLAP handling
cc [~ashutoshc]

> ACID: Skip decoding the ROW__ID sections for read-only queries 
> ---
>
> Key: HIVE-19985
> URL: https://issues.apache.org/jira/browse/HIVE-19985
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Gopal V
>Assignee: Eugene Koifman
>Priority: Major
>  Labels: Branch3Candidate
> Attachments: HIVE-19985.01.patch, HIVE-19985.04.patch
>
>
> For a base_n file there are no aborted transactions within the file and if 
> there are no pending delete deltas, the entire ACID ROW__ID can be skipped 
> for all read-only queries (i.e SELECT), though it still needs to be projected 
> out for MERGE, UPDATE and DELETE queries.
> This patch tries to entirely ignore the ACID ROW__ID fields for all tables 
> where there are no possible deletes or aborted transactions for an ACID split.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19985) ACID: Skip decoding the ROW__ID sections for read-only queries

2018-08-03 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-19985:
--
Attachment: HIVE-19985.04.patch

> ACID: Skip decoding the ROW__ID sections for read-only queries 
> ---
>
> Key: HIVE-19985
> URL: https://issues.apache.org/jira/browse/HIVE-19985
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Gopal V
>Assignee: Eugene Koifman
>Priority: Major
>  Labels: Branch3Candidate
> Attachments: HIVE-19985.01.patch, HIVE-19985.04.patch
>
>
> For a base_n file there are no aborted transactions within the file and if 
> there are no pending delete deltas, the entire ACID ROW__ID can be skipped 
> for all read-only queries (i.e SELECT), though it still needs to be projected 
> out for MERGE, UPDATE and DELETE queries.
> This patch tries to entirely ignore the ACID ROW__ID fields for all tables 
> where there are no possible deletes or aborted transactions for an ACID split.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20302) LLAP: non-vectorized execution in IO ignores virtual columns, including ROW__ID

2018-08-03 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20302:
--
Component/s: Transactions

> LLAP: non-vectorized execution in IO ignores virtual columns, including 
> ROW__ID
> ---
>
> Key: HIVE-20302
> URL: https://issues.apache.org/jira/browse/HIVE-20302
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20302.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20293) Support Replication of ACID table truncate operation

2018-08-02 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20293:
--
Component/s: Transactions

> Support Replication of ACID table truncate operation
> 
>
> Key: HIVE-20293
> URL: https://issues.apache.org/jira/browse/HIVE-20293
> Project: Hive
>  Issue Type: Task
>  Components: repl, Transactions
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, replication
> Fix For: 4.0.0, 3.2.0
>
>
> Support truncate acid table replication.
> 1. Write id allocation needs to be removed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20291) Allow HiveStreamingConnection to receive a WriteId

2018-08-02 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16566979#comment-16566979
 ] 

Eugene Koifman commented on HIVE-20291:
---

Is there a design description somewhere that explains what this is for and how 
it will be used?
In particular, RandomStatementIdChooser?  If statement id is ever reused this 
will lead to a data loss.


> Allow HiveStreamingConnection to receive a WriteId
> --
>
> Key: HIVE-20291
> URL: https://issues.apache.org/jira/browse/HIVE-20291
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20291.1.patch
>
>
> If the writeId is received externally it won't need to open connections to 
> the metastore. It won't be able to the commit in this case as well so it must 
> be done by the entity passing the writeId.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19985) ACID: Skip decoding the ROW__ID sections for read-only queries

2018-08-01 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16565600#comment-16565600
 ] 

Eugene Koifman commented on HIVE-19985:
---

The logic whether to pass ROW_ID up or not is not changed by this patch.
Previously, we decoded the values from storage unconditionally and then the 
acid reader simply dropped them.  Now it takes it one step further and doesn't 
decode them if they are not needed.  In short, this should have no functional 
impact above the reader.


> ACID: Skip decoding the ROW__ID sections for read-only queries 
> ---
>
> Key: HIVE-19985
> URL: https://issues.apache.org/jira/browse/HIVE-19985
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Gopal V
>Assignee: Eugene Koifman
>Priority: Major
>  Labels: Branch3Candidate
> Attachments: HIVE-19985.01.patch
>
>
> For a base_n file there are no aborted transactions within the file and if 
> there are no pending delete deltas, the entire ACID ROW__ID can be skipped 
> for all read-only queries (i.e SELECT), though it still needs to be projected 
> out for MERGE, UPDATE and DELETE queries.
> This patch tries to entirely ignore the ACID ROW__ID fields for all tables 
> where there are no possible deletes or aborted transactions for an ACID split.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19985) ACID: Skip decoding the ROW__ID sections for read-only queries

2018-07-31 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564509#comment-16564509
 ] 

Eugene Koifman commented on HIVE-19985:
---

LLAP notes:
{{OrcEncodedDataReader}} does {{fileIncludes = 
includes.generateFileIncludes(fileSchema);}}

{{LlapRecordReader.next()}} gets a {{ColumnVectorBatch}} and wraps it in 
{{AcidWrapper}} and does 
{{VectorizedOrcAcidRowBatchReader.setBaseAndInnerReader()}} then calls next() 
on acid reader

Acid reader is created from {{LlapRecordReader}} c'tor
 

 

> ACID: Skip decoding the ROW__ID sections for read-only queries 
> ---
>
> Key: HIVE-19985
> URL: https://issues.apache.org/jira/browse/HIVE-19985
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Gopal V
>Assignee: Eugene Koifman
>Priority: Major
>  Labels: Branch3Candidate
> Attachments: HIVE-19985.01.patch
>
>
> For a base_n file there are no aborted transactions within the file and if 
> there are no pending delete deltas, the entire ACID ROW__ID can be skipped 
> for all read-only queries (i.e SELECT), though it still needs to be projected 
> out for MERGE, UPDATE and DELETE queries.
> This patch tries to entirely ignore the ACID ROW__ID fields for all tables 
> where there are no possible deletes or aborted transactions for an ACID split.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19800) Create separate submodules for pre and post upgrade and add rename file logic

2018-07-31 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564133#comment-16564133
 ] 

Eugene Koifman commented on HIVE-19800:
---

{noformat}
2018-07-30T15:41:42,287 ERROR [main] metastore.HiveAlterHandler: Failed to 
alter table hive.default.nonacidnonbuc\

ket

2018-07-30T15:41:42,290 ERROR [main] metastore.RetryingHMSHandler: 
MetaException(message:Cannot change stats stat\

e for a transactional table without providing the transactional write state for 
verification (new write ID -1, va\

lid write IDs null; current state 
{"BASIC_STATS":"true","COLUMN_STATS":{"a":"true","b":"true"}}; new state null)

        at 
org.apache.hadoop.hive.metastore.ObjectStore.alterTable(ObjectStore.java:4124)

        at sun.reflect.GeneratedMethodAccessor78.invoke(Unknown Source)

        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

        at java.lang.reflect.Method.invoke(Method.java:498)

        at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97)

        at com.sun.proxy.$Proxy35.alterTable(Unknown Source)

        at 
org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTableUpdateTableColumnStats(HiveAlterHandler.ja\

va:857)

        at 
org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:353)

        at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:5107)

        at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_req(HiveMetaStore.java:5055)

        at sun.reflect.GeneratedMethodAccessor77.invoke(Unknown Source)

        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

        at java.lang.reflect.Method.invoke(Method.java:498)

        at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)

        at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)

        at com.sun.proxy.$Proxy37.alter_table_req(Unknown Source)

        at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table(HiveMetaStoreClient.java:433)

        at 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.alter_table(SessionHiveMetaStoreClient.j\

ava:373)

        at sun.reflect.GeneratedMethodAccessor76.invoke(Unknown Source)

        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

        at java.lang.reflect.Method.invoke(Method.java:498)

        at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212)

        at com.sun.proxy.$Proxy38.alter_table(Unknown Source)

        at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:652)

        at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:588)

        at 
org.apache.hadoop.hive.ql.util.UpgradeTool.alterTable(UpgradeTool.java:232)

        at 
org.apache.hadoop.hive.ql.util.UpgradeTool.processConversion(UpgradeTool.java:531)

        at 
org.apache.hadoop.hive.ql.util.UpgradeTool.performUpgradeInternal(UpgradeTool.java:211)

        at org.apache.hadoop.hive.ql.util.UpgradeTool.main(UpgradeTool.java:135)

        at 
org.apache.hadoop.hive.ql.util.TestUpgradeTool.testPostUpgrade(TestUpgradeTool.java:149){noformat}

> Create separate submodules for pre and post upgrade and add rename file logic
> -
>
> Key: HIVE-19800
> URL: https://issues.apache.org/jira/browse/HIVE-19800
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-19800.01.patch, HIVE-19800.02.patch, 
> HIVE-19800.03.patch, HIVE-19800.04.patch
>
>
> this is a followup to HIVE-19751 which includes HIVE-19751 since it hasn't 
> landed yet
> this includes file rename logic and HIVE-19750 since it hasn't landed yet 
> either
>  
> cc [~jdere]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19800) Create separate submodules for pre and post upgrade and add rename file logic

2018-07-30 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-19800:
--
Attachment: HIVE-19800.04.patch

> Create separate submodules for pre and post upgrade and add rename file logic
> -
>
> Key: HIVE-19800
> URL: https://issues.apache.org/jira/browse/HIVE-19800
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-19800.01.patch, HIVE-19800.02.patch, 
> HIVE-19800.03.patch, HIVE-19800.04.patch
>
>
> this is a followup to HIVE-19751 which includes HIVE-19751 since it hasn't 
> landed yet
> this includes file rename logic and HIVE-19750 since it hasn't landed yet 
> either
>  
> cc [~jdere]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17683) Add explain locks command

2018-07-30 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562118#comment-16562118
 ] 

Eugene Koifman commented on HIVE-17683:
---

TestDbTxnManager2.testLockingOnInsertIntoNonNativeTables is a real failure

> Add explain locks  command
> ---
>
> Key: HIVE-17683
> URL: https://issues.apache.org/jira/browse/HIVE-17683
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Critical
> Attachments: HIVE-17683-branch-3.0.patch, HIVE-17683-branch-3.patch, 
> HIVE-17683.01.patch, HIVE-17683.02.patch, HIVE-17683.03.patch, 
> HIVE-17683.04.patch, HIVE-17683.05.patch, HIVE-17683.06.patch
>
>
> Explore if it's possible to add info about what locks will be asked for to 
> the query plan.
> Lock acquisition (for Acid Lock Manager) is done in 
> DbTxnManager.acquireLocks() which is called once the query starts running.  
> Would need to refactor that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20248) clean up some TODOs after txn stats merge

2018-07-26 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559059#comment-16559059
 ] 

Eugene Koifman commented on HIVE-20248:
---

+1

> clean up some TODOs after txn stats merge
> -
>
> Key: HIVE-20248
> URL: https://issues.apache.org/jira/browse/HIVE-20248
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20248.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19985) ACID: Skip decoding the ROW__ID sections for read-only queries

2018-07-26 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-19985:
--
Attachment: HIVE-19985.01.patch

> ACID: Skip decoding the ROW__ID sections for read-only queries 
> ---
>
> Key: HIVE-19985
> URL: https://issues.apache.org/jira/browse/HIVE-19985
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Gopal V
>Assignee: Eugene Koifman
>Priority: Major
>  Labels: Branch3Candidate
> Attachments: HIVE-19985.01.patch
>
>
> For a base_n file there are no aborted transactions within the file and if 
> there are no pending delete deltas, the entire ACID ROW__ID can be skipped 
> for all read-only queries (i.e SELECT), though it still needs to be projected 
> out for MERGE, UPDATE and DELETE queries.
> This patch tries to entirely ignore the ACID ROW__ID fields for all tables 
> where there are no possible deletes or aborted transactions for an ACID split.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19985) ACID: Skip decoding the ROW__ID sections for read-only queries

2018-07-26 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558682#comment-16558682
 ] 

Eugene Koifman commented on HIVE-19985:
---

[~gopalv] could you review please

> ACID: Skip decoding the ROW__ID sections for read-only queries 
> ---
>
> Key: HIVE-19985
> URL: https://issues.apache.org/jira/browse/HIVE-19985
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Gopal V
>Assignee: Eugene Koifman
>Priority: Major
>  Labels: Branch3Candidate
> Attachments: HIVE-19985.01.patch
>
>
> For a base_n file there are no aborted transactions within the file and if 
> there are no pending delete deltas, the entire ACID ROW__ID can be skipped 
> for all read-only queries (i.e SELECT), though it still needs to be projected 
> out for MERGE, UPDATE and DELETE queries.
> This patch tries to entirely ignore the ACID ROW__ID fields for all tables 
> where there are no possible deletes or aborted transactions for an ACID split.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19532) merge master-txnstats branch

2018-07-24 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554880#comment-16554880
 ] 

Eugene Koifman commented on HIVE-19532:
---

+1

> merge master-txnstats branch
> 
>
> Key: HIVE-19532
> URL: https://issues.apache.org/jira/browse/HIVE-19532
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-19532.01.patch, HIVE-19532.01.prepatch, 
> HIVE-19532.02.patch, HIVE-19532.02.prepatch, HIVE-19532.03.patch, 
> HIVE-19532.04.patch, HIVE-19532.05.patch, HIVE-19532.07.patch, 
> HIVE-19532.08.patch, HIVE-19532.09.patch, HIVE-19532.10.patch, 
> HIVE-19532.11.patch, HIVE-19532.12.patch, HIVE-19532.13.patch, 
> HIVE-19532.14.patch, HIVE-19532.15.patch, HIVE-19532.16.patch, 
> HIVE-19532.19.patch, HIVE-19532.23.patch, HIVE-19532.25.patch, 
> HIVE-19532.26.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-19985) ACID: Skip decoding the ROW__ID sections for read-only queries

2018-07-23 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-19985:
-

Assignee: Eugene Koifman  (was: Gopal V)

> ACID: Skip decoding the ROW__ID sections for read-only queries 
> ---
>
> Key: HIVE-19985
> URL: https://issues.apache.org/jira/browse/HIVE-19985
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Gopal V
>Assignee: Eugene Koifman
>Priority: Major
>  Labels: Branch3Candidate
>
> For a base_n file there are no aborted transactions within the file and if 
> there are no pending delete deltas, the entire ACID ROW__ID can be skipped 
> for all read-only queries (i.e SELECT), though it still needs to be projected 
> out for MERGE, UPDATE and DELETE queries.
> This patch tries to entirely ignore the ACID ROW__ID fields for all tables 
> where there are no possible deletes or aborted transactions for an ACID split.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-17683) Add explain locks command

2018-07-23 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17683:
--
Release Note: 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain#LanguageManualExplain-TneLOCKSClause

> Add explain locks  command
> ---
>
> Key: HIVE-17683
> URL: https://issues.apache.org/jira/browse/HIVE-17683
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Critical
> Attachments: HIVE-17683.01.patch, HIVE-17683.02.patch, 
> HIVE-17683.03.patch, HIVE-17683.04.patch, HIVE-17683.05.patch, 
> HIVE-17683.06.patch
>
>
> Explore if it's possible to add info about what locks will be asked for to 
> the query plan.
> Lock acquisition (for Acid Lock Manager) is done in 
> DbTxnManager.acquireLocks() which is called once the query starts running.  
> Would need to refactor that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17683) Add explain locks command

2018-07-23 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553252#comment-16553252
 ] 

Eugene Koifman commented on HIVE-17683:
---

fixed

> Add explain locks  command
> ---
>
> Key: HIVE-17683
> URL: https://issues.apache.org/jira/browse/HIVE-17683
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Critical
> Attachments: HIVE-17683.01.patch, HIVE-17683.02.patch, 
> HIVE-17683.03.patch, HIVE-17683.04.patch, HIVE-17683.05.patch, 
> HIVE-17683.06.patch
>
>
> Explore if it's possible to add info about what locks will be asked for to 
> the query plan.
> Lock acquisition (for Acid Lock Manager) is done in 
> DbTxnManager.acquireLocks() which is called once the query starts running.  
> Would need to refactor that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17683) Add explain locks command

2018-07-23 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553185#comment-16553185
 ] 

Eugene Koifman commented on HIVE-17683:
---

[~ikryvenko], could you please make a 3.x patch for this - I think it would 
useful to users

> Add explain locks  command
> ---
>
> Key: HIVE-17683
> URL: https://issues.apache.org/jira/browse/HIVE-17683
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Critical
> Attachments: HIVE-17683.01.patch, HIVE-17683.02.patch, 
> HIVE-17683.03.patch, HIVE-17683.04.patch, HIVE-17683.05.patch, 
> HIVE-17683.06.patch
>
>
> Explore if it's possible to add info about what locks will be asked for to 
> the query plan.
> Lock acquisition (for Acid Lock Manager) is done in 
> DbTxnManager.acquireLocks() which is called once the query starts running.  
> Would need to refactor that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17683) Add explain locks command

2018-07-23 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553179#comment-16553179
 ] 

Eugene Koifman commented on HIVE-17683:
---

committed to master

thanks Igor for the contribution

> Add explain locks  command
> ---
>
> Key: HIVE-17683
> URL: https://issues.apache.org/jira/browse/HIVE-17683
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Critical
> Attachments: HIVE-17683.01.patch, HIVE-17683.02.patch, 
> HIVE-17683.03.patch, HIVE-17683.04.patch, HIVE-17683.05.patch, 
> HIVE-17683.06.patch
>
>
> Explore if it's possible to add info about what locks will be asked for to 
> the query plan.
> Lock acquisition (for Acid Lock Manager) is done in 
> DbTxnManager.acquireLocks() which is called once the query starts running.  
> Would need to refactor that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-17683) Add explain locks command

2018-07-23 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17683:
--
Summary: Add explain locks  command  (was: Annotate Query Plan with 
locking information)

> Add explain locks  command
> ---
>
> Key: HIVE-17683
> URL: https://issues.apache.org/jira/browse/HIVE-17683
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Critical
> Attachments: HIVE-17683.01.patch, HIVE-17683.02.patch, 
> HIVE-17683.03.patch, HIVE-17683.04.patch, HIVE-17683.05.patch, 
> HIVE-17683.06.patch
>
>
> Explore if it's possible to add info about what locks will be asked for to 
> the query plan.
> Lock acquisition (for Acid Lock Manager) is done in 
> DbTxnManager.acquireLocks() which is called once the query starts running.  
> Would need to refactor that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17683) Annotate Query Plan with locking information

2018-07-23 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553072#comment-16553072
 ] 

Eugene Koifman commented on HIVE-17683:
---

+1

> Annotate Query Plan with locking information
> 
>
> Key: HIVE-17683
> URL: https://issues.apache.org/jira/browse/HIVE-17683
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Critical
> Attachments: HIVE-17683.01.patch, HIVE-17683.02.patch, 
> HIVE-17683.03.patch, HIVE-17683.04.patch, HIVE-17683.05.patch, 
> HIVE-17683.06.patch
>
>
> Explore if it's possible to add info about what locks will be asked for to 
> the query plan.
> Lock acquisition (for Acid Lock Manager) is done in 
> DbTxnManager.acquireLocks() which is called once the query starts running.  
> Would need to refactor that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20218) make sure Statement.executeUpdate() returns number of rows affected

2018-07-22 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20218:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

committed to master

thanks Ashutosh for the review

> make sure Statement.executeUpdate() returns number of rows affected
> ---
>
> Key: HIVE-20218
> URL: https://issues.apache.org/jira/browse/HIVE-20218
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20218.01.patch
>
>
> HiveStatement and HivePreparedStatement currently return 0 in all cases



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20218) make sure Statement.executeUpdate() returns number of rows affected

2018-07-22 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20218:
--
Fix Version/s: 4.0.0

> make sure Statement.executeUpdate() returns number of rows affected
> ---
>
> Key: HIVE-20218
> URL: https://issues.apache.org/jira/browse/HIVE-20218
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20218.01.patch
>
>
> HiveStatement and HivePreparedStatement currently return 0 in all cases



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20218) make sure Statement.executeUpdate() returns number of rows affected

2018-07-20 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551059#comment-16551059
 ] 

Eugene Koifman commented on HIVE-20218:
---

[~ashutoshc] could you review please

> make sure Statement.executeUpdate() returns number of rows affected
> ---
>
> Key: HIVE-20218
> URL: https://issues.apache.org/jira/browse/HIVE-20218
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20218.01.patch
>
>
> HiveStatement and HivePreparedStatement currently return 0 in all cases



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20218) make sure Statement.executeUpdate() returns number of rows affected

2018-07-20 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20218:
--
Attachment: (was: HIVE-20218.01.patch)

> make sure Statement.executeUpdate() returns number of rows affected
> ---
>
> Key: HIVE-20218
> URL: https://issues.apache.org/jira/browse/HIVE-20218
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20218.01.patch
>
>
> HiveStatement and HivePreparedStatement currently return 0 in all cases



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20218) make sure Statement.executeUpdate() returns number of rows affected

2018-07-20 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20218:
--
Attachment: HIVE-20218.01.patch

> make sure Statement.executeUpdate() returns number of rows affected
> ---
>
> Key: HIVE-20218
> URL: https://issues.apache.org/jira/browse/HIVE-20218
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20218.01.patch, HIVE-20218.01.patch
>
>
> HiveStatement and HivePreparedStatement currently return 0 in all cases



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20218) make sure Statement.executeUpdate() returns number of rows affected

2018-07-20 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551054#comment-16551054
 ] 

Eugene Koifman commented on HIVE-20218:
---

current ticket fixes the JDBC API

need to check what beeline produces - HIVE-8244

> make sure Statement.executeUpdate() returns number of rows affected
> ---
>
> Key: HIVE-20218
> URL: https://issues.apache.org/jira/browse/HIVE-20218
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20218.01.patch
>
>
> HiveStatement and HivePreparedStatement currently return 0 in all cases



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20218) make sure Statement.executeUpdate() returns number of rows affected

2018-07-20 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20218:
--
Status: Patch Available  (was: Open)

> make sure Statement.executeUpdate() returns number of rows affected
> ---
>
> Key: HIVE-20218
> URL: https://issues.apache.org/jira/browse/HIVE-20218
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20218.01.patch
>
>
> HiveStatement and HivePreparedStatement currently return 0 in all cases



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20218) make sure Statement.executeUpdate() returns number of rows affected

2018-07-20 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20218:
--
Attachment: HIVE-20218.01.patch

> make sure Statement.executeUpdate() returns number of rows affected
> ---
>
> Key: HIVE-20218
> URL: https://issues.apache.org/jira/browse/HIVE-20218
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20218.01.patch
>
>
> HiveStatement and HivePreparedStatement currently return 0 in all cases



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20218) make sure Statement.executeUpdate() returns number of rows affected

2018-07-20 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-20218:
-


> make sure Statement.executeUpdate() returns number of rows affected
> ---
>
> Key: HIVE-20218
> URL: https://issues.apache.org/jira/browse/HIVE-20218
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
>
> HiveStatement and HivePreparedStatement currently return 0 in all cases



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20047) [phase 1.5] consider removing txnID argument for txn stats methods

2018-07-19 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16549738#comment-16549738
 ] 

Eugene Koifman commented on HIVE-20047:
---

"it will not rely entirely" - should 'not' be 'now'?

> [phase 1.5] consider removing txnID argument for txn stats methods
> --
>
> Key: HIVE-20047
> URL: https://issues.apache.org/jira/browse/HIVE-20047
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20047.patch
>
>
> Followup from HIVE-19975.
> W.r.t. write IDs and txn IDs, stats validity check currently verifies one of 
> two things - that stats write ID is valid for query write ID list, or that 
> stats txn ID (derived from write ID) is the same as the query txn ID.
> I'm not sure the latter check is needed; removing it would allow us to make a 
> bunch of APIs a little bit simpler.
> [~ekoifman] do you have any feedback? Can any stats reader (e.g. compile) 
> observe stats written by the same txn; but in such manner that it doesn't 
> have the write ID of the same-txn stats writer, in its valid write ID list? 
> I'm assuming it's not possible, e.g. in multi statement txn each query would 
> have the previous same-txn writer for the same table in its valid write ID 
> list?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20115) Acid tables should not use footer scan for analyze

2018-07-19 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16549733#comment-16549733
 ] 

Eugene Koifman commented on HIVE-20115:
---

+1 in general.

A few nits:

Could you add some comment to BasicStatsNoJobTask that it cannot be used for 
full CRUD tables?

couple of other nits on RB

> Acid tables should not use footer scan for analyze
> --
>
> Key: HIVE-20115
> URL: https://issues.apache.org/jira/browse/HIVE-20115
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics, Transactions
>Affects Versions: 4.0.0
>Reporter: Steve Yeom
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20115.patch
>
>
> Discovered via incorrect stats in acid_no_buckets test on master-txnstats 
> branch



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20115) Acid tables should not use footer scan for analyze

2018-07-19 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20115:
--
Component/s: Statistics

> Acid tables should not use footer scan for analyze
> --
>
> Key: HIVE-20115
> URL: https://issues.apache.org/jira/browse/HIVE-20115
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics, Transactions
>Affects Versions: 4.0.0
>Reporter: Steve Yeom
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20115.patch
>
>
> Discovered via incorrect stats in acid_no_buckets test on master-txnstats 
> branch



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20198) Constant time table drops/renames

2018-07-18 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548628#comment-16548628
 ] 

Eugene Koifman commented on HIVE-20198:
---

could TBLS.TBL_ID be used as this ID?

Not strictly related, but it would be nice if Table object contained this 
TBL_ID as well.

> Constant time table drops/renames
> -
>
> Key: HIVE-20198
> URL: https://issues.apache.org/jira/browse/HIVE-20198
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 4.0.0
>Reporter: Alexander Kolbasov
>Priority: Major
>
> Currently table drops and table renames have O(P) performance (where P is the 
> number of partitions). When a managed table is deleted, the implementation 
> deletes table metadata and then deletes all partitions in HDFS. HDFS 
> operations are optimized and only do a sequential deletes for partitions 
> outside of table prefix. This operation is O(P)where Pis the number of 
> partitions. 
> Table rename goes through the list of partitions and modifies table name (and 
> potentially db name) in each partition. It also modifies each partition 
> location to match the new db/table name and renames directories (which is a 
> non-atomic and slow operation on S3). This is O(P) operation where P is the 
> number of partitions.
> Basic idea is to do the following:
> # Assign unique ID to each table
> # Create directory name based on unique ID rather then the name
> # Table rename then becomes metadata-only operation - there is no need to 
> change any location information.
> # Table drop can become an asynchronous operation where the table is marked 
> as "deleted". Subsequent public metadata APIs should skip such tables. A 
> background cleaner thread may then go and clean up directories.
> Since the table location is unique for each table, new tables will not reuse 
> existing locations. This change isn't compatible with the current behavior 
> where there is an assumption that table location is based on table name. We 
> can get around this by providing "opt-in" mechanism - special table property 
> that tells that the table can have such new behavior, so the improvement will 
> initially work for new tables created with this feature enabled. We may later 
> provide some tool to convert existing tables to the new scheme.
> One complication is there in case where impersonation is enabled - the FS 
> operations should be performed using client UGI rather then server's, so the 
> cleaner thread should be able to use client UGIs.
> Initially we can punt on this and do standard table drops when impersonation 
> is enabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HIVE-18453) ACID: Add "CREATE TRANSACTIONAL TABLE" syntax to unify ACID ORC & Parquet support

2018-07-18 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-18453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548618#comment-16548618
 ] 

Eugene Koifman edited comment on HIVE-18453 at 7/19/18 12:33 AM:
-

[~ikryvenko],
I don't think this is going to work.  I don't think you are actually creating 
transactional tables in your tests.
for example {{update transactional_table_test set value='foo';}} in 
create_transactional.q would fail.

{noformat}
2018-07-18T17:25:35,181 ERROR [25cc35df-5e66-4d10-b31e-22a490cef829 main] 
parse.UpdateDeleteSemanticAnalyzer: org.apache.hadoop.hive.ql.parse.SemanticE\
xception: Attempt to do update or delete on table 
default.transactional_table_test that is not transactional
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:2297)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:2088)
{noformat}

Incidentally, in SemanticAnalyzer
  {{tblProps = addDefaultProperties(tblProps, isExt, storageFormat, 
dbDotTab, sortCols, isMaterialization, true);}}
why is the last param true?

I think you'd need to add another param to addDefaultProperties() to indicate 
that it's called because of "create transactional" so that this method acts as 
if CREATE_TABLES_AS_ACID and HIVE_CREATE_TABLES_AS_INSERT_ONLY are both true.


was (Author: ekoifman):
[~ikryvenko],
I don't think this is going to work.  I don't think you are actually creating 
transactional tables in your tests.
for example {{update transactional_table_test set value='foo';}} in 
create_transactional.q would fail.

Incidentally, in SemanticAnalyzer
  {{tblProps = addDefaultProperties(tblProps, isExt, storageFormat, 
dbDotTab, sortCols, isMaterialization, true);}}
why is the last param true?

I think you'd need to add another param to addDefaultProperties() to indicate 
that it's called because of "create transactional" so that this method acts as 
if CREATE_TABLES_AS_ACID and HIVE_CREATE_TABLES_AS_INSERT_ONLY are both true.

> ACID: Add "CREATE TRANSACTIONAL TABLE" syntax to unify ACID ORC & Parquet 
> support
> -
>
> Key: HIVE-18453
> URL: https://issues.apache.org/jira/browse/HIVE-18453
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Gopal V
>Assignee: Igor Kryvenko
>Priority: Major
> Attachments: HIVE-18453.01.patch, HIVE-18453.02.patch, 
> HIVE-18453.03.patch, HIVE-18453.04.patch, HIVE-18453.05.patch
>
>
> The ACID table markers are currently done with TBLPROPERTIES which is 
> inherently fragile.
> The "create transactional table" offers a way to standardize the syntax and 
> allows for future compatibility changes to support Parquet ACIDv2 tables 
> along with ORC tables.
> The ACIDv2 design is format independent, with the ability to add new 
> vectorized input formats with no changes to the design.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18453) ACID: Add "CREATE TRANSACTIONAL TABLE" syntax to unify ACID ORC & Parquet support

2018-07-18 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-18453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548618#comment-16548618
 ] 

Eugene Koifman commented on HIVE-18453:
---

[~ikryvenko],
I don't think this is going to work.  I don't think you are actually creating 
transactional tables in your tests.
for example {{update transactional_table_test set value='foo';}} in 
create_transactional.q would fail.

Incidentally, in SemanticAnalyzer
  {{tblProps = addDefaultProperties(tblProps, isExt, storageFormat, 
dbDotTab, sortCols, isMaterialization, true);}}
why is the last param true?

I think you'd need to add another param to addDefaultProperties() to indicate 
that it's called because of "create transactional" so that this method acts as 
if CREATE_TABLES_AS_ACID and HIVE_CREATE_TABLES_AS_INSERT_ONLY are both true.

> ACID: Add "CREATE TRANSACTIONAL TABLE" syntax to unify ACID ORC & Parquet 
> support
> -
>
> Key: HIVE-18453
> URL: https://issues.apache.org/jira/browse/HIVE-18453
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Gopal V
>Assignee: Igor Kryvenko
>Priority: Major
> Attachments: HIVE-18453.01.patch, HIVE-18453.02.patch, 
> HIVE-18453.03.patch, HIVE-18453.04.patch, HIVE-18453.05.patch
>
>
> The ACID table markers are currently done with TBLPROPERTIES which is 
> inherently fragile.
> The "create transactional table" offers a way to standardize the syntax and 
> allows for future compatibility changes to support Parquet ACIDv2 tables 
> along with ORC tables.
> The ACIDv2 design is format independent, with the ability to add new 
> vectorized input formats with no changes to the design.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17683) Annotate Query Plan with locking information

2018-07-18 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548382#comment-16548382
 ] 

Eugene Koifman commented on HIVE-17683:
---

[~ikryvenko], 

this looks good in general.  One question:

Did you mean to add support for 'formatted' option?  In patch 4, adding 
'formatted' produces
{noformat}
{"LOCK INFORMATION:":"[]"}
{noformat}

{{ExplainTask.getLocks()}} has a bug

 
I think, {[explain locks drop table test_explain_locks}} produces 
Read/WriteEntity because this table doesn't exist at the time this command runs.

 

> Annotate Query Plan with locking information
> 
>
> Key: HIVE-17683
> URL: https://issues.apache.org/jira/browse/HIVE-17683
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Critical
> Attachments: HIVE-17683.01.patch, HIVE-17683.02.patch, 
> HIVE-17683.03.patch, HIVE-17683.04.patch
>
>
> Explore if it's possible to add info about what locks will be asked for to 
> the query plan.
> Lock acquisition (for Acid Lock Manager) is done in 
> DbTxnManager.acquireLocks() which is called once the query starts running.  
> Would need to refactor that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20172) StatsUpdater failed with GSS Exception while trying to connect to remote metastore

2018-07-13 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20172:
--
Component/s: (was: Hive)
 Transactions

> StatsUpdater failed with GSS Exception while trying to connect to remote 
> metastore
> --
>
> Key: HIVE-20172
> URL: https://issues.apache.org/jira/browse/HIVE-20172
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.1.1
> Environment: Hive-1.2.1,Hive2.1,java8
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-20172.patch
>
>
> StatsUpdater task failed with GSS Exception while trying to connect to remote 
> Metastore.
> {code}
> org.apache.thrift.transport.TTransportException: GSS initiate failed 
> at 
> org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232)
>  
> at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:316) 
> at 
> org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
>  
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
>  
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
>  
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:422) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
>  
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
>  
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:487)
>  
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:282)
>  
> at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:76)
>  
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423) 
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1564)
>  
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:92)
>  
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:138)
>  
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:110)
>  
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3526) 
> at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3558) 
> at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:533) 
> at 
> org.apache.hadoop.hive.ql.txn.compactor.Worker$StatsUpdater.gatherStats(Worker.java:300)
>  
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:265) 
> at org.apache.hadoop.hive.ql.txn.compactor.Worker$1.run(Worker.java:177) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:422) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
>  
> at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:174) 
> ) 
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:534)
>  
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:282)
>  
> at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:76)
>  
> {code}
> since metastore client is running in HMS so there is no need to connect to 
> remote URI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HIVE-17683) Annotate Query Plan with locking information

2018-07-12 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542380#comment-16542380
 ] 

Eugene Koifman edited comment on HIVE-17683 at 7/13/18 12:54 AM:
-

[~ikryvenko], sorry, it took a while to get back to this.

Your implementation creates ExplainTask.getJsonLocks() which duplicates a lot 
of the logic in DbTxnManger.acquireLocks().  This is problematic because they 
have to be kept in sync.

Could you refactor it so that they share code?

For example, create a {{LockRequest makeLockRequest(List, 
List)}} and use it in both places?

 

Also, the refactoring in acquireLocks() lost
{noformat}
default:
  throw new IllegalArgumentException(String
  .format("Lock type [%s] for Database.Table [%s.%s] is unknown", lockType, 
t.getDbName(),
  t.getTableName()
  ));{noformat}
This may change how errors are surfaced - not sure it's a good idea.

 

Don't know if it's related to your changes but in explain_locks.q.out

{{explain locks drop table test_explain_locks}}

doesn't acquire any locks - this is odd - I'd expect X lock on the table for a 
drop command.

 

Why did you chose to output the data as JSON?  


was (Author: ekoifman):
[~ikryvenko], sorry, it took a while to get back to this.

Your implementation creates ExplainTask.getJsonLocks() which duplicates a lot 
of the logic in DbTxnManger.acquireLocks().  This is problematic because they 
have to be kept in sync.

Could you refactor it so that they share code?

For example, create a {{LockRequest makeLockRequest(List, 
List)}} and use it in both places?

 

Also, the refactoring in acquireLocks() lost
{noformat}
default:
  throw new IllegalArgumentException(String
  .format("Lock type [%s] for Database.Table [%s.%s] is unknown", lockType, 
t.getDbName(),
  t.getTableName()
  ));{noformat}
This may change how errors are surfaced - not sure it's a good idea.

> Annotate Query Plan with locking information
> 
>
> Key: HIVE-17683
> URL: https://issues.apache.org/jira/browse/HIVE-17683
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Critical
> Attachments: HIVE-17683.01.patch, HIVE-17683.02.patch
>
>
> Explore if it's possible to add info about what locks will be asked for to 
> the query plan.
> Lock acquisition (for Acid Lock Manager) is done in 
> DbTxnManager.acquireLocks() which is called once the query starts running.  
> Would need to refactor that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17683) Annotate Query Plan with locking information

2018-07-12 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542380#comment-16542380
 ] 

Eugene Koifman commented on HIVE-17683:
---

[~ikryvenko], sorry, it took a while to get back to this.

Your implementation creates ExplainTask.getJsonLocks() which duplicates a lot 
of the logic in DbTxnManger.acquireLocks().  This is problematic because they 
have to be kept in sync.

Could you refactor it so that they share code?

For example, create a {{LockRequest makeLockRequest(List, 
List)}} and use it in both places?

 

Also, the refactoring in acquireLocks() lost
{noformat}
default:
  throw new IllegalArgumentException(String
  .format("Lock type [%s] for Database.Table [%s.%s] is unknown", lockType, 
t.getDbName(),
  t.getTableName()
  ));{noformat}
This may change how errors are surfaced - not sure it's a good idea.

> Annotate Query Plan with locking information
> 
>
> Key: HIVE-17683
> URL: https://issues.apache.org/jira/browse/HIVE-17683
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Critical
> Attachments: HIVE-17683.01.patch, HIVE-17683.02.patch
>
>
> Explore if it's possible to add info about what locks will be asked for to 
> the query plan.
> Lock acquisition (for Acid Lock Manager) is done in 
> DbTxnManager.acquireLocks() which is called once the query starts running.  
> Would need to refactor that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19375) Bad message: 'transactional'='false' is no longer a valid property and will be ignored

2018-07-12 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-19375:
--
   Resolution: Fixed
Fix Version/s: 3.2.0
   Status: Resolved  (was: Patch Available)

committed to branch-3 and master

thanks Jason for the review

> Bad message: 'transactional'='false' is no longer a valid property and will 
> be ignored
> --
>
> Key: HIVE-19375
> URL: https://issues.apache.org/jira/browse/HIVE-19375
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Minor
> Fix For: 3.2.0
>
> Attachments: HIVE-19375.01.patch
>
>
> from {{TransactionalValidationListener.handleCreateTableTransactionalProp()}}
> {noformat}
> if ("false".equalsIgnoreCase(transactional)) {
>   // just drop transactional=false.  For backward compatibility in case 
> someone has scripts
>   // with transactional=false
>   LOG.info("'transactional'='false' is no longer a valid property and 
> will be ignored: " +
> Warehouse.getQualifiedName(newTable));
>   return;
> }
> {noformat}
> this msg is misleading since with metastore.create.as.acid=true, setting 
> transactional=false is valid to make a flat table



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19375) Bad message: 'transactional'='false' is no longer a valid property and will be ignored

2018-07-12 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-19375:
--
Summary: Bad message: 'transactional'='false' is no longer a valid property 
and will be ignored  (was: "'transactional'='false' is no longer a valid 
property and will be ignored: )

> Bad message: 'transactional'='false' is no longer a valid property and will 
> be ignored
> --
>
> Key: HIVE-19375
> URL: https://issues.apache.org/jira/browse/HIVE-19375
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Minor
> Attachments: HIVE-19375.01.patch
>
>
> from {{TransactionalValidationListener.handleCreateTableTransactionalProp()}}
> {noformat}
> if ("false".equalsIgnoreCase(transactional)) {
>   // just drop transactional=false.  For backward compatibility in case 
> someone has scripts
>   // with transactional=false
>   LOG.info("'transactional'='false' is no longer a valid property and 
> will be ignored: " +
> Warehouse.getQualifiedName(newTable));
>   return;
> }
> {noformat}
> this msg is misleading since with metastore.create.as.acid=true, setting 
> transactional=false is valid to make a flat table



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

< 1 2 3 4 5 6 7 8 9 10 >

501 - 600 of 4361 matches

Mail list logo