[jira] [Commented] (HIVE-18827) useless dynamic value exceptions strike back

2018-04-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448569#comment-16448569
 ] 

Sergey Shelukhin commented on HIVE-18827:
-

+1. may not be worth committing to 3.0 though; we might not have another 
storage-api release until then

> useless dynamic value exceptions strike back
> 
>
> Key: HIVE-18827
> URL: https://issues.apache.org/jira/browse/HIVE-18827
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-18827.1.patch, HIVE-18827.2.patch
>
>
> Looking at ~master, I can see tons of exceptions like this in LLAP log:
> {noformat}
> 2018-02-27T14:07:51,989  WARN [IO-Elevator-Thread-12 
> (1515669035295_0909_1_08_000117_0)] impl.RecordReaderImpl: 
> NoDynamicValuesException when evaluating predicate. Skipping ORC PPD. Stats: 
> numberOfValues: 9750
> intStatistics {
>   minimum: 11335
>   maximum: 560
>   sum: 27648854404
> }
> hasNull: true
>  Predicate: (BETWEEN ss_addr_sk 
> DynamicValue(RS_27_customer_address_ca_address_sk_min) 
> DynamicValue(RS_27_customer_address_ca_address_sk_max))
> org.apache.hadoop.hive.ql.plan.DynamicValue$NoDynamicValuesException: Value 
> does not exist in registry: RS_27_customer_address_ca_address_sk_min
>   at 
> org.apache.hadoop.hive.ql.exec.tez.DynamicValueRegistryTez.getValue(DynamicValueRegistryTez.java:77)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.plan.DynamicValue.getValue(DynamicValue.java:137) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.plan.DynamicValue.getJavaValue(DynamicValue.java:97)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.plan.DynamicValue.getLiteral(DynamicValue.java:93) 
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$PredicateLeafImpl.getLiteralList(SearchArgumentImpl.java:120)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.orc.impl.RecordReaderImpl.evaluatePredicateMinMax(RecordReaderImpl.java:553)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.orc.impl.RecordReaderImpl.evaluatePredicateRange(RecordReaderImpl.java:463)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.orc.impl.RecordReaderImpl.evaluatePredicateProto(RecordReaderImpl.java:423)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.orc.impl.RecordReaderImpl$SargApplier.pickRowGroups(RecordReaderImpl.java:848)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.determineRgsToRead(OrcEncodedDataReader.java:835)
>  ~[hive-llap-server-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:335)
>  ~[hive-llap-server-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:276)
>  ~[hive-llap-server-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:273)
>  ~[hive-llap-server-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at java.security.AccessController.doPrivileged(Native Method) 
> ~[?:1.8.0_112]
>   at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112]
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
>  ~[hadoop-common-3.0.0.3.0.0.0-776.jar:?]
>   at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:273)
>  ~[hive-llap-server-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:110)
>  ~[hive-llap-server-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) 
> ~[tez-common-0.9.2-SNAPSHOT.jar:0.9.2-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
>  ~[hive-llap-server-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[?:1.8.0_112]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[?:1.8.0_112]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  ~[?:1.8.0_112]
>   at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17647) DDLTask.generateAddMmTasks(Table tbl) and other random code should not start transactions

2018-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17647:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master and branch-3. The remaining test failures appear to be 
spurious or flay tests, suspicious ones don't repro locally. 
Thanks for the review!

> DDLTask.generateAddMmTasks(Table tbl) and other random code should not start 
> transactions
> -
>
> Key: HIVE-17647
> URL: https://issues.apache.org/jira/browse/HIVE-17647
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Fix For: 3.0.0
>
> Attachments: HIVE-17647.01.patch, HIVE-17647.02.patch, 
> HIVE-17647.03.patch, HIVE-17647.04.patch, HIVE-17647.05.patch, 
> HIVE-17647.patch
>
>
> This method (and other places) have 
> {noformat}
>   if (txnManager.isTxnOpen()) {
> mmWriteId = txnManager.getCurrentTxnId();
>   } else {
> mmWriteId = txnManager.openTxn(new Context(conf), conf.getUser());
> txnManager.commitTxn();
>   }
> {noformat}
> this should throw if there is no open transaction.  It should never open one.
> In general the logic seems suspect.  Looks like the intent is to move all 
> existing files into a delta_x_x/ when a plain table is converted to MM table. 
>  This seems like something that needs to be done from under an Exclusive lock 
> to prevent concurrent Insert operations writing data under table/partition 
> root.  But this is too late to acquire locks which should be done from the 
> Driver.acquireLocks()  (or else have deadlock detector since acquiring them 
> here would bread all-or-nothing lock acquisition semantics currently required 
> w/o deadlock detector)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17970) MM LOAD DATA with OVERWRITE doesn't use base_n directory concept

2018-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17970:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to branches. Thanks for the review!

> MM LOAD DATA with OVERWRITE doesn't use base_n directory concept
> 
>
> Key: HIVE-17970
> URL: https://issues.apache.org/jira/browse/HIVE-17970
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Fix For: 3.0.0
>
> Attachments: HIVE-17970.01.patch, HIVE-17970.02.patch, 
> HIVE-17970.03.patch, HIVE-17970.04.patch, HIVE-17970.05.patch, 
> HIVE-17970.patch
>
>
> Judging by 
> {code:java}
> Hive.loadTable(Path loadPath, String tableName, LoadFileType loadFileType, 
> boolean isSrcLocal,
>   boolean isSkewedStoreAsSubdir, boolean isAcid, boolean 
> hasFollowingStatsTask,
>   Long txnId, int stmtId, boolean isMmTable)
> {code}
> LOAD DATA with OVERWRITE will delete all existing data then write new data 
> into the table.  This logic makes sense for non-acid tables but for Acid/MM 
> it should work like INSERT OVERWRITE statement and write new data to base_n/. 
> This way the lock manager can be used to either get an X lock for IOW and 
> thus block all readers or let it run with SemiShared and let readers continue 
> and make the system more concurrent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19124) implement a basic major compactor for MM tables

2018-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19124:

Attachment: HIVE-19124.06.patch

> implement a basic major compactor for MM tables
> ---
>
> Key: HIVE-19124
> URL: https://issues.apache.org/jira/browse/HIVE-19124
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Attachments: HIVE-19124.01.patch, HIVE-19124.02.patch, 
> HIVE-19124.03.patch, HIVE-19124.03.patch, HIVE-19124.04.patch, 
> HIVE-19124.05.patch, HIVE-19124.06.patch, HIVE-19124.patch
>
>
> For now, it will run a query directly and only major compactions will be 
> supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19124) implement a basic major compactor for MM tables

2018-04-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448714#comment-16448714
 ] 

Sergey Shelukhin commented on HIVE-19124:
-

Addressed the recent CR feedback

> implement a basic major compactor for MM tables
> ---
>
> Key: HIVE-19124
> URL: https://issues.apache.org/jira/browse/HIVE-19124
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Attachments: HIVE-19124.01.patch, HIVE-19124.02.patch, 
> HIVE-19124.03.patch, HIVE-19124.03.patch, HIVE-19124.04.patch, 
> HIVE-19124.05.patch, HIVE-19124.06.patch, HIVE-19124.patch
>
>
> For now, it will run a query directly and only major compactions will be 
> supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19124) implement a basic major compactor for MM tables

2018-04-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448735#comment-16448735
 ] 

Sergey Shelukhin commented on HIVE-19124:
-

Cannot repro the only seemingly relevant test failure, seems to be a setup 
issue.

> implement a basic major compactor for MM tables
> ---
>
> Key: HIVE-19124
> URL: https://issues.apache.org/jira/browse/HIVE-19124
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Attachments: HIVE-19124.01.patch, HIVE-19124.02.patch, 
> HIVE-19124.03.patch, HIVE-19124.03.patch, HIVE-19124.04.patch, 
> HIVE-19124.05.patch, HIVE-19124.06.patch, HIVE-19124.patch
>
>
> For now, it will run a query directly and only major compactions will be 
> supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19215) JavaUtils.AnyIdDirFilter ignores base_n directories

2018-04-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448789#comment-16448789
 ] 

Sergey Shelukhin commented on HIVE-19215:
-

Fixed the tests (prefix variables with the same names had different values in 
different files), moved even more code so no MM table related code remains in 
JavaUtils.

> JavaUtils.AnyIdDirFilter ignores base_n directories
> ---
>
> Key: HIVE-19215
> URL: https://issues.apache.org/jira/browse/HIVE-19215
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19215.01.patch, HIVE-19215.02.patch, 
> HIVE-19215.03.patch, HIVE-19215.patch
>
>
> cc [~sershe], [~steveyeom2017]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19215) JavaUtils.AnyIdDirFilter ignores base_n directories

2018-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19215:

Attachment: HIVE-19215.03.patch

> JavaUtils.AnyIdDirFilter ignores base_n directories
> ---
>
> Key: HIVE-19215
> URL: https://issues.apache.org/jira/browse/HIVE-19215
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19215.01.patch, HIVE-19215.02.patch, 
> HIVE-19215.03.patch, HIVE-19215.patch
>
>
> cc [~sershe], [~steveyeom2017]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17657) export/import for MM tables is broken

2018-04-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448870#comment-16448870
 ] 

Sergey Shelukhin commented on HIVE-17657:
-

Fixed another NPE, rebased.
Also I gave up on removing the ugly magic directory skipping, removing it 
breaks too much stuff. Seems to still work fine for MM tables.
Will file a follow up jira.

> export/import for MM tables is broken
> -
>
> Key: HIVE-17657
> URL: https://issues.apache.org/jira/browse/HIVE-17657
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Attachments: HIVE-17657.01.patch, HIVE-17657.02.patch, 
> HIVE-17657.03.patch, HIVE-17657.patch
>
>
> there is mm_exim.q but it's not clear from the tests what file structure it 
> creates 
> On import the txnids in the directory names would have to be remapped if 
> importing to a different cluster.  Perhaps export can be smart and export 
> highest base_x and accretive deltas (minus aborted ones).  Then import can 
> ...?  It would have to remap txn ids from the archive to new txn ids.  This 
> would then mean that import is made up of several transactions rather than 1 
> atomic op.  (all locks must belong to a transaction)
> One possibility is to open a new txn for each dir in the archive (where 
> start/end txn of file name is the same) and commit all of them at once (need 
> new TMgr API for that).  This assumes using a shared lock (if any!) and thus 
> allows other inserts (not related to import) to occur.
> What if you have delta_6_9, such as a result of concatenate?  If we stipulate 
> that this must mean that there is no delta_6_6 or any other "obsolete" delta 
> in the archive we can map it to a new single txn delta_x_x.
> Add read_only mode for tables (useful in general, may be needed for upgrade 
> etc) and use that to make the above atomic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17657) export/import for MM tables is broken

2018-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17657:

Attachment: HIVE-17657.03.patch

> export/import for MM tables is broken
> -
>
> Key: HIVE-17657
> URL: https://issues.apache.org/jira/browse/HIVE-17657
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Attachments: HIVE-17657.01.patch, HIVE-17657.02.patch, 
> HIVE-17657.03.patch, HIVE-17657.patch
>
>
> there is mm_exim.q but it's not clear from the tests what file structure it 
> creates 
> On import the txnids in the directory names would have to be remapped if 
> importing to a different cluster.  Perhaps export can be smart and export 
> highest base_x and accretive deltas (minus aborted ones).  Then import can 
> ...?  It would have to remap txn ids from the archive to new txn ids.  This 
> would then mean that import is made up of several transactions rather than 1 
> atomic op.  (all locks must belong to a transaction)
> One possibility is to open a new txn for each dir in the archive (where 
> start/end txn of file name is the same) and commit all of them at once (need 
> new TMgr API for that).  This assumes using a shared lock (if any!) and thus 
> allows other inserts (not related to import) to occur.
> What if you have delta_6_9, such as a result of concatenate?  If we stipulate 
> that this must mean that there is no delta_6_6 or any other "obsolete" delta 
> in the archive we can map it to a new single txn delta_x_x.
> Add read_only mode for tables (useful in general, may be needed for upgrade 
> etc) and use that to make the above atomic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19279) remove magic directory skipping from CopyTask

2018-04-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448879#comment-16448879
 ] 

Sergey Shelukhin commented on HIVE-19279:
-

cc [~thejas]

> remove magic directory skipping from CopyTask
> -
>
> Key: HIVE-19279
> URL: https://issues.apache.org/jira/browse/HIVE-19279
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Follow up from HIVE-17657.
> Code exists in copytask that copies files (fancy that); however, when listing 
> the files, if a single directory exists at the source with no other files, it 
> will skip the directory and copy the files inside instead.
> This directory in various tests happens to be the "data" directory from 
> export, or some random partition directory ("foo=bar") that if not skipped 
> makes it into the real partition directory at the destination.
> It won't do that if any other files or directories are present.
> This seems brittle. Caller of the CopyTask should specify exactly what it 
> wants copied instead of relying on this behavior.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19279) remove magic directory skipping from CopyTask

2018-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19279:

Description: 
Follow up from HIVE-17657.
Code exists in copytask that copies files (fancy that); however, when listing 
the files, if a single directory exists at the source with no other files, it 
will skip the directory and copy the files inside instead.
This directory in various tests is either the "data" directory from export, or 
some random partition directory ("foo=bar") that if not skipped makes it into 
the real partition directory at the destination.
The directory is not skipped if it's not by itself, i.e. any other files or 
directories are present.

This seems brittle. Caller of the CopyTask should specify exactly what it wants 
copied instead of relying on this behavior.

  was:
Follow up from HIVE-17657.
Code exists in copytask that copies files (fancy that); however, when listing 
the files, if a single directory exists at the source with no other files, it 
will skip the directory and copy the files inside instead.
This directory in various tests is either the "data" directory from export, or 
some random partition directory ("foo=bar") that if not skipped makes it into 
the real partition directory at the destination.
It won't do that if any other files or directories are present.

This seems brittle. Caller of the CopyTask should specify exactly what it wants 
copied instead of relying on this behavior.


> remove magic directory skipping from CopyTask
> -
>
> Key: HIVE-19279
> URL: https://issues.apache.org/jira/browse/HIVE-19279
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Follow up from HIVE-17657.
> Code exists in copytask that copies files (fancy that); however, when listing 
> the files, if a single directory exists at the source with no other files, it 
> will skip the directory and copy the files inside instead.
> This directory in various tests is either the "data" directory from export, 
> or some random partition directory ("foo=bar") that if not skipped makes it 
> into the real partition directory at the destination.
> The directory is not skipped if it's not by itself, i.e. any other files or 
> directories are present.
> This seems brittle. Caller of the CopyTask should specify exactly what it 
> wants copied instead of relying on this behavior.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19279) remove magic directory skipping from CopyTask

2018-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19279:

Description: 
Follow up from HIVE-17657.
Code exists in copytask that copies files (fancy that); however, when listing 
the files, if a single directory exists at the source with no other files, it 
will skip the directory and copy the files inside instead.
This directory in various tests is either the "data" directory from export, or 
some random partition directory ("foo=bar") that if not skipped makes it into 
the real partition directory at the destination.
It won't do that if any other files or directories are present.

This seems brittle. Caller of the CopyTask should specify exactly what it wants 
copied instead of relying on this behavior.

  was:
Follow up from HIVE-17657.
Code exists in copytask that copies files (fancy that); however, when listing 
the files, if a single directory exists at the source with no other files, it 
will skip the directory and copy the files inside instead.
This directory in various tests happens to be the "data" directory from export, 
or some random partition directory ("foo=bar") that if not skipped makes it 
into the real partition directory at the destination.
It won't do that if any other files or directories are present.

This seems brittle. Caller of the CopyTask should specify exactly what it wants 
copied instead of relying on this behavior.


> remove magic directory skipping from CopyTask
> -
>
> Key: HIVE-19279
> URL: https://issues.apache.org/jira/browse/HIVE-19279
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Follow up from HIVE-17657.
> Code exists in copytask that copies files (fancy that); however, when listing 
> the files, if a single directory exists at the source with no other files, it 
> will skip the directory and copy the files inside instead.
> This directory in various tests is either the "data" directory from export, 
> or some random partition directory ("foo=bar") that if not skipped makes it 
> into the real partition directory at the destination.
> It won't do that if any other files or directories are present.
> This seems brittle. Caller of the CopyTask should specify exactly what it 
> wants copied instead of relying on this behavior.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19171) Persist runtime statistics in metastore

2018-04-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448912#comment-16448912
 ] 

Sergey Shelukhin commented on HIVE-19171:
-

[~ashutoshc] [~kgyrtkirk] this broke the build... please fix or revert :)

> Persist runtime statistics in metastore
> ---
>
> Key: HIVE-19171
> URL: https://issues.apache.org/jira/browse/HIVE-19171
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-19171.01.patch, HIVE-19171.01wip01.patch, 
> HIVE-19171.01wip02.patch, HIVE-19171.01wip03.patch, HIVE-19171.02.patch, 
> HIVE-19171.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19243) Upgrade hadoop.version to 3.1.0

2018-04-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448941#comment-16448941
 ] 

Sergey Shelukhin commented on HIVE-19243:
-

Is this going to be committed to 3.0?

> Upgrade hadoop.version to 3.1.0
> ---
>
> Key: HIVE-19243
> URL: https://issues.apache.org/jira/browse/HIVE-19243
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gour Saha
>Assignee: Gour Saha
>Priority: Blocker
> Fix For: 3.1.0
>
> Attachments: HIVE-19243.01.patch
>
>
> Given that Hadoop 3.1.0 has been released, we need to upgrade hadoop.version 
> to 3.1.0. This change is required for HIVE-18037 since it depends on YARN 
> Service which had its first release in 3.1.0 (and is non-existent in 3.0.0).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18037) Migrate Slider LLAP package to YARN Service framework for Hadoop 3.x

2018-04-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448942#comment-16448942
 ] 

Sergey Shelukhin commented on HIVE-18037:
-

[~gsaha] can you update the RB? thnx. Looks like 19243 is in so it might be 
good to trigger the QA again, too

> Migrate Slider LLAP package to YARN Service framework for Hadoop 3.x
> 
>
> Key: HIVE-18037
> URL: https://issues.apache.org/jira/browse/HIVE-18037
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gour Saha
>Assignee: Gour Saha
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-18037.001.patch, HIVE-18037.002.patch, 
> HIVE-18037.003.patch, HIVE-18037.004.patch
>
>
> Apache Slider has been migrated to Hadoop-3.x and is referred to as YARN 
> Service (YARN-4692). Most of the classic Slider features are now going to be 
> supported in a first-class manner by core YARN. It includes several new 
> features like a RESTful API. Command line equivalents of classic Slider are 
> supported by YARN Service as well.
> This jira will take care of all changes required to Slider LLAP packaging and 
> scripts to make it work against Hadoop 3.x.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17657) export/import for MM tables is broken

2018-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17657:

Attachment: HIVE-17657.04.patch

> export/import for MM tables is broken
> -
>
> Key: HIVE-17657
> URL: https://issues.apache.org/jira/browse/HIVE-17657
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Attachments: HIVE-17657.01.patch, HIVE-17657.02.patch, 
> HIVE-17657.03.patch, HIVE-17657.04.patch, HIVE-17657.patch
>
>
> there is mm_exim.q but it's not clear from the tests what file structure it 
> creates 
> On import the txnids in the directory names would have to be remapped if 
> importing to a different cluster.  Perhaps export can be smart and export 
> highest base_x and accretive deltas (minus aborted ones).  Then import can 
> ...?  It would have to remap txn ids from the archive to new txn ids.  This 
> would then mean that import is made up of several transactions rather than 1 
> atomic op.  (all locks must belong to a transaction)
> One possibility is to open a new txn for each dir in the archive (where 
> start/end txn of file name is the same) and commit all of them at once (need 
> new TMgr API for that).  This assumes using a shared lock (if any!) and thus 
> allows other inserts (not related to import) to occur.
> What if you have delta_6_9, such as a result of concatenate?  If we stipulate 
> that this must mean that there is no delta_6_6 or any other "obsolete" delta 
> in the archive we can map it to a new single txn delta_x_x.
> Add read_only mode for tables (useful in general, may be needed for upgrade 
> etc) and use that to make the above atomic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19215) JavaUtils.AnyIdDirFilter ignores base_n directories

2018-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19215:

Attachment: HIVE-19215.04.patch

> JavaUtils.AnyIdDirFilter ignores base_n directories
> ---
>
> Key: HIVE-19215
> URL: https://issues.apache.org/jira/browse/HIVE-19215
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19215.01.patch, HIVE-19215.02.patch, 
> HIVE-19215.03.patch, HIVE-19215.04.patch, HIVE-19215.patch
>
>
> cc [~sershe], [~steveyeom2017]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19124) implement a basic major compactor for MM tables

2018-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19124:

Attachment: HIVE-19124.07.patch

> implement a basic major compactor for MM tables
> ---
>
> Key: HIVE-19124
> URL: https://issues.apache.org/jira/browse/HIVE-19124
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Attachments: HIVE-19124.01.patch, HIVE-19124.02.patch, 
> HIVE-19124.03.patch, HIVE-19124.03.patch, HIVE-19124.04.patch, 
> HIVE-19124.05.patch, HIVE-19124.06.patch, HIVE-19124.07.patch, 
> HIVE-19124.patch
>
>
> For now, it will run a query directly and only major compactions will be 
> supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19281) incorrect protocol name for LLAP AM plugin

2018-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-19281:
---


> incorrect protocol name for LLAP AM plugin
> --
>
> Key: HIVE-19281
> URL: https://issues.apache.org/jira/browse/HIVE-19281
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19281) incorrect protocol name for LLAP AM plugin

2018-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19281:

Attachment: HIVE-19281.patch

> incorrect protocol name for LLAP AM plugin
> --
>
> Key: HIVE-19281
> URL: https://issues.apache.org/jira/browse/HIVE-19281
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19281.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19281) incorrect protocol name for LLAP AM plugin

2018-04-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16449006#comment-16449006
 ] 

Sergey Shelukhin commented on HIVE-19281:
-

The patch... I'd like to test on cluster to see if everything else works, we 
never tried this on a secure cluster and there was a number of minor 
setup/configuration issues.

> incorrect protocol name for LLAP AM plugin
> --
>
> Key: HIVE-19281
> URL: https://issues.apache.org/jira/browse/HIVE-19281
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19281.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19281) incorrect protocol name for LLAP AM plugin

2018-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19281:

Status: Patch Available  (was: Open)

> incorrect protocol name for LLAP AM plugin
> --
>
> Key: HIVE-19281
> URL: https://issues.apache.org/jira/browse/HIVE-19281
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19281.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19280) Invalid error messages for UPDATE/DELETE on insert-only transactional tables

2018-04-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16449008#comment-16449008
 ] 

Sergey Shelukhin commented on HIVE-19280:
-

+1 pending tests

> Invalid error messages for UPDATE/DELETE on insert-only transactional tables
> 
>
> Key: HIVE-19280
> URL: https://issues.apache.org/jira/browse/HIVE-19280
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19280.01.patch
>
>
> UPDATE/DELETE on MM tables fails with 
> "FAILED: SemanticException Error 10297: Attempt to do update or delete on 
> table tpch.tbl_default_mm that is not transactional". 
> This is invalid since the MM table is transactional. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19279) remove magic directory skipping from CopyTask

2018-04-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16449010#comment-16449010
 ] 

Sergey Shelukhin commented on HIVE-19279:
-

Yes, see description. I'm not sure what code relies on not skipping if there 
isn't a directory :)

> remove magic directory skipping from CopyTask
> -
>
> Key: HIVE-19279
> URL: https://issues.apache.org/jira/browse/HIVE-19279
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> Follow up from HIVE-17657.
> Code exists in copytask that copies files (fancy that); however, when listing 
> the files, if a single directory exists at the source with no other files, it 
> will skip the directory and copy the files inside instead.
> This directory in various tests is either the "data" directory from export, 
> or some random partition directory ("foo=bar") that if not skipped makes it 
> into the real partition directory at the destination.
> The directory is not skipped if it's not by itself, i.e. any other files or 
> directories are present.
> This seems brittle. Caller of the CopyTask should specify exactly what it 
> wants copied instead of relying on this behavior.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-18052) Run p-tests on mm tables

2018-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-18052:
---

Assignee: Sergey Shelukhin  (was: Steve Yeom)

> Run p-tests on mm tables
> 
>
> Key: HIVE-18052
> URL: https://issues.apache.org/jira/browse/HIVE-18052
> Project: Hive
>  Issue Type: Task
>Reporter: Steve Yeom
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-18052.1.patch, HIVE-18052.10.patch, 
> HIVE-18052.11.patch, HIVE-18052.12.patch, HIVE-18052.13.patch, 
> HIVE-18052.14.patch, HIVE-18052.15.patch, HIVE-18052.16.patch, 
> HIVE-18052.17.patch, HIVE-18052.18.patch, HIVE-18052.19.patch, 
> HIVE-18052.2.patch, HIVE-18052.3.patch, HIVE-18052.4.patch, 
> HIVE-18052.5.patch, HIVE-18052.6.patch, HIVE-18052.7.patch, 
> HIVE-18052.8.patch, HIVE-18052.9.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18052) Run p-tests on mm tables

2018-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-18052:

Attachment: HIVE-18052.19.patch

> Run p-tests on mm tables
> 
>
> Key: HIVE-18052
> URL: https://issues.apache.org/jira/browse/HIVE-18052
> Project: Hive
>  Issue Type: Task
>Reporter: Steve Yeom
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-18052.1.patch, HIVE-18052.10.patch, 
> HIVE-18052.11.patch, HIVE-18052.12.patch, HIVE-18052.13.patch, 
> HIVE-18052.14.patch, HIVE-18052.15.patch, HIVE-18052.16.patch, 
> HIVE-18052.17.patch, HIVE-18052.18.patch, HIVE-18052.19.patch, 
> HIVE-18052.2.patch, HIVE-18052.3.patch, HIVE-18052.4.patch, 
> HIVE-18052.5.patch, HIVE-18052.6.patch, HIVE-18052.7.patch, 
> HIVE-18052.8.patch, HIVE-18052.9.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18052) Run p-tests on mm tables

2018-04-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16449078#comment-16449078
 ] 

Sergey Shelukhin commented on HIVE-18052:
-

Updating after recent fixes. We will focus on MiniLlapLocal driver only.

> Run p-tests on mm tables
> 
>
> Key: HIVE-18052
> URL: https://issues.apache.org/jira/browse/HIVE-18052
> Project: Hive
>  Issue Type: Task
>Reporter: Steve Yeom
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-18052.1.patch, HIVE-18052.10.patch, 
> HIVE-18052.11.patch, HIVE-18052.12.patch, HIVE-18052.13.patch, 
> HIVE-18052.14.patch, HIVE-18052.15.patch, HIVE-18052.16.patch, 
> HIVE-18052.17.patch, HIVE-18052.18.patch, HIVE-18052.19.patch, 
> HIVE-18052.2.patch, HIVE-18052.3.patch, HIVE-18052.4.patch, 
> HIVE-18052.5.patch, HIVE-18052.6.patch, HIVE-18052.7.patch, 
> HIVE-18052.8.patch, HIVE-18052.9.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19282) don't nest delta directories inside LB directories for ACID tables

2018-04-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-19282:
---


> don't nest delta directories inside LB directories for ACID tables
> --
>
> Key: HIVE-19282
> URL: https://issues.apache.org/jira/browse/HIVE-19282
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18037) Migrate Slider LLAP package to YARN Service framework for Hadoop 3.x

2018-04-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450346#comment-16450346
 ] 

Sergey Shelukhin commented on HIVE-18037:
-

+1 pending tests

> Migrate Slider LLAP package to YARN Service framework for Hadoop 3.x
> 
>
> Key: HIVE-18037
> URL: https://issues.apache.org/jira/browse/HIVE-18037
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gour Saha
>Assignee: Gour Saha
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-18037.001.patch, HIVE-18037.002.patch, 
> HIVE-18037.003.patch, HIVE-18037.004.patch
>
>
> Apache Slider has been migrated to Hadoop-3.x and is referred to as YARN 
> Service (YARN-4692). Most of the classic Slider features are now going to be 
> supported in a first-class manner by core YARN. It includes several new 
> features like a RESTful API. Command line equivalents of classic Slider are 
> supported by YARN Service as well.
> This jira will take care of all changes required to Slider LLAP packaging and 
> scripts to make it work against Hadoop 3.x.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17657) export/import for MM tables is broken

2018-04-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450354#comment-16450354
 ] 

Sergey Shelukhin commented on HIVE-17657:
-

Test failures are unrelated. [~ekoifman] can you take a look?

> export/import for MM tables is broken
> -
>
> Key: HIVE-17657
> URL: https://issues.apache.org/jira/browse/HIVE-17657
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Attachments: HIVE-17657.01.patch, HIVE-17657.02.patch, 
> HIVE-17657.03.patch, HIVE-17657.04.patch, HIVE-17657.patch
>
>
> there is mm_exim.q but it's not clear from the tests what file structure it 
> creates 
> On import the txnids in the directory names would have to be remapped if 
> importing to a different cluster.  Perhaps export can be smart and export 
> highest base_x and accretive deltas (minus aborted ones).  Then import can 
> ...?  It would have to remap txn ids from the archive to new txn ids.  This 
> would then mean that import is made up of several transactions rather than 1 
> atomic op.  (all locks must belong to a transaction)
> One possibility is to open a new txn for each dir in the archive (where 
> start/end txn of file name is the same) and commit all of them at once (need 
> new TMgr API for that).  This assumes using a shared lock (if any!) and thus 
> allows other inserts (not related to import) to occur.
> What if you have delta_6_9, such as a result of concatenate?  If we stipulate 
> that this must mean that there is no delta_6_6 or any other "obsolete" delta 
> in the archive we can map it to a new single txn delta_x_x.
> Add read_only mode for tables (useful in general, may be needed for upgrade 
> etc) and use that to make the above atomic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19124) implement a basic major compactor for MM tables

2018-04-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450365#comment-16450365
 ] 

Sergey Shelukhin commented on HIVE-19124:
-

Well, it already needs to have access to the cluster to use the MR compactor 
job.
When metastore is separate from Hive, I think compactor should move to HS2 cc 
[~alangates]

> implement a basic major compactor for MM tables
> ---
>
> Key: HIVE-19124
> URL: https://issues.apache.org/jira/browse/HIVE-19124
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Attachments: HIVE-19124.01.patch, HIVE-19124.02.patch, 
> HIVE-19124.03.patch, HIVE-19124.03.patch, HIVE-19124.04.patch, 
> HIVE-19124.05.patch, HIVE-19124.06.patch, HIVE-19124.07.patch, 
> HIVE-19124.patch
>
>
> For now, it will run a query directly and only major compactions will be 
> supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19215) JavaUtils.AnyIdDirFilter ignores base_n directories

2018-04-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450368#comment-16450368
 ] 

Sergey Shelukhin commented on HIVE-19215:
-

The only test that looks relevant actually failed due to 
ConcurrentModificationException in Hadoop config. [~prasanth_j] can you take a 
look again? thnx

> JavaUtils.AnyIdDirFilter ignores base_n directories
> ---
>
> Key: HIVE-19215
> URL: https://issues.apache.org/jira/browse/HIVE-19215
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19215.01.patch, HIVE-19215.02.patch, 
> HIVE-19215.03.patch, HIVE-19215.04.patch, HIVE-19215.patch
>
>
> cc [~sershe], [~steveyeom2017]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19124) implement a basic major compactor for MM tables

2018-04-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450380#comment-16450380
 ] 

Sergey Shelukhin commented on HIVE-19124:
-

See the comment on RB; we don't have the same input structures needed for this 
method. There are too many different classes that do not convert into each 
other... it's better to have an utility method that adjust one property than to 
convert existing structure back to thrift (or alternatively call metastore 
again) and then rebuild the new one from scratch from thrift.

> implement a basic major compactor for MM tables
> ---
>
> Key: HIVE-19124
> URL: https://issues.apache.org/jira/browse/HIVE-19124
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Attachments: HIVE-19124.01.patch, HIVE-19124.02.patch, 
> HIVE-19124.03.patch, HIVE-19124.03.patch, HIVE-19124.04.patch, 
> HIVE-19124.05.patch, HIVE-19124.06.patch, HIVE-19124.07.patch, 
> HIVE-19124.patch
>
>
> For now, it will run a query directly and only major compactions will be 
> supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19124) implement a basic major compactor for MM tables

2018-04-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450387#comment-16450387
 ] 

Sergey Shelukhin commented on HIVE-19124:
-

There's tons of code in AcidUtils that does similar things based on different 
inputs (e.g. overloads if isAcidTable)... why is this not acceptable?
Do you want to modify a patch? I don't see a way to use this method without 
either copy-pasting the old method for different input structure, or making it 
so we do a root canal thru the ear...

> implement a basic major compactor for MM tables
> ---
>
> Key: HIVE-19124
> URL: https://issues.apache.org/jira/browse/HIVE-19124
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Attachments: HIVE-19124.01.patch, HIVE-19124.02.patch, 
> HIVE-19124.03.patch, HIVE-19124.03.patch, HIVE-19124.04.patch, 
> HIVE-19124.05.patch, HIVE-19124.06.patch, HIVE-19124.07.patch, 
> HIVE-19124.patch
>
>
> For now, it will run a query directly and only major compactions will be 
> supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-19124) implement a basic major compactor for MM tables

2018-04-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450387#comment-16450387
 ] 

Sergey Shelukhin edited comment on HIVE-19124 at 4/24/18 6:56 PM:
--

There's tons of code in AcidUtils that does similar things based on different 
inputs (e.g. overloads if isAcidTable)... why is this not acceptable?
Do you want to modify the latest patch? I don't see a way to use this method 
without either copy-pasting the old method for different input structure, or 
making it so we do a root canal thru the ear...


was (Author: sershe):
There's tons of code in AcidUtils that does similar things based on different 
inputs (e.g. overloads if isAcidTable)... why is this not acceptable?
Do you want to modify a patch? I don't see a way to use this method without 
either copy-pasting the old method for different input structure, or making it 
so we do a root canal thru the ear...

> implement a basic major compactor for MM tables
> ---
>
> Key: HIVE-19124
> URL: https://issues.apache.org/jira/browse/HIVE-19124
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Attachments: HIVE-19124.01.patch, HIVE-19124.02.patch, 
> HIVE-19124.03.patch, HIVE-19124.03.patch, HIVE-19124.04.patch, 
> HIVE-19124.05.patch, HIVE-19124.06.patch, HIVE-19124.07.patch, 
> HIVE-19124.patch
>
>
> For now, it will run a query directly and only major compactions will be 
> supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19124) implement a basic major compactor for MM tables

2018-04-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16450393#comment-16450393
 ] 

Sergey Shelukhin commented on HIVE-19124:
-

Frankly having to do this points at the need to get rid of all these classes 
and just have a single one; in a follow up patch. I'll take a look into the 
root canal thru the ear variant today, for now.

> implement a basic major compactor for MM tables
> ---
>
> Key: HIVE-19124
> URL: https://issues.apache.org/jira/browse/HIVE-19124
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Attachments: HIVE-19124.01.patch, HIVE-19124.02.patch, 
> HIVE-19124.03.patch, HIVE-19124.03.patch, HIVE-19124.04.patch, 
> HIVE-19124.05.patch, HIVE-19124.06.patch, HIVE-19124.07.patch, 
> HIVE-19124.patch
>
>
> For now, it will run a query directly and only major compactions will be 
> supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19124) implement a basic major compactor for MM tables

2018-04-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451247#comment-16451247
 ] 

Sergey Shelukhin commented on HIVE-19124:
-

The patch currently extracts the write IDs that driver has actually used (and 
will continue doing so after the change).
The HWM from there can be recorded...

> implement a basic major compactor for MM tables
> ---
>
> Key: HIVE-19124
> URL: https://issues.apache.org/jira/browse/HIVE-19124
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Attachments: HIVE-19124.01.patch, HIVE-19124.02.patch, 
> HIVE-19124.03.patch, HIVE-19124.03.patch, HIVE-19124.04.patch, 
> HIVE-19124.05.patch, HIVE-19124.06.patch, HIVE-19124.07.patch, 
> HIVE-19124.patch
>
>
> For now, it will run a query directly and only major compactions will be 
> supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19282) don't nest delta directories inside LB directories for ACID tables

2018-04-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19282:

Attachment: HIVE-19282.patch

> don't nest delta directories inside LB directories for ACID tables
> --
>
> Key: HIVE-19282
> URL: https://issues.apache.org/jira/browse/HIVE-19282
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19282.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19282) don't nest delta directories inside LB directories for ACID tables

2018-04-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19282:

Status: Patch Available  (was: Open)

> don't nest delta directories inside LB directories for ACID tables
> --
>
> Key: HIVE-19282
> URL: https://issues.apache.org/jira/browse/HIVE-19282
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19282.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19282) don't nest delta directories inside LB directories for ACID tables

2018-04-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451262#comment-16451262
 ] 

Sergey Shelukhin commented on HIVE-19282:
-

[~prasanth_j] [~steveyeom2017] can you take a look? this also removes bunch of 
3-line utility methods with generic names (getFinalDir, appendToSource, etc) 
from FileSinkOperator to make it easier to understand what's actually going on 
logically

> don't nest delta directories inside LB directories for ACID tables
> --
>
> Key: HIVE-19282
> URL: https://issues.apache.org/jira/browse/HIVE-19282
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19282.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-19282) don't nest delta directories inside LB directories for ACID tables

2018-04-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451262#comment-16451262
 ] 

Sergey Shelukhin edited comment on HIVE-19282 at 4/24/18 9:40 PM:
--

[~prasanth_j] [~steveyeom2017] can you take a look? 
It basically changes FileSinkOperator to store components of the path instead 
of building/replacing whole paths. 
And changes MM utility methods to look for LB directories in the right place. 
It also removes bunch of 3-line utility methods with generic names 
(getFinalDir, appendToSource, etc) from FileSinkOperator to make it easier to 
understand what's actually going on logically


was (Author: sershe):
[~prasanth_j] [~steveyeom2017] can you take a look? this also removes bunch of 
3-line utility methods with generic names (getFinalDir, appendToSource, etc) 
from FileSinkOperator to make it easier to understand what's actually going on 
logically

> don't nest delta directories inside LB directories for ACID tables
> --
>
> Key: HIVE-19282
> URL: https://issues.apache.org/jira/browse/HIVE-19282
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19282.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19124) implement a basic major compactor for MM tables

2018-04-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19124:

Attachment: HIVE-19124.08.patch

> implement a basic major compactor for MM tables
> ---
>
> Key: HIVE-19124
> URL: https://issues.apache.org/jira/browse/HIVE-19124
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Attachments: HIVE-19124.01.patch, HIVE-19124.02.patch, 
> HIVE-19124.03.patch, HIVE-19124.03.patch, HIVE-19124.04.patch, 
> HIVE-19124.05.patch, HIVE-19124.06.patch, HIVE-19124.07.patch, 
> HIVE-19124.08.patch, HIVE-19124.patch
>
>
> For now, it will run a query directly and only major compactions will be 
> supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19124) implement a basic major compactor for MM tables

2018-04-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451281#comment-16451281
 ] 

Sergey Shelukhin commented on HIVE-19124:
-

Updated the patch. I ended up splitting the metastore call path to allow the 
creation of the object without back-conversion.

> implement a basic major compactor for MM tables
> ---
>
> Key: HIVE-19124
> URL: https://issues.apache.org/jira/browse/HIVE-19124
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Attachments: HIVE-19124.01.patch, HIVE-19124.02.patch, 
> HIVE-19124.03.patch, HIVE-19124.03.patch, HIVE-19124.04.patch, 
> HIVE-19124.05.patch, HIVE-19124.06.patch, HIVE-19124.07.patch, 
> HIVE-19124.08.patch, HIVE-19124.patch
>
>
> For now, it will run a query directly and only major compactions will be 
> supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19281) incorrect protocol name for LLAP AM plugin

2018-04-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451413#comment-16451413
 ] 

Sergey Shelukhin commented on HIVE-19281:
-

[~jdere] can you take a look? tiny patch, seems to fix WM on a secure cluster 
that I have

> incorrect protocol name for LLAP AM plugin
> --
>
> Key: HIVE-19281
> URL: https://issues.apache.org/jira/browse/HIVE-19281
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19281.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19124) implement a basic major compactor for MM tables

2018-04-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19124:

Attachment: HIVE-19124.09.patch

> implement a basic major compactor for MM tables
> ---
>
> Key: HIVE-19124
> URL: https://issues.apache.org/jira/browse/HIVE-19124
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Attachments: HIVE-19124.01.patch, HIVE-19124.02.patch, 
> HIVE-19124.03.patch, HIVE-19124.03.patch, HIVE-19124.04.patch, 
> HIVE-19124.05.patch, HIVE-19124.06.patch, HIVE-19124.07.patch, 
> HIVE-19124.08.patch, HIVE-19124.09.patch, HIVE-19124.patch
>
>
> For now, it will run a query directly and only major compactions will be 
> supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19124) implement a basic major compactor for MM tables

2018-04-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451457#comment-16451457
 ] 

Sergey Shelukhin commented on HIVE-19124:
-

Updated

> implement a basic major compactor for MM tables
> ---
>
> Key: HIVE-19124
> URL: https://issues.apache.org/jira/browse/HIVE-19124
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Attachments: HIVE-19124.01.patch, HIVE-19124.02.patch, 
> HIVE-19124.03.patch, HIVE-19124.03.patch, HIVE-19124.04.patch, 
> HIVE-19124.05.patch, HIVE-19124.06.patch, HIVE-19124.07.patch, 
> HIVE-19124.08.patch, HIVE-19124.09.patch, HIVE-19124.patch
>
>
> For now, it will run a query directly and only major compactions will be 
> supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18052) Run p-tests on mm tables

2018-04-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451503#comment-16451503
 ] 

Sergey Shelukhin commented on HIVE-18052:
-

Partial summary for reference:

Results change:
groupby_rollup_empty
vector_groupby_sort_8
sysdb - very strange

Potentially bad changes:
union_fast_stats - strange stats changes (not disappearance)
autoColumnStats_10 - stats disappear (same for 
column_names_with_leading_and_trailing_spaces, columnstats_part_coltype)

Unclear plan changes:
parallel
schema_evol_stats
union_remove_26
vector_groupby_cube1

Semi clear plan changes:
metadata_only_queries - unclear plan change, probably because metadata-only is 
disabled for transactional? need to dbl check if this is by design) (probably 
same as insert_values_orig_table_use_metadata, 
metadata_only_queries_with_filters, stats_only_null, 
vector_annotate_stats_select)
multi_insert - unclear plan changes, we can reevaluate after the bugfix w/IOW 
(probably the same as multi_insert_lateral_view)

> Run p-tests on mm tables
> 
>
> Key: HIVE-18052
> URL: https://issues.apache.org/jira/browse/HIVE-18052
> Project: Hive
>  Issue Type: Task
>Reporter: Steve Yeom
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-18052.1.patch, HIVE-18052.10.patch, 
> HIVE-18052.11.patch, HIVE-18052.12.patch, HIVE-18052.13.patch, 
> HIVE-18052.14.patch, HIVE-18052.15.patch, HIVE-18052.16.patch, 
> HIVE-18052.17.patch, HIVE-18052.18.patch, HIVE-18052.19.patch, 
> HIVE-18052.2.patch, HIVE-18052.3.patch, HIVE-18052.4.patch, 
> HIVE-18052.5.patch, HIVE-18052.6.patch, HIVE-18052.7.patch, 
> HIVE-18052.8.patch, HIVE-18052.9.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18037) Migrate Slider LLAP package to YARN Service framework for Hadoop 3.x

2018-04-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16452806#comment-16452806
 ] 

Sergey Shelukhin commented on HIVE-18037:
-

Reattaching the patch for HiveQA to run.

> Migrate Slider LLAP package to YARN Service framework for Hadoop 3.x
> 
>
> Key: HIVE-18037
> URL: https://issues.apache.org/jira/browse/HIVE-18037
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gour Saha
>Assignee: Gour Saha
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-18037.001.patch, HIVE-18037.002.patch, 
> HIVE-18037.003.patch, HIVE-18037.004.patch, HIVE-18037.05.patch
>
>
> Apache Slider has been migrated to Hadoop-3.x and is referred to as YARN 
> Service (YARN-4692). Most of the classic Slider features are now going to be 
> supported in a first-class manner by core YARN. It includes several new 
> features like a RESTful API. Command line equivalents of classic Slider are 
> supported by YARN Service as well.
> This jira will take care of all changes required to Slider LLAP packaging and 
> scripts to make it work against Hadoop 3.x.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18037) Migrate Slider LLAP package to YARN Service framework for Hadoop 3.x

2018-04-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-18037:

Attachment: HIVE-18037.05.patch

> Migrate Slider LLAP package to YARN Service framework for Hadoop 3.x
> 
>
> Key: HIVE-18037
> URL: https://issues.apache.org/jira/browse/HIVE-18037
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gour Saha
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-18037.001.patch, HIVE-18037.002.patch, 
> HIVE-18037.003.patch, HIVE-18037.004.patch, HIVE-18037.05.patch
>
>
> Apache Slider has been migrated to Hadoop-3.x and is referred to as YARN 
> Service (YARN-4692). Most of the classic Slider features are now going to be 
> supported in a first-class manner by core YARN. It includes several new 
> features like a RESTful API. Command line equivalents of classic Slider are 
> supported by YARN Service as well.
> This jira will take care of all changes required to Slider LLAP packaging and 
> scripts to make it work against Hadoop 3.x.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-18037) Migrate Slider LLAP package to YARN Service framework for Hadoop 3.x

2018-04-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-18037:
---

Assignee: Sergey Shelukhin  (was: Gour Saha)

> Migrate Slider LLAP package to YARN Service framework for Hadoop 3.x
> 
>
> Key: HIVE-18037
> URL: https://issues.apache.org/jira/browse/HIVE-18037
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gour Saha
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-18037.001.patch, HIVE-18037.002.patch, 
> HIVE-18037.003.patch, HIVE-18037.004.patch, HIVE-18037.05.patch
>
>
> Apache Slider has been migrated to Hadoop-3.x and is referred to as YARN 
> Service (YARN-4692). Most of the classic Slider features are now going to be 
> supported in a first-class manner by core YARN. It includes several new 
> features like a RESTful API. Command line equivalents of classic Slider are 
> supported by YARN Service as well.
> This jira will take care of all changes required to Slider LLAP packaging and 
> scripts to make it work against Hadoop 3.x.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-18037) Migrate Slider LLAP package to YARN Service framework for Hadoop 3.x

2018-04-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-18037:
---

Assignee: Gour Saha  (was: Sergey Shelukhin)

> Migrate Slider LLAP package to YARN Service framework for Hadoop 3.x
> 
>
> Key: HIVE-18037
> URL: https://issues.apache.org/jira/browse/HIVE-18037
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gour Saha
>Assignee: Gour Saha
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-18037.001.patch, HIVE-18037.002.patch, 
> HIVE-18037.003.patch, HIVE-18037.004.patch, HIVE-18037.05.patch
>
>
> Apache Slider has been migrated to Hadoop-3.x and is referred to as YARN 
> Service (YARN-4692). Most of the classic Slider features are now going to be 
> supported in a first-class manner by core YARN. It includes several new 
> features like a RESTful API. Command line equivalents of classic Slider are 
> supported by YARN Service as well.
> This jira will take care of all changes required to Slider LLAP packaging and 
> scripts to make it work against Hadoop 3.x.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19281) incorrect protocol name for LLAP AM plugin

2018-04-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19281:

Fix Version/s: 3.0.0

> incorrect protocol name for LLAP AM plugin
> --
>
> Key: HIVE-19281
> URL: https://issues.apache.org/jira/browse/HIVE-19281
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19281.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19281) incorrect protocol name for LLAP AM plugin

2018-04-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19281:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed, thnx for the review!

> incorrect protocol name for LLAP AM plugin
> --
>
> Key: HIVE-19281
> URL: https://issues.apache.org/jira/browse/HIVE-19281
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19281.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19280) Invalid error messages for UPDATE/DELETE on insert-only transactional tables

2018-04-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19280:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master and branch-3. Thanks for the patch!

> Invalid error messages for UPDATE/DELETE on insert-only transactional tables
> 
>
> Key: HIVE-19280
> URL: https://issues.apache.org/jira/browse/HIVE-19280
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19280.01.patch
>
>
> UPDATE/DELETE on MM tables fails with 
> "FAILED: SemanticException Error 10297: Attempt to do update or delete on 
> table tpch.tbl_default_mm that is not transactional". 
> This is invalid since the MM table is transactional. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19282) don't nest delta directories inside LB directories for ACID tables

2018-04-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19282:

Attachment: HIVE-19282.01.patch

> don't nest delta directories inside LB directories for ACID tables
> --
>
> Key: HIVE-19282
> URL: https://issues.apache.org/jira/browse/HIVE-19282
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19282.01.patch, HIVE-19282.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19282) don't nest delta directories inside LB directories for ACID tables

2018-04-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16452861#comment-16452861
 ] 

Sergey Shelukhin commented on HIVE-19282:
-

[~prasanth_j] [~steveyeom2017] can you take a look?


> don't nest delta directories inside LB directories for ACID tables
> --
>
> Key: HIVE-19282
> URL: https://issues.apache.org/jira/browse/HIVE-19282
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19282.01.patch, HIVE-19282.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19215) JavaUtils.AnyIdDirFilter ignores base_n directories

2018-04-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19215:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to branches. Thanks for the review!

> JavaUtils.AnyIdDirFilter ignores base_n directories
> ---
>
> Key: HIVE-19215
> URL: https://issues.apache.org/jira/browse/HIVE-19215
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19215.01.patch, HIVE-19215.02.patch, 
> HIVE-19215.03.patch, HIVE-19215.04.patch, HIVE-19215.patch
>
>
> cc [~sershe], [~steveyeom2017]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19282) don't nest delta directories inside LB directories for ACID tables

2018-04-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16452948#comment-16452948
 ] 

Sergey Shelukhin commented on HIVE-19282:
-

See links, RB is linked

> don't nest delta directories inside LB directories for ACID tables
> --
>
> Key: HIVE-19282
> URL: https://issues.apache.org/jira/browse/HIVE-19282
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19282.01.patch, HIVE-19282.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19312) MM tables don't work with BucketizedHIF

2018-04-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-19312:
---


> MM tables don't work with BucketizedHIF
> ---
>
> Key: HIVE-19312
> URL: https://issues.apache.org/jira/browse/HIVE-19312
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19312.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19312) MM tables don't work with BucketizedHIF

2018-04-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19312:

Attachment: HIVE-19312.patch

> MM tables don't work with BucketizedHIF
> ---
>
> Key: HIVE-19312
> URL: https://issues.apache.org/jira/browse/HIVE-19312
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19312.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19312) MM tables don't work with BucketizedHIF

2018-04-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19312:

Attachment: (was: HIVE-19312.patch)

> MM tables don't work with BucketizedHIF
> ---
>
> Key: HIVE-19312
> URL: https://issues.apache.org/jira/browse/HIVE-19312
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19312) MM tables don't work with BucketizedHIF

2018-04-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19312:

Attachment: HIVE-19312.patch

> MM tables don't work with BucketizedHIF
> ---
>
> Key: HIVE-19312
> URL: https://issues.apache.org/jira/browse/HIVE-19312
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19312.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19312) MM tables don't work with BucketizedHIF

2018-04-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19312:

Status: Patch Available  (was: Open)

[~ekoifman] [~steveyeom2017] can you take a look? small code change, plus test

> MM tables don't work with BucketizedHIF
> ---
>
> Key: HIVE-19312
> URL: https://issues.apache.org/jira/browse/HIVE-19312
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19312.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19310) Metastore: MetaStoreDirectSql.ensureDbInit has some slow DN calls which might need to be run only in test env

2018-04-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16453325#comment-16453325
 ] 

Sergey Shelukhin commented on HIVE-19310:
-

It makes sense to make all these init calls in test only.

> Metastore: MetaStoreDirectSql.ensureDbInit has some slow DN calls which might 
> need to be run only in test env
> -
>
> Key: HIVE-19310
> URL: https://issues.apache.org/jira/browse/HIVE-19310
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
> Attachments: HIVE-19310.1.patch
>
>
> MetaStoreDirectSql.ensureDbInit has the following 2 calls which we have 
> observed taking a long time in our testing:
> {code}
> initQueries.add(pm.newQuery(MNotificationLog.class, "dbName == ''"));
> initQueries.add(pm.newQuery(MNotificationNextId.class, "nextEventId < -1"));
> {code}
> In a production environment, these tables should be initialized using 
> schematool, however in a test environment, these calls might be needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-6980) Drop table by using direct sql

2018-04-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16453326#comment-16453326
 ] 

Sergey Shelukhin commented on HIVE-6980:


Hmm... isn't this going to mess with internal caches for datanucleus and 
potentially make objects invalid?

I wonder if this needs some concurrency tests where we open 2 DN sessions, one 
gets some tables/etc as objects, the other drops them, and we make sure the 
first one still works and also doesn't produce incorrect results on committing?


> Drop table by using direct sql
> --
>
> Key: HIVE-6980
> URL: https://issues.apache.org/jira/browse/HIVE-6980
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 0.12.0
>Reporter: Selina Zhang
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-6980.patch
>
>
> Dropping table which has lots of partitions is slow. Even after applying the 
> patch of HIVE-6265, the drop table still takes hours (100K+ partitions). 
> The fixes come with two parts:
> 1. use directSQL to query the partitions protect mode;
> the current implementation needs to transfer the Partition object to client 
> and check the protect mode for each partition. I'd like to move this part of 
> logic to metastore. The check will be done by direct sql (if direct sql is 
> disabled, execute the same logic in the ObjectStore);
> 2. use directSQL to drop partitions for table;
> there maybe two solutions here:
> 1. add "DELETE CASCADE" in the schema. In this way we only need to delete 
> entries from partitions table use direct sql. May need to change 
> datanucleus.deletionPolicy = DataNucleus. 
> 2. clean up the dependent tables by issue DELETE statement. This also needs 
> to turn on datanucleus.query.sql.allowAll
> Both of above solutions should be able to fix the problem. The DELETE CASCADE 
> has to change schemas and prepare upgrade scripts. The second solutions added 
> maintenance cost if new tables added in the future releases.
> Please advice. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19312) MM tables don't work with BucketizedHIF

2018-04-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19312:

Attachment: HIVE-19312.patch

> MM tables don't work with BucketizedHIF
> ---
>
> Key: HIVE-19312
> URL: https://issues.apache.org/jira/browse/HIVE-19312
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19312.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19312) MM tables don't work with BucketizedHIF

2018-04-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19312:

Attachment: (was: HIVE-19312.patch)

> MM tables don't work with BucketizedHIF
> ---
>
> Key: HIVE-19312
> URL: https://issues.apache.org/jira/browse/HIVE-19312
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19312.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19277) Active/Passive HA web endpoints does not allow cross origin requests

2018-04-26 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16454655#comment-16454655
 ] 

Sergey Shelukhin commented on HIVE-19277:
-

Nit: constants are duplicated in 2 files, and each is used once... while the 
values for the headers are not constants.
Perhaps constants could be stored in one place, or maybe all values should be 
constants, or there should be no constants.

+1

> Active/Passive HA web endpoints does not allow cross origin requests
> 
>
> Key: HIVE-19277
> URL: https://issues.apache.org/jira/browse/HIVE-19277
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-19277.1.patch
>
>
> CORS is not allowed with web endpoints added for active/passive HA. Enable 
> CORS by default for all web endpoints. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19324) improve YARN queue check error message in Tez pool

2018-04-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-19324:
---


> improve YARN queue check error message in Tez pool
> --
>
> Key: HIVE-19324
> URL: https://issues.apache.org/jira/browse/HIVE-19324
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepesh Khandelwal
>Assignee: Sergey Shelukhin
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19324) improve YARN queue check error message in Tez pool

2018-04-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19324:

Attachment: HIVE-19324.patch

> improve YARN queue check error message in Tez pool
> --
>
> Key: HIVE-19324
> URL: https://issues.apache.org/jira/browse/HIVE-19324
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepesh Khandelwal
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19324.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19324) improve YARN queue check error message in Tez pool

2018-04-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19324:

Status: Patch Available  (was: Open)

[~ashutoshc] can you take a look? a tiny patch that moves the "logical" error 
out of the try-catch block that wraps it into a silly generic error

> improve YARN queue check error message in Tez pool
> --
>
> Key: HIVE-19324
> URL: https://issues.apache.org/jira/browse/HIVE-19324
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepesh Khandelwal
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19324.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19124) implement a basic major compactor for MM tables

2018-04-26 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16454745#comment-16454745
 ] 

Sergey Shelukhin commented on HIVE-19124:
-

[~gopalv] now that the write ID stuff has been addressed can you review the 
last iters of the diff?

> implement a basic major compactor for MM tables
> ---
>
> Key: HIVE-19124
> URL: https://issues.apache.org/jira/browse/HIVE-19124
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Attachments: HIVE-19124.01.patch, HIVE-19124.02.patch, 
> HIVE-19124.03.patch, HIVE-19124.03.patch, HIVE-19124.04.patch, 
> HIVE-19124.05.patch, HIVE-19124.06.patch, HIVE-19124.07.patch, 
> HIVE-19124.08.patch, HIVE-19124.09.patch, HIVE-19124.patch
>
>
> For now, it will run a query directly and only major compactions will be 
> supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19282) don't nest delta directories inside LB directories for ACID tables

2018-04-26 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16454749#comment-16454749
 ] 

Sergey Shelukhin commented on HIVE-19282:
-

The test coverage is via mm_all test, skew_mm, skew_union_dp_mm etc sections

> don't nest delta directories inside LB directories for ACID tables
> --
>
> Key: HIVE-19282
> URL: https://issues.apache.org/jira/browse/HIVE-19282
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19282.01.patch, HIVE-19282.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19324) improve YARN queue check error message in Tez pool

2018-04-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19324:

Attachment: HIVE-19324.patch

> improve YARN queue check error message in Tez pool
> --
>
> Key: HIVE-19324
> URL: https://issues.apache.org/jira/browse/HIVE-19324
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepesh Khandelwal
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19324.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19324) improve YARN queue check error message in Tez pool

2018-04-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19324:

Attachment: (was: HIVE-19324.patch)

> improve YARN queue check error message in Tez pool
> --
>
> Key: HIVE-19324
> URL: https://issues.apache.org/jira/browse/HIVE-19324
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepesh Khandelwal
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19324.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19326) union_fast_stats golden file has incorrect "accurate" stats

2018-04-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-19326:
---


> union_fast_stats golden file has incorrect "accurate" stats
> ---
>
> Key: HIVE-19326
> URL: https://issues.apache.org/jira/browse/HIVE-19326
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Ashutosh Chauhan
>Priority: Major
>
> Found when investigating results change after converting tables to MM, turns 
> out the MM result is correct but the current one is not.
> The test ends like so:
> {noformat}
> desc formatted small_alltypesorc_a;
> ANALYZE TABLE small_alltypesorc_a COMPUTE STATISTICS;
> desc formatted small_alltypesorc_a;
> insert into table small_alltypesorc_a select * from small_alltypesorc1a;
> desc formatted small_alltypesorc_a;
> {noformat}
> The results from the descs in the golden file are:
> {noformat}
>   COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
>   numFiles1   
>   numRows 5   
> ...
>   COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
>   numFiles1   
>   numRows 15
> ...
>   COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
>   numFiles2   
>   numRows 20  
> {noformat}
> Note the result change after analyze - the original nomRows is inaccurate, 
> but  BASIC_STATS is set to true.
> I am assuming with metadata only optimization this can produce incorrect 
> results.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19326) union_fast_stats golden file has incorrect "accurate" stats

2018-04-26 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16454801#comment-16454801
 ] 

Sergey Shelukhin commented on HIVE-19326:
-

[~ashutoshc] [~prasanth_j] looks like a stats issue that may cause problems 
with metadata only queries.
Can you confirm the latter part (ie whether this is important if 
BASIC_STATS=true but numRows is wrong).

> union_fast_stats golden file has incorrect "accurate" stats
> ---
>
> Key: HIVE-19326
> URL: https://issues.apache.org/jira/browse/HIVE-19326
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Ashutosh Chauhan
>Priority: Major
>
> Found when investigating results change after converting tables to MM, turns 
> out the MM result is correct but the current one is not.
> The test ends like so:
> {noformat}
> desc formatted small_alltypesorc_a;
> ANALYZE TABLE small_alltypesorc_a COMPUTE STATISTICS;
> desc formatted small_alltypesorc_a;
> insert into table small_alltypesorc_a select * from small_alltypesorc1a;
> desc formatted small_alltypesorc_a;
> {noformat}
> The results from the descs in the golden file are:
> {noformat}
>   COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
>   numFiles1   
>   numRows 5   
> ...
>   COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
>   numFiles1   
>   numRows 15
> ...
>   COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
>   numFiles2   
>   numRows 20  
> {noformat}
> Note the result change after analyze - the original nomRows is inaccurate, 
> but  BASIC_STATS is set to true.
> I am assuming with metadata only optimization this can produce incorrect 
> results.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19326) union_fast_stats MiniLlapLocal golden file has incorrect "accurate" stats

2018-04-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19326:

Summary: union_fast_stats MiniLlapLocal golden file has incorrect 
"accurate" stats  (was: union_fast_stats golden file has incorrect "accurate" 
stats)

> union_fast_stats MiniLlapLocal golden file has incorrect "accurate" stats
> -
>
> Key: HIVE-19326
> URL: https://issues.apache.org/jira/browse/HIVE-19326
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Ashutosh Chauhan
>Priority: Major
>
> Found when investigating the results change after converting tables to MM, 
> turns out the MM result is correct but the current one is not.
> The test ends like so:
> {noformat}
> desc formatted small_alltypesorc_a;
> ANALYZE TABLE small_alltypesorc_a COMPUTE STATISTICS;
> desc formatted small_alltypesorc_a;
> insert into table small_alltypesorc_a select * from small_alltypesorc1a;
> desc formatted small_alltypesorc_a;
> {noformat}
> The results from the descs in the golden file are:
> {noformat}
>   COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
>   numFiles1   
>   numRows 5   
> ...
>   COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
>   numFiles1   
>   numRows 15
> ...
>   COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
>   numFiles2   
>   numRows 20  
> {noformat}
> Note the result change after analyze - the original nomRows is inaccurate, 
> but  BASIC_STATS is set to true.
> I am assuming with metadata only optimization this can produce incorrect 
> results.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19326) union_fast_stats golden file has incorrect "accurate" stats

2018-04-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19326:

Description: 
Found when investigating the results change after converting tables to MM, 
turns out the MM result is correct but the current one is not.
The test ends like so:
{noformat}
desc formatted small_alltypesorc_a;
ANALYZE TABLE small_alltypesorc_a COMPUTE STATISTICS;
desc formatted small_alltypesorc_a;
insert into table small_alltypesorc_a select * from small_alltypesorc1a;
desc formatted small_alltypesorc_a;
{noformat}

The results from the descs in the golden file are:
{noformat}
COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
numFiles1   
numRows 5   
...
COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
numFiles1   
numRows 15
...
COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
numFiles2   
numRows 20  
{noformat}

Note the result change after analyze - the original nomRows is inaccurate, but  
BASIC_STATS is set to true.

I am assuming with metadata only optimization this can produce incorrect 
results.

  was:
Found when investigating results change after converting tables to MM, turns 
out the MM result is correct but the current one is not.
The test ends like so:
{noformat}
desc formatted small_alltypesorc_a;
ANALYZE TABLE small_alltypesorc_a COMPUTE STATISTICS;
desc formatted small_alltypesorc_a;
insert into table small_alltypesorc_a select * from small_alltypesorc1a;
desc formatted small_alltypesorc_a;
{noformat}

The results from the descs in the golden file are:
{noformat}
COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
numFiles1   
numRows 5   
...
COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
numFiles1   
numRows 15
...
COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
numFiles2   
numRows 20  
{noformat}

Note the result change after analyze - the original nomRows is inaccurate, but  
BASIC_STATS is set to true.

I am assuming with metadata only optimization this can produce incorrect 
results.


> union_fast_stats golden file has incorrect "accurate" stats
> ---
>
> Key: HIVE-19326
> URL: https://issues.apache.org/jira/browse/HIVE-19326
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Ashutosh Chauhan
>Priority: Major
>
> Found when investigating the results change after converting tables to MM, 
> turns out the MM result is correct but the current one is not.
> The test ends like so:
> {noformat}
> desc formatted small_alltypesorc_a;
> ANALYZE TABLE small_alltypesorc_a COMPUTE STATISTICS;
> desc formatted small_alltypesorc_a;
> insert into table small_alltypesorc_a select * from small_alltypesorc1a;
> desc formatted small_alltypesorc_a;
> {noformat}
> The results from the descs in the golden file are:
> {noformat}
>   COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
>   numFiles1   
>   numRows 5   
> ...
>   COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
>   numFiles1   
>   numRows 15
> ...
>   COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
>   numFiles2   
>   numRows 20  
> {noformat}
> Note the result change after analyze - the original nomRows is inaccurate, 
> but  BASIC_STATS is set to true.
> I am assuming with metadata only optimization this can produce incorrect 
> results.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19326) union_fast_stats MiniLlapLocal golden file has incorrect "accurate" stats

2018-04-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19326:

Priority: Blocker  (was: Major)

> union_fast_stats MiniLlapLocal golden file has incorrect "accurate" stats
> -
>
> Key: HIVE-19326
> URL: https://issues.apache.org/jira/browse/HIVE-19326
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Ashutosh Chauhan
>Priority: Blocker
> Fix For: 3.0.0
>
>
> Found when investigating the results change after converting tables to MM, 
> turns out the MM result is correct but the current one is not.
> The test ends like so:
> {noformat}
> desc formatted small_alltypesorc_a;
> ANALYZE TABLE small_alltypesorc_a COMPUTE STATISTICS;
> desc formatted small_alltypesorc_a;
> insert into table small_alltypesorc_a select * from small_alltypesorc1a;
> desc formatted small_alltypesorc_a;
> {noformat}
> The results from the descs in the golden file are:
> {noformat}
>   COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
>   numFiles1   
>   numRows 5   
> ...
>   COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
>   numFiles1   
>   numRows 15
> ...
>   COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
>   numFiles2   
>   numRows 20  
> {noformat}
> Note the result change after analyze - the original nomRows is inaccurate, 
> but  BASIC_STATS is set to true.
> I am assuming with metadata only optimization this can produce incorrect 
> results.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19326) union_fast_stats MiniLlapLocal golden file has incorrect "accurate" stats

2018-04-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19326:

Fix Version/s: 3.0.0

> union_fast_stats MiniLlapLocal golden file has incorrect "accurate" stats
> -
>
> Key: HIVE-19326
> URL: https://issues.apache.org/jira/browse/HIVE-19326
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Ashutosh Chauhan
>Priority: Blocker
> Fix For: 3.0.0
>
>
> Found when investigating the results change after converting tables to MM, 
> turns out the MM result is correct but the current one is not.
> The test ends like so:
> {noformat}
> desc formatted small_alltypesorc_a;
> ANALYZE TABLE small_alltypesorc_a COMPUTE STATISTICS;
> desc formatted small_alltypesorc_a;
> insert into table small_alltypesorc_a select * from small_alltypesorc1a;
> desc formatted small_alltypesorc_a;
> {noformat}
> The results from the descs in the golden file are:
> {noformat}
>   COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
>   numFiles1   
>   numRows 5   
> ...
>   COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
>   numFiles1   
>   numRows 15
> ...
>   COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
>   numFiles2   
>   numRows 20  
> {noformat}
> Note the result change after analyze - the original nomRows is inaccurate, 
> but  BASIC_STATS is set to true.
> I am assuming with metadata only optimization this can produce incorrect 
> results.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19338) isExplicitAnalyze method may be incorrect in BasicStatsTask

2018-04-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-19338:
---


> isExplicitAnalyze method may be incorrect in BasicStatsTask
> ---
>
> Key: HIVE-19338
> URL: https://issues.apache.org/jira/browse/HIVE-19338
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>
> It relies on a specific ctor being used, however this ctor is used on 
> non-analyze paths too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19338) isExplicitAnalyze method may be incorrect in BasicStatsTask

2018-04-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19338:

Attachment: HIVE-19338.patch

> isExplicitAnalyze method may be incorrect in BasicStatsTask
> ---
>
> Key: HIVE-19338
> URL: https://issues.apache.org/jira/browse/HIVE-19338
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19338.patch
>
>
> It relies on a specific ctor being used, however this ctor is used on 
> non-analyze paths too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19338) isExplicitAnalyze method may be incorrect in BasicStatsTask

2018-04-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19338:

Status: Patch Available  (was: Open)

[~jcamachorodriguez] can you take a look? 
Found out while looking at something else.


> isExplicitAnalyze method may be incorrect in BasicStatsTask
> ---
>
> Key: HIVE-19338
> URL: https://issues.apache.org/jira/browse/HIVE-19338
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19338.patch
>
>
> It relies on a specific ctor being used, however this ctor is used on 
> non-analyze paths too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19124) implement a basic major compactor for MM tables

2018-04-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19124:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the reviews!

> implement a basic major compactor for MM tables
> ---
>
> Key: HIVE-19124
> URL: https://issues.apache.org/jira/browse/HIVE-19124
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Fix For: 3.0.0
>
> Attachments: HIVE-19124.01.patch, HIVE-19124.02.patch, 
> HIVE-19124.03.patch, HIVE-19124.03.patch, HIVE-19124.04.patch, 
> HIVE-19124.05.patch, HIVE-19124.06.patch, HIVE-19124.07.patch, 
> HIVE-19124.08.patch, HIVE-19124.09.patch, HIVE-19124.patch
>
>
> For now, it will run a query directly and only major compactions will be 
> supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19312) MM tables don't work with BucketizedHIF

2018-04-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19312:

Attachment: HIVE-19312.01.patch

> MM tables don't work with BucketizedHIF
> ---
>
> Key: HIVE-19312
> URL: https://issues.apache.org/jira/browse/HIVE-19312
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19312.01.patch, HIVE-19312.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19312) MM tables don't work with BucketizedHIF

2018-04-26 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16455604#comment-16455604
 ] 

Sergey Shelukhin commented on HIVE-19312:
-

I hate HiveQA.

> MM tables don't work with BucketizedHIF
> ---
>
> Key: HIVE-19312
> URL: https://issues.apache.org/jira/browse/HIVE-19312
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19312.01.patch, HIVE-19312.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19324) improve YARN queue check error message in Tez pool

2018-04-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19324:

Attachment: HIVE-19324.01.patch

> improve YARN queue check error message in Tez pool
> --
>
> Key: HIVE-19324
> URL: https://issues.apache.org/jira/browse/HIVE-19324
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepesh Khandelwal
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19324.01.patch, HIVE-19324.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19324) improve YARN queue check error message in Tez pool

2018-04-26 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16455606#comment-16455606
 ] 

Sergey Shelukhin commented on HIVE-19324:
-

I hate HiveQA...

> improve YARN queue check error message in Tez pool
> --
>
> Key: HIVE-19324
> URL: https://issues.apache.org/jira/browse/HIVE-19324
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepesh Khandelwal
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19324.01.patch, HIVE-19324.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19327) qroupby_rollup_empty.q fails for insert-only transactional tables

2018-04-26 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16455615#comment-16455615
 ] 

Sergey Shelukhin commented on HIVE-19327:
-

The result may be invalid if there are some invalid directories in the table 
(i.e. finalPaths is empty, but there are some aborted deltas, for example). 
This will just add them and read them as a single directory.
See CombinedHiveInputFormat for an example of what it does in this case...
{noformat}
  // If there are no inputs; the Execution engine skips the operator tree.
  // To prevent it from happening; an opaque  ZeroRows input is added here 
- when needed.
  result.add(
  new HiveInputSplit(new NullRowsInputFormat.DummyInputSplit(paths[0]), 
ZeroRowsInputFormat.class.getName()));
{noformat}
That will ensure only one row actually gets produced.

> qroupby_rollup_empty.q fails for insert-only transactional tables
> -
>
> Key: HIVE-19327
> URL: https://issues.apache.org/jira/browse/HIVE-19327
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19327.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19340) Disable timeout of transactions opened by replication task at target cluster

2018-04-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456868#comment-16456868
 ] 

Sergey Shelukhin commented on HIVE-19340:
-

Never aborting a transaction seems dangerous. 
If it does get stuck forever, what is the user supposed to do with it? 
And if the user aborts it manually, it's the same as timeout, just aggravating 
to the user. You still have to handle when user aborts it.

If it's ok to get stuck for a long time, why not just increase the heartbeat 
timeout for it?

> Disable timeout of transactions opened by replication task at target cluster
> 
>
> Key: HIVE-19340
> URL: https://issues.apache.org/jira/browse/HIVE-19340
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl, Transactions
>Affects Versions: 3.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-19340.01.patch
>
>
> The transactions opened by applying EVENT_OPEN_TXN should never be aborted 
> automatically due to time-out. Aborting of transaction started by replication 
> task may leads to inconsistent state at target which needs additional 
> overhead to clean-up. So, it is proposed to mark the transactions opened by 
> replication task as special ones and shouldn't be aborted if heart beat is 
> lost. This helps to ensure all ABORT and COMMIT events will always find the 
> corresponding txn at target to operate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19340) Disable timeout of transactions opened by replication task at target cluster

2018-04-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456869#comment-16456869
 ] 

Sergey Shelukhin commented on HIVE-19340:
-

cc [~ekoifman]

> Disable timeout of transactions opened by replication task at target cluster
> 
>
> Key: HIVE-19340
> URL: https://issues.apache.org/jira/browse/HIVE-19340
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl, Transactions
>Affects Versions: 3.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-19340.01.patch
>
>
> The transactions opened by applying EVENT_OPEN_TXN should never be aborted 
> automatically due to time-out. Aborting of transaction started by replication 
> task may leads to inconsistent state at target which needs additional 
> overhead to clean-up. So, it is proposed to mark the transactions opened by 
> replication task as special ones and shouldn't be aborted if heart beat is 
> lost. This helps to ensure all ABORT and COMMIT events will always find the 
> corresponding txn at target to operate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-19340) Disable timeout of transactions opened by replication task at target cluster

2018-04-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456868#comment-16456868
 ] 

Sergey Shelukhin edited comment on HIVE-19340 at 4/27/18 6:25 PM:
--

Never aborting a transaction seems dangerous. 
If it does get stuck forever, what is the user supposed to do with it? 
And if the user aborts it manually, it's the same as timeout, just aggravating 
to the user. You still have to handle when user aborts it.

If it's ok to get stuck for a long time but not forever, why not just increase 
the heartbeat timeout for it?


was (Author: sershe):
Never aborting a transaction seems dangerous. 
If it does get stuck forever, what is the user supposed to do with it? 
And if the user aborts it manually, it's the same as timeout, just aggravating 
to the user. You still have to handle when user aborts it.

If it's ok to get stuck for a long time, why not just increase the heartbeat 
timeout for it?

> Disable timeout of transactions opened by replication task at target cluster
> 
>
> Key: HIVE-19340
> URL: https://issues.apache.org/jira/browse/HIVE-19340
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl, Transactions
>Affects Versions: 3.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-19340.01.patch
>
>
> The transactions opened by applying EVENT_OPEN_TXN should never be aborted 
> automatically due to time-out. Aborting of transaction started by replication 
> task may leads to inconsistent state at target which needs additional 
> overhead to clean-up. So, it is proposed to mark the transactions opened by 
> replication task as special ones and shouldn't be aborted if heart beat is 
> lost. This helps to ensure all ABORT and COMMIT events will always find the 
> corresponding txn at target to operate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17657) export/import for MM tables is broken

2018-04-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457008#comment-16457008
 ] 

Sergey Shelukhin commented on HIVE-17657:
-

Rebased the patch. I bet HiveQA will find a way to lose it somehow.

> export/import for MM tables is broken
> -
>
> Key: HIVE-17657
> URL: https://issues.apache.org/jira/browse/HIVE-17657
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Attachments: HIVE-17657.01.patch, HIVE-17657.02.patch, 
> HIVE-17657.03.patch, HIVE-17657.04.patch, HIVE-17657.05.patch, 
> HIVE-17657.patch
>
>
> there is mm_exim.q but it's not clear from the tests what file structure it 
> creates 
> On import the txnids in the directory names would have to be remapped if 
> importing to a different cluster.  Perhaps export can be smart and export 
> highest base_x and accretive deltas (minus aborted ones).  Then import can 
> ...?  It would have to remap txn ids from the archive to new txn ids.  This 
> would then mean that import is made up of several transactions rather than 1 
> atomic op.  (all locks must belong to a transaction)
> One possibility is to open a new txn for each dir in the archive (where 
> start/end txn of file name is the same) and commit all of them at once (need 
> new TMgr API for that).  This assumes using a shared lock (if any!) and thus 
> allows other inserts (not related to import) to occur.
> What if you have delta_6_9, such as a result of concatenate?  If we stipulate 
> that this must mean that there is no delta_6_6 or any other "obsolete" delta 
> in the archive we can map it to a new single txn delta_x_x.
> Add read_only mode for tables (useful in general, may be needed for upgrade 
> etc) and use that to make the above atomic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17657) export/import for MM tables is broken

2018-04-27 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17657:

Attachment: HIVE-17657.05.patch

> export/import for MM tables is broken
> -
>
> Key: HIVE-17657
> URL: https://issues.apache.org/jira/browse/HIVE-17657
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: mm-gap-2
> Attachments: HIVE-17657.01.patch, HIVE-17657.02.patch, 
> HIVE-17657.03.patch, HIVE-17657.04.patch, HIVE-17657.05.patch, 
> HIVE-17657.patch
>
>
> there is mm_exim.q but it's not clear from the tests what file structure it 
> creates 
> On import the txnids in the directory names would have to be remapped if 
> importing to a different cluster.  Perhaps export can be smart and export 
> highest base_x and accretive deltas (minus aborted ones).  Then import can 
> ...?  It would have to remap txn ids from the archive to new txn ids.  This 
> would then mean that import is made up of several transactions rather than 1 
> atomic op.  (all locks must belong to a transaction)
> One possibility is to open a new txn for each dir in the archive (where 
> start/end txn of file name is the same) and commit all of them at once (need 
> new TMgr API for that).  This assumes using a shared lock (if any!) and thus 
> allows other inserts (not related to import) to occur.
> What if you have delta_6_9, such as a result of concatenate?  If we stipulate 
> that this must mean that there is no delta_6_6 or any other "obsolete" delta 
> in the archive we can map it to a new single txn delta_x_x.
> Add read_only mode for tables (useful in general, may be needed for upgrade 
> etc) and use that to make the above atomic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19282) don't nest delta directories inside LB directories for ACID tables

2018-04-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457047#comment-16457047
 ] 

Sergey Shelukhin commented on HIVE-19282:
-

Rebased the patch. [~prasanth_j] can you please review? thnx

> don't nest delta directories inside LB directories for ACID tables
> --
>
> Key: HIVE-19282
> URL: https://issues.apache.org/jira/browse/HIVE-19282
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19282.01.patch, HIVE-19282.02.patch, 
> HIVE-19282.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19282) don't nest delta directories inside LB directories for ACID tables

2018-04-27 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19282:

Attachment: HIVE-19282.02.patch

> don't nest delta directories inside LB directories for ACID tables
> --
>
> Key: HIVE-19282
> URL: https://issues.apache.org/jira/browse/HIVE-19282
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19282.01.patch, HIVE-19282.02.patch, 
> HIVE-19282.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19327) qroupby_rollup_empty.q fails for insert-only transactional tables

2018-04-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457053#comment-16457053
 ] 

Sergey Shelukhin commented on HIVE-19327:
-

[~steveyeom2017] for the case when there are no finalDirs, and operators must 
run, it will not filter dirs, just return original dirs as it... 
This will work fine if original dir is empty (causing finalDirs to be null).
However if there's something inside dirs that was excluded from finalDirs, it 
will be included and read, which should not happen. I think this condition 
needs to be propagated up and handled the same as the other case - by 
generation a custom 0-row split.


> qroupby_rollup_empty.q fails for insert-only transactional tables
> -
>
> Key: HIVE-19327
> URL: https://issues.apache.org/jira/browse/HIVE-19327
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19327.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19327) qroupby_rollup_empty.q fails for insert-only transactional tables

2018-04-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457097#comment-16457097
 ] 

Sergey Shelukhin commented on HIVE-19327:
-

[~prasanth_j] please don't commit without a test for the above case :)

> qroupby_rollup_empty.q fails for insert-only transactional tables
> -
>
> Key: HIVE-19327
> URL: https://issues.apache.org/jira/browse/HIVE-19327
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19327.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   4   5   6   7   8   9   10   >