But until those transactions are closed you don’t know that they won’t write to partition B. After they write to A they may choose to write to B and then commit. The compactor can not make any assumptions about what sessions with open transactions will do in the future.
Alan. > On Jul 28, 2016, at 09:19, Igor Kuzmenko <f1she...@gmail.com> wrote: > > But this minOpenTxn value isn't from from delta I want to compact. minOpenTxn > can point on transaction in partition A while in partition B there's deltas > ready for compaction. If minOpenTxn is less than txnIds in partition B > deltas, compaction won't happen. So open transaction in partition A blocks > compaction in partition B. That's seems wrong to me. > > On Thu, Jul 28, 2016 at 7:06 PM, Alan Gates <alanfga...@gmail.com> wrote: > Hive is doing the right thing there, as it cannot compact the deltas into a > base file while there are still open transactions in the delta. Storm should > be committing on some frequency even if it doesn’t have enough data to commit. > > Alan. > > > On Jul 28, 2016, at 05:36, Igor Kuzmenko <f1she...@gmail.com> wrote: > > > > I made some research on that issue. > > The problem is in ValidCompactorTxnList::isTxnRangeValid method. > > > > Here's code: > > @Override > > public RangeResponse isTxnRangeValid(long minTxnId, long maxTxnId) { > > if (highWatermark < minTxnId) { > > return RangeResponse.NONE; > > } else if (minOpenTxn < 0) { > > return highWatermark >= maxTxnId ? RangeResponse.ALL : > > RangeResponse.NONE; > > } else { > > return minOpenTxn > maxTxnId ? RangeResponse.ALL : RangeResponse.NONE; > > } > > } > > > > In my case this method returned RangeResponce.NONE for most of delta files. > > With this value delta file doesn't include in compaction. > > > > Last 'else' bock compare minOpenTxn to maxTxnId and if maxTxnId bigger > > return RangeResponce.NONE, thats a problem for me, because of using Storm > > Hive Bolt. Hive Bolt gets transaction and maintain it open with heartbeat > > until there's data to commit. > > > > So if i get transaction and maintain it open all compactions will stop. Is > > it incorrect Hive behavior, or Storm should close transaction? > > > > > > > > > > On Wed, Jul 27, 2016 at 8:46 PM, Igor Kuzmenko <f1she...@gmail.com> wrote: > > Thanks for reply, Alan. My guess with Storm was wrong. Today I get same > > behavior with running Storm topology. > > Anyway, I'd like to know, how can I check that transaction batch was closed > > correctly? > > > > On Wed, Jul 27, 2016 at 8:09 PM, Alan Gates <alanfga...@gmail.com> wrote: > > I don’t know the details of how the storm application that streams into > > Hive works, but this sounds like the transaction batches weren’t getting > > closed. Compaction can’t happen until those batches are closed. Do you > > know how you had storm configured? Also, you might ask separately on the > > storm list to see if people have seen this issue before. > > > > Alan. > > > > > On Jul 27, 2016, at 03:31, Igor Kuzmenko <f1she...@gmail.com> wrote: > > > > > > One more thing. I'm using Apache Storm to stream data in Hive. And when I > > > turned off Storm topology compactions started to work properly. > > > > > > On Tue, Jul 26, 2016 at 6:28 PM, Igor Kuzmenko <f1she...@gmail.com> wrote: > > > I'm using Hive 1.2.1 transactional table. Inserting data in it via Hive > > > Streaming API. After some time i expect compaction to start but it didn't > > > happen: > > > > > > Here's part of log, which shows that compactor initiator thread doesn't > > > see any delta files: > > > 2016-07-26 18:06:52,459 INFO [Thread-8]: compactor.Initiator > > > (Initiator.java:run(89)) - Checking to see if we should compact > > > default.data_aaa.dt=20160726 > > > 2016-07-26 18:06:52,496 DEBUG [Thread-8]: io.AcidUtils > > > (AcidUtils.java:getAcidState(432)) - in directory > > > hdfs://sorm-master01.msk.mts.ru:8020/apps/hive/warehouse/data_aaa/dt=20160726 > > > base = null deltas = 0 > > > 2016-07-26 18:06:52,496 DEBUG [Thread-8]: compactor.Initiator > > > (Initiator.java:determineCompactionType(271)) - delta size: 0 base size: > > > 0 threshold: 0.1 will major compact: false > > > > > > But in that directory there's actually 23 files: > > > > > > hadoop fs -ls /apps/hive/warehouse/data_aaa/dt=20160726 > > > Found 23 items > > > -rw-r--r-- 3 storm hdfs 4 2016-07-26 17:20 > > > /apps/hive/warehouse/data_aaa/dt=20160726/_orc_acid_version > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:22 > > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71741256_71741355 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:23 > > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71762456_71762555 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:25 > > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71787756_71787855 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:26 > > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71795756_71795855 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:27 > > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71804656_71804755 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:29 > > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71828856_71828955 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:30 > > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71846656_71846755 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:32 > > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71850756_71850855 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:33 > > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71867356_71867455 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:34 > > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71891556_71891655 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:36 > > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71904856_71904955 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:37 > > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71907256_71907355 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:39 > > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71918756_71918855 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:40 > > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71947556_71947655 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:41 > > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71960656_71960755 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:43 > > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71963156_71963255 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:44 > > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71964556_71964655 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:46 > > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71987156_71987255 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:47 > > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_72015756_72015855 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:48 > > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_72021356_72021455 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:50 > > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_72048756_72048855 > > > drwxrwxrwx - storm hdfs 0 2016-07-26 17:50 > > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_72070856_72070955 > > > > > > Full log here. > > > > > > What could go wrong? > > > > > > > > > > >