But until those transactions are closed you don’t know that they won’t write to 
partition B.  After they write to A they may choose to write to B and then 
commit.  The compactor can not make any assumptions about what sessions with 
open transactions will do in the future.

Alan.

> On Jul 28, 2016, at 09:19, Igor Kuzmenko <f1she...@gmail.com> wrote:
> 
> But this minOpenTxn value isn't from from delta I want to compact. minOpenTxn 
> can point on transaction in partition A while in partition B there's deltas 
> ready for compaction. If minOpenTxn is less than txnIds in partition B 
> deltas, compaction won't happen. So open transaction in partition A blocks 
> compaction in partition B. That's seems wrong to me.
> 
> On Thu, Jul 28, 2016 at 7:06 PM, Alan Gates <alanfga...@gmail.com> wrote:
> Hive is doing the right thing there, as it cannot compact the deltas into a 
> base file while there are still open transactions in the delta.  Storm should 
> be committing on some frequency even if it doesn’t have enough data to commit.
> 
> Alan.
> 
> > On Jul 28, 2016, at 05:36, Igor Kuzmenko <f1she...@gmail.com> wrote:
> >
> > I made some research on that issue.
> > The problem is in ValidCompactorTxnList::isTxnRangeValid method.
> >
> > Here's code:
> > @Override
> > public RangeResponse isTxnRangeValid(long minTxnId, long maxTxnId) {
> >   if (highWatermark < minTxnId) {
> >     return RangeResponse.NONE;
> >   } else if (minOpenTxn < 0) {
> >     return highWatermark >= maxTxnId ? RangeResponse.ALL : 
> > RangeResponse.NONE;
> >   } else {
> >     return minOpenTxn > maxTxnId ? RangeResponse.ALL : RangeResponse.NONE;
> >   }
> > }
> >
> > In my case this method returned RangeResponce.NONE for most of delta files. 
> > With this value delta file doesn't include in compaction.
> >
> > Last 'else' bock compare minOpenTxn to maxTxnId and if maxTxnId bigger 
> > return RangeResponce.NONE, thats a problem for me, because of using Storm 
> > Hive Bolt. Hive Bolt gets transaction and maintain it open with heartbeat 
> > until there's data to commit.
> >
> > So if i get transaction and maintain it open all compactions will stop. Is 
> > it incorrect Hive behavior, or Storm should close transaction?
> >
> >
> >
> >
> > On Wed, Jul 27, 2016 at 8:46 PM, Igor Kuzmenko <f1she...@gmail.com> wrote:
> > Thanks for reply, Alan. My guess with Storm was wrong. Today I get same 
> > behavior with running Storm topology.
> > Anyway, I'd like to know, how can I check that transaction batch was closed 
> > correctly?
> >
> > On Wed, Jul 27, 2016 at 8:09 PM, Alan Gates <alanfga...@gmail.com> wrote:
> > I don’t know the details of how the storm application that streams into 
> > Hive works, but this sounds like the transaction batches weren’t getting 
> > closed.  Compaction can’t happen until those batches are closed.  Do you 
> > know how you had storm configured?  Also, you might ask separately on the 
> > storm list to see if people have seen this issue before.
> >
> > Alan.
> >
> > > On Jul 27, 2016, at 03:31, Igor Kuzmenko <f1she...@gmail.com> wrote:
> > >
> > > One more thing. I'm using Apache Storm to stream data in Hive. And when I 
> > > turned off Storm topology compactions started to work properly.
> > >
> > > On Tue, Jul 26, 2016 at 6:28 PM, Igor Kuzmenko <f1she...@gmail.com> wrote:
> > > I'm using Hive 1.2.1 transactional table. Inserting data in it via Hive 
> > > Streaming API. After some time i expect compaction to start but it didn't 
> > > happen:
> > >
> > > Here's part of log, which shows that compactor initiator thread doesn't 
> > > see any delta files:
> > > 2016-07-26 18:06:52,459 INFO  [Thread-8]: compactor.Initiator 
> > > (Initiator.java:run(89)) - Checking to see if we should compact 
> > > default.data_aaa.dt=20160726
> > > 2016-07-26 18:06:52,496 DEBUG [Thread-8]: io.AcidUtils 
> > > (AcidUtils.java:getAcidState(432)) - in directory 
> > > hdfs://sorm-master01.msk.mts.ru:8020/apps/hive/warehouse/data_aaa/dt=20160726
> > >  base = null deltas = 0
> > > 2016-07-26 18:06:52,496 DEBUG [Thread-8]: compactor.Initiator 
> > > (Initiator.java:determineCompactionType(271)) - delta size: 0 base size: 
> > > 0 threshold: 0.1 will major compact: false
> > >
> > > But in that directory there's actually 23 files:
> > >
> > > hadoop fs -ls /apps/hive/warehouse/data_aaa/dt=20160726
> > > Found 23 items
> > > -rw-r--r--   3 storm hdfs          4 2016-07-26 17:20 
> > > /apps/hive/warehouse/data_aaa/dt=20160726/_orc_acid_version
> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:22 
> > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71741256_71741355
> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:23 
> > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71762456_71762555
> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:25 
> > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71787756_71787855
> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:26 
> > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71795756_71795855
> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:27 
> > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71804656_71804755
> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:29 
> > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71828856_71828955
> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:30 
> > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71846656_71846755
> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:32 
> > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71850756_71850855
> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:33 
> > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71867356_71867455
> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:34 
> > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71891556_71891655
> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:36 
> > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71904856_71904955
> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:37 
> > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71907256_71907355
> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:39 
> > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71918756_71918855
> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:40 
> > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71947556_71947655
> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:41 
> > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71960656_71960755
> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:43 
> > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71963156_71963255
> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:44 
> > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71964556_71964655
> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:46 
> > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71987156_71987255
> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:47 
> > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_72015756_72015855
> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:48 
> > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_72021356_72021455
> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:50 
> > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_72048756_72048855
> > > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:50 
> > > /apps/hive/warehouse/data_aaa/dt=20160726/delta_72070856_72070955
> > >
> > > Full log here.
> > >
> > > What could go wrong?
> > >
> >
> >
> >
> 
> 

Reply via email to