Re: Hive compaction didn't launch

2016-08-05 Thread Eugene Koifman
t;> Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>> Date: Friday, July 29, 2016 at 4:43 AM To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:use

Re: Hive compaction didn't launch

2016-07-29 Thread Igor Kuzmenko
Here's how storm works right now: After receiving new message, Storm determine in which partition it should be written. Than, check is there any open connection to that HiveEndPoint

Re: Hive compaction didn't launch

2016-07-28 Thread Eugene Koifman
I think Storm has some timeout parameter that will close the transaction if there are no events for a certain amount of time. How many transactions do you per transaction batch? Perhaps making the batches smaller will make them close sooner. Eugene On 7/28/16, 3:59 PM, "Alan Gates"

Re: Hive compaction didn't launch

2016-07-28 Thread Alan Gates
But until those transactions are closed you don’t know that they won’t write to partition B. After they write to A they may choose to write to B and then commit. The compactor can not make any assumptions about what sessions with open transactions will do in the future. Alan. > On Jul 28,

Re: Hive compaction didn't launch

2016-07-28 Thread Igor Kuzmenko
But this *minOpenTxn* value isn't from from delta I want to compact. *minOpenTxn* can point on transaction in partition *A *while in partition *B *there's deltas ready for compaction. If *minOpenTxn* is less than txnIds in partition *B *deltas, compaction won't happen. So open transaction in

Re: Hive compaction didn't launch

2016-07-28 Thread Alan Gates
Hive is doing the right thing there, as it cannot compact the deltas into a base file while there are still open transactions in the delta. Storm should be committing on some frequency even if it doesn’t have enough data to commit. Alan. > On Jul 28, 2016, at 05:36, Igor Kuzmenko

Re: Hive compaction didn't launch

2016-07-28 Thread Igor Kuzmenko
I made some research on that issue. The problem is in ValidCompactorTxnList::isTxnRangeValid method. Here's code: @Override public RangeResponse

Re: Hive compaction didn't launch

2016-07-27 Thread Igor Kuzmenko
Thanks for reply, Alan. My guess with Storm was wrong. Today I get same behavior with running Storm topology. Anyway, I'd like to know, how can I check that transaction batch was closed correctly? On Wed, Jul 27, 2016 at 8:09 PM, Alan Gates wrote: > I don’t know the

Re: Hive compaction didn't launch

2016-07-27 Thread Alan Gates
I don’t know the details of how the storm application that streams into Hive works, but this sounds like the transaction batches weren’t getting closed. Compaction can’t happen until those batches are closed. Do you know how you had storm configured? Also, you might ask separately on the

Re: Hive compaction didn't launch

2016-07-27 Thread Igor Kuzmenko
One more thing. I'm using Apache Storm to stream data in Hive. And when I turned off Storm topology compactions started to work properly. On Tue, Jul 26, 2016 at 6:28 PM, Igor Kuzmenko wrote: > I'm using Hive 1.2.1 transactional table. Inserting data in it via Hive >

Hive compaction didn't launch

2016-07-26 Thread Igor Kuzmenko
I'm using Hive 1.2.1 transactional table. Inserting data in it via Hive Streaming API. After some time i expect compaction to start but it didn't happen: Here's part of log, which shows that compactor initiator thread doesn't see any delta files: *2016-07-26 18:06:52,459 INFO [Thread-8]: