[jira] [Commented] (HIVE-21931) Slow compaction for tiny tables
[ https://issues.apache.org/jira/browse/HIVE-21931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019669#comment-17019669 ] Csaba Ringhofer commented on HIVE-21931: Thanks, I am ok with HIVE-22554 as solution. I am closing this now, will reopen if it doesn't work for me. > Slow compaction for tiny tables > --- > > Key: HIVE-21931 > URL: https://issues.apache.org/jira/browse/HIVE-21931 > Project: Hive > Issue Type: Bug >Affects Versions: 3.1.0 >Reporter: Csaba Ringhofer >Priority: Major > Labels: compaction > > I observed the issue in Impala development environment when (major) > compacting insert_only transactional tables in Hive. The compaction could > take ~10 minutes even when it only had to merge 2 rows from 2 inserts. The > actual work was done much earlier, the new base file was correctly written to > HDFS, and Hive seemed to wait without doing any work. > The compactions are started manually, hive.compactor.initiator.on=false to > avoid "surprise compaction" during tests. > {code} > hive.compactor.abortedtxn.threshold=1000 > hive.compactor.check.interval=300s > hive.compactor.cleaner.run.interval=5000ms > hive.compactor.compact.insert.only=true > hive.compactor.crud.query.based=false > hive.compactor.delta.num.threshold=10 > hive.compactor.delta.pct.threshold=0.1 > hive.compactor.history.reaper.interval=2m > hive.compactor.history.retention.attempted=2 > hive.compactor.history.retention.failed=3 > hive.compactor.history.retention.succeeded=3 > hive.compactor.initiator.failed.compacts.threshold=2 > hive.compactor.initiator.on=false > hive.compactor.max.num.delta=500 > hive.compactor.worker.threads=4 > hive.compactor.worker.timeout=86400s > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21931) Slow compaction for tiny tables
[ https://issues.apache.org/jira/browse/HIVE-21931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019534#comment-17019534 ] Laszlo Pinter commented on HIVE-21931: -- [~csringhofer] It's a session property. > Slow compaction for tiny tables > --- > > Key: HIVE-21931 > URL: https://issues.apache.org/jira/browse/HIVE-21931 > Project: Hive > Issue Type: Bug >Affects Versions: 3.1.0 >Reporter: Csaba Ringhofer >Priority: Major > Labels: compaction > > I observed the issue in Impala development environment when (major) > compacting insert_only transactional tables in Hive. The compaction could > take ~10 minutes even when it only had to merge 2 rows from 2 inserts. The > actual work was done much earlier, the new base file was correctly written to > HDFS, and Hive seemed to wait without doing any work. > The compactions are started manually, hive.compactor.initiator.on=false to > avoid "surprise compaction" during tests. > {code} > hive.compactor.abortedtxn.threshold=1000 > hive.compactor.check.interval=300s > hive.compactor.cleaner.run.interval=5000ms > hive.compactor.compact.insert.only=true > hive.compactor.crud.query.based=false > hive.compactor.delta.num.threshold=10 > hive.compactor.delta.pct.threshold=0.1 > hive.compactor.history.reaper.interval=2m > hive.compactor.history.retention.attempted=2 > hive.compactor.history.retention.failed=3 > hive.compactor.history.retention.succeeded=3 > hive.compactor.initiator.failed.compacts.threshold=2 > hive.compactor.initiator.on=false > hive.compactor.max.num.delta=500 > hive.compactor.worker.threads=4 > hive.compactor.worker.timeout=86400s > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21931) Slow compaction for tiny tables
[ https://issues.apache.org/jira/browse/HIVE-21931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019518#comment-17019518 ] Csaba Ringhofer commented on HIVE-21931: [~lpinter] yes, that sounds good. I still don't understand why do we need such a large default though. Is it a session property, or I have to set it at Sentry startup time? > Slow compaction for tiny tables > --- > > Key: HIVE-21931 > URL: https://issues.apache.org/jira/browse/HIVE-21931 > Project: Hive > Issue Type: Bug >Affects Versions: 3.1.0 >Reporter: Csaba Ringhofer >Priority: Major > Labels: compaction > > I observed the issue in Impala development environment when (major) > compacting insert_only transactional tables in Hive. The compaction could > take ~10 minutes even when it only had to merge 2 rows from 2 inserts. The > actual work was done much earlier, the new base file was correctly written to > HDFS, and Hive seemed to wait without doing any work. > The compactions are started manually, hive.compactor.initiator.on=false to > avoid "surprise compaction" during tests. > {code} > hive.compactor.abortedtxn.threshold=1000 > hive.compactor.check.interval=300s > hive.compactor.cleaner.run.interval=5000ms > hive.compactor.compact.insert.only=true > hive.compactor.crud.query.based=false > hive.compactor.delta.num.threshold=10 > hive.compactor.delta.pct.threshold=0.1 > hive.compactor.history.reaper.interval=2m > hive.compactor.history.retention.attempted=2 > hive.compactor.history.retention.failed=3 > hive.compactor.history.retention.succeeded=3 > hive.compactor.initiator.failed.compacts.threshold=2 > hive.compactor.initiator.on=false > hive.compactor.max.num.delta=500 > hive.compactor.worker.threads=4 > hive.compactor.worker.timeout=86400s > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21931) Slow compaction for tiny tables
[ https://issues.apache.org/jira/browse/HIVE-21931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019501#comment-17019501 ] Laszlo Pinter commented on HIVE-21931: -- [~csringhofer] With HIVE-22554 you can configure the initial wait time out to a much lower value, like 2000 millisec. Does that satisfies your needs? > Slow compaction for tiny tables > --- > > Key: HIVE-21931 > URL: https://issues.apache.org/jira/browse/HIVE-21931 > Project: Hive > Issue Type: Bug >Affects Versions: 3.1.0 >Reporter: Csaba Ringhofer >Priority: Major > Labels: compaction > > I observed the issue in Impala development environment when (major) > compacting insert_only transactional tables in Hive. The compaction could > take ~10 minutes even when it only had to merge 2 rows from 2 inserts. The > actual work was done much earlier, the new base file was correctly written to > HDFS, and Hive seemed to wait without doing any work. > The compactions are started manually, hive.compactor.initiator.on=false to > avoid "surprise compaction" during tests. > {code} > hive.compactor.abortedtxn.threshold=1000 > hive.compactor.check.interval=300s > hive.compactor.cleaner.run.interval=5000ms > hive.compactor.compact.insert.only=true > hive.compactor.crud.query.based=false > hive.compactor.delta.num.threshold=10 > hive.compactor.delta.pct.threshold=0.1 > hive.compactor.history.reaper.interval=2m > hive.compactor.history.retention.attempted=2 > hive.compactor.history.retention.failed=3 > hive.compactor.history.retention.succeeded=3 > hive.compactor.initiator.failed.compacts.threshold=2 > hive.compactor.initiator.on=false > hive.compactor.max.num.delta=500 > hive.compactor.worker.threads=4 > hive.compactor.worker.timeout=86400s > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21931) Slow compaction for tiny tables
[ https://issues.apache.org/jira/browse/HIVE-21931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019466#comment-17019466 ] Csaba Ringhofer commented on HIVE-21931: [~Rajkumar Singh] Sorry, I missed your comment for some reason. [~lpinter] Yes, I do run it with " and wait". This is needed for the tests, as we specifically test if Impala can read a table after/during compaction. > Slow compaction for tiny tables > --- > > Key: HIVE-21931 > URL: https://issues.apache.org/jira/browse/HIVE-21931 > Project: Hive > Issue Type: Bug >Affects Versions: 3.1.0 >Reporter: Csaba Ringhofer >Priority: Major > Labels: compaction > > I observed the issue in Impala development environment when (major) > compacting insert_only transactional tables in Hive. The compaction could > take ~10 minutes even when it only had to merge 2 rows from 2 inserts. The > actual work was done much earlier, the new base file was correctly written to > HDFS, and Hive seemed to wait without doing any work. > The compactions are started manually, hive.compactor.initiator.on=false to > avoid "surprise compaction" during tests. > {code} > hive.compactor.abortedtxn.threshold=1000 > hive.compactor.check.interval=300s > hive.compactor.cleaner.run.interval=5000ms > hive.compactor.compact.insert.only=true > hive.compactor.crud.query.based=false > hive.compactor.delta.num.threshold=10 > hive.compactor.delta.pct.threshold=0.1 > hive.compactor.history.reaper.interval=2m > hive.compactor.history.retention.attempted=2 > hive.compactor.history.retention.failed=3 > hive.compactor.history.retention.succeeded=3 > hive.compactor.initiator.failed.compacts.threshold=2 > hive.compactor.initiator.on=false > hive.compactor.max.num.delta=500 > hive.compactor.worker.threads=4 > hive.compactor.worker.timeout=86400s > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21931) Slow compaction for tiny tables
[ https://issues.apache.org/jira/browse/HIVE-21931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019458#comment-17019458 ] Laszlo Pinter commented on HIVE-21931: -- [~csringhofer] Did you run compaction in blocking mode? HIVE-22554 provides a way to configure the wait time out. > Slow compaction for tiny tables > --- > > Key: HIVE-21931 > URL: https://issues.apache.org/jira/browse/HIVE-21931 > Project: Hive > Issue Type: Bug >Affects Versions: 3.1.0 >Reporter: Csaba Ringhofer >Priority: Major > Labels: compaction > > I observed the issue in Impala development environment when (major) > compacting insert_only transactional tables in Hive. The compaction could > take ~10 minutes even when it only had to merge 2 rows from 2 inserts. The > actual work was done much earlier, the new base file was correctly written to > HDFS, and Hive seemed to wait without doing any work. > The compactions are started manually, hive.compactor.initiator.on=false to > avoid "surprise compaction" during tests. > {code} > hive.compactor.abortedtxn.threshold=1000 > hive.compactor.check.interval=300s > hive.compactor.cleaner.run.interval=5000ms > hive.compactor.compact.insert.only=true > hive.compactor.crud.query.based=false > hive.compactor.delta.num.threshold=10 > hive.compactor.delta.pct.threshold=0.1 > hive.compactor.history.reaper.interval=2m > hive.compactor.history.retention.attempted=2 > hive.compactor.history.retention.failed=3 > hive.compactor.history.retention.succeeded=3 > hive.compactor.initiator.failed.compacts.threshold=2 > hive.compactor.initiator.on=false > hive.compactor.max.num.delta=500 > hive.compactor.worker.threads=4 > hive.compactor.worker.timeout=86400s > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21931) Slow compaction for tiny tables
[ https://issues.apache.org/jira/browse/HIVE-21931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907513#comment-16907513 ] Rajkumar Singh commented on HIVE-21931: --- compaction should be affected by wait time https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/storage/AlterTableCompactOperation.java#L102 only in case of blocking compaction command (alter table compact 'major' and wait), if that the case then increasing wait time exp will be a good idea. [~csringhofer] can you confirm that you are seeing this issue with the blocking compaction call? > Slow compaction for tiny tables > --- > > Key: HIVE-21931 > URL: https://issues.apache.org/jira/browse/HIVE-21931 > Project: Hive > Issue Type: Bug >Affects Versions: 3.1.0 >Reporter: Csaba Ringhofer >Priority: Major > Labels: compaction > > I observed the issue in Impala development environment when (major) > compacting insert_only transactional tables in Hive. The compaction could > take ~10 minutes even when it only had to merge 2 rows from 2 inserts. The > actual work was done much earlier, the new base file was correctly written to > HDFS, and Hive seemed to wait without doing any work. > The compactions are started manually, hive.compactor.initiator.on=false to > avoid "surprise compaction" during tests. > {code} > hive.compactor.abortedtxn.threshold=1000 > hive.compactor.check.interval=300s > hive.compactor.cleaner.run.interval=5000ms > hive.compactor.compact.insert.only=true > hive.compactor.crud.query.based=false > hive.compactor.delta.num.threshold=10 > hive.compactor.delta.pct.threshold=0.1 > hive.compactor.history.reaper.interval=2m > hive.compactor.history.retention.attempted=2 > hive.compactor.history.retention.failed=3 > hive.compactor.history.retention.succeeded=3 > hive.compactor.initiator.failed.compacts.threshold=2 > hive.compactor.initiator.on=false > hive.compactor.max.num.delta=500 > hive.compactor.worker.threads=4 > hive.compactor.worker.timeout=86400s > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (HIVE-21931) Slow compaction for tiny tables
[ https://issues.apache.org/jira/browse/HIVE-21931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906334#comment-16906334 ] Peter Vary commented on HIVE-21931: --- [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/storage/AlterTableCompactOperation.java#L102] We wait 5 minutes for starting the check. It would be good to do some exp increasing timeout > Slow compaction for tiny tables > --- > > Key: HIVE-21931 > URL: https://issues.apache.org/jira/browse/HIVE-21931 > Project: Hive > Issue Type: Bug >Affects Versions: 3.1.0 >Reporter: Csaba Ringhofer >Priority: Major > Labels: compaction > > I observed the issue in Impala development environment when (major) > compacting insert_only transactional tables in Hive. The compaction could > take ~10 minutes even when it only had to merge 2 rows from 2 inserts. The > actual work was done much earlier, the new base file was correctly written to > HDFS, and Hive seemed to wait without doing any work. > The compactions are started manually, hive.compactor.initiator.on=false to > avoid "surprise compaction" during tests. > {code} > hive.compactor.abortedtxn.threshold=1000 > hive.compactor.check.interval=300s > hive.compactor.cleaner.run.interval=5000ms > hive.compactor.compact.insert.only=true > hive.compactor.crud.query.based=false > hive.compactor.delta.num.threshold=10 > hive.compactor.delta.pct.threshold=0.1 > hive.compactor.history.reaper.interval=2m > hive.compactor.history.retention.attempted=2 > hive.compactor.history.retention.failed=3 > hive.compactor.history.retention.succeeded=3 > hive.compactor.initiator.failed.compacts.threshold=2 > hive.compactor.initiator.on=false > hive.compactor.max.num.delta=500 > hive.compactor.worker.threads=4 > hive.compactor.worker.timeout=86400s > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016)