[jira] [Commented] (HIVE-21931) Slow compaction for tiny tables

2020-01-20 Thread Csaba Ringhofer (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019669#comment-17019669
 ] 

Csaba Ringhofer commented on HIVE-21931:


Thanks, I am ok with HIVE-22554  as solution. I am closing this now, will 
reopen if it doesn't work for me.

> Slow compaction for tiny tables
> ---
>
> Key: HIVE-21931
> URL: https://issues.apache.org/jira/browse/HIVE-21931
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Csaba Ringhofer
>Priority: Major
>  Labels: compaction
>
> I observed the issue in Impala development environment when (major) 
> compacting insert_only transactional tables in Hive. The compaction could 
> take ~10 minutes even when it only had to merge 2 rows from 2 inserts. The 
> actual work was done much earlier, the new base file was correctly written to 
> HDFS, and Hive seemed to wait without doing any work.
> The compactions are started manually, hive.compactor.initiator.on=false to 
> avoid "surprise compaction" during tests.
> {code}
> hive.compactor.abortedtxn.threshold=1000
> hive.compactor.check.interval=300s
> hive.compactor.cleaner.run.interval=5000ms
> hive.compactor.compact.insert.only=true
> hive.compactor.crud.query.based=false
> hive.compactor.delta.num.threshold=10
> hive.compactor.delta.pct.threshold=0.1
> hive.compactor.history.reaper.interval=2m
> hive.compactor.history.retention.attempted=2
> hive.compactor.history.retention.failed=3
> hive.compactor.history.retention.succeeded=3
> hive.compactor.initiator.failed.compacts.threshold=2
> hive.compactor.initiator.on=false
> hive.compactor.max.num.delta=500
> hive.compactor.worker.threads=4
> hive.compactor.worker.timeout=86400s
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-21931) Slow compaction for tiny tables

2020-01-20 Thread Laszlo Pinter (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019534#comment-17019534
 ] 

Laszlo Pinter commented on HIVE-21931:
--

[~csringhofer] It's a session property. 
 

> Slow compaction for tiny tables
> ---
>
> Key: HIVE-21931
> URL: https://issues.apache.org/jira/browse/HIVE-21931
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Csaba Ringhofer
>Priority: Major
>  Labels: compaction
>
> I observed the issue in Impala development environment when (major) 
> compacting insert_only transactional tables in Hive. The compaction could 
> take ~10 minutes even when it only had to merge 2 rows from 2 inserts. The 
> actual work was done much earlier, the new base file was correctly written to 
> HDFS, and Hive seemed to wait without doing any work.
> The compactions are started manually, hive.compactor.initiator.on=false to 
> avoid "surprise compaction" during tests.
> {code}
> hive.compactor.abortedtxn.threshold=1000
> hive.compactor.check.interval=300s
> hive.compactor.cleaner.run.interval=5000ms
> hive.compactor.compact.insert.only=true
> hive.compactor.crud.query.based=false
> hive.compactor.delta.num.threshold=10
> hive.compactor.delta.pct.threshold=0.1
> hive.compactor.history.reaper.interval=2m
> hive.compactor.history.retention.attempted=2
> hive.compactor.history.retention.failed=3
> hive.compactor.history.retention.succeeded=3
> hive.compactor.initiator.failed.compacts.threshold=2
> hive.compactor.initiator.on=false
> hive.compactor.max.num.delta=500
> hive.compactor.worker.threads=4
> hive.compactor.worker.timeout=86400s
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-21931) Slow compaction for tiny tables

2020-01-20 Thread Csaba Ringhofer (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019518#comment-17019518
 ] 

Csaba Ringhofer commented on HIVE-21931:


[~lpinter] yes, that sounds good. I still don't understand why do we need such 
a large default though.
Is it a session property, or I have to set it at Sentry startup time?

> Slow compaction for tiny tables
> ---
>
> Key: HIVE-21931
> URL: https://issues.apache.org/jira/browse/HIVE-21931
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Csaba Ringhofer
>Priority: Major
>  Labels: compaction
>
> I observed the issue in Impala development environment when (major) 
> compacting insert_only transactional tables in Hive. The compaction could 
> take ~10 minutes even when it only had to merge 2 rows from 2 inserts. The 
> actual work was done much earlier, the new base file was correctly written to 
> HDFS, and Hive seemed to wait without doing any work.
> The compactions are started manually, hive.compactor.initiator.on=false to 
> avoid "surprise compaction" during tests.
> {code}
> hive.compactor.abortedtxn.threshold=1000
> hive.compactor.check.interval=300s
> hive.compactor.cleaner.run.interval=5000ms
> hive.compactor.compact.insert.only=true
> hive.compactor.crud.query.based=false
> hive.compactor.delta.num.threshold=10
> hive.compactor.delta.pct.threshold=0.1
> hive.compactor.history.reaper.interval=2m
> hive.compactor.history.retention.attempted=2
> hive.compactor.history.retention.failed=3
> hive.compactor.history.retention.succeeded=3
> hive.compactor.initiator.failed.compacts.threshold=2
> hive.compactor.initiator.on=false
> hive.compactor.max.num.delta=500
> hive.compactor.worker.threads=4
> hive.compactor.worker.timeout=86400s
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-21931) Slow compaction for tiny tables

2020-01-20 Thread Laszlo Pinter (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019501#comment-17019501
 ] 

Laszlo Pinter commented on HIVE-21931:
--

[~csringhofer] With HIVE-22554 you can configure the initial wait time out to a 
much lower value, like 2000 millisec. Does that satisfies your needs? 

> Slow compaction for tiny tables
> ---
>
> Key: HIVE-21931
> URL: https://issues.apache.org/jira/browse/HIVE-21931
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Csaba Ringhofer
>Priority: Major
>  Labels: compaction
>
> I observed the issue in Impala development environment when (major) 
> compacting insert_only transactional tables in Hive. The compaction could 
> take ~10 minutes even when it only had to merge 2 rows from 2 inserts. The 
> actual work was done much earlier, the new base file was correctly written to 
> HDFS, and Hive seemed to wait without doing any work.
> The compactions are started manually, hive.compactor.initiator.on=false to 
> avoid "surprise compaction" during tests.
> {code}
> hive.compactor.abortedtxn.threshold=1000
> hive.compactor.check.interval=300s
> hive.compactor.cleaner.run.interval=5000ms
> hive.compactor.compact.insert.only=true
> hive.compactor.crud.query.based=false
> hive.compactor.delta.num.threshold=10
> hive.compactor.delta.pct.threshold=0.1
> hive.compactor.history.reaper.interval=2m
> hive.compactor.history.retention.attempted=2
> hive.compactor.history.retention.failed=3
> hive.compactor.history.retention.succeeded=3
> hive.compactor.initiator.failed.compacts.threshold=2
> hive.compactor.initiator.on=false
> hive.compactor.max.num.delta=500
> hive.compactor.worker.threads=4
> hive.compactor.worker.timeout=86400s
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-21931) Slow compaction for tiny tables

2020-01-20 Thread Csaba Ringhofer (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019466#comment-17019466
 ] 

Csaba Ringhofer commented on HIVE-21931:


[~Rajkumar Singh] Sorry, I missed your comment for some reason.
[~lpinter] Yes, I do run it with " and wait". This is needed for the tests, as 
we specifically test if Impala can read a table after/during compaction.

> Slow compaction for tiny tables
> ---
>
> Key: HIVE-21931
> URL: https://issues.apache.org/jira/browse/HIVE-21931
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Csaba Ringhofer
>Priority: Major
>  Labels: compaction
>
> I observed the issue in Impala development environment when (major) 
> compacting insert_only transactional tables in Hive. The compaction could 
> take ~10 minutes even when it only had to merge 2 rows from 2 inserts. The 
> actual work was done much earlier, the new base file was correctly written to 
> HDFS, and Hive seemed to wait without doing any work.
> The compactions are started manually, hive.compactor.initiator.on=false to 
> avoid "surprise compaction" during tests.
> {code}
> hive.compactor.abortedtxn.threshold=1000
> hive.compactor.check.interval=300s
> hive.compactor.cleaner.run.interval=5000ms
> hive.compactor.compact.insert.only=true
> hive.compactor.crud.query.based=false
> hive.compactor.delta.num.threshold=10
> hive.compactor.delta.pct.threshold=0.1
> hive.compactor.history.reaper.interval=2m
> hive.compactor.history.retention.attempted=2
> hive.compactor.history.retention.failed=3
> hive.compactor.history.retention.succeeded=3
> hive.compactor.initiator.failed.compacts.threshold=2
> hive.compactor.initiator.on=false
> hive.compactor.max.num.delta=500
> hive.compactor.worker.threads=4
> hive.compactor.worker.timeout=86400s
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-21931) Slow compaction for tiny tables

2020-01-20 Thread Laszlo Pinter (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019458#comment-17019458
 ] 

Laszlo Pinter commented on HIVE-21931:
--

[~csringhofer] Did you run compaction in blocking mode?

HIVE-22554 provides a way to configure the wait time out. 

> Slow compaction for tiny tables
> ---
>
> Key: HIVE-21931
> URL: https://issues.apache.org/jira/browse/HIVE-21931
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Csaba Ringhofer
>Priority: Major
>  Labels: compaction
>
> I observed the issue in Impala development environment when (major) 
> compacting insert_only transactional tables in Hive. The compaction could 
> take ~10 minutes even when it only had to merge 2 rows from 2 inserts. The 
> actual work was done much earlier, the new base file was correctly written to 
> HDFS, and Hive seemed to wait without doing any work.
> The compactions are started manually, hive.compactor.initiator.on=false to 
> avoid "surprise compaction" during tests.
> {code}
> hive.compactor.abortedtxn.threshold=1000
> hive.compactor.check.interval=300s
> hive.compactor.cleaner.run.interval=5000ms
> hive.compactor.compact.insert.only=true
> hive.compactor.crud.query.based=false
> hive.compactor.delta.num.threshold=10
> hive.compactor.delta.pct.threshold=0.1
> hive.compactor.history.reaper.interval=2m
> hive.compactor.history.retention.attempted=2
> hive.compactor.history.retention.failed=3
> hive.compactor.history.retention.succeeded=3
> hive.compactor.initiator.failed.compacts.threshold=2
> hive.compactor.initiator.on=false
> hive.compactor.max.num.delta=500
> hive.compactor.worker.threads=4
> hive.compactor.worker.timeout=86400s
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-21931) Slow compaction for tiny tables

2019-08-14 Thread Rajkumar Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907513#comment-16907513
 ] 

Rajkumar Singh commented on HIVE-21931:
---

compaction should be affected by wait time 
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/storage/AlterTableCompactOperation.java#L102
 only in case of blocking compaction command (alter table compact 'major' 
and wait), if that the case then increasing wait time exp will be a good idea.
[~csringhofer] can you confirm that you are seeing this issue with the blocking 
compaction call?


> Slow compaction for tiny tables
> ---
>
> Key: HIVE-21931
> URL: https://issues.apache.org/jira/browse/HIVE-21931
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Csaba Ringhofer
>Priority: Major
>  Labels: compaction
>
> I observed the issue in Impala development environment when (major) 
> compacting insert_only transactional tables in Hive. The compaction could 
> take ~10 minutes even when it only had to merge 2 rows from 2 inserts. The 
> actual work was done much earlier, the new base file was correctly written to 
> HDFS, and Hive seemed to wait without doing any work.
> The compactions are started manually, hive.compactor.initiator.on=false to 
> avoid "surprise compaction" during tests.
> {code}
> hive.compactor.abortedtxn.threshold=1000
> hive.compactor.check.interval=300s
> hive.compactor.cleaner.run.interval=5000ms
> hive.compactor.compact.insert.only=true
> hive.compactor.crud.query.based=false
> hive.compactor.delta.num.threshold=10
> hive.compactor.delta.pct.threshold=0.1
> hive.compactor.history.reaper.interval=2m
> hive.compactor.history.retention.attempted=2
> hive.compactor.history.retention.failed=3
> hive.compactor.history.retention.succeeded=3
> hive.compactor.initiator.failed.compacts.threshold=2
> hive.compactor.initiator.on=false
> hive.compactor.max.num.delta=500
> hive.compactor.worker.threads=4
> hive.compactor.worker.timeout=86400s
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-21931) Slow compaction for tiny tables

2019-08-13 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906334#comment-16906334
 ] 

Peter Vary commented on HIVE-21931:
---

[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/storage/AlterTableCompactOperation.java#L102]

We wait 5 minutes for starting the check. It would be good to do some exp 
increasing timeout

> Slow compaction for tiny tables
> ---
>
> Key: HIVE-21931
> URL: https://issues.apache.org/jira/browse/HIVE-21931
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Csaba Ringhofer
>Priority: Major
>  Labels: compaction
>
> I observed the issue in Impala development environment when (major) 
> compacting insert_only transactional tables in Hive. The compaction could 
> take ~10 minutes even when it only had to merge 2 rows from 2 inserts. The 
> actual work was done much earlier, the new base file was correctly written to 
> HDFS, and Hive seemed to wait without doing any work.
> The compactions are started manually, hive.compactor.initiator.on=false to 
> avoid "surprise compaction" during tests.
> {code}
> hive.compactor.abortedtxn.threshold=1000
> hive.compactor.check.interval=300s
> hive.compactor.cleaner.run.interval=5000ms
> hive.compactor.compact.insert.only=true
> hive.compactor.crud.query.based=false
> hive.compactor.delta.num.threshold=10
> hive.compactor.delta.pct.threshold=0.1
> hive.compactor.history.reaper.interval=2m
> hive.compactor.history.retention.attempted=2
> hive.compactor.history.retention.failed=3
> hive.compactor.history.retention.succeeded=3
> hive.compactor.initiator.failed.compacts.threshold=2
> hive.compactor.initiator.on=false
> hive.compactor.max.num.delta=500
> hive.compactor.worker.threads=4
> hive.compactor.worker.timeout=86400s
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)