abstractdog commented on PR #6430:
URL: https://github.com/apache/hive/pull/6430#issuecomment-4242953563
@deniskuzZ: I tested this with docker llap + compaction:
```
export
POSTGRES_LOCAL_PATH=~/.m2/repository/org/postgresql/postgresql/42.7.3/postgresql-42.7.3.jar
./build.sh -hive 4.3.0-SNAPSHOT -hadoop 3.4.1 -tez 0.10.5
./start-hive.sh --llap
docker compose --profile llap logs -f
# in beeline as you suggested:
set hive.mapred.mode=nonstrict;
set hive.support.concurrency=true;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
-- Check with update
drop table if exists test_major_minor;
CREATE TABLE test_major_minor (
name VARCHAR(50),
age TINYINT,
num_clicks BIGINT)
stored as orc TBLPROPERTIES ("transactional"="true",
'NO_AUTO_COMPACTION'='true');
insert into test_major_minor values ('amy', 35, 12341234);
insert into test_major_minor values ('bob', 66, 1234712348712);
insert into test_major_minor values ('cal', 21, 431);
insert into test_major_minor values ('fse', 28, 8456);
--before compaction
SELECT * FROM test_major_minor;
ALTER TABLE test_major_minor COMPACT 'major' ;
```
unfortunately, the full successful testing is blocked on
[HIVE-29566](https://issues.apache.org/jira/browse/HIVE-29566), but at least I
reached the `QueryCompactor` codepath with your patch, which is fine and
promising
however, there are some confusing things here, that need clarification
before moving on:
> HS2 uses a direct connection to the backend DB for Compaction
this is confusing, can you confirm if this is by design? does this mean that
on hive clusters, the backend RDBMS configs have to be replicated HS2 in order
to run query-based compaction?
depending on your answer, I can imagine 2 possible outcomes of this patch:
1. keep it as is, considering that CRUD major compaction is the primary
supported one (is it?)
2. remove CRUD compaction config with and all the postgres connection
configs from HS2 to simplify this patch
I'm fine with 1) if at least a comment in docker compose file can emphasize
that HS2 really needs postgres connection configs, otherwise, we cannot
guarantee that confusing folks in the future (like me now :) ) won't remove
this config
if we can agree on / clarify this, the patch is +1 from my side, as
including compaction is a great improvement for our wonderful docker-based
testing
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]