Please ensure hive.stats.autogather is enabled as well.
On Fri, Nov 10, 2023, 2:57 PM Denys Kuzmenko wrote:
> `hive.iceberg.stats.source` controls where the stats should be sourced
> from. When it's set to iceberg (default), we should go directly to iceberg
> and bypass HMS.
>
`hive.iceberg.stats.source` controls where the stats should be sourced from.
When it's set to iceberg (default), we should go directly to iceberg and bypass
HMS.
Can you please check this property? We need ensure it is true.
set hive.compute.query.using.stats=true;
In addition, it looks like the table created by spark has lots of data. Can you
create a new table and insert into several values by spark, and then create &
count(*) this
STEP1:
CREATE TABLE USING SPARK:
CREATE TABLE IF NOT EXISTS test.dwd.test_trade_table(
`uni_order_id` string,
`data_from` bigint,
Could you please provide detailed steps to reproduce this issue? e.g. how do
you create the table?
Thanks,
Butao Zhang
Replied Message
| From | lisoda |
| Date | 11/9/2023 14:25 |
| To | |
| Subject | Re:Re: Re: Hive's performance for querying the Iceberg table is
very poor. |
Incidentally, I'm using a COW table, so there is no DELETE_FILE.
在 2023-11-09 10:57:35,"Butao Zhang" 写道:
Hi lisoda. You can check this ticket
https://issues.apache.org/jira/browse/HIVE-27347 which can use iceberg basic
stats to optimize count(*) query. Note: it didn't take effect if
Hi lisoda. You can check this ticket
https://issues.apache.org/jira/browse/HIVE-27347 which can use iceberg basic
stats to optimize count(*) query. Note: it didn't take effect if having delete
files.
Thanks,
Butao Zhang
Replied Message
| From | lisoda |
| Date | 11/9/2023 10:43 |
HI.
I am testing with HIVE-4.0.0-BETA-1 version and I am using location_based_table.
So far I found that HIVE still can't push some queries down to METADATA, e.g.
COUNT(*).
Is HIVE 4.0.0-BETA-1 still not able to support query push down?
在 2023-10-24 17:41:20,"Ayush Saxena" 写道:
HIVE-27734 is in progress, as I see we have a POC attached to the ticket,
we should have it in 2-3 week I believe.
> Also, after the release of 4.0.0, will we be able to do all TPCDS queries
on ICEBERG except for normal HIVE tables?
Yep, I believe most of the TPCDS queries would be supported
Thanks.
I would like to know if hive currently supports push to ICEBERG table partition
under JOIN condition.
Because I see HIVE-27734 is not yet complete, what is its progress so far?
Also, after the release of 4.0.0, will we be able to do all TPCDS queries on
ICEBERG except for normal HIVE
Hi Lisoda,
The iceberg jar for hive 3.1.3 doesn't have a lot of changes, We did a
bunch of improvements on the 4.x line for Hive-Iceberg. You can give
iceberg a try on the 4.0.0-beta-1 release mentioned here [1], we have a
bunch of improvements like vecotrization and stuff like that. If you wanna
Too bad. Tencent Games used StarRocks with Apache Iceberg to power their
analytics.
https://medium.com/starrocks-engineering/tencent-games-inside-scoop-the-road-to-cloud-native-with-starrocks-d7dcb2438e25.
On Mon, Oct 23, 2023 at 10:55 AM lisoda wrote:
> We are not going to use starrocks.
>
We are not going to use starrocks.
mpp architecture databases have natural limitations, and starrocks does not
necessarily perform better than hive llap.
Replied Message
| From | Albert Wong |
| Date | 10/24/2023 01:39 |
| To | user@hive.apache.org |
| Cc | |
| Subject | Re: Hive's
I would try http://starrocks.io. StarRocks is an MPP OLAP database that
can query Apache Iceberg and we can cache the data for faster performance.
We also have additional features like building materialized views that span
across Apache Iceberg, Apache Hudi and Apache Hive. Here is a video of
Hi Team.
I recently was testing Hive query Iceberg table , I found that Hive query
Iceberg table performance is very very poor . Almost impossible to use in the
production environment . And Join conditions can not be pushed down to the
Iceberg partition.
I'm using the 1.3.1 Hive
15 matches
Mail list logo