[
https://issues.apache.org/jira/browse/PHOENIX-7064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17777167#comment-17777167
]
fanartoria commented on PHOENIX-7064:
-------------------------------------
[~vjasani]
Here is my test case:
{code:java}
table columns: 3 pks + 30 columns
indexes: 10 indexes
test data: 100,000 rows{code}
DDL: global index [^ddl-global.sql]; local index [^ddl-local.sql]
Data generator shell script(100,000 rows): [^gen-data.sh]
Test result summary
{code:bash}
# upsert data using psql.py
./bin/psql.py -t TESTTABLE_LOCAL -d ' ' test-data.csv
# local
## first upsert
CSV Upsert complete. 100000 rows upserted
Time: 76.36 sec(s)
## second upsert
CSV Upsert complete. 100000 rows upserted
Time: 158.516 sec(s)
# global
## first
CSV Upsert complete. 100000 rows upserted
Time: 46.15 sec(s)
## second
CSV Upsert complete. 100000 rows upserted
Time: 61.516 sec(s)
# local with test patch
## first
CSV Upsert complete. 100000 rows upserted
Time: 39.279 sec(s)
## second
CSV Upsert complete. 100000 rows upserted
Time: 55.167 sec(s)
{code}
the second upsert time is slower than the first one because there are extra
index prepare logic to process.
> Prepare of local index mutations is extremely slow
> --------------------------------------------------
>
> Key: PHOENIX-7064
> URL: https://issues.apache.org/jira/browse/PHOENIX-7064
> Project: Phoenix
> Issue Type: Improvement
> Affects Versions: 5.1.3
> Reporter: fanartoria
> Priority: Major
> Attachments: ddl-global.sql, ddl-local.sql, gen-data.sh,
> image-2023-10-09-17-29-47-856.png, image-2023-10-09-17-41-29-679.png
>
>
> When the data table has more than one index, the prepare time of local index
> will be much slower than global index.
> The write performance should be better on local indexes.
> Here is the stack trace which the most time is spent in.
> !image-2023-10-09-17-29-47-856.png!
> Seems a LocalTableState object will be create when prepare index mutation for
> each row.
> Compared with other ValueGetter, LazyValueGetter may be has bad performance.
> Why not use IndexMaintainer#createGetterFromKeyValues?
> Or combine the logic with global index prepare?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)