[
https://issues.apache.org/jira/browse/LUCENE-7396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15394797#comment-15394797
]
Michael McCandless commented on LUCENE-7396:
--------------------------------------------
I added a sop for how long it takes to flush each field's points.
Trunk:
{noformat}
127.0.0.1: IW 0 [2016-07-26T23:44:01.962Z; LuceneIndexing-1-thread-1]: 2047
msec to write docValues
127.0.0.1: FIELD surcharge: 2218.62956 msec
127.0.0.1: FIELD pick_up_lon: 1968.539463 msec
127.0.0.1: FIELD mta_tax: 1967.196068 msec
127.0.0.1: FIELD fare_amount: 2179.47771 msec
127.0.0.1: FIELD trip_distance: 2042.19908 msec
127.0.0.1: FIELD pick_up_lat: 2002.735026 msec
127.0.0.1: FIELD drop_off_date_time: 1315.330631 msec
127.0.0.1: FIELD tolls_amount: 1942.725832 msec
127.0.0.1: FIELD passenger_count: 797.681094 msec
127.0.0.1: FIELD drop_off_lon: 1937.206549 msec
127.0.0.1: FIELD total_amount: 1987.058896 msec
127.0.0.1: FIELD pick_up_date_time: 1296.005252 msec
127.0.0.1: FIELD tip_amount: 2063.410329 msec
127.0.0.1: FIELD drop_off_lat: 2045.020601 msec
127.0.0.1: IW 0 [2016-07-26T23:44:27.726Z; LuceneIndexing-1-thread-1]: 25764
msec to write points
{noformat}
and patch:
{noformat}
127.0.0.1: IW 0 [2016-07-26T23:49:54.494Z; LuceneIndexing-1-thread-1]: 2033
msec to write docValues
127.0.0.1: FIELD surcharge: 2137.926903 msec
127.0.0.1: FIELD pick_up_lon: 2511.391725 msec
127.0.0.1: FIELD mta_tax: 2144.822578 msec
127.0.0.1: FIELD fare_amount: 3232.977894 msec
127.0.0.1: FIELD trip_distance: 2545.771801 msec
127.0.0.1: FIELD pick_up_lat: 2939.796276 msec
127.0.0.1: FIELD drop_off_date_time: 1272.857191 msec
127.0.0.1: FIELD tolls_amount: 2042.863782 msec
127.0.0.1: FIELD passenger_count: 565.551751 msec
127.0.0.1: FIELD drop_off_lon: 2493.79608 msec
127.0.0.1: FIELD total_amount: 2596.043882 msec
127.0.0.1: FIELD pick_up_date_time: 1308.397927 msec
127.0.0.1: FIELD tip_amount: 2316.962831 msec
127.0.0.1: FIELD drop_off_lat: 2748.935673 msec
127.0.0.1: IW 0 [2016-07-26T23:50:25.415Z; LuceneIndexing-1-thread-1]: 30920
msec to write points
{noformat}
This is using 1 thread w/ 1 GB IW buffer.
Some fields seem to take similar time, but others are sizably different (e.g.
the lat/lon points) ... odd.
> Speed up flush of 1-dimension points
> ------------------------------------
>
> Key: LUCENE-7396
> URL: https://issues.apache.org/jira/browse/LUCENE-7396
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Adrien Grand
> Priority: Minor
> Attachments: LUCENE-7396.patch, LUCENE-7396.patch
>
>
> 1D points already have an optimized merge implementation which works when
> points come in order. So maybe we could make IndexWriter's PointValuesWriter
> sort before feeding the PointsFormat and somehow propagate the information to
> the PointsFormat?
> The benefit is that flushing could directly stream points to disk with little
> memory usage.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]