[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740855#comment-16740855 ]
Ankit Jain edited comment on LUCENE-8635 at 1/12/19 12:07 AM: -------------------------------------------------------------- The excel sheet is pretty big, so not sure if pasting it here is good idea. You have good point about moving FSTs off-heap in the default codec as we can always preload mmap file during index open as demonstrated [here|https://www.elastic.co/guide/en/elasticsearch/reference/master/_pre_loading_data_into_the_file_system_cache.html] I ran the test suite and couple of tests seem to fail. Though, they don't seem to have anything to do with my change: [junit4] Tests with failures [seed: 1D3ADDF6AE377902]: [junit4] - org.apache.solr.cloud.autoscaling.ScheduledMaintenanceTriggerTest.testInactiveShardCleanup [junit4] - org.apache.solr.cloud.autoscaling.ScheduledTriggerTest.testTrigger [junit4] [junit4] [junit4] JVM J0: 1.40 .. 4359.18 = 4357.78s [junit4] JVM J1: 1.40 .. 4359.35 = 4357.95s [junit4] JVM J2: 1.40 .. 4359.30 = 4357.90s [junit4] Execution time total: 1 hour 12 minutes 40 seconds [junit4] Tests summary: 833 suites (7 ignored), 4024 tests, 2 failures, 286 ignored (153 assumptions) Details for failing tests NOTE: reproduce with: ant test -Dtestcase=ScheduledTriggerTest -Dtests.method=testTrigger -Dtests.seed=1D3ADDF6AE377902 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=mr-IN -Dtests.timezone=America/St_Lucia -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] FAILURE 9.03s J2 | ScheduledTriggerTest.testTrigger <<< [junit4] > Throwable #1: java.lang.AssertionError: expected:<3> but was:<2> [junit4] > at __randomizedtesting.SeedInfo.seed([1D3ADDF6AE377902:7EF1EB7437F80A2F]:0) [junit4] > at org.apache.solr.cloud.autoscaling.ScheduledTriggerTest.scheduledTriggerTest(ScheduledTriggerTest.java:113) [junit4] > at org.apache.solr.cloud.autoscaling.ScheduledTriggerTest.testTrigger(ScheduledTriggerTest.java:66) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit4] > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit4] > at java.base/java.lang.reflect.Method.invoke(Method.java:564) [junit4] > at java.base/java.lang.Thread.run(Thread.java:844) NOTE: reproduce with: ant test -Dtestcase=ScheduledMaintenanceTriggerTest -Dtests.method=testInactiveShardCleanup -Dtests.seed=1D3ADDF6AE377902 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=ha -Dtests.timezone=America/Nome -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] FAILURE 2.01s J0 | ScheduledMaintenanceTriggerTest.testInactiveShardCleanup <<< at __randomizedtesting.SeedInfo.seed([1D3ADDF6AE377902:161D84CF745E09]:0) [junit4] > at org.apache.solr.cloud.CloudTestUtils.waitForState(CloudTestUtils.java:70) [junit4] > at org.apache.solr.cloud.autoscaling.ScheduledMaintenanceTriggerTest.testInactiveShardCleanup(ScheduledMaintenanceTriggerTest.java:167) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit4] > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit4] > at java.base/java.lang.reflect.Method.invoke(Method.java:564) [junit4] > at java.base/java.lang.Thread.run(Thread.java:844) [junit4] > Caused by: java.util.concurrent.TimeoutException: last state: DocCollection(ScheduledMaintenanceTriggerTest_collection1//clusterstate.json/6)={ was (Author: akjain): I ran the test suite and couple of tests seem to fail. Though, they don't seem to have anything to do with my change: [junit4] Tests with failures [seed: 1D3ADDF6AE377902]: [junit4] - org.apache.solr.cloud.autoscaling.ScheduledMaintenanceTriggerTest.testInactiveShardCleanup [junit4] - org.apache.solr.cloud.autoscaling.ScheduledTriggerTest.testTrigger [junit4] [junit4] [junit4] JVM J0: 1.40 .. 4359.18 = 4357.78s [junit4] JVM J1: 1.40 .. 4359.35 = 4357.95s [junit4] JVM J2: 1.40 .. 4359.30 = 4357.90s [junit4] Execution time total: 1 hour 12 minutes 40 seconds [junit4] Tests summary: 833 suites (7 ignored), 4024 tests, 2 failures, 286 ignored (153 assumptions) Details for failing tests NOTE: reproduce with: ant test -Dtestcase=ScheduledTriggerTest -Dtests.method=testTrigger -Dtests.seed=1D3ADDF6AE377902 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=mr-IN -Dtests.timezone=America/St_Lucia -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] FAILURE 9.03s J2 | ScheduledTriggerTest.testTrigger <<< [junit4] > Throwable #1: java.lang.AssertionError: expected:<3> but was:<2> [junit4] > at __randomizedtesting.SeedInfo.seed([1D3ADDF6AE377902:7EF1EB7437F80A2F]:0) [junit4] > at org.apache.solr.cloud.autoscaling.ScheduledTriggerTest.scheduledTriggerTest(ScheduledTriggerTest.java:113) [junit4] > at org.apache.solr.cloud.autoscaling.ScheduledTriggerTest.testTrigger(ScheduledTriggerTest.java:66) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit4] > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit4] > at java.base/java.lang.reflect.Method.invoke(Method.java:564) [junit4] > at java.base/java.lang.Thread.run(Thread.java:844) NOTE: reproduce with: ant test -Dtestcase=ScheduledMaintenanceTriggerTest -Dtests.method=testInactiveShardCleanup -Dtests.seed=1D3ADDF6AE377902 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=ha -Dtests.timezone=America/Nome -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] FAILURE 2.01s J0 | ScheduledMaintenanceTriggerTest.testInactiveShardCleanup <<< at __randomizedtesting.SeedInfo.seed([1D3ADDF6AE377902:161D84CF745E09]:0) [junit4] > at org.apache.solr.cloud.CloudTestUtils.waitForState(CloudTestUtils.java:70) [junit4] > at org.apache.solr.cloud.autoscaling.ScheduledMaintenanceTriggerTest.testInactiveShardCleanup(ScheduledMaintenanceTriggerTest.java:167) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit4] > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit4] > at java.base/java.lang.reflect.Method.invoke(Method.java:564) [junit4] > at java.base/java.lang.Thread.run(Thread.java:844) [junit4] > Caused by: java.util.concurrent.TimeoutException: last state: DocCollection(ScheduledMaintenanceTriggerTest_collection1//clusterstate.json/6)={ > Lazy loading Lucene FST offheap using mmap > ------------------------------------------ > > Key: LUCENE-8635 > URL: https://issues.apache.org/jira/browse/LUCENE-8635 > Project: Lucene - Core > Issue Type: New Feature > Components: core/FSTs > Environment: I used below setup for es_rally tests: > single node i3.xlarge running ES 6.5 > es_rally was running on another i3.xlarge instance > Reporter: Ankit Jain > Priority: Major > Attachments: offheap.patch, rally_benchmark.xlsx > > > Currently, FST loads all the terms into heap memory during index open. This > causes frequent JVM OOM issues if the term size gets big. A better way of > doing this will be to lazily load FST using mmap. That ensures only the > required terms get loaded into memory. > > Lucene can expose API for providing list of fields to load terms offheap. I'm > planning to take following approach for this: > # Add a boolean property fstOffHeap in FieldInfo > # Pass list of offheap fields to lucene during index open (ALL can be > special keyword for loading ALL fields offheap) > # Initialize the fstOffHeap property during lucene index open > # FieldReader invokes default FST constructor or OffHeap constructor based > on fstOffHeap field > > I created a patch (that loads all fields offheap), did some benchmarks using > es_rally and results look good. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org