[jira] [Commented] (CONNECTORS-1546) Optimize Elasticsearch performance by removing 'forcemerge'
[ https://issues.apache.org/jira/browse/CONNECTORS-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707411#comment-16707411 ] Steph van Schalkwyk commented on CONNECTORS-1546: - That's in the codebase I sent to you. All removed. Also don't need the ES Version anymore as that was the ony thing that it was used for. > Optimize Elasticsearch performance by removing 'forcemerge' > --- > > Key: CONNECTORS-1546 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1546 > Project: ManifoldCF > Issue Type: Improvement > Components: Elastic Search connector >Reporter: Hans Van Goethem >Assignee: Steph van Schalkwyk >Priority: Major > Fix For: ManifoldCF 2.12 > > > After crawling with ManifoldCF, forcemerge is applied to optimize the > Elasticsearch index. This optimization makes the Elastic faster for > read-operations but not for write-opeartions. On the contrary, performance on > the write operations becomes worse after every forcemerge. > Can you remove this forcemerge in ManifoldCF to optimize perfomance for > recurrent crawling to Elasticsearch? > If somene needs this forcemerge, it can be applied mannually against > Elasticsearch directly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1546) Optimize Elasticsearch performance by removing 'forcemerge'
[ https://issues.apache.org/jira/browse/CONNECTORS-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16706804#comment-16706804 ] Karl Wright commented on CONNECTORS-1546: - Hi [~st...@remcam.net], can you let me know what happened to this? We're trying to get 2.12 ready for completion. Thanks!! > Optimize Elasticsearch performance by removing 'forcemerge' > --- > > Key: CONNECTORS-1546 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1546 > Project: ManifoldCF > Issue Type: Improvement > Components: Elastic Search connector >Reporter: Hans Van Goethem >Assignee: Steph van Schalkwyk >Priority: Major > Fix For: ManifoldCF 2.12 > > > After crawling with ManifoldCF, forcemerge is applied to optimize the > Elasticsearch index. This optimization makes the Elastic faster for > read-operations but not for write-opeartions. On the contrary, performance on > the write operations becomes worse after every forcemerge. > Can you remove this forcemerge in ManifoldCF to optimize perfomance for > recurrent crawling to Elasticsearch? > If somene needs this forcemerge, it can be applied mannually against > Elasticsearch directly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1546) Optimize Elasticsearch performance by removing 'forcemerge'
[ https://issues.apache.org/jira/browse/CONNECTORS-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672605#comment-16672605 ] Karl Wright commented on CONNECTORS-1546: - I didn't see a commit go by. Were you able to commit? > Optimize Elasticsearch performance by removing 'forcemerge' > --- > > Key: CONNECTORS-1546 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1546 > Project: ManifoldCF > Issue Type: Improvement > Components: Elastic Search connector >Reporter: Hans Van Goethem >Assignee: Steph van Schalkwyk >Priority: Major > > After crawling with ManifoldCF, forcemerge is applied to optimize the > Elasticsearch index. This optimization makes the Elastic faster for > read-operations but not for write-opeartions. On the contrary, performance on > the write operations becomes worse after every forcemerge. > Can you remove this forcemerge in ManifoldCF to optimize perfomance for > recurrent crawling to Elasticsearch? > If somene needs this forcemerge, it can be applied mannually against > Elasticsearch directly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1546) Optimize Elasticsearch performance by removing 'forcemerge'
[ https://issues.apache.org/jira/browse/CONNECTORS-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672431#comment-16672431 ] Steph van Schalkwyk commented on CONNECTORS-1546: - Removed. > Optimize Elasticsearch performance by removing 'forcemerge' > --- > > Key: CONNECTORS-1546 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1546 > Project: ManifoldCF > Issue Type: Improvement > Components: Elastic Search connector >Reporter: Hans Van Goethem >Assignee: Steph van Schalkwyk >Priority: Major > > After crawling with ManifoldCF, forcemerge is applied to optimize the > Elasticsearch index. This optimization makes the Elastic faster for > read-operations but not for write-opeartions. On the contrary, performance on > the write operations becomes worse after every forcemerge. > Can you remove this forcemerge in ManifoldCF to optimize perfomance for > recurrent crawling to Elasticsearch? > If somene needs this forcemerge, it can be applied mannually against > Elasticsearch directly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1546) Optimize Elasticsearch performance by removing 'forcemerge'
[ https://issues.apache.org/jira/browse/CONNECTORS-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16651950#comment-16651950 ] Karl Wright commented on CONNECTORS-1546: - I agree with your decision. > Optimize Elasticsearch performance by removing 'forcemerge' > --- > > Key: CONNECTORS-1546 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1546 > Project: ManifoldCF > Issue Type: Improvement > Components: Elastic Search connector >Reporter: Hans Van Goethem >Assignee: Steph van Schalkwyk >Priority: Major > > After crawling with ManifoldCF, forcemerge is applied to optimize the > Elasticsearch index. This optimization makes the Elastic faster for > read-operations but not for write-opeartions. On the contrary, performance on > the write operations becomes worse after every forcemerge. > Can you remove this forcemerge in ManifoldCF to optimize perfomance for > recurrent crawling to Elasticsearch? > If somene needs this forcemerge, it can be applied mannually against > Elasticsearch directly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1546) Optimize Elasticsearch performance by removing 'forcemerge'
[ https://issues.apache.org/jira/browse/CONNECTORS-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16651942#comment-16651942 ] Steph van Schalkwyk commented on CONNECTORS-1546: - Hans is correct. I would remove it. It can mess up merging later if not used correctly. It may also take a long time to complete. I'm going to upload a patch or two soon and will remove it if you concur. BTW, from the ES 6.4 doc: "Force merge should only be called against *read-only indices*. Running force merge against a read-write index can cause very large segments to be produced (>5Gb per segment), and the merge policy +*will never consider it for merging again until it mostly consists of deleted docs*+. This can cause very large segments to remain in the shards." But I agree. It isn't up to MCF to decide what to do as it does impact ingesting. Hans may want to try this before ingesting: PUT /_cluster/settings{"transient" : {"indices.store.throttle.type" : "none" }} and after ingesting: PUT /_cluster/settings{"transient" : {"indices.store.throttle.type" : "merge" }} > Optimize Elasticsearch performance by removing 'forcemerge' > --- > > Key: CONNECTORS-1546 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1546 > Project: ManifoldCF > Issue Type: Improvement > Components: Elastic Search connector >Reporter: Hans Van Goethem >Assignee: Steph van Schalkwyk >Priority: Major > > After crawling with ManifoldCF, forcemerge is applied to optimize the > Elasticsearch index. This optimization makes the Elastic faster for > read-operations but not for write-opeartions. On the contrary, performance on > the write operations becomes worse after every forcemerge. > Can you remove this forcemerge in ManifoldCF to optimize perfomance for > recurrent crawling to Elasticsearch? > If somene needs this forcemerge, it can be applied mannually against > Elasticsearch directly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1546) Optimize Elasticsearch performance by removing 'forcemerge'
[ https://issues.apache.org/jira/browse/CONNECTORS-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16651761#comment-16651761 ] Karl Wright commented on CONNECTORS-1546: - Hi [~st...@remcam.net], can you comment on this? > Optimize Elasticsearch performance by removing 'forcemerge' > --- > > Key: CONNECTORS-1546 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1546 > Project: ManifoldCF > Issue Type: Improvement > Components: Elastic Search connector >Reporter: Hans Van Goethem >Assignee: Steph van Schalkwyk >Priority: Major > > After crawling with ManifoldCF, forcemerge is applied to optimize the > Elasticsearch index. This optimization makes the Elastic faster for > read-operations but not for write-opeartions. On the contrary, performance on > the write operations becomes worse after every forcemerge. > Can you remove this forcemerge in ManifoldCF to optimize perfomance for > recurrent crawling to Elasticsearch? > If somene needs this forcemerge, it can be applied mannually against > Elasticsearch directly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)