[ https://issues.apache.org/jira/browse/UNOMI-784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17742909#comment-17742909 ]
Kevan Jahanshahi edited comment on UNOMI-784 at 7/13/23 6:51 PM: ----------------------------------------------------------------- This is due to the HLRC socket timeout config. We cannot use ElasticSearch task system for handling this because it's not available on our client version: [https://github.com/elastic/elasticsearch/pull/58552] It have been introduce in HLRC version 8, and due to licensing we cannot upgrade. Still this is a problem, long running processes should be better handled by Unomi, but this is a more global matter that just this isolated case. Nonetheless, we have other options, this timeout is configurable on Unomi using: ENV: *UNOMI_ELASTICSEARCH_CLIENT_SOCKET_TIMEOUT* currently it doesn't have any default value, so HLRC is using 30 sec by default I raised it up to 80 sec by default. And also implemented a test that is checking that even in case of SocketTimeoutException, the data is correctly saved and consistant on ES side that is finishing the requested operation anyway. PR: [https://github.com/apache/unomi/pull/636] was (Author: jkevan): This is due to the HLRC socket timeout config. We cannot use ElasticSearch task system for handling this because it's not available on our client version: [https://github.com/elastic/elasticsearch/pull/58552] It have been introduce in HLRC version 8, and due to licensing we cannot upgrade. Still this is a problem, long running processes should be better handled by Unomi, but this is a more global matter that just this isolated case. Nonetheless, we have other options, this timeout is configurable on Unomi using: ENV: *UNOMI_ELASTICSEARCH_CLIENT_SOCKET_TIMEOUT* currently it doesn't have any default value, so HLRC is using 30 sec by default I raised it up to 80 sec by default. And also implemented a test that is checking that even in case of SocketTimeoutException, the data is correctly saved and consistant on ES side that is finishing the requested operation anyway. > Timeout on updateByQuery request such as scoring update > ------------------------------------------------------- > > Key: UNOMI-784 > URL: https://issues.apache.org/jira/browse/UNOMI-784 > Project: Apache Unomi > Issue Type: Bug > Components: unomi(-core) > Affects Versions: unomi-2.3.0, unomi-1.9.0 > Reporter: Serge Huber > Assignee: Kevan Jahanshahi > Priority: Major > Fix For: unomi-2.4.0 > > Time Spent: 10m > Remaining Estimate: 0h > > As we have refactoring updates that were previously done with scroll queries > with updateByQueries, on systems with large data set we are now reaching > timeout while waiting for the updates to complete. > Here is an example of such an error: > {code} > Error while executing in class loader > java.net.SocketTimeoutException: 30,000 milliseconds timeout on connection > http-outgoing-296 [ACTIVE] > at > org.elasticsearch.client.RestClient.extractAndWrapCause(RestClient.java:773) > ~[!/:?] > at > org.elasticsearch.client.RestClient.performRequest(RestClient.java:218) > ~[!/:?] > at > org.elasticsearch.client.RestClient.performRequest(RestClient.java:205) > ~[!/:?] > at > org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1454) > ~[!/:?] > at > org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1424) > ~[!/:?] > at > org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1394) > ~[!/:?] > at > org.elasticsearch.client.RestHighLevelClient.updateByQuery(RestHighLevelClient.java:554) > ~[!/:?] > at > org.apache.unomi.persistence.elasticsearch.ElasticSearchPersistenceServiceImpl$10.execute(ElasticSearchPersistenceServiceImpl.java:1023) > ~[!/:?] > at > org.apache.unomi.persistence.elasticsearch.ElasticSearchPersistenceServiceImpl$10.execute(ElasticSearchPersistenceServiceImpl.java:1005) > ~[!/:?] > at > org.apache.unomi.persistence.elasticsearch.ElasticSearchPersistenceServiceImpl$InClassLoaderExecute.executeInClassLoader(ElasticSearchPersistenceServiceImpl.java:2267) > [!/:?] > at > org.apache.unomi.persistence.elasticsearch.ElasticSearchPersistenceServiceImpl$InClassLoaderExecute.catchingExecuteInClassLoader(ElasticSearchPersistenceServiceImpl.java:2278) > [!/:?] > at > org.apache.unomi.persistence.elasticsearch.ElasticSearchPersistenceServiceImpl.updateWithQueryAndScript(ElasticSearchPersistenceServiceImpl.java:1050) > [!/:?] > at > org.apache.unomi.persistence.elasticsearch.ElasticSearchPersistenceServiceImpl.updateWithQueryAndStoredScript(ElasticSearchPersistenceServiceImpl.java:1001) > [!/:?] > at > Proxy952471af_195a_464e_aca9_365c7d2c5bac.updateWithQueryAndStoredScript(Unknown > Source) [?:?] > at > org.apache.unomi.services.impl.segments.SegmentServiceImpl.updateExistingProfilesForScoring(SegmentServiceImpl.java:1192) > [!/:?] > at > org.apache.unomi.services.impl.segments.SegmentServiceImpl.recalculatePastEventConditions(SegmentServiceImpl.java:982) > [!/:?] > at > org.apache.unomi.services.impl.segments.SegmentServiceImpl$1.run(SegmentServiceImpl.java:1215) > [!/:?] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?] > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) > [?:?] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) > [?:?] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > [?:?] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > [?:?] > at java.lang.Thread.run(Thread.java:829) [?:?] > Caused by: java.net.SocketTimeoutException: 30,000 milliseconds timeout on > connection http-outgoing-296 [ACTIVE] > at > org.apache.http.nio.protocol.HttpAsyncRequestExecutor.timeout(HttpAsyncRequestExecutor.java:387) > ~[?:?] > at > org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:92) > ~[?:?] > at > org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:39) > ~[?:?] > at > org.apache.http.impl.nio.reactor.AbstractIODispatch.timeout(AbstractIODispatch.java:175) > ~[?:?] > at > org.apache.http.impl.nio.reactor.BaseIOReactor.sessionTimedOut(BaseIOReactor.java:263) > ~[?:?] > at > org.apache.http.impl.nio.reactor.AbstractIOReactor.timeoutCheck(AbstractIOReactor.java:492) > ~[?:?] > at > org.apache.http.impl.nio.reactor.BaseIOReactor.validate(BaseIOReactor.java:213) > ~[?:?] > at > org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:280) > ~[?:?] > at > org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) > ~[?:?] > at > org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591) > ~[?:?] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)