I divided the query to 1000 pieces and removed the parallel stream clause, it 
seems to be working without timeout so far, if it does I just can divide it to 
even smaller pieces I guess.

I tried to send all 1000 pieces in a “list” expression to be executed linearly, 
it didn’t work but I was just curious if it could handle such a large query 😃

Now I’m just generating expression strings from java code and sending them one 
by one. I tried to use SolrJ for this, but encountered a weird problem where 
even the simplest expression (echo) stops working after a few iterations in a 
loop. I’m guessing the underlying HttpClient is not closing connections timely, 
hitting the OS per-host connection limit. I asked a separate question about 
this. I was following the example on lucidworks: 
https://lucidworks.com/post/streaming-expressions-in-solrj/

I just modified my code to use regular REST calls using okhttp3, it’s a shame 
that I couldn’t use SolrJ since it truly streams every result 1 by 1 
continuously. REST just returns a single large response at the very end of the 
stream.

Thanks again for your help.

Sent from Mail for Windows 10

From: Joel Bernstein
Sent: 02 March 2021 00:19
To: solr-user@lucene.apache.org
Subject: Re: Idle timeout expired and Early Client Disconnect errors

Also the parallel function builds hash partitioning filters that could lead
to timeouts if they take too long to build. Try the query without the
parallel function if you're still getting timeouts when making the query
smaller.



Joel Bernstein
http://joelsolr.blogspot.com/


On Mon, Mar 1, 2021 at 4:03 PM Joel Bernstein <joels...@gmail.com> wrote:

> The settings in your version are 30 seconds and 15 seconds for socket and
> connection timeouts.
>
> Typically timeouts occur because one or more shards in the query are idle
> beyond the timeout threshold. This happens because lot's of data is being
> read from other shards.
>
> Breaking the query into small parts would be a good strategy.
>
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Mon, Mar 1, 2021 at 3:30 PM ufuk yılmaz <uyil...@vivaldi.net.invalid>
> wrote:
>
>> Hello Mr. Bernstein,
>>
>> I’m using version 8.4. So, if I understand correctly, I can’t increase
>> timeouts and they are bound to happen in such a large stream. Should I just
>> reduce the output of my search expressions?
>>
>> Maybe I can split my search results into ~100 parts and run the same
>> query 100 times in series. Each part would emit ~3M documents so they
>> should finish before timeout?
>>
>> Is this a reasonable solution?
>>
>> Btw how long is the default hard-coded timeout value? Because yesterday I
>> ran another query which took more than 1 hour without any timeouts and
>> finished successfully.
>>
>> Sent from Mail for Windows 10
>>
>> From: Joel Bernstein
>> Sent: 01 March 2021 23:03
>> To: solr-user@lucene.apache.org
>> Subject: Re: Idle timeout expired and Early Client Disconnect errors
>>
>> Oh wait, I misread your email. The idle timeout issue is configurable in:
>>
>> https://issues.apache.org/jira/browse/SOLR-14672
>>
>> This unfortunately missed the 8.8 release and will be 8.9.
>>
>>
>>
>> This i
>>
>>
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>>
>> On Mon, Mar 1, 2021 at 2:56 PM Joel Bernstein <joels...@gmail.com> wrote:
>>
>> > What version are you using?
>> >
>> > Solr 8.7 has changes that caused these errors to hit the logs. These
>> used
>> > to be suppressed. This has been fixed in Solr 9.0 but it has not been
>> back
>> > ported to Solr 8.x.
>> >
>> > The errors are actually normal operational occurrences when doing joins
>> so
>> > should be suppressed in the logs and were before the specific release.
>> >
>> > It might make sense to do a release that specifically suppresses these
>> > errors without backporting the full Solr 9.0 changes which impact the
>> > memory footprint of export.
>> >
>> >
>> >
>> >
>> > Joel Bernstein
>> > http://joelsolr.blogspot.com/
>> >
>> >
>> > On Mon, Mar 1, 2021 at 10:29 AM ufuk yılmaz <uyil...@vivaldi.net.invalid
>> >
>> > wrote:
>> >
>> >> Hello all,
>> >>
>> >> I’m running a large streaming expression and feeding the result to
>> update
>> >> expression.
>> >>
>> >>  update(targetCollection, ...long running stream here...,
>> >>
>> >> I tried sending the exact same query multiple times, it sometimes works
>> >> and indexes some results, then gives exception, other times fails with
>> an
>> >> exception after 2 minutes.
>> >>
>> >> Response is like:
>> >> "EXCEPTION":"java.util.concurrent.ExecutionException:
>> >> java.io.IOException: params distrib=false&numWorkers=4.... and my long
>> >> stream expression
>> >>
>> >> Server log (short):
>> >> [c:DNM s:shard1 r:core_node2 x:DNM_shard1_replica_n1]
>> >> o.a.s.s.HttpSolrCall null:java.io.IOException:
>> >> java.util.concurrent.TimeoutException: Idle timeout expired:
>> 120000/120000
>> >> ms
>> >> o.a.s.s.HttpSolrCall null:java.io.IOException:
>> >> java.util.concurrent.TimeoutException: Idle timeout expired:
>> 120000/120000
>> >> ms
>> >>
>> >> I tried to increase the jetty idle timeout value on the node which
>> hosts
>> >> my target collection to something like an hour. It didn’t affect.
>> >>
>> >>
>> >> Server logs (long)
>> >> ERROR (qtp832292933-589) [c:DNM s:shard1 r:core_node2
>> >> x:DNM_shard1_replica_n1] o.a.s.s.HttpSolrCall null:java.io.IOException:
>> >> java.util.concurrent.TimeoutException: Idle timeout expired: 1
>> >>                                 20000/120000 ms
>> >> solr-01    |    at
>> >>
>> org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:235)
>> >> solr-01    |    at
>> >> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:226)
>> >> solr-01    |    at
>> >> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:524)
>> >> solr-01    |    at
>> >>
>> org.apache.solr.servlet.ServletOutputStreamWrapper.write(ServletOutputStreamWrapper.java:134)
>> >> solr-01    |    at
>> >> java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233)
>> >> solr-01    |    at
>> >> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:303)
>> >> solr-01    |    at
>> >> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:281)
>> >> solr-01    |    at
>> >> java.base/sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
>> >> solr-01    |    at java.base/java.io
>> >> .OutputStreamWriter.write(OutputStreamWriter.java:211)
>> >> solr-01    |    at
>> >> org.apache.solr.common.util.FastWriter.flush(FastWriter.java:140)
>> >> solr-01    |    at
>> >> org.apache.solr.common.util.FastWriter.write(FastWriter.java:54)
>> >> solr-01    |    at
>> >> org.apache.solr.response.JSONWriter._writeChar(JSONWriter.java:173)
>> >> solr-01    |    at
>> >>
>> org.apache.solr.common.util.JsonTextWriter.writeStr(JsonTextWriter.java:86)
>> >> solr-01    |    at
>> >> org.apache.solr.common.util.TextWriter.writeVal(TextWriter.java:52)
>> >> solr-01    |    at
>> >>
>> org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:152)
>> >> solr-01    |    at
>> >>
>> org.apache.solr.common.util.JsonTextWriter$2.put(JsonTextWriter.java:176)
>> >> solr-01    |    at
>> >> org.apache.solr.common.MapWriter$EntryWriter.put(MapWriter.java:154)
>> >> solr-01    |    at
>> >>
>> org.apache.solr.handler.export.StringFieldWriter.write(StringFieldWriter.java:77)
>> >> solr-01    |    at
>> >>
>> org.apache.solr.handler.export.ExportWriter.writeDoc(ExportWriter.java:313)
>> >> solr-01    |    at
>> >>
>> org.apache.solr.handler.export.ExportWriter.lambda$addDocsToItemWriter$4(ExportWriter.java:263)
>> >> --
>> >> solr-01    |    at org.eclipse.jetty.io
>> >> .FillInterest.fillable(FillInterest.java:103)
>> >> solr-01    |    at org.eclipse.jetty.io
>> >> .ChannelEndPoint$2.run(ChannelEndPoint.java:117)
>> >> solr-01    |    at
>> >>
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
>> >> solr-01    |    at
>> >>
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
>> >> solr-01    |    at
>> >>
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
>> >> solr-01    |    at
>> >>
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
>> >> solr-01    |    at
>> >>
>> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
>> >> solr-01    |    at
>> >>
>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:781)
>> >> solr-01    |    at
>> >>
>> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:917)
>> >> solr-01    |    at java.base/java.lang.Thread.run(Thread.java:834)
>> >> solr-01    | Caused by: java.util.concurrent.TimeoutException: Idle
>> >> timeout expired: 120000/120000 ms
>> >> solr-01    |    at org.eclipse.jetty.io
>> >> .IdleTimeout.checkIdleTimeout(IdleTimeout.java:171)
>> >> solr-01    |    at org.eclipse.jetty.io
>> >> .IdleTimeout.idleCheck(IdleTimeout.java:113)
>> >> solr-01    |    at
>> >>
>> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>> >> solr-01    |    at
>> >> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>> >> solr-01    |    at
>> >>
>> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>> >> solr-01    |    at
>> >>
>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>> >> solr-01    |    at
>> >>
>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>> >> solr-01    |    ... 1 more
>> >>
>> >>
>> >> My expression, in case it helps. To summarize, it finds the document
>> ids
>> >> which exists on sourceCollection but not on target collection (DNM).
>> Joins
>> >> on itself to duplicate some fields (I couldn’t find another way to
>> >> duplicate the value of field into 2 fields). Then sends the result to
>> >> update. Source collection has about 300M documents, 24GB heap, 2
>> shards, 2
>> >> replicas of each shard.
>> >>
>> >> update(
>> >>     DNM,
>> >>     batchSize=1000,
>> >>     parallel(
>> >>         WorkerCollection,
>> >>         leftOuterJoin(
>> >>             fetch(
>> >>                 sourceCollection,
>> >>                 complement(
>> >>                     search(
>> >>                         sourceCollection,
>> >>                         q="*:*",
>> >>                         qt="/export",
>> >>                         fq="...some filters...",
>> >>                         sort="id_str asc",
>> >>                         fl="id_str",
>> >>                         partitionKeys="id_str"
>> >>                     ),
>> >>                     search(
>> >>                         DNM,
>> >>                         q="*:*",
>> >>                         qt="/export",
>> >>                         sort="id_str asc",
>> >>                         fl="id_str",
>> >>                         partitionKeys="id_str"
>> >>                     ),
>> >>                     on="id_str"
>> >>                 ),
>> >>                 fl="...my many fields...",
>> >>                 on="id_str",
>> >>                 batchSize="1000"
>> >>             ),
>> >>             select(
>> >>                 fetch(
>> >>                     sourceCollection,
>> >>                     complement(
>> >>                         search(
>> >>                             sourceCollection,
>> >>                             q="*:*",
>> >>                             qt="/export",
>> >>                             fq="...some other filters...",
>> >>                             sort="id_str asc",
>> >>                             fl="id_str",
>> >>                             partitionKeys="id_str"
>> >>                         ),
>> >>                         search(
>> >>                             DNM,
>> >>                             q="*:*",
>> >>                             qt="/export",
>> >>                             sort="id_str asc",
>> >>                             fl="id_str",
>> >>                             partitionKeys="id_str"
>> >>                         ),
>> >>                         on="id_str"
>> >>                     ),
>> >>                     fl="...some other fields...",
>> >>                     on="id_str",
>> >>                     batchSize="1000"
>> >>                 ),
>> >>                 id_str, ..some other fields as...
>> >>             ),
>> >>             on="id_str"
>> >>         ),
>> >>         workers="4", sort="id_str asc"
>> >>     )
>> >> )
>> >>
>> >> Sent from Mail for Windows 10
>> >>
>> >>
>>
>>

Reply via email to