Re: IOException occured when talking to server

2020-08-17 Thread Odysci
Dominique,
thanks, but I'm not sure the links you sent point to an actual solution.
The Nginx logs, sometimes give a 499 return code which is:
(499 Client Closed Request
Used when the client has closed the request before the server could send a
response.

but the timestamps of these log msgs do not coincide with the IOException,
so I'm not sure they are related.
Reinaldo

On Mon, Aug 17, 2020 at 12:59 PM Dominique Bejean 
wrote:

> Hi,
>
> It looks like this issues
> https://github.com/eclipse/jetty.project/issues/4883
> https://github.com/eclipse/jetty.project/issues/2571
>
> The Nginx server closed the connection. Any info in nginx log ?
>
> Dominique
>
> Le lun. 17 août 2020 à 17:33, Odysci  a écrit :
>
>> Hi,
>> thanks for the reply.
>> We're using solr 8.3.1, ZK 3.5.6
>> The stacktrace is below.
>> The address on the first line "http://192.168.15.10:888/solr/mycollection";
>> is the "server" address in my nginx configuration, which points to 2
>> upstream solr nodes. There were no other solr or ZK messages in the logs.
>>
>> StackTrace:
>> (Msg = IOException occured when talking to server at:
>> http://192.168.15.10:888/solr/mycollection)
>> org.apache.solr.client.solrj.SolrServerException: IOException occured
>> when talking to server at: http://192.168.15.10:888/solr/mycollection
>> at
>> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:418)
>> at
>> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:754)
>> at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:211)
>> at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1035)
>> at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1051)
>> ... calls from our code
>> ... calls from our code
>> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>> at
>> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>> at
>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>> at
>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>> at java.base/java.lang.Thread.run(Thread.java:834)
>> Caused by: java.nio.channels.AsynchronousCloseException
>> at
>> org.eclipse.jetty.http2.client.http.HttpConnectionOverHTTP2.close(HttpConnectionOverHTTP2.java:144)
>> at
>> org.eclipse.jetty.http2.client.http.HttpClientTransportOverHTTP2.onClose(HttpClientTransportOverHTTP2.java:170)
>> at
>> org.eclipse.jetty.http2.client.http.HttpClientTransportOverHTTP2$SessionListenerPromise.onClose(HttpClientTransportOverHTTP2.java:232)
>> at org.eclipse.jetty.http2.api.Session$Listener.onClose(Session.java:206)
>> at
>> org.eclipse.jetty.http2.HTTP2Session.notifyClose(HTTP2Session.java:1153)
>> at org.eclipse.jetty.http2.HTTP2Session.onGoAway(HTTP2Session.java:438)
>> at
>> org.eclipse.jetty.http2.parser.Parser$Listener$Wrapper.onGoAway(Parser.java:392)
>> at
>> org.eclipse.jetty.http2.parser.BodyParser.notifyGoAway(BodyParser.java:187)
>> at
>> org.eclipse.jetty.http2.parser.GoAwayBodyParser.onGoAway(GoAwayBodyParser.java:169)
>> at
>> org.eclipse.jetty.http2.parser.GoAwayBodyParser.parse(GoAwayBodyParser.java:108)
>> at org.eclipse.jetty.http2.parser.Parser.parseBody(Parser.java:194)
>> at org.eclipse.jetty.http2.parser.Parser.parse(Parser.java:123)
>> at
>> org.eclipse.jetty.http2.HTTP2Connection$HTTP2Producer.produce(HTTP2Connection.java:248)
>> at
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produceTask(EatWhatYouKill.java:357)
>> at
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:181)
>> at
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
>> at
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:132)
>> at
>> org.eclipse.jetty.http2.HTTP2Connection.produce(HTTP2Connection.java:170)
>> at
>> org.eclipse.jetty.http2.HTTP2Connection.onFillable(HTTP2Connection.java:125)
>> at
>> org.eclipse.jetty.http2.HTTP2Connection$FillableCallback.succeeded(HTTP2Connection.java:348)
>> at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
>> at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
>> at
>> org.eclipse.jetty.util.thread.Invocable.invokeNonBlocking(Invocable.java:68)
>> at
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.invokeTask(EatWhatYouKill.java:345)

Re: IOException occured when talking to server

2020-08-17 Thread Odysci
Hi,
thanks for the reply.
We're using solr 8.3.1, ZK 3.5.6
The stacktrace is below.
The address on the first line "http://192.168.15.10:888/solr/mycollection";
is the "server" address in my nginx configuration, which points to 2
upstream solr nodes. There were no other solr or ZK messages in the logs.

StackTrace:
(Msg = IOException occured when talking to server at:
http://192.168.15.10:888/solr/mycollection)
org.apache.solr.client.solrj.SolrServerException: IOException occured when
talking to server at: http://192.168.15.10:888/solr/mycollection
at
org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:418)
at
org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:754)
at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:211)
at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1035)
at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1051)
... calls from our code
... calls from our code
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.nio.channels.AsynchronousCloseException
at
org.eclipse.jetty.http2.client.http.HttpConnectionOverHTTP2.close(HttpConnectionOverHTTP2.java:144)
at
org.eclipse.jetty.http2.client.http.HttpClientTransportOverHTTP2.onClose(HttpClientTransportOverHTTP2.java:170)
at
org.eclipse.jetty.http2.client.http.HttpClientTransportOverHTTP2$SessionListenerPromise.onClose(HttpClientTransportOverHTTP2.java:232)
at org.eclipse.jetty.http2.api.Session$Listener.onClose(Session.java:206)
at org.eclipse.jetty.http2.HTTP2Session.notifyClose(HTTP2Session.java:1153)
at org.eclipse.jetty.http2.HTTP2Session.onGoAway(HTTP2Session.java:438)
at
org.eclipse.jetty.http2.parser.Parser$Listener$Wrapper.onGoAway(Parser.java:392)
at
org.eclipse.jetty.http2.parser.BodyParser.notifyGoAway(BodyParser.java:187)
at
org.eclipse.jetty.http2.parser.GoAwayBodyParser.onGoAway(GoAwayBodyParser.java:169)
at
org.eclipse.jetty.http2.parser.GoAwayBodyParser.parse(GoAwayBodyParser.java:108)
at org.eclipse.jetty.http2.parser.Parser.parseBody(Parser.java:194)
at org.eclipse.jetty.http2.parser.Parser.parse(Parser.java:123)
at
org.eclipse.jetty.http2.HTTP2Connection$HTTP2Producer.produce(HTTP2Connection.java:248)
at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produceTask(EatWhatYouKill.java:357)
at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:181)
at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:132)
at org.eclipse.jetty.http2.HTTP2Connection.produce(HTTP2Connection.java:170)
at
org.eclipse.jetty.http2.HTTP2Connection.onFillable(HTTP2Connection.java:125)
at
org.eclipse.jetty.http2.HTTP2Connection$FillableCallback.succeeded(HTTP2Connection.java:348)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
at
org.eclipse.jetty.util.thread.Invocable.invokeNonBlocking(Invocable.java:68)
at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.invokeTask(EatWhatYouKill.java:345)
at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:300)
at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:132)
at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
... 3 more

-

I did consider using the solrJ cloud or lb clients, but nginx gives me more
flexibility in controlling how the load balancing is done. I'm still
running experiments to see which one works best for me.
In the meantime, if you have any clues for why I'm getting this
IOException, I'd appreciate.
Thanks

Reinaldo



On Mon, Aug 17, 2020 at 10:59 AM Dominique Bejean 
wrote:

> Hi,
>
> Can you provide more information ?
> - Solr and ZK version
> - full error stacktrace generated by SolrJ
> - any concomitant and relevant information in solr nodes logs or zk logs
>
> Just to know, why not use a load balanced LBHttp... Solr Client ?
>
> Regards.
>
> Dominique
>
>
> Le lun. 17 août 2020 à 00:41, Odysci  a écrit :
>
> > Hi,
> >
> > We have a solrcloud setup with 2 solr nodes and 3 ZK instances. Until
> > recently I had my application server always call one of the solr nodes
> (via
> > solrJ), and

IOException occured when talking to server

2020-08-16 Thread Odysci
Hi,

We have a solrcloud setup with 2 solr nodes and 3 ZK instances. Until
recently I had my application server always call one of the solr nodes (via
solrJ), and it worked just fine.

In order to improve reliability I put an Nginx reverse-proxy load balancer
between my application server and the solr nodes. The response time
remained almost the same but we started getting the msg:

IOException occured when talking to server http://myserver

every minute or so (very randomly but consistently). Since our code will
just try again after a few milliseconds, the overall system continues to
work fine, despite the delay. I tried increasing all nginx related
timeout's to no avail.
I've searched for this msg a lot and most replies seem to be related to
ssl.
We are using http2solrclient but no ssl to solr.
Can anyone shed any light on this?

Thanks!
Reinaldo


Re: Meow attacks

2020-07-28 Thread Odysci
Folks,
thanks for the replies. We do use VPCs in AWS and the ZK ports are only
open to the solr machines (also in the same VPC). We're using Solr 8.3 and
ZK 3.5.6
We will investigate the Kerberos authentication.
thanks

Reinaldo

On Tue, Jul 28, 2020 at 6:03 PM Jörn Franke  wrote:

> In Addition what has been said before (use private networks/firewall
> rules) - activate Kerberos authentication so that only Solr hosts can write
> to Zk (the Solr client needs no write access) and use encryption where
> possible.
> Upgrade Solr to the latest version, use ssl , enable Kerberos, have
> clients not having any admin access on Solr (minimum privileges only!), use
> Solr whitelists to enable only clients that should access Solr, enable Java
> security manager (* to make it work with Kerberos auth you need for it to
> wait for a newer Solr version).
>
> > Am 28.07.2020 um 22:41 schrieb Odysci :
> >
> > Folks,
> >
> > I suspect one of our Zookeeper installations on AWS was subject to a Meow
> > attack (
> >
> https://arstechnica.com/information-technology/2020/07/more-than-1000-databases-have-been-nuked-by-mystery-meow-attack/
> > )
> >
> > Basically, the configuration for one of our collections disappeared from
> > the Zookeeper tree (when looking at the Solr interface), and it left
> > several files ending in "-meow"
> > Before I realized it, I stopped and restarted the ZK and Solr machines
> (as
> > part of ubuntu updates), and when ZK didn't find the configuration for a
> > collection, it deleted the collection from Solr. At least that's what I
> > suspect happened.
> >
> > Fortunately it affected a very small index and we had backups. But it is
> > very worrisome.
> > Has anyone had any problems with this?
> > Is there any type of log that I can check to sort out how this happened?
> > The ZK log complained that the configs for the collection were not there,
> > but that's about it.
> >
> > and, is there a better way to protect against such attacks?
> > Thanks
> >
> > Reinaldo
>


Meow attacks

2020-07-28 Thread Odysci
Folks,

I suspect one of our Zookeeper installations on AWS was subject to a Meow
attack (
https://arstechnica.com/information-technology/2020/07/more-than-1000-databases-have-been-nuked-by-mystery-meow-attack/
)

Basically, the configuration for one of our collections disappeared from
the Zookeeper tree (when looking at the Solr interface), and it left
several files ending in "-meow"
Before I realized it, I stopped and restarted the ZK and Solr machines (as
part of ubuntu updates), and when ZK didn't find the configuration for a
collection, it deleted the collection from Solr. At least that's what I
suspect happened.

Fortunately it affected a very small index and we had backups. But it is
very worrisome.
Has anyone had any problems with this?
Is there any type of log that I can check to sort out how this happened?
The ZK log complained that the configs for the collection were not there,
but that's about it.

and, is there a better way to protect against such attacks?
Thanks

Reinaldo


Re: Solr heap Old generation grows and it is not recovered by G1GC

2020-07-14 Thread Odysci
Hi Erick,

I agree. The 300K docs in one search is an anomaly.
But we do use 'fq' to return a large number of docs for the purposes of
generating statistics for the whole index. We do use CursorMark extensively.
Thanks!

Reinaldo

On Tue, Jul 14, 2020 at 8:55 AM Erick Erickson 
wrote:

> I’d add that you’re abusing Solr horribly by returning 300K documents in a
> single go.
>
> Solr is built to return the top N docs where N is usually quite small, <
> 100. If you allow
> an unlimited number of docs to be returned, you’re simply kicking the can
> down
> the road, somebody will ask for 1,000,000 docs sometime and you’ll be back
> where
> you started.
>
> I _strongly_ recommend you do one of two things for such large result sets:
>
> 1> Use Streaming. Perhaps Streaming Expressions will do what you want
> without you having to process all those docs on the client if you’re
> doing some kind of analytics.
>
> 2> if you really, truly need all 300K docs, try getting them in chunks
>  using CursorMark.
>
> Best,
> Erick
>
> > On Jul 13, 2020, at 10:03 PM, Odysci  wrote:
> >
> > Shawn,
> >
> > thanks for the extra info.
> > The OOM errors were indeed because of heap space. In my case most of the
> GC
> > calls were not full GC. Only when heap was really near the top, a full GC
> > was done.
> > I'll try out your suggestion of increasing the G1 heap region size. I've
> > been using 4m, and from what you said, a 2m allocation would be
> considered
> > humongous. My test cases have a few allocations that are definitely
> bigger
> > than 2m (estimating based on the number of docs returned), but most of
> them
> > are not.
> >
> > When i was using maxRamMB, the size used was "compatible" with the the
> size
> > values, assuming the avg 2K bytes docs that our index has.
> > As far as I could tell in my runs, removing maxRamMB did change the GC
> > behavior for the better. That is, now, heap goes up and down as expected,
> > and before (with maxRamMB) it seemed to increase continuously.
> > Thanks
> >
> > Reinaldo
> >
> > On Sun, Jul 12, 2020 at 1:02 AM Shawn Heisey 
> wrote:
> >
> >> On 6/25/2020 2:08 PM, Odysci wrote:
> >>> I have a solrcloud setup with 12GB heap and I've been trying to
> optimize
> >> it
> >>> to avoid OOM errors. My index has about 30million docs and about 80GB
> >>> total, 2 shards, 2 replicas.
> >>
> >> Have you seen the full OutOfMemoryError exception text?  OOME can be
> >> caused by problems that are not actually memory-related.  Unless the
> >> error specifically mentions "heap space" we might be chasing the wrong
> >> thing here.
> >>
> >>> When the queries return a smallish number of docs (say, below 1000),
> the
> >>> heap behavior seems "normal". Monitoring the gc log I see that young
> >>> generation grows then when GC kicks in, it goes considerably down. And
> >> the
> >>> old generation grows just a bit.
> >>>
> >>> However, at some point i have a query that returns over 300K docs (for
> a
> >>> total size of approximately 1GB). At this very point the OLD generation
> >>> size grows (almost by 2GB), and it remains high for all remaining time.
> >>> Even as new queries are executed, the OLD generation size does not go
> >> down,
> >>> despite multiple GC calls done afterwards.
> >>
> >> Assuming the OOME exceptions were indeed caused by running out of heap,
> >> then the following paragraphs will apply:
> >>
> >> G1 has this concept called "humongous allocations".  In order to reach
> >> this designation, a memory allocation must get to half of the G1 heap
> >> region size.  You have set this to 4 megabytes, so any allocation of 2
> >> megabytes or larger is humongous.  Humongous allocations bypass the new
> >> generation entirely and go directly into the old generation.  The max
> >> value that can be set for the G1 region size is 32MB.  If you increase
> >> the region size and the behavior changes, then humongous allocations
> >> could be something to investigate.
> >>
> >> In the versions of Java that I have used, humongous allocations can only
> >> be reclaimed as garbage by a full GC.  I do not know if Oracle has
> >> changed this so the smaller collections will do it or not.
> >>
> >> Were any of those multiple GCs a Full GC?  If they were, then there is
> >> probably little or no garbage to collect.  You've gotten a reply from
> >> "Zisis T." with some possible causes for this.  I do not have anything
> >> to add.
> >>
> >> I did not know about any problems with maxRamMB ... but if I were
> >> attempting to limit cache sizes, I would do so by the size values, not a
> >> specific RAM size.  The size values you have chosen (8192 and 16384)
> >> will most likely result in a total cache size well beyond the limits
> >> you've indicated with maxRamMB.  So if there are any bugs in the code
> >> with the maxRamMB parameter, you might end up using a LOT of memory that
> >> you didn't expect to be using.
> >>
> >> Thanks,
> >> Shawn
> >>
>
>


Re: Solr heap Old generation grows and it is not recovered by G1GC

2020-07-13 Thread Odysci
Shawn,

thanks for the extra info.
The OOM errors were indeed because of heap space. In my case most of the GC
calls were not full GC. Only when heap was really near the top, a full GC
was done.
I'll try out your suggestion of increasing the G1 heap region size. I've
been using 4m, and from what you said, a 2m allocation would be considered
humongous. My test cases have a few allocations that are definitely bigger
than 2m (estimating based on the number of docs returned), but most of them
are not.

When i was using maxRamMB, the size used was "compatible" with the the size
values, assuming the avg 2K bytes docs that our index has.
As far as I could tell in my runs, removing maxRamMB did change the GC
behavior for the better. That is, now, heap goes up and down as expected,
and before (with maxRamMB) it seemed to increase continuously.
Thanks

Reinaldo

On Sun, Jul 12, 2020 at 1:02 AM Shawn Heisey  wrote:

> On 6/25/2020 2:08 PM, Odysci wrote:
> > I have a solrcloud setup with 12GB heap and I've been trying to optimize
> it
> > to avoid OOM errors. My index has about 30million docs and about 80GB
> > total, 2 shards, 2 replicas.
>
> Have you seen the full OutOfMemoryError exception text?  OOME can be
> caused by problems that are not actually memory-related.  Unless the
> error specifically mentions "heap space" we might be chasing the wrong
> thing here.
>
> > When the queries return a smallish number of docs (say, below 1000), the
> > heap behavior seems "normal". Monitoring the gc log I see that young
> > generation grows then when GC kicks in, it goes considerably down. And
> the
> > old generation grows just a bit.
> >
> > However, at some point i have a query that returns over 300K docs (for a
> > total size of approximately 1GB). At this very point the OLD generation
> > size grows (almost by 2GB), and it remains high for all remaining time.
> > Even as new queries are executed, the OLD generation size does not go
> down,
> > despite multiple GC calls done afterwards.
>
> Assuming the OOME exceptions were indeed caused by running out of heap,
> then the following paragraphs will apply:
>
> G1 has this concept called "humongous allocations".  In order to reach
> this designation, a memory allocation must get to half of the G1 heap
> region size.  You have set this to 4 megabytes, so any allocation of 2
> megabytes or larger is humongous.  Humongous allocations bypass the new
> generation entirely and go directly into the old generation.  The max
> value that can be set for the G1 region size is 32MB.  If you increase
> the region size and the behavior changes, then humongous allocations
> could be something to investigate.
>
> In the versions of Java that I have used, humongous allocations can only
> be reclaimed as garbage by a full GC.  I do not know if Oracle has
> changed this so the smaller collections will do it or not.
>
> Were any of those multiple GCs a Full GC?  If they were, then there is
> probably little or no garbage to collect.  You've gotten a reply from
> "Zisis T." with some possible causes for this.  I do not have anything
> to add.
>
> I did not know about any problems with maxRamMB ... but if I were
> attempting to limit cache sizes, I would do so by the size values, not a
> specific RAM size.  The size values you have chosen (8192 and 16384)
> will most likely result in a total cache size well beyond the limits
> you've indicated with maxRamMB.  So if there are any bugs in the code
> with the maxRamMB parameter, you might end up using a LOT of memory that
> you didn't expect to be using.
>
> Thanks,
> Shawn
>


Re: Solr heap Old generation grows and it is not recovered by G1GC

2020-06-27 Thread Odysci
Hi,

Just summarizing:
I've experimented using different sized of filtercache and documentcache,
after removing any maxRamMB.  Now the heap seems to behave as expected,
that is, it grows, then GC (not full one) kicks in multiple times and keep
the used heap under control. eventually full GC may kick in and the size
goes down a little more.

Previously, when I had maxRamMB specified, the heap would grow considerably
(for a search returning about 300K docs) and after that it would not go
down again (and those docs were never again requested). This did not work
well.

I looked at the heapdump and saw all the caches (filter, document, one type
per core), so if you have multiple shards you may have to be very careful
not to increase the cache sizes, because they apply to each core.

I still think there is something strange when a search returns a large
number of docs - the G1GC didn't seem to handle that very well in some
cases (when maxRamMB was specified), but that may be the symptom and not
the cause.
Thanks for the help.

Reinaldo

On Sat, Jun 27, 2020 at 4:29 AM Zisis T.  wrote:

> Hi Reinaldo,
>
> Glad that helped. I've had several sleepless nights with Solr clusters
> failing spectacularly in production due to that but I still cannot say that
> the problem is completely away.
>
> Did you check in the heap dump if you have cache memory leaks as described
> in https://issues.apache.org/jira/browse/SOLR-12743?
>
> Say you have 4 cache instances (filterCache, documentCache etc) per core
> and
> you have 5 Solr cores you should not see more than 20 CaffeineCache
> instances in your dump.
>
> Unfortunately I still cannot determine what exactly triggers this memory
> leak although since I removed the maxRAMMB setting I've not seen similar
> behavior for more than a month now in production.
>
> The weird thing is that I was running on Solr 7.5.0 for quite some time
> without any issues and it was at some point in time that those problems
> started appearing...
>
>
>
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


Re: Solr heap Old generation grows and it is not recovered by G1GC

2020-06-26 Thread Odysci
Thanks.
The heapdump indicated that most of the space was occupied by the caches
(filter and documentCache in my case).
I followed your suggestion of removing the limit on maxRAMMB on filterCache
and documentCache and decreasing the number of entries allowed.
It did have a significant impact on the used heap size. So I guess, I have
to find the sweet spot between hit ratio and size
Still, the OldGeneration does not seem to fall significantly even if I
force a full GC (using jvisualvm).

Any other suggestions are welcome!
Thanks

Reinaldo

On Fri, Jun 26, 2020 at 5:05 AM Zisis T.  wrote:

> I have faced similar issues and the culprit was filterCache when using
> maxRAMMB. More specifically on a sharded Solr cluster with lots of faceting
> during search (which makes use of the filterCache in a distributed setting)
> I noticed that maxRAMMB value was not respected. I had a value of 300MB set
> but I witnessed an instance sized a couple of GBs in a heap dump at some
> point. The thing that I found was that because the keys of the Map
> (BooleanQuery or something if I recall correctly) was not implementing the
> Accountable interface it was NOT taken into account when calculating the
> cache's size. But all that was on a 7.5 cluster using FastLRUCache.
>
> There's also https://issues.apache.org/jira/browse/SOLR-12743 on caches
> memory leak which does not seem to have been fixed yet although the trigger
> points of this memory leak are not clear. I've witnessed this as well on a
> 7.5 cluster with multiple (>10) filter cache objects for a single core each
> holding from a few MBs to GBs.
>
> Try to get a heap dump from your cluster, the truth is almost always hidden
> there.
>
> One workaround which seems to alleviate the problem is to check you running
> Solr cluster and see in reality how many cache entries actually give you a
> good hit ratio and get rid of the maxRAMMB attribute. Play only with the
> size.
>
>
>
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


Re: Solr heap Old generation grows and it is not recovered by G1GC

2020-06-25 Thread Odysci
Hi Furkan,

I'm using solr 8.3.1 (with openjdk version "11.0.7"),  with the following
cache settings:




   


   



   


Thanks
Reinaldo

On Thu, Jun 25, 2020 at 7:45 PM Furkan KAMACI 
wrote:

> Hi Reinaldo,
>
> Which version of Solr do you use and could you share your cache settings?
>
> On the other hand, did you check here:
> https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems
>
> Kind Regards,
> Furkan KAMACI
>
> On Thu, Jun 25, 2020 at 11:09 PM Odysci  wrote:
>
> > Hi,
> >
> > I have a solrcloud setup with 12GB heap and I've been trying to optimize
> it
> > to avoid OOM errors. My index has about 30million docs and about 80GB
> > total, 2 shards, 2 replicas.
> >
> > In my testing setup I submit multiple queries to solr (same node),
> > sequentially, and with no overlap between the documents returned in each
> > query (so docs do not need to be kept in cache)
> >
> > When the queries return a smallish number of docs (say, below 1000), the
> > heap behavior seems "normal". Monitoring the gc log I see that young
> > generation grows then when GC kicks in, it goes considerably down. And
> the
> > old generation grows just a bit.
> >
> > However, at some point i have a query that returns over 300K docs (for a
> > total size of approximately 1GB). At this very point the OLD generation
> > size grows (almost by 2GB), and it remains high for all remaining time.
> > Even as new queries are executed, the OLD generation size does not go
> down,
> > despite multiple GC calls done afterwards.
> >
> > Can anyone shed some light on this behavior?
> >
> > I'm using the following GC options:
> > GC_TUNE=" \
> >
> > -XX:+UseG1GC \
> >
> > -XX:+PerfDisableSharedMem \
> >
> > -XX:+ParallelRefProcEnabled \
> >
> > -XX:G1HeapRegionSize=4m \
> >
> > -XX:MaxGCPauseMillis=250 \
> >
> > -XX:InitiatingHeapOccupancyPercent=75 \
> >
> > -XX:+UseLargePages \
> >
> > -XX:+AggressiveOpts \
> >
> > "
> > Thanks
> > Reinaldo
> >
>


Solr heap Old generation grows and it is not recovered by G1GC

2020-06-25 Thread Odysci
Hi,

I have a solrcloud setup with 12GB heap and I've been trying to optimize it
to avoid OOM errors. My index has about 30million docs and about 80GB
total, 2 shards, 2 replicas.

In my testing setup I submit multiple queries to solr (same node),
sequentially, and with no overlap between the documents returned in each
query (so docs do not need to be kept in cache)

When the queries return a smallish number of docs (say, below 1000), the
heap behavior seems "normal". Monitoring the gc log I see that young
generation grows then when GC kicks in, it goes considerably down. And the
old generation grows just a bit.

However, at some point i have a query that returns over 300K docs (for a
total size of approximately 1GB). At this very point the OLD generation
size grows (almost by 2GB), and it remains high for all remaining time.
Even as new queries are executed, the OLD generation size does not go down,
despite multiple GC calls done afterwards.

Can anyone shed some light on this behavior?

I'm using the following GC options:
GC_TUNE=" \

-XX:+UseG1GC \

-XX:+PerfDisableSharedMem \

-XX:+ParallelRefProcEnabled \

-XX:G1HeapRegionSize=4m \

-XX:MaxGCPauseMillis=250 \

-XX:InitiatingHeapOccupancyPercent=75 \

-XX:+UseLargePages \

-XX:+AggressiveOpts \

"
Thanks
Reinaldo


Re: Solr caches per node or per core

2020-06-24 Thread Odysci
Thanks!

Reinaldo

On Wed, Jun 24, 2020 at 11:47 AM Emir Arnautović <
emir.arnauto...@sematext.com> wrote:

> Hi Reinaldo,
> It is per core. Single node can have cores from different collections,
> each configured differently. When you size caches from memory consumption
> point of view, you have to take into account how many cores will be placed
> on each node. Of course, you have to count replicas as well.
>
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 24 Jun 2020, at 16:38, Odysci  wrote:
> >
> > Hi,
> >
> > I have a Solrcloud configuration with 2 nodes and 2 shards/2 replicas.
> > I configure the sizes of the solr caches on solrconfig.xml, which I
> > believe apply to nodes.
> >
> > But when I look at the caches in the Solr UI, they are shown per core
> > (e.g., shard1_replica_N1). Are the cache sizes defined in the
> > solrconfig.xml the total size (adding up the caches for all cores in the
> > node)? or are the cache sizes defined in the solrconfig.xm applied to
> each
> > core separately?
> > Thanks
> >
> > Reinaldo
>
>


Solr caches per node or per core

2020-06-24 Thread Odysci
Hi,

I have a Solrcloud configuration with 2 nodes and 2 shards/2 replicas.
I configure the sizes of the solr caches on solrconfig.xml, which I
believe apply to nodes.

But when I look at the caches in the Solr UI, they are shown per core
(e.g., shard1_replica_N1). Are the cache sizes defined in the
solrconfig.xml the total size (adding up the caches for all cores in the
node)? or are the cache sizes defined in the solrconfig.xm applied to each
core separately?
Thanks

Reinaldo


Re: replica deleted but directory remains

2020-06-24 Thread Odysci
Thanks!
Reinaldo

On Tue, Jun 23, 2020 at 6:27 PM Erick Erickson 
wrote:

> In a word, “yes”. What it looks like is that the information in
> Zookeeper has been updated to reflect the deletion. But since
> node for some mysterious reason wasn’t available when the replica
> was deleted, the data couldn’t be removed.
>
> Best,
> Erick
>
> > On Jun 23, 2020, at 12:58 PM, Odysci  wrote:
> >
> > Hi,
> > I've got a solrcloud configuration with 2 shards and 2 replicas each.
> > For some unknown reason, one of the replicas was on "recovery" mode
> > forever, so I decided to create another replica, which went fine.
> > Then I proceeded to delete the old replica (using the SOlr UI). After a
> > while the interface gave me a msg about not being able to connect to the
> > solr node. But once i refreshed it, the old replica was no longer showing
> > in the interface, and the new replica was active.
> > However, the directory in disk for the old replica is still there (and
> it's
> > size is larger than originally).
> > In a previous time when I did this in the exact the same way, the
> directory
> > was removed.
> >
> > My quesion is, can I manually delete the directory for the old replica?
> > Or is there a solr command that will do this cleanly?
> > Thanks
> >
> > Reinaldo
>
>


replica deleted but directory remains

2020-06-23 Thread Odysci
Hi,
I've got a solrcloud configuration with 2 shards and 2 replicas each.
For some unknown reason, one of the replicas was on "recovery" mode
forever, so I decided to create another replica, which went fine.
Then I proceeded to delete the old replica (using the SOlr UI). After a
while the interface gave me a msg about not being able to connect to the
solr node. But once i refreshed it, the old replica was no longer showing
in the interface, and the new replica was active.
However, the directory in disk for the old replica is still there (and it's
size is larger than originally).
In a previous time when I did this in the exact the same way, the directory
was removed.

My quesion is, can I manually delete the directory for the old replica?
Or is there a solr command that will do this cleanly?
Thanks

Reinaldo


Re: search in solrcloud on replicas

2020-06-04 Thread Odysci
Erick,
thanks a lot, very clear.

Reinaldo

On Thu, Jun 4, 2020 at 8:37 PM Erick Erickson 
wrote:

> Close. Zookeeper is not involved in routing requests. Each Solr node
> queries Zookeeper to get the topology of the cluster, and thereafter
> Zookeeper will notify each node when the topology changes, i.e.
> a node goes up or down, a replica goes into recovery etc. Zookeeper
> does _not_ get involved in each request since each Solr node
> has all the information it needs to satisfy the request cached.
> This is a common misunderstanding.
>
> So, nodeA gets the topology of the cluster, including the IP addresses
> of each and every node in the cluster. Now you send a query directly to
> nodeA. There is an internal load balancer that routes that request to one
> of the nodes in the cluster, perhaps itself, perhaps nodeB, etc. That way,
> nodeA doesn’t do all the aggregating.
>
> Aggregating? Well, a top-level request comes in to nodeB. Let’s say
> rows=10. NodeB must send a sub-request to one replica of every shard
> and get the top 10 from each one. It then sorts the lists by whatever
> the sort criteria are and sends another request to each of the replicas
> queried in the first step to get the actual top 10 docs. Why the 2nd round
> trip? Well, imagine there are 100 shards (and I’ve seen more). If the
> sub-requests each returned the top 10 documents, there would be
> 1,000 documents fetched, 990 of which would be thrown away.
>
> Your setup has a single point of failure the way you have it set up now.
> Ideally, you have nodeA with one replica of each shard and nodeB also
> has one replica for each shard. So either one can go down and your system
> can still serve requests. However, since your app is sending the
> requests all to the same node, you don’t have that robustness; if that
> node goes down so does your entire application.
>
> You should be doing one of two things:
> 1> use a load balancer between your app and your Solr nodes
> or
> 2> have your app use SolrJ and CloudSolrClient. That class is “just
> another Solr node” as far as Zookeeper is concerned. It goes through
> the exact same process as a Solr node. When it starts, it gets a snapshot
> of the topology of the cluster and “does the right thing” with requests,
> including dealing with any changes to the topology, i.e. nodes
> stopping/starting, replicas going into recovery, new collections being
> added, etc.
>
> HTH,
> Erick
>
> > On Jun 4, 2020, at 7:11 PM, Odysci  wrote:
> >
> > Erick,
> > thanks for the reply.
> > Your last line puzzled me a bit. You wrote
> > *"The theory is that all the top-level requests shouldn’t be handled by
> the
> > same Solr instance if a client is directly using the http address of a
> > single node in the cluster for all requests."*
> >
> > We are using 2 machines (2 different IPs), 2 shards with 2 replicas each.
> > We have an application which sends all solr requests to the same http
> > address of our machine A. I assumed that Zookeeper would distribute the
> > requests among the nodes.
> > Is this not the right thing to do? Should I have the application
> alternate
> > the solr machine to send requests to?
> > Thanks
> >
> > Reinaldo
> >
> >
> > On Wed, May 27, 2020 at 12:37 PM Erick Erickson  >
> > wrote:
> >
> >> The base algorithm for searches picks out one replica from each
> >> shard in a round-robin fashion, without regard to whether it’s on
> >> the same machine or not.
> >>
> >> You can alter this behavior, see:
> >> https://lucene.apache.org/solr/guide/8_1/distributed-requests.html
> >>
> >> When you say “the exact same search”, it isn’t quite in the sense that
> >> it’s going to a different shard as evidenced by &DISTRIB=false being
> >> on the URL (I’d guess you already know that, but…). The top-level
> >> request _may_ be forwarded as is, there’s an internal load balancer
> >> that does this. The theory is that all the top-level requests shouldn’t
> >> be handled by the same Solr instance if a client is directly using
> >> the http address of a single node in the cluster for all requests.
> >>
> >> Best,
> >> Erick
> >>
> >>
> >>
> >>> On May 27, 2020, at 11:12 AM, Odysci  wrote:
> >>>
> >>> Hi,
> >>>
> >>> I have a question regarding solrcloud searches on both replicas of an
> >> index.
> >>> I have a solrcloud setup with 2 physical machines (let's call them A
> and
> >>> B), and my index is divided into 2 shards, and 2

Re: search in solrcloud on replicas

2020-06-04 Thread Odysci
Erick,
thanks for the reply.
Your last line puzzled me a bit. You wrote
*"The theory is that all the top-level requests shouldn’t be handled by the
same Solr instance if a client is directly using the http address of a
single node in the cluster for all requests."*

We are using 2 machines (2 different IPs), 2 shards with 2 replicas each.
We have an application which sends all solr requests to the same http
address of our machine A. I assumed that Zookeeper would distribute the
requests among the nodes.
Is this not the right thing to do? Should I have the application alternate
the solr machine to send requests to?
Thanks

Reinaldo


On Wed, May 27, 2020 at 12:37 PM Erick Erickson 
wrote:

> The base algorithm for searches picks out one replica from each
> shard in a round-robin fashion, without regard to whether it’s on
> the same machine or not.
>
> You can alter this behavior, see:
> https://lucene.apache.org/solr/guide/8_1/distributed-requests.html
>
> When you say “the exact same search”, it isn’t quite in the sense that
> it’s going to a different shard as evidenced by &DISTRIB=false being
> on the URL (I’d guess you already know that, but…). The top-level
> request _may_ be forwarded as is, there’s an internal load balancer
> that does this. The theory is that all the top-level requests shouldn’t
> be handled by the same Solr instance if a client is directly using
> the http address of a single node in the cluster for all requests.
>
> Best,
> Erick
>
>
>
> > On May 27, 2020, at 11:12 AM, Odysci  wrote:
> >
> > Hi,
> >
> > I have a question regarding solrcloud searches on both replicas of an
> index.
> > I have a solrcloud setup with 2 physical machines (let's call them A and
> > B), and my index is divided into 2 shards, and 2 replicas, such that each
> > machine has a full copy of the index. My Zookeeper setup uses 3
> instances.
> > The nodes and replicas are as follows:
> > Machine A:
> >  core_node3 / shard1_replica_n1
> >  core_node7 / shard2_replica_n4
> > Machine B:
> >  core_node5 / shard1_replica_n2
> >  core_node8 / shard2_replica_n6
> >
> > I'm using solrJ and I create the solr client using
> Http2SolrClient.Builder
> > and the IP of machineA.
> >
> > Here is my question:
> > when I do a search (using solrJ) and I look at the search logs on both
> > machines, I see that the same search is being executed on both machines.
> > But if the full index is present on both machines, wouldn't it be enough
> > just to search on one of machines?
> > In fact, if I turn off machine B, the search returns the correct results
> > anyway.
> >
> > Thanks a lot.
> >
> > Reinaldo
>
>


question about setup for maximizing solr performance

2020-06-01 Thread Odysci
Hi,
I'm looking for some advice on improving performance of our solr setup. In
particular, about the trade-offs between applying larger machines, vs more
smaller machines. Our full index has just over 100 million docs, and we do
almost all searches using fq's (with q=*:*) and facets. We are using solr
8.3.

Currently, I have a solrcloud setup with 2 physical machines (let's call
them A and B), and my index is divided into 2 shards, and 2 replicas, such
that each machine has a full copy of the index.
The nodes and replicas are as follows:
Machine A:
  core_node3 / shard1_replica_n1
  core_node7 / shard2_replica_n4
Machine B:
  core_node5 / shard1_replica_n2
  core_node8 / shard2_replica_n6

My Zookeeper setup uses 3 instances. It's also the case that most of the
searches we do, we have results returning from both shards (from the same
search).

My experiments indicate that our setup is cpu-bound.
Due to cost constraints, I could, either, double the cpu in each of the 2
machines, or make it a 4-machine setup (using current size machines) and 2
shards and 4 replicas (or 4 shards w/ 4 replicas). I assume that keeping
the full index on all machines will allow all searches to be evenly
distributed.

Does anyone have any insights on what would be better for maximizing
throughput on multiple searches being done at the same time?
thanks!

Reinaldo


search in solrcloud on replicas

2020-05-27 Thread Odysci
Hi,

I have a question regarding solrcloud searches on both replicas of an index.
I have a solrcloud setup with 2 physical machines (let's call them A and
B), and my index is divided into 2 shards, and 2 replicas, such that each
machine has a full copy of the index. My Zookeeper setup uses 3 instances.
The nodes and replicas are as follows:
Machine A:
  core_node3 / shard1_replica_n1
  core_node7 / shard2_replica_n4
Machine B:
  core_node5 / shard1_replica_n2
  core_node8 / shard2_replica_n6

I'm using solrJ and I create the solr client using Http2SolrClient.Builder
and the IP of machineA.

Here is my question:
when I do a search (using solrJ) and I look at the search logs on both
machines, I see that the same search is being executed on both machines.
But if the full index is present on both machines, wouldn't it be enough
just to search on one of machines?
In fact, if I turn off machine B, the search returns the correct results
anyway.

Thanks a lot.

Reinaldo


Re: Solr performance using fq with multiple values

2020-04-18 Thread Odysci
We don't used this field for general queries (q:*), only for fq and
faceting.
Do you think making it indexed="true" would make a difference in fq
performance?
Thanks

Reinaldo

On Sat, Apr 18, 2020 at 3:06 PM Sylvain James 
wrote:

> Hi Reinaldo,
>
> Involved fields should be indexed for better performance ?
>
>  stored="false" required="false" multiValued="false"
> docValues="true" />
>
> Sylvain
>
> Le sam. 18 avr. 2020 à 18:46, Odysci  a écrit :
>
> > Hi,
> >
> > We are seeing significant performance degradation on single queries that
> > use fq with multiple values as in:
> >
> > fq=field1_name:(V1 V2 V3 ...)
> >
> > If we use only one value in the fq (say only V1) we get Qtime = T ms
> > As we increase the number of values, say to 5 values, Qtime more than
> > triples, even if the number of results is small. In my tests I made sure
> > cache was not an issue and nothing else was using the cpu.
> >
> > We commonly need to use fq with multiple values (on the same field name,
> > which is normally a long).
> > Is this performance hit to be expected?
> > Is there a better way to do this?
> >
> > We use Solr Cloud 8.3, and the field that we use fq on is defined as:
> >
> >  > stored="false" required="false" multiValued="false"
> > docValues="true" />
> >
> > Thanks
> >
> > Reinaldo
> >
>


Solr performance using fq with multiple values

2020-04-18 Thread Odysci
Hi,

We are seeing significant performance degradation on single queries that
use fq with multiple values as in:

fq=field1_name:(V1 V2 V3 ...)

If we use only one value in the fq (say only V1) we get Qtime = T ms
As we increase the number of values, say to 5 values, Qtime more than
triples, even if the number of results is small. In my tests I made sure
cache was not an issue and nothing else was using the cpu.

We commonly need to use fq with multiple values (on the same field name,
which is normally a long).
Is this performance hit to be expected?
Is there a better way to do this?

We use Solr Cloud 8.3, and the field that we use fq on is defined as:



Thanks

Reinaldo


Re: Search Performance and omitNorms

2019-12-05 Thread Odysci
Hi Erick,
thanks for the reply.
Just to follow up, I'm using "unified" highlighter (fastVector does not
work for my purposes). I search and highlight on a multivalued string
string field which contains small strings (usually less than 200 chars).
This multivalued field is subject to various processors (tokenizer, word
delimiter, stemming), and all termVectors, termPositions, termOffsets are
"true".
This is what I'm using:

-- schema --
   
























-- schema --

And the java code I set the following params. Considering the multivalued
field above is called "text_msearchp")

SolrQuery solrQ = new SolrQuery();
solrQ.setFilterQueries( -- set some filters --);
solrQ.setStart(0);
solrQ.setRows( -- set max rows --);
solrQ.setQuery("text_msearchp"+":(\"+string_being_searched+ "\")");
// ativate highlight
solrQ.setHighlight(true);
solrQ.setHighlightSnippets(500);   // normally this number is low

// set highligher type
solrQ.setParam("hl.method", "unified");
// set highlight field to be the same as the search field
solrQ.setParam("hl.fl", "text_msearchp");
//Seta o termo que irá gerar o highlight
solrQ.setParam("hl.q", "text_msearchp"+":(\"+string_being_searched+ "\")");



Still, my tests indicate a significant speed up using omitNorms="false".
Best,

Reinaldo

On Tue, Dec 3, 2019 at 6:35 PM Erick Erickson 
wrote:

> I suspect this is spurious. Norms are just an encoding
> of the length of a field, offhand I have no clue how having
> them (or not) would affect highlighting at all.
>
> Term _vectors_ OTOH could have a major impact. If
> FastVectorHighlighter is not used, the highlighter has
> to re-analyze the text in order to highlight, and if you’re
> highlighting in large text fields that can be very expensive.
>
> Norms, aren’t relevant there….
>
> So let’s see the full highlighter configuration you have, along
> with the field definition for the field you’re highlighting on.
>
> Best,
> Erick
>
> > On Dec 3, 2019, at 4:27 PM, Odysci  wrote:
> >
> > I'm using solr-8.3.1 on a solrcloud set up with 2 solr nodes and 2 ZK
> nodes.
> > I was experiencing very slow search-with-highlighting on a index that had
> > 'omitNorms="true"' on all fields.
> > At the suggestion of a stackoverflow post, I changed all fields to be
> > 'omitNorms="false"' and the search-with-highlight time came down to about
> > 1/10th of what it was!!!
> >
> > This was a relatively small index and I had no issues with memory
> increase.
> > Now my question is whether I should expect the same speed up on regular
> > search calls, or search with only filters (no query)?
> > This would be on a different, much larger index - and I do want to incur
> > the memory increase unless the search is significantly faster.
> > Does anyone have any experience in comparing search speed using
> "omitNorms"
> > true or false in regular search (non-highlight)?
> > Thanks!
> >
> > Reinaldo
>
>


Search Performance and omitNorms

2019-12-03 Thread Odysci
I'm using solr-8.3.1 on a solrcloud set up with 2 solr nodes and 2 ZK nodes.
I was experiencing very slow search-with-highlighting on a index that had
'omitNorms="true"' on all fields.
At the suggestion of a stackoverflow post, I changed all fields to be
'omitNorms="false"' and the search-with-highlight time came down to about
1/10th of what it was!!!

This was a relatively small index and I had no issues with memory increase.
Now my question is whether I should expect the same speed up on regular
search calls, or search with only filters (no query)?
This would be on a different, much larger index - and I do want to incur
the memory increase unless the search is significantly faster.
Does anyone have any experience in comparing search speed using "omitNorms"
true or false in regular search (non-highlight)?
Thanks!

Reinaldo


Re: solr 8.3 indexing wrong values in some fields

2019-12-03 Thread Odysci
Hi Colvin,

I updated my setup to 8.3.1-RC2 and so far it seems to work. I've converted
my 7.7 index again and indexed a whole bunch of new doc and haven't
detected any memory corruption.
Thanks a lot!

On Mon, Dec 2, 2019 at 6:40 AM Colvin Cowie 
wrote:

> This sounds like https://issues.apache.org/jira/browse/SOLR-13963
> Solr 8.3.1 is likely to be available soon - RC2 is at
>
> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-8.3.1-RC2-reva3d456fba2cd1b9892defbcf46a0eb4d4bb4d01f/solr/
> Re-index
> <https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-8.3.1-RC2-reva3d456fba2cd1b9892defbcf46a0eb4d4bb4d01f/solr/Re-index>
> on it, and see if you still have issues.
>
> On Sun, 1 Dec 2019 at 17:35, Odysci  wrote:
>
> > Hi,
> > I have a solr cloud setup using solr 8.3 and zookeeper, which I recently
> > converted from solr 7.7. I converted the index using the index updater
> and
> > it all went fine. My index has about 40 million docs.
> > I used a separate program to check the values of all fields in the solr
> > docs, for consistency (e.g., fields which are supposed to have
> > only numbers, or only alpha chars, etc.). I ran this program immediately
> > after the index updating and it did not detect any problems.
> >
> > Then I started regular use of the system, indexing new documents, and I
> > noticed that some fields were getting the wrong values. For example, a
> > field which was supposed to be a string with only digits had a string
> > containing parts of another field name. It looked as if memory was
> getting
> > corrupted. There were no error msgs in the solr logs.
> > In other words, solr 8.3 seems to be indexing wrong values in some
> fields.
> > This happens very few times, but it's happening.
> > Has anyone seen this happening?
> > Thanks!
> >
> > Reinaldo
> >
>


solr 8.3 indexing wrong values in some fields

2019-12-01 Thread Odysci
Hi,
I have a solr cloud setup using solr 8.3 and zookeeper, which I recently
converted from solr 7.7. I converted the index using the index updater and
it all went fine. My index has about 40 million docs.
I used a separate program to check the values of all fields in the solr
docs, for consistency (e.g., fields which are supposed to have
only numbers, or only alpha chars, etc.). I ran this program immediately
after the index updating and it did not detect any problems.

Then I started regular use of the system, indexing new documents, and I
noticed that some fields were getting the wrong values. For example, a
field which was supposed to be a string with only digits had a string
containing parts of another field name. It looked as if memory was getting
corrupted. There were no error msgs in the solr logs.
In other words, solr 8.3 seems to be indexing wrong values in some fields.
This happens very few times, but it's happening.
Has anyone seen this happening?
Thanks!

Reinaldo


Re: problem using Http2SolrClient with solr 8.3.0

2019-12-01 Thread Odysci
That worked. I included the dist/solrj-lib  libs in my class path, and
could make it work with Http2SolrClient. Thanks!

Still on a related topic, is the CloudHttp2SolrClient client fully stable?
(I'm using solr 7.7 and solr 8.3)
Thanks

On Thu, Nov 28, 2019 at 3:00 PM Shawn Heisey  wrote:

> On 11/28/2019 9:30 AM, Odysci wrote:
> > No, I did nothing specific to Jetty. Should I?
>
> The http/2 Solr client uses a different http client than the previous
> ones do.  It uses the client from Jetty, while the previous clients use
> the one from Apache.
>
> Achieving http/2 with the Apache client would have required using a beta
> release, while the Jetty client has had http/2 in a GA release for three
> years.
>
> The error message you're getting indicates that you have not included
> the Jetty client jar in your project.  Using a dependency manager should
> pull in all required dependencies.  If you're not using a dependency
> manager, you will find all the jars that you need in the dist/solrj-lib
> directory in the Solr download.
>
> Thanks,
> Shawn
>


Re: problem using Http2SolrClient with solr 8.3.0

2019-11-28 Thread Odysci
No, I did nothing specific to Jetty. Should I?
Thx

On Wed, Nov 27, 2019 at 6:54 PM Houston Putman 
wrote:

> Are you overriding the Jetty version in your application using SolrJ?
>
> On Wed, Nov 27, 2019 at 4:00 PM Odysci  wrote:
>
> > Hi,
> > I have a solr cloud setup using solr 8.3 and SolrJj, which works fine
> using
> > the HttpSolrClient as well as the CloudSolrClient. I use 2 solr nodes
> with
> > 3 Zookeeper nodes.
> > Recently I configured my machines to handle ssl, http/2 and then I tried
> > using in my java code the Http2SolrClient supported by SolrJ 8.3.0, but I
> > got the following error at run time upon instantiating the
> Http2SolrClient
> > object:
> >
> > Has anyone seen this problem?
> > Thanks
> > Reinaldo
> > ===
> >
> > Oops: NoClassDefFoundError
> > Unexpected error : Unexpected Error, caused by exception
> > NoClassDefFoundError: org/eclipse/jetty/client/api/Request
> >
> > play.exceptions.UnexpectedException: Unexpected Error
> > at play.jobs.Job.onException(Job.java:180)
> > at play.jobs.Job.call(Job.java:250)
> > at Invocation.Job(Play!)
> > Caused by: java.lang.NoClassDefFoundError:
> > org/eclipse/jetty/client/api/Request
> > at
> >
> >
> org.apache.solr.client.solrj.impl.Http2SolrClient$AsyncTracker.(Http2SolrClient.java:789)
> > at
> >
> >
> org.apache.solr.client.solrj.impl.Http2SolrClient.(Http2SolrClient.java:131)
> > at
> >
> >
> org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.build(Http2SolrClient.java:833)
> > ... more
> > Caused by: java.lang.ClassNotFoundException:
> > org.eclipse.jetty.client.api.Request
> > at
> >
> >
> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581)
> > at
> >
> >
> java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
> > at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
> > ... 16 more
> > ==
> >
>


Re: problem using Http2SolrClient with solr 8.3.0

2019-11-28 Thread Odysci
I'm using OpenJDK 11

On Wed, Nov 27, 2019 at 7:12 PM Jörn Franke  wrote:

> Which jdk version? In this Setting i would recommend JDK11.
>
> > Am 27.11.2019 um 22:00 schrieb Odysci :
> >
> > Hi,
> > I have a solr cloud setup using solr 8.3 and SolrJj, which works fine
> using
> > the HttpSolrClient as well as the CloudSolrClient. I use 2 solr nodes
> with
> > 3 Zookeeper nodes.
> > Recently I configured my machines to handle ssl, http/2 and then I tried
> > using in my java code the Http2SolrClient supported by SolrJ 8.3.0, but I
> > got the following error at run time upon instantiating the
> Http2SolrClient
> > object:
> >
> > Has anyone seen this problem?
> > Thanks
> > Reinaldo
> > ===
> >
> > Oops: NoClassDefFoundError
> > Unexpected error : Unexpected Error, caused by exception
> > NoClassDefFoundError: org/eclipse/jetty/client/api/Request
> >
> > play.exceptions.UnexpectedException: Unexpected Error
> > at play.jobs.Job.onException(Job.java:180)
> > at play.jobs.Job.call(Job.java:250)
> > at Invocation.Job(Play!)
> > Caused by: java.lang.NoClassDefFoundError:
> > org/eclipse/jetty/client/api/Request
> > at
> >
> org.apache.solr.client.solrj.impl.Http2SolrClient$AsyncTracker.(Http2SolrClient.java:789)
> > at
> >
> org.apache.solr.client.solrj.impl.Http2SolrClient.(Http2SolrClient.java:131)
> > at
> >
> org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.build(Http2SolrClient.java:833)
> > ... more
> > Caused by: java.lang.ClassNotFoundException:
> > org.eclipse.jetty.client.api.Request
> > at
> >
> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581)
> > at
> >
> java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
> > at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
> > ... 16 more
> > ==
>


problem using Http2SolrClient with solr 8.3.0

2019-11-27 Thread Odysci
Hi,
I have a solr cloud setup using solr 8.3 and SolrJj, which works fine using
the HttpSolrClient as well as the CloudSolrClient. I use 2 solr nodes with
3 Zookeeper nodes.
Recently I configured my machines to handle ssl, http/2 and then I tried
using in my java code the Http2SolrClient supported by SolrJ 8.3.0, but I
got the following error at run time upon instantiating the Http2SolrClient
object:

Has anyone seen this problem?
Thanks
Reinaldo
===

Oops: NoClassDefFoundError
Unexpected error : Unexpected Error, caused by exception
NoClassDefFoundError: org/eclipse/jetty/client/api/Request

play.exceptions.UnexpectedException: Unexpected Error
at play.jobs.Job.onException(Job.java:180)
at play.jobs.Job.call(Job.java:250)
at Invocation.Job(Play!)
Caused by: java.lang.NoClassDefFoundError:
org/eclipse/jetty/client/api/Request
at
org.apache.solr.client.solrj.impl.Http2SolrClient$AsyncTracker.(Http2SolrClient.java:789)
at
org.apache.solr.client.solrj.impl.Http2SolrClient.(Http2SolrClient.java:131)
at
org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.build(Http2SolrClient.java:833)
... more
Caused by: java.lang.ClassNotFoundException:
org.eclipse.jetty.client.api.Request
at
java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581)
at
java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
... 16 more
==