date:20180716

Re: SOLR 7.1 Top level element for similarity factory

2018-07-16 Thread Chris Hostetter



: So I have the following at the bottom of my schema.xml file
: 
: 
: 
: 
: 
: The documentation says "top level element" - so should that actually be 
outside the schema tag?

No, the schema tag is the "root" level element, it's direct children are 
the "top level elements"  

(the wording may not be the best possible, but the goal was to emphasis 
that to have the behavior you're looking need to make sure you don't just 
add it to a single fieldType, or mistakenly put it inside one the of the 
legacy  or  blocks)


-Hoss
http://www.lucidworks.com/

SOLR 7.1 Top level element for similarity factory

2018-07-16 Thread Hodder, Rick

I'm using SOLR 7.1 and I'm trying to set the similarity factory back to 
ClassicSimilarityFactory so that it will behave like SOLR 6 or before.

In the document

https://lucene.apache.org/solr/guide/7_1/other-schema-elements.html#similarity

It says

This default behavior can be overridden by declaring a top level  
element in your schema.xml, outside of any single field type. This similarity 
declaration can either refer directly to the name of a class with a no-argument 
constructor, such as in this example showing BM25Similarity:



So I have the following at the bottom of my schema.xml file





The documentation says "top level element" - so should that actually be outside 
the schema tag?

Thanks,

Rick Hodder
Information Technology
Navigators Management Company, Inc.
83 Wooster Heights Road, 2nd Floor
Danbury, CT  06810
(475) 329-6251

[Forbes_Best Places Logo2016]

Re: Preferred PHP Client Library

2018-07-16 Thread John Blythe

We have envious using Solarium

On Mon, Jul 16, 2018 at 14:19 Zimmermann, Thomas 
wrote:

> Hi,
>
> We're in the midst of our first major Solr upgrade in years and are trying
> to run some cleanup across all of our client codebases. We're currently
> using the standard PHP Solr Extension when communicating with our cluster
> from our Wordpress installs. http://php.net/manual/en/book.solr.php
>
> Few questions.
>
> Should we have any concerns about communicating with a Solr 7 cloud from
> that client?
> Is anyone using another client they prefer? If so what are the benefits of
> switching to it?
>
> Thanks!
> TZ
>
-- 
John Blythe

RE: 7.3 appears to leak

2018-07-16 Thread Markus Jelsma

Hello Thomas,

To be absolutely sure you suffer from the same problem as one of our 
collections, can you confirm that your Solr cores are leaking a 
SolrIndexSearcher instance on each commit? If not, there may be a second 
problem.

Also, do you run any custom plugins or apply patches to your Solr instances? Or 
is your Solr a 100 % official build?

Thanks,
Markus

 
 
-Original message-
> From:Thomas Scheffler 
> Sent: Monday 16th July 2018 13:39
> To: solr-user@lucene.apache.org
> Subject: Re: 7.3 appears to leak
> 
> Hi,
> 
> we noticed the same problems here in a rather small setup. 40.000 metadata 
> documents with nearly as much files that have „literal.*“ fields with it. 
> While 7.2.1 has brought some tika issues the real problems started to appear 
> with version 7.3.0 which are currently unresolved in 7.4.0. Memory 
> consumption is out-of-roof. Where previously 512MB heap was enough, now 6G 
> aren’t enough to index all files.
> 
> kind regards,
> 
> Thomas
> 
> > Am 04.07.2018 um 15:03 schrieb Markus Jelsma :
> > 
> > Hello Andrey,
> > 
> > I didn't think of that! I will try it when i have the courage again, 
> > probably next week or so.
> > 
> > Many thanks,
> > Markus
> > 
> > 
> > -Original message-
> >> From:Kydryavtsev Andrey 
> >> Sent: Wednesday 4th July 2018 14:48
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: 7.3 appears to leak
> >> 
> >> If it is not possible to find a resource leak by code analysis and there 
> >> is no better ideas, I can suggest a brute force approach:
> >> - Clone Solr's sources from appropriate branch 
> >> https://github.com/apache/lucene-solr/tree/branch_7_3
> >> - Log every searcher's holder increment/decrement operation in a way to 
> >> catch every caller name (use Thread.currentThread().getStackTrace() or 
> >> something) 
> >> https://github.com/apache/lucene-solr/blob/branch_7_3/solr/core/src/java/org/apache/solr/util/RefCounted.java
> >> - Build custom artefacts and upload them on prod
> >> - After memory leak happened - analyse logs to see what part of 
> >> functionality doesn't decrement searcher after counter was incremented. If 
> >> searchers are leaked - there should be such code I guess.
> >> 
> >> This is not something someone would like to do, but it is what it is.
> >> 
> >> 
> >> 
> >> Thank you,
> >> 
> >> Andrey Kudryavtsev
> >> 
> >> 
> >> 03.07.2018, 14:26, "Markus Jelsma" :
> >>> Hello Erick,
> >>> 
> >>> Even the silliest ideas may help us, but unfortunately this is not the 
> >>> case. All our Solr nodes run binaries from the same source from our 
> >>> central build server, with the same libraries thanks to provisioning. 
> >>> Only schema and config are different, but the  directive is the 
> >>> same all over.
> >>> 
> >>> Are there any other ideas, speculations, whatever, on why only our main 
> >>> text collection leaks a SolrIndexSearcher instance on commit since 7.3.0 
> >>> and every version up?
> >>> 
> >>> Many thanks?
> >>> Markus
> >>> 
> >>> -Original message-
>   From:Erick Erickson 
>   Sent: Friday 29th June 2018 19:34
>   To: solr-user 
>   Subject: Re: 7.3 appears to leak
>  
>   This is truly puzzling then, I'm clueless. It's hard to imagine this
>   is lurking out there and nobody else notices, but you've eliminated
>   the custom code. And this is also very peculiar:
>  
>   * it occurs only in our main text search collection, all other
>   collections are unaffected;
>   * despite what i said earlier, it is so far unreproducible outside
>   production, even when mimicking production as good as we can;
>  
>   Here's a tedious idea. Restart Solr with the -v option, I _think_ that
>   shows you each and every jar file Solr loads. Is it "somehow" possible
>   that your main collection is loading some jar from somewhere that's
>   different than you expect? 'cause silly ideas like this are all I can
>   come up with.
>  
>   Erick
>  
>   On Fri, Jun 29, 2018 at 9:56 AM, Markus Jelsma
>    wrote:
>   > Hello Erick,
>   >
>   > The custom search handler doesn't interact with SolrIndexSearcher, 
>  this is really all it does:
>   >
>   >   public void handleRequestBody(SolrQueryRequest req, 
>  SolrQueryResponse rsp) throws Exception {
>   > super.handleRequestBody(req, rsp);
>   >
>   > if (rsp.getToLog().get("hits") instanceof Integer) {
>   >   rsp.addHttpHeader("X-Solr-Hits", 
>  String.valueOf((Integer)rsp.getToLog().get("hits")));
>   > }
>   > if (rsp.getToLog().get("hits") instanceof Long) {
>   >   rsp.addHttpHeader("X-Solr-Hits", 
>  String.valueOf((Long)rsp.getToLog().get("hits")));
>   > }
>   >   }
>   >
>   > I am not sure this qualifies as one more to go.
>   >
>   > Re: compiler warnings on resources, yes! This and tests failing due 
>  to resources leaks ha

Preferred PHP Client Library

2018-07-16 Thread Zimmermann, Thomas

Hi,

We're in the midst of our first major Solr upgrade in years and are trying to 
run some cleanup across all of our client codebases. We're currently using the 
standard PHP Solr Extension when communicating with our cluster from our 
Wordpress installs. http://php.net/manual/en/book.solr.php

Few questions.

Should we have any concerns about communicating with a Solr 7 cloud from that 
client?
Is anyone using another client they prefer? If so what are the benefits of 
switching to it?

Thanks!
TZ

Re: Hardware-Aware Solr Coud Sharding?

2018-07-16 Thread Michael Braun

Ended up working well with nodeset EMPTY and placing all replicas manually.
Thank you all for the assistance!

On Thu, Jun 14, 2018 at 9:28 AM, Jan Høydahl  wrote:

> You could also look into the Autoscaling stuff in 7.x which can be
> programmed to move shards around based on system load and HW specs on the
> various nodes, so in theory that framework (although still a bit unstable)
> will suggest moving some replicas from weak nodes over to more powerful
> ones. If you "overshard" your system, i.e. if you have three nodes, you
> create a collection with 9 shards, then there will be three shards per
> node, and Solr can suggest moving one of them off to anther server.
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
> > 12. jun. 2018 kl. 18:39 skrev Erick Erickson :
> >
> > In a mixed-hardware situation you can certainly place replicas as you
> > choose. Create a minimal collection or use the special nodeset EMPTY
> > and then place your replicas one-by-one.
> >
> > You can also consider "replica placement rules", see:
> > https://lucene.apache.org/solr/guide/6_6/rule-based-
> replica-placement.html.
> > I _think_ this would be a variant of "rack aware". In this case you'd
> > provide a "snitch" that says something about the hardware
> > characteristics and the rules you'd define would be sensitive to that.
> >
> > WARNING: haven't done this myself so don't have any examples to point
> to
> >
> > Best,
> > Erick
> >
> > On Tue, Jun 12, 2018 at 8:34 AM, Shawn Heisey 
> wrote:
> >> On 6/12/2018 9:12 AM, Michael Braun wrote:
> >>> The way to handle this right now looks to be running additional Solr
> >>> instances on nodes with increased resources to balance the load (so if
> the
> >>> machines are 1x, 1.5x, and 2x, run 2 instances, 3 instances, and 4
> >>> instances, respectively). Has anyone looked into other ways of handling
> >>> this that don't require the additional Solr instance deployments?
> >>
> >> Usually, no.  In most cases, you only want to run one Solr instance per
> >> server.  One Solr instance can handle many individual shard replicas.
> >> If there are more individual indexes on a Solr instance, then it is
> >> likely to be able to take advantage of additional system resources
> >> without running another Solr instance.
> >>
> >> The only time you should run multiple Solr instances is when the heap
> >> requirements for running the required indexes with one instance would be
> >> way too big.  Splitting the indexes between two instances with smaller
> >> heaps might end up with much better garbage collection efficiency.
> >>
> >> https://lucene.apache.org/solr/guide/7_3/taking-solr-to-
> production.html#running-multiple-solr-nodes-per-host
> >>
> >> Thanks,
> >> Shawn
> >>
>
>

Re: SolrCloud and Kubernetes

2018-07-16 Thread ssivashunm

Hi Vincenzo,
  I used the repo, but encountering following hurdles and trying to solve
them. 
I increased the replicas to 3 for both solr and zookeeper. 
I dont want to expose the nodeport directly for inter communication hence
created a headless service and used FQDN of the solr-ss-0  pod for the
solrHost. Solr started fine but still didnt recognize the other 2 solr
replicas when i do Solr status. 

Zookeeper has a different issue. When i use
zkHost=zookeeper-ss-0.zookeeper-discovery.default.svc.cluster.local, its
failing to start and ending up in crashloop. 

Thanks!
Sundar



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Solr cloud in kubernetes

2018-07-16 Thread Paweł Ruciński

Hi,
I am trying to achieve same, to host Solr on k8s.
For now, I successfully created ZK as a statefulset (3 instances) with a
headless service. Apart of that created deployment objects for storing Solr
pods (again 3 instances). For each solr pod I have manually created
persistent volume.

Now I am wondering if there is a way to move into the statefulset for a solr
instances. I see a constraint, when solr pod dies, it loose core.properties
file, as it is inside solr home directory. Solr data directory is a mounted
persistent volume.

My question is, can I made Solr to create core.properties files in a
different place that solr home directory (ie. solr data)?

PS.
Your discussion was very informative for me. As I just started, both with
Solr and k8s.


Best regards,
Paweł Ruciński



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: terms present within fields

2018-07-16 Thread Vincenzo D'Amore

Ok, I got it, thank you very much.

On Mon, Jul 16, 2018 at 6:25 PM Erick Erickson 
wrote:

> Terms are already sorted when you use TermsComponent. So you fetch the
> first 1,000 from each
> field and compare... if you're starting with the same prefix for both
> fields the lists should be
> comparable in a straightforward manner.
>
> Best,
> Erick
>
> On Mon, Jul 16, 2018 at 9:10 AM, Vincenzo D'Amore 
> wrote:
> > Hi Alexandre, well... you're right. Sooner or later I had to create a
> > collection with synthetic data where run my test.
> >
> > Well I have SolrCloud, I'm curious, could you please suggest me an
> example
> > with the streaming expression you're talking?
> >
> > On Mon, Jul 16, 2018 at 4:50 PM Alexandre Rafalovitch <
> arafa...@gmail.com>
> > wrote:
> >
> >> For the test, can't you just use synthetic data where you know the terms
> >> from the start?
> >>
> >> Otherwise maybe something from streaming expressions will help, but it
> >> needs SolrCloud.
> >>
> >> Regards,
> >> Alex
> >>
> >> On Mon, Jul 16, 2018, 10:22 AM Vincenzo D'Amore, 
> >> wrote:
> >>
> >> > Hi all,
> >> >
> >> > I have a question for you, Solr Gurus :)
> >> >
> >> > there is an index where there are two fields: short_title and
> long_title.
> >> > As the field names suggest, this two fields are very similar, the long
> >> > title has just more terms in it.
> >> >
> >> > So, looking at all the documents I have in the index, I would like to
> >> > extract all the terms that are present in the long_title title only.
> >> >
> >> > Could you suggest me, if it is possibile, how to figure out from this
> >> > problem?
> >> >
> >> > I've tried with the term component, and it should return all the terms
> >> > present in a field but what happens when I have millions of terms?
> >> >
> >> > I thought to use the termcomponent or luke, but the only doable way
> I've
> >> > found is download the entire list of terms present in both the fields
> and
> >> > remove a term that is present in both the lists.
> >> >
> >> > I need this because I would like to write a test that try few terms
> >> present
> >> > only in the long_title.
> >> >
> >> > Thanks for your time,
> >> > Vincenzo
> >> >
> >> > --
> >> > Vincenzo D'Amore
> >> >
> >>
> >
> >
> > --
> > Vincenzo D'Amore
>


-- 
Vincenzo D'Amore

Re: terms present within fields

2018-07-16 Thread Erick Erickson

Terms are already sorted when you use TermsComponent. So you fetch the
first 1,000 from each
field and compare... if you're starting with the same prefix for both
fields the lists should be
comparable in a straightforward manner.

Best,
Erick

On Mon, Jul 16, 2018 at 9:10 AM, Vincenzo D'Amore  wrote:
> Hi Alexandre, well... you're right. Sooner or later I had to create a
> collection with synthetic data where run my test.
>
> Well I have SolrCloud, I'm curious, could you please suggest me an example
> with the streaming expression you're talking?
>
> On Mon, Jul 16, 2018 at 4:50 PM Alexandre Rafalovitch 
> wrote:
>
>> For the test, can't you just use synthetic data where you know the terms
>> from the start?
>>
>> Otherwise maybe something from streaming expressions will help, but it
>> needs SolrCloud.
>>
>> Regards,
>> Alex
>>
>> On Mon, Jul 16, 2018, 10:22 AM Vincenzo D'Amore, 
>> wrote:
>>
>> > Hi all,
>> >
>> > I have a question for you, Solr Gurus :)
>> >
>> > there is an index where there are two fields: short_title and long_title.
>> > As the field names suggest, this two fields are very similar, the long
>> > title has just more terms in it.
>> >
>> > So, looking at all the documents I have in the index, I would like to
>> > extract all the terms that are present in the long_title title only.
>> >
>> > Could you suggest me, if it is possibile, how to figure out from this
>> > problem?
>> >
>> > I've tried with the term component, and it should return all the terms
>> > present in a field but what happens when I have millions of terms?
>> >
>> > I thought to use the termcomponent or luke, but the only doable way I've
>> > found is download the entire list of terms present in both the fields and
>> > remove a term that is present in both the lists.
>> >
>> > I need this because I would like to write a test that try few terms
>> present
>> > only in the long_title.
>> >
>> > Thanks for your time,
>> > Vincenzo
>> >
>> > --
>> > Vincenzo D'Amore
>> >
>>
>
>
> --
> Vincenzo D'Amore

Re: terms present within fields

2018-07-16 Thread Vincenzo D'Amore

Hi Alexandre, well... you're right. Sooner or later I had to create a
collection with synthetic data where run my test.

Well I have SolrCloud, I'm curious, could you please suggest me an example
with the streaming expression you're talking?

On Mon, Jul 16, 2018 at 4:50 PM Alexandre Rafalovitch 
wrote:

> For the test, can't you just use synthetic data where you know the terms
> from the start?
>
> Otherwise maybe something from streaming expressions will help, but it
> needs SolrCloud.
>
> Regards,
> Alex
>
> On Mon, Jul 16, 2018, 10:22 AM Vincenzo D'Amore, 
> wrote:
>
> > Hi all,
> >
> > I have a question for you, Solr Gurus :)
> >
> > there is an index where there are two fields: short_title and long_title.
> > As the field names suggest, this two fields are very similar, the long
> > title has just more terms in it.
> >
> > So, looking at all the documents I have in the index, I would like to
> > extract all the terms that are present in the long_title title only.
> >
> > Could you suggest me, if it is possibile, how to figure out from this
> > problem?
> >
> > I've tried with the term component, and it should return all the terms
> > present in a field but what happens when I have millions of terms?
> >
> > I thought to use the termcomponent or luke, but the only doable way I've
> > found is download the entire list of terms present in both the fields and
> > remove a term that is present in both the lists.
> >
> > I need this because I would like to write a test that try few terms
> present
> > only in the long_title.
> >
> > Thanks for your time,
> > Vincenzo
> >
> > --
> > Vincenzo D'Amore
> >
>


-- 
Vincenzo D'Amore

Re: terms present within fields

2018-07-16 Thread Vincenzo D'Amore

Thanks Erick, at first glance I didn't understood your suggestion.
But trying to sort the terms per index it make sense, absolutely make sense
:)))
Thanks for the suggestion, adding the prefix it very easy to implement.



On Mon, Jul 16, 2018 at 4:34 PM Erick Erickson 
wrote:

> There's no real way I know of to do what you want except to use
> TermsComponent.
>
> Note that you don't have to extract all of them, just advance the two
> lists until you find
> enough terms in long_title that aren't in short_title, extract, say,
> 1,000 terms at a time.
>
> You can also start with various prefixes (even individual letters) to
> get some from
> different places. Basically you're paginating through terms by using
> terms.lower.
>
> Do note, though, that you get the _indexed_ term after all
> transformations, say lower
> casing, WordDelimter(Graph)FilterFactory, stemming etc.
>
> Best,
> Erick
>
> On Mon, Jul 16, 2018 at 7:22 AM, Vincenzo D'Amore 
> wrote:
> > Hi all,
> >
> > I have a question for you, Solr Gurus :)
> >
> > there is an index where there are two fields: short_title and long_title.
> > As the field names suggest, this two fields are very similar, the long
> > title has just more terms in it.
> >
> > So, looking at all the documents I have in the index, I would like to
> > extract all the terms that are present in the long_title title only.
> >
> > Could you suggest me, if it is possibile, how to figure out from this
> > problem?
> >
> > I've tried with the term component, and it should return all the terms
> > present in a field but what happens when I have millions of terms?
> >
> > I thought to use the termcomponent or luke, but the only doable way I've
> > found is download the entire list of terms present in both the fields and
> > remove a term that is present in both the lists.
> >
> > I need this because I would like to write a test that try few terms
> present
> > only in the long_title.
> >
> > Thanks for your time,
> > Vincenzo
> >
> > --
> > Vincenzo D'Amore
>


-- 
Vincenzo D'Amore

Re: terms present within fields

2018-07-16 Thread Alexandre Rafalovitch

For the test, can't you just use synthetic data where you know the terms
from the start?

Otherwise maybe something from streaming expressions will help, but it
needs SolrCloud.

Regards,
Alex

On Mon, Jul 16, 2018, 10:22 AM Vincenzo D'Amore,  wrote:

> Hi all,
>
> I have a question for you, Solr Gurus :)
>
> there is an index where there are two fields: short_title and long_title.
> As the field names suggest, this two fields are very similar, the long
> title has just more terms in it.
>
> So, looking at all the documents I have in the index, I would like to
> extract all the terms that are present in the long_title title only.
>
> Could you suggest me, if it is possibile, how to figure out from this
> problem?
>
> I've tried with the term component, and it should return all the terms
> present in a field but what happens when I have millions of terms?
>
> I thought to use the termcomponent or luke, but the only doable way I've
> found is download the entire list of terms present in both the fields and
> remove a term that is present in both the lists.
>
> I need this because I would like to write a test that try few terms present
> only in the long_title.
>
> Thanks for your time,
> Vincenzo
>
> --
> Vincenzo D'Amore
>

Re: terms present within fields

2018-07-16 Thread Erick Erickson

There's no real way I know of to do what you want except to use TermsComponent.

Note that you don't have to extract all of them, just advance the two
lists until you find
enough terms in long_title that aren't in short_title, extract, say,
1,000 terms at a time.

You can also start with various prefixes (even individual letters) to
get some from
different places. Basically you're paginating through terms by using
terms.lower.

Do note, though, that you get the _indexed_ term after all
transformations, say lower
casing, WordDelimter(Graph)FilterFactory, stemming etc.

Best,
Erick

On Mon, Jul 16, 2018 at 7:22 AM, Vincenzo D'Amore  wrote:
> Hi all,
>
> I have a question for you, Solr Gurus :)
>
> there is an index where there are two fields: short_title and long_title.
> As the field names suggest, this two fields are very similar, the long
> title has just more terms in it.
>
> So, looking at all the documents I have in the index, I would like to
> extract all the terms that are present in the long_title title only.
>
> Could you suggest me, if it is possibile, how to figure out from this
> problem?
>
> I've tried with the term component, and it should return all the terms
> present in a field but what happens when I have millions of terms?
>
> I thought to use the termcomponent or luke, but the only doable way I've
> found is download the entire list of terms present in both the fields and
> remove a term that is present in both the lists.
>
> I need this because I would like to write a test that try few terms present
> only in the long_title.
>
> Thanks for your time,
> Vincenzo
>
> --
> Vincenzo D'Amore

terms present within fields

2018-07-16 Thread Vincenzo D'Amore

Hi all,

I have a question for you, Solr Gurus :)

there is an index where there are two fields: short_title and long_title.
As the field names suggest, this two fields are very similar, the long
title has just more terms in it.

So, looking at all the documents I have in the index, I would like to
extract all the terms that are present in the long_title title only.

Could you suggest me, if it is possibile, how to figure out from this
problem?

I've tried with the term component, and it should return all the terms
present in a field but what happens when I have millions of terms?

I thought to use the termcomponent or luke, but the only doable way I've
found is download the entire list of terms present in both the fields and
remove a term that is present in both the lists.

I need this because I would like to write a test that try few terms present
only in the long_title.

Thanks for your time,
Vincenzo

-- 
Vincenzo D'Amore

Learning to rank

2018-07-16 Thread Akshay Patil

 Hi

I am student. for my master thesis I am working on the Learning To rank. As
I did research on it. I found solution provided by the Bloomberg. But I
would like to ask. With the example that you have provided It always shows
the error of Bad Request.

Do you have running example of it. So i can adapt it to my application.

I am trying to use the example that you have provided in github.

core :- techproducts
traning_and_uploading_demo.py

It generates the training data. But I am getting the problem in uploading
the model. It shows error of bad request (empty request body). please help
me out with this problem. So I will be able to adapt it to my application.

Best Regards !

Any help would be appreciated

Re: 7.3 appears to leak

2018-07-16 Thread Thomas Scheffler

Hi,

we noticed the same problems here in a rather small setup. 40.000 metadata 
documents with nearly as much files that have „literal.*“ fields with it. While 
7.2.1 has brought some tika issues the real problems started to appear with 
version 7.3.0 which are currently unresolved in 7.4.0. Memory consumption is 
out-of-roof. Where previously 512MB heap was enough, now 6G aren’t enough to 
index all files.

kind regards,

Thomas

> Am 04.07.2018 um 15:03 schrieb Markus Jelsma :
> 
> Hello Andrey,
> 
> I didn't think of that! I will try it when i have the courage again, probably 
> next week or so.
> 
> Many thanks,
> Markus
> 
> 
> -Original message-
>> From:Kydryavtsev Andrey 
>> Sent: Wednesday 4th July 2018 14:48
>> To: solr-user@lucene.apache.org
>> Subject: Re: 7.3 appears to leak
>> 
>> If it is not possible to find a resource leak by code analysis and there is 
>> no better ideas, I can suggest a brute force approach:
>> - Clone Solr's sources from appropriate branch 
>> https://github.com/apache/lucene-solr/tree/branch_7_3
>> - Log every searcher's holder increment/decrement operation in a way to 
>> catch every caller name (use Thread.currentThread().getStackTrace() or 
>> something) 
>> https://github.com/apache/lucene-solr/blob/branch_7_3/solr/core/src/java/org/apache/solr/util/RefCounted.java
>> - Build custom artefacts and upload them on prod
>> - After memory leak happened - analyse logs to see what part of 
>> functionality doesn't decrement searcher after counter was incremented. If 
>> searchers are leaked - there should be such code I guess.
>> 
>> This is not something someone would like to do, but it is what it is.
>> 
>> 
>> 
>> Thank you,
>> 
>> Andrey Kudryavtsev
>> 
>> 
>> 03.07.2018, 14:26, "Markus Jelsma" :
>>> Hello Erick,
>>> 
>>> Even the silliest ideas may help us, but unfortunately this is not the 
>>> case. All our Solr nodes run binaries from the same source from our central 
>>> build server, with the same libraries thanks to provisioning. Only schema 
>>> and config are different, but the  directive is the same all over.
>>> 
>>> Are there any other ideas, speculations, whatever, on why only our main 
>>> text collection leaks a SolrIndexSearcher instance on commit since 7.3.0 
>>> and every version up?
>>> 
>>> Many thanks?
>>> Markus
>>> 
>>> -Original message-
  From:Erick Erickson 
  Sent: Friday 29th June 2018 19:34
  To: solr-user 
  Subject: Re: 7.3 appears to leak
 
  This is truly puzzling then, I'm clueless. It's hard to imagine this
  is lurking out there and nobody else notices, but you've eliminated
  the custom code. And this is also very peculiar:
 
  * it occurs only in our main text search collection, all other
  collections are unaffected;
  * despite what i said earlier, it is so far unreproducible outside
  production, even when mimicking production as good as we can;
 
  Here's a tedious idea. Restart Solr with the -v option, I _think_ that
  shows you each and every jar file Solr loads. Is it "somehow" possible
  that your main collection is loading some jar from somewhere that's
  different than you expect? 'cause silly ideas like this are all I can
  come up with.
 
  Erick
 
  On Fri, Jun 29, 2018 at 9:56 AM, Markus Jelsma
   wrote:
  > Hello Erick,
  >
  > The custom search handler doesn't interact with SolrIndexSearcher, this 
 is really all it does:
  >
  >   public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse 
 rsp) throws Exception {
  > super.handleRequestBody(req, rsp);
  >
  > if (rsp.getToLog().get("hits") instanceof Integer) {
  >   rsp.addHttpHeader("X-Solr-Hits", 
 String.valueOf((Integer)rsp.getToLog().get("hits")));
  > }
  > if (rsp.getToLog().get("hits") instanceof Long) {
  >   rsp.addHttpHeader("X-Solr-Hits", 
 String.valueOf((Long)rsp.getToLog().get("hits")));
  > }
  >   }
  >
  > I am not sure this qualifies as one more to go.
  >
  > Re: compiler warnings on resources, yes! This and tests failing due to 
 resources leaks have always warned me when i forgot to release something 
 or decrement a reference. But except for the above method (and the token 
 filters which i really can't disable) are all that is left.
  >
  > I am quite desperate about this problem so although i am unwilling to 
 disable stuff, i can do it if i must. But i so reason, yet, to remove the 
 search handler or the token filter stuff, i mean, how could those leak a 
 SolrIndexSearcher?
  >
  > Let me know :)
  >
  > Many thanks!
  > Markus
  >
  > -Original message-
  >> From:Erick Erickson 
  >> Sent: Friday 29th June 2018 18:46
  >> To: solr-user 
  >> Subject: Re: 7.3 appears to leak
  >>
  >> bq. The only custom stuff

Re: Learning to rank - Bad Request

2018-07-16 Thread Diego Ceccarelli (BLOOMBERG/ LONDON)

Hi Akshay,

did you run solr enabling learning to rank? 

./bin/solr -e techproducts -Dsolr.ltr.enabled=true

if you don't pass -Dsolr.ltr.enabled=true ltr will not be available. 

Cheers,
Diego


From: solr-user@lucene.apache.org At: 07/16/18 09:00:39To:  
solr-user@lucene.apache.org
Subject: Re: Learning to rank - Bad Request

Hi,

I am using apache solr 7.4.0. I am trying to use learning to rank using the
python script and related data provided by the lucene. which can be found at
the Github

  
repository of the lucene solr.

I am using the standard core "techproducts" I didnt change the configuration
data and thus it shows the error of bad request while uploading the model to
the solr. 

It shows the error of  empty request body :- "unknown source"

please suggest me how to overcome the error.

Best Regards !

Akshay Patil


--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Learning to rank - Bad Request

2018-07-16 Thread akshaypatil

Hi,

I am using apache solr 7.4.0. I am trying to use learning to rank using the
python script and related data provided by the lucene. which can be found at
the Github

  
repository of the lucene solr.

I am using the standard core "techproducts" I didnt change the configuration
data and thus it shows the error of bad request while uploading the model to
the solr. 

It shows the error of  empty request body :- "unknown source"

please suggest me how to overcome the error.

Best Regards !

Akshay Patil



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: SOLR 7.1 Top level element for similarity factory

SOLR 7.1 Top level element for similarity factory

Re: Preferred PHP Client Library

RE: 7.3 appears to leak

Preferred PHP Client Library

Re: Hardware-Aware Solr Coud Sharding?

Re: SolrCloud and Kubernetes

Re: Solr cloud in kubernetes

Re: terms present within fields

Re: terms present within fields

Re: terms present within fields

Re: terms present within fields

Re: terms present within fields

Re: terms present within fields

terms present within fields

Learning to rank

Re: 7.3 appears to leak

Re: Learning to rank - Bad Request

Re: Learning to rank - Bad Request

19 matches

Site Navigation

Mail list logo

Footer information