How can I enable scoring on a DocList rather than a DocSet

2021-02-25 Thread krishan goyal
Hi,

I want to match and score on a sorted DocList.

The use case is something like this

   - Cache sorted results (with scores) of certain queries in the
   queryCache (This is a DocList)
   - New queries are superset of these cached queries and have dynamic
   scoring clauses
   - At runtime, I want to lookup results in query Cache and run matching
   and scoring on this list
  - This enables me to have a more dynamic way (per query) of pre
  sorted data set where I can early terminate shorter and more effectively
  and reduce latencies even further
  - In some cases, I can use the cached score along with new score too
  and avoid recomputation here.

The problem is currently the Scorer interface requires a DocIdSetIterator
and can't work on top of a DocList.

So does that mean, enabling this kind of optimisation requires using a
different Scorer & Weight interfaces or is there something I can do using
the current interfaces itself ?


Re: How to use query function inside a function query in Solr LTR

2020-09-22 Thread krishan goyal
This is solved by using local parameters. So

{!func}sub(num_tokens_int,query({!dismax qf=field_name v=${text}}))

works


On Mon, Sep 21, 2020 at 7:43 PM krishan goyal  wrote:

> Hi,
>
> I have use cases of features which require a query function and some more
> math on top of the result of the query function
>
> Eg of a feature : no of extra terms in the document from input text
>
> I am trying various ways of representing this feature but always getting
> an exception
> java.lang.RuntimeException: Exception from createWeight for SolrFeature
> . Failed to parse feature query.
>
>  Feature representations
> "name" : "no_of_extra_terms",
> "class" : "org.apache.solr.ltr.feature.SolrFeature",
> "params": {
> "q": "{!func}sub(num_tokens_int,query({!dismax
> qf=field_name}${text}))"
> },
>
> where num_tokens_int is a stored field which contains no of tokens in the
> document
>
>
> Also, feature representation with just a query parser like
>
> "q": "{!dismax df=field_name}${text}"
>
> works but I can't really getting my desired feature representation without
> using it in a function query where i want to operate on the result of this
> query to derive my actual feature
>


How to use query function inside a function query in Solr LTR

2020-09-21 Thread krishan goyal
Hi,

I have use cases of features which require a query function and some more
math on top of the result of the query function

Eg of a feature : no of extra terms in the document from input text

I am trying various ways of representing this feature but always getting an
exception
java.lang.RuntimeException: Exception from createWeight for SolrFeature
. Failed to parse feature query.

 Feature representations
"name" : "no_of_extra_terms",
"class" : "org.apache.solr.ltr.feature.SolrFeature",
"params": {
"q": "{!func}sub(num_tokens_int,query({!dismax
qf=field_name}${text}))"
},

where num_tokens_int is a stored field which contains no of tokens in the
document


Also, feature representation with just a query parser like

"q": "{!dismax df=field_name}${text}"

works but I can't really getting my desired feature representation without
using it in a function query where i want to operate on the result of this
query to derive my actual feature


Solr LTR Performance Issues

2020-09-21 Thread krishan goyal
I was observing a high degradation in performance when adding more features
to my solr LTR model even if the model complexity (no of trees, depth of
tree) remains same. I am using the MultipleAdditiveTreesModel model

Moreover, if model complexity increases keeping no of features constant,
performance degrades only slightly.

This seemed odd as model complexity should have been much more performance
heavy than just looking up features, so I looked at LTR code to understand
cause. This is my findings in solr 7.7

Use case:

   - The features to my model are very dynamic and request dependent.
   - The features are mainly scoring features rather than filter/boolean
   features


Findings

   - The assumption was that features are computed only for top N docs
   which need to be reranked by LTR
   - The problem starts in the LTRRescorer.scoreFeatures.
  - This ends up calling SolrIndexSearcher.getProcessedFilter() for
  each top doc to be reranked and for each feature required.
  - Each feature is an individual query
  to SolrIndexSearcher.getProcessedFilter(). And each query is looked up /
  inserted into filter cache in getPositiveDocSet().
  - The bulk of the cost (>90%) of LTRRescorer.scoreFeatures() is in
  DefaultBulkScorer.scoreAll() method which actually creates the
doc set for
  these queries.
  - This ends up collecting all docs for few features which are scoring
  features rather than filtering features
  - Because features are dynamic, there is actually very little reuse
  of the filter cache except for the ongoing request thus the doc bit set
  collection happens almost every request
   - We probably need to change SolrFeature.scorer() to
  - only operate on doc required to be scored
  - utilise a cache where applicable for features which can be reused
  across requests

Please let me know if this seems appropriate and valid and will file a JIRA
request


Re: Issues deploying LTR into SolrCloud

2020-09-21 Thread krishan goyal
Not sure how solr cloud works but if your still facing issues, can try this

1. Deploy the features and models as a _schema_feature-store.json
and _schema_model-store.json file in the right config set.
2. Can either deploy to all nodes (works for me) or add these files
to confFiles in /replication request handler.


On Wed, Aug 26, 2020 at 1:00 PM Dmitry Kan  wrote:

> Hello,
>
> Just noticed my numbering is off, should be:
>
> 1. Deploy a feature store from a JSON file to each collection.
> 2. Reload all collections as advised in the documentation:
>
> https://lucene.apache.org/solr/guide/7_5/learning-to-rank.html#applying-changes
> 3. Deploy the related model from a JSON file.
> 4. Reload all collections again.
>
>
> An update: applying this process twice I was able to fix the issue.
> However, it required "patching" individual collections, while reloading was
> done for all collections at once. I'm not sure this is very transparent to
> the user: maybe show the model deployment status per collection in the
> admin UI?
>
> Thanks,
>
> Dmitry
>
> On Tue, Aug 25, 2020 at 6:20 PM Dmitry Kan  wrote:
>
> > Hi,
> >
> > There is a recent thread "Replication of Solr Model and feature store" on
> > deploying LTR feature store and model into a master/slave Solr topology.
> >
> > I'm facing an issue of deploying into SolrCloud (solr 7.5.0), where
> > collections have shards with replicas. This is the process I've been
> > following:
> >
> > 1. Deploy a feature store from a JSON file to each collection.
> > 2. Reload all collections as advised in the documentation:
> >
> https://lucene.apache.org/solr/guide/7_5/learning-to-rank.html#applying-changes
> > 3. Deploy the related model from a JSON file.
> > 3. Reload all collections again.
> >
> >
> > The problem is that even after reloading the collections, shard replicas
> > continue to not have the model:
> >
> > Error from server at
> > http://server1:8983/solr/collection1_shard1_replica_n1: cannot find
> model
> > 'model_name'
> >
> > What is the proper way to address this issue and can it be potentially a
> > bug in SolrCloud?
> >
> > Is there any workaround I can try, like saving the feature store and
> model
> > JSON files into the collection config path and creating the SolrCloud
> from
> > there?
> >
> > Thanks,
> >
> > Dmitry
> >
> > --
> > Dmitry Kan
> > Luke Toolbox: http://github.com/DmitryKey/luke
> > Blog: http://dmitrykan.blogspot.com and https://medium.com/@dmitry.kan
> > Twitter: http://twitter.com/dmitrykan
> > SemanticAnalyzer: https://semanticanalyzer.info
> >
> >
>


Unable to get test cases running in Intellij via maven / ant

2020-09-10 Thread krishan goyal
Hi,

I downloaded the solr source from https://github.com/apache/lucene-solr and
checked out to branch_7_7

Configured intellij using the steps on
https://cwiki.apache.org/confluence/display/LUCENE/HowtoConfigureIntelliJ.
Configured the project SDK too as mentioned.

Facing the following problems

   - Unable to navigate from one class to another. "Find usages" of any
   public method doesn't return any result. Unable to open any class by
   searching for it. Have to navigate in the project structure to open any
   class
   - Not getting the Run/Debug option on any test case
   - Executing any pre configured test configuration fails with the single
   error "Error Running . No junit.jar". Have added junit in
   Project Structure -> Modules

I also tried setting it up via maven using steps in
https://github.com/apache/lucene-solr/blob/branch_7_7/dev-tools/maven/README.maven

   - ant generate-maven-artifacts
   - ant get-maven-poms
   - cd maven-build
   - mvn install -DskipTests

I don't think I can execute test cases from inside the maven-build folder
as they are all inside the target of their respective modules.

There are dependency errors in intelliJ if i open either the root folder
(lucene-solr) or the sub folder maven-build in IntelliJ

Root folder - "Package name org.apache.solr.ltr does not correspond to file
path test.org.apache.solr.ltr"
maven-build folder - Facing same issue as with ant and unable to navigate /
find usages. Getting the Run option but Run configuration is broken as it
says
"Class org.apache.solr.ltr.feature.TestEdisMaxSolrFeature not found in
module solr-ltr"

What am I doing wrong and how do I run test cases like TestEdisMaxSolrFeature
on IntelliJ ?


Re: Creating a phrase match feature in LTR

2020-09-09 Thread krishan goyal
Hi,

Can anyone help me on this ? I am stuck on this for days.

On Tue, Sep 8, 2020 at 3:02 PM krishan goyal  wrote:

> Thanks Dmitry.
>
> Using
>  "q": "{!complexphrase inOrder=true}fieldName:${input}"
> works for single token queries but raises same exception when input is
> multi token
>
> Using
> "q": "{!complexphrase inOrder=true df=fieldName}${input}"
> works for all types of tokens but the scoring logic isn't the same as "pf"
> or as using the same query via query reranking -
> rqq: "{!complexphrase inOrder=true v=$v1}",
> v1: "query(fieldName:"some text"^1.0)",
>
> Eg:
> query: "nike red shoes"
> I expect the phrase score to be 0 if the tokens are not in order in the
> document or if any one token is absent in the document.
>
> This is the score returned based on document and the type of reranking
>
> Document LTR score
> Reranking score
> "nike red shoes" 3 3
> "nike caps" 1 0
> "nike shoes red" 3 0
> What is the cause for LTR score to not match query reranking score
>
>
> On Fri, Aug 28, 2020 at 11:17 PM Dmitry Kan  wrote:
>
>> Hi Krishan,
>>
>> What if you remove the query() wrapping?
>>
>> {
>>   "name": "phraseMatch",
>>   "class": "org.apache.solr.ltr.feature.SolrFeature",
>>   "params": {
>> "q": "{!complexphrase inOrder=true}fieldName:${input}"
>>   },
>>   "store": "_DEFAULT_"
>> }
>>
>> or even:
>>
>> {
>>   "name": "phraseMatch",
>>   "class": "org.apache.solr.ltr.feature.SolrFeature",
>>   "params": {
>> "q": "{!complexphrase inOrder=true df=fieldName}${input}"
>>   },
>>   "store": "_DEFAULT_"
>> }
>>
>>
>> On Tue, Aug 25, 2020 at 9:59 AM krishan goyal 
>> wrote:
>>
>> > Hi,
>> >
>> > I am trying to create a phrase match feature (what "pf" does in
>> > dismax/edismax parsers)
>> >
>> > I've tried various ways to set it up
>> >
>> > {
>> >   "name": "phraseMatch",
>> >   "class": "org.apache.solr.ltr.feature.SolrFeature",
>> >   "params": {
>> > "q": "{!complexphrase inOrder=true}query(fieldName:${input})"
>> >   },
>> >   "store": "_DEFAULT_"
>> > }
>> >
>> > This fails with the exception
>> >
>> > Exception from createWeight for SolrFeature [name=phraseMatch,
>> > params={q={!complexphrase inOrder=true}query(fieldName:${input})}] null
>> >
>> > But similar query works when used in the query reranking construct with
>> > these params
>> >
>> > rqq: "{!complexphrase inOrder=true v=$v1}",
>> > v1: "query(fieldName:"some text"~2^1.0,0)",
>> >
>> > What is the problem in the LTR configuration for the feature ?
>> >
>>
>>
>> --
>> Dmitry Kan
>> Luke Toolbox: http://github.com/DmitryKey/luke
>> Blog: http://dmitrykan.blogspot.com
>> Twitter: http://twitter.com/dmitrykan
>> SemanticAnalyzer: https://semanticanalyzer.info
>>
>


Re: Creating a phrase match feature in LTR

2020-09-08 Thread krishan goyal
Thanks Dmitry.

Using
 "q": "{!complexphrase inOrder=true}fieldName:${input}"
works for single token queries but raises same exception when input is
multi token

Using
"q": "{!complexphrase inOrder=true df=fieldName}${input}"
works for all types of tokens but the scoring logic isn't the same as "pf"
or as using the same query via query reranking -
rqq: "{!complexphrase inOrder=true v=$v1}",
v1: "query(fieldName:"some text"^1.0)",

Eg:
query: "nike red shoes"
I expect the phrase score to be 0 if the tokens are not in order in the
document or if any one token is absent in the document.

This is the score returned based on document and the type of reranking

Document LTR score
Reranking score
"nike red shoes" 3 3
"nike caps" 1 0
"nike shoes red" 3 0
What is the cause for LTR score to not match query reranking score


On Fri, Aug 28, 2020 at 11:17 PM Dmitry Kan  wrote:

> Hi Krishan,
>
> What if you remove the query() wrapping?
>
> {
>   "name": "phraseMatch",
>   "class": "org.apache.solr.ltr.feature.SolrFeature",
>   "params": {
> "q": "{!complexphrase inOrder=true}fieldName:${input}"
>   },
>   "store": "_DEFAULT_"
> }
>
> or even:
>
> {
>   "name": "phraseMatch",
>   "class": "org.apache.solr.ltr.feature.SolrFeature",
>   "params": {
> "q": "{!complexphrase inOrder=true df=fieldName}${input}"
>   },
>   "store": "_DEFAULT_"
> }
>
>
> On Tue, Aug 25, 2020 at 9:59 AM krishan goyal 
> wrote:
>
> > Hi,
> >
> > I am trying to create a phrase match feature (what "pf" does in
> > dismax/edismax parsers)
> >
> > I've tried various ways to set it up
> >
> > {
> >   "name": "phraseMatch",
> >   "class": "org.apache.solr.ltr.feature.SolrFeature",
> >   "params": {
> > "q": "{!complexphrase inOrder=true}query(fieldName:${input})"
> >   },
> >   "store": "_DEFAULT_"
> > }
> >
> > This fails with the exception
> >
> > Exception from createWeight for SolrFeature [name=phraseMatch,
> > params={q={!complexphrase inOrder=true}query(fieldName:${input})}] null
> >
> > But similar query works when used in the query reranking construct with
> > these params
> >
> > rqq: "{!complexphrase inOrder=true v=$v1}",
> > v1: "query(fieldName:"some text"~2^1.0,0)",
> >
> > What is the problem in the LTR configuration for the feature ?
> >
>
>
> --
> Dmitry Kan
> Luke Toolbox: http://github.com/DmitryKey/luke
> Blog: http://dmitrykan.blogspot.com
> Twitter: http://twitter.com/dmitrykan
> SemanticAnalyzer: https://semanticanalyzer.info
>


Creating a phrase match feature in LTR

2020-08-25 Thread krishan goyal
Hi,

I am trying to create a phrase match feature (what "pf" does in
dismax/edismax parsers)

I've tried various ways to set it up

{
  "name": "phraseMatch",
  "class": "org.apache.solr.ltr.feature.SolrFeature",
  "params": {
"q": "{!complexphrase inOrder=true}query(fieldName:${input})"
  },
  "store": "_DEFAULT_"
}

This fails with the exception

Exception from createWeight for SolrFeature [name=phraseMatch,
params={q={!complexphrase inOrder=true}query(fieldName:${input})}] null

But similar query works when used in the query reranking construct with
these params

rqq: "{!complexphrase inOrder=true v=$v1}",
v1: "query(fieldName:"some text"~2^1.0,0)",

What is the problem in the LTR configuration for the feature ?


Re: Replication of Solr Model and feature store

2020-08-07 Thread krishan goyal
Hi Monica,

Replication is working fine for me. You just have to add the
_schema_feature-store.json and _schema_model-store.json to confFiles under
/replication in solrconfig.xml

I think the issue you are seeing is where the model is referencing a
feature which is not present in the feature store. Or the feature weights
for the model are incorrect. The issue in solr is that it doesn't return
you the right exception but throws a model not found exception

Try these ways to fix it
1. verify feature weights are < 1. I am not sure why having weights > 1 is
an issue but apparently it is in some random cases
2. verify all features used in the model file _schema_model-store.json are
actually present in the feature file _schema_feature-store.json.

Another issue with solr LTR is if you have a corrupt model/feature file,
you can't update/delete it via the API in some cases. you would need to
change the respective _schema_model-store.json
and _schema_feature-store.json files and reload the cores for the changes
to take effect.

Please try these and let me know if the issue still exists

On Thu, Aug 6, 2020 at 11:18 PM Monica Skidmore <
monica.skidm...@careerbuilder.com> wrote:

> I would be interested in the answer here, as well.  We're using LTR
> successfully on Solr 7.3 and Solr 8.3 in cloud mode, but we're struggling
> to load a simple, test model on 8.3 in master/slave mode.   The
> FeatureStore appears to load, but we're not sure it's loading correctly,
> either. Here are some details from the engineer on our team who is leading
> that effort:
>
> "I'm getting a ClassCastException when uploading a Model. Using the
> debugger, was able to see the line throwing the exception is:
> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:488)
>
> Apparently it cannot find: org.apache.solr.ltr.model.LinearModel, although
> the features appear to be created without issues with the following class:
> org.apache.solr.ltr.feature.FieldValueFeature
>
> Another thing we were able to see is that the List features has a
> list of null elements, so that made us think there may be some issues when
> creating the instances of Feature.
>
> We had begun to believe this might be related to the fact that we are
> running Solr in Master/Slave config. Was LTR ever tested on non-cloud
> deployments??
>
> Any help is appreciated."
>
> Monica D Skidmore
> Lead Engineer, Core Search
>
>
>
> CareerBuilder.com <https://www.careerbuilder.com/> | Blog <
> https://www.careerbuilder.com/advice> | Press Room <
> https://press.careerbuilder.com/>
>
>
>
>
> On 7/24/20, 7:58 AM, "Christine Poerschke (BLOOMBERG/ LONDON)" <
> cpoersc...@bloomberg.net> wrote:
>
> Hi Krishan,
>
> Could you share what version of Solr you are using?
>
> And I wonder if the observed behaviour could be reproduced e.g. with
> the techproducts example, changes not applying after reload [1] sounds like
> a bug if so.
>
> Hope that helps.
>
> Regards,
>
> Christine
>
> [1]
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flucene.apache.org%2Fsolr%2Fguide%2F8_6%2Flearning-to-rank.html%23applying-changesdata=01%7C01%7CMonica.Skidmore%40careerbuilder.com%7C65581e5e79414c90832508d82fc8ce21%7C7cc1677566a34e8b80fd5b1f1db15061%7C0sdata=mMqgPhnkjb8h7ETQNaySOBJQ8x%2FP2dtzM%2FgSE1K1FZg%3Dreserved=0
>
> From: solr-user@lucene.apache.org At: 07/22/20 14:00:59To:
> solr-user@lucene.apache.org
> Subject: Re: Replication of Solr Model and feature store
>
> Adding more details here
>
> I need some help on how to enable the solr LTR model and features on
> all
> nodes of a solr cluster.
>
> I am unable to replicate the model and the feature store though from
> any
> master to its slaves with the replication API ? And unable to find any
> documentation for the same. Is replication possible?
>
> Without replication, would I have to individually update all nodes of a
> cluster ? Or can the feature and model files be read as a resource
> (like
> config or schema) so that I can replicate the file or add the file to
> my
> deployments.
>
>
> On Wed, Jul 22, 2020 at 5:53 PM krishan goyal 
> wrote:
>
> > Bump. Any one has an idea how to proceed here ?
> >
> > On Wed, Jul 8, 2020 at 5:41 PM krishan goyal 
> > wrote:
> >
> >> Hi,
> >>
> >> How do I enable replication of the model and feature store ?
> >>
> >> Thanks
> >> Krishan
> >>
> >
>
>
>
>


Re: Replication of Solr Model and feature store

2020-07-28 Thread krishan goyal
Hi Christine,

I am using Solr 7.7

I am able to get it replicated now. I didn't know that the feature and
model store are saved as files in the config structure. And by providing
these names in /replication handle, I can replicate them.

I guess this is something that can be provided in the LTR documentation.
Will try to raise a PR for this.


On Fri, Jul 24, 2020 at 5:28 PM Christine Poerschke (BLOOMBERG/ LONDON) <
cpoersc...@bloomberg.net> wrote:

> Hi Krishan,
>
> Could you share what version of Solr you are using?
>
> And I wonder if the observed behaviour could be reproduced e.g. with the
> techproducts example, changes not applying after reload [1] sounds like a
> bug if so.
>
> Hope that helps.
>
> Regards,
>
> Christine
>
> [1]
> https://lucene.apache.org/solr/guide/8_6/learning-to-rank.html#applying-changes
>
> From: solr-user@lucene.apache.org At: 07/22/20 14:00:59To:
> solr-user@lucene.apache.org
> Subject: Re: Replication of Solr Model and feature store
>
> Adding more details here
>
> I need some help on how to enable the solr LTR model and features on all
> nodes of a solr cluster.
>
> I am unable to replicate the model and the feature store though from any
> master to its slaves with the replication API ? And unable to find any
> documentation for the same. Is replication possible?
>
> Without replication, would I have to individually update all nodes of a
> cluster ? Or can the feature and model files be read as a resource (like
> config or schema) so that I can replicate the file or add the file to my
> deployments.
>
>
> On Wed, Jul 22, 2020 at 5:53 PM krishan goyal 
> wrote:
>
> > Bump. Any one has an idea how to proceed here ?
> >
> > On Wed, Jul 8, 2020 at 5:41 PM krishan goyal 
> > wrote:
> >
> >> Hi,
> >>
> >> How do I enable replication of the model and feature store ?
> >>
> >> Thanks
> >> Krishan
> >>
> >
>
>
>


Re: Replication of Solr Model and feature store

2020-07-22 Thread krishan goyal
Adding more details here

I need some help on how to enable the solr LTR model and features on all
nodes of a solr cluster.

I am unable to replicate the model and the feature store though from any
master to its slaves with the replication API ? And unable to find any
documentation for the same. Is replication possible?

Without replication, would I have to individually update all nodes of a
cluster ? Or can the feature and model files be read as a resource (like
config or schema) so that I can replicate the file or add the file to my
deployments.


On Wed, Jul 22, 2020 at 5:53 PM krishan goyal  wrote:

> Bump. Any one has an idea how to proceed here ?
>
> On Wed, Jul 8, 2020 at 5:41 PM krishan goyal 
> wrote:
>
>> Hi,
>>
>> How do I enable replication of the model and feature store ?
>>
>> Thanks
>> Krishan
>>
>


Re: Replication of Solr Model and feature store

2020-07-22 Thread krishan goyal
Bump. Any one has an idea how to proceed here ?

On Wed, Jul 8, 2020 at 5:41 PM krishan goyal  wrote:

> Hi,
>
> How do I enable replication of the model and feature store ?
>
> Thanks
> Krishan
>


Re: Solr fails to start with G1 GC

2020-07-16 Thread krishan goyal
The issue was figured out by starting solr with the -f parameter which
starts solr in foreground and provides the errors if any

Got an error - "Conflicting collector combinations in option list; please
refer to the release notes for the combinations allowed"

Turns out bin/solr file starts with CMS by default and had to disable that
to resolve the conflict.


On Wed, Jul 15, 2020 at 10:20 PM Walter Underwood 
wrote:

> I don’t see a heap size specified, so it is probably trying to run with
> a 512 Megabyte heap. That might just not work with the 32M region
> size.
>
> Here are the options we have been using for 3+ years on about 150 hosts.
>
> SOLR_HEAP=8g
> # Use G1 GC  -- wunder 2017-01-23
> # Settings from https://wiki.apache.org/solr/ShawnHeisey
> GC_TUNE=" \
> -XX:+UseG1GC \
> -XX:+ParallelRefProcEnabled \
> -XX:G1HeapRegionSize=8m \
> -XX:MaxGCPauseMillis=200 \
> -XX:+UseLargePages \
> -XX:+AggressiveOpts \
> "
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> > On Jul 15, 2020, at 4:24 AM, krishan goyal 
> wrote:
> >
> > Hi,
> >
> > I am using Solr 7.7
> >
> > I am trying to start my solr server with G1 GC instead of the default CMS
> > but the solr service doesn't get up.
> >
> > The command I use to start solr is
> >
> > bin/solr start -p 25280 -a "-Dsolr.solr.home=
> > -Denable.slave=true -Denable.master=false -XX:+UseG1GC
> > -XX:MaxGCPauseMillis=500 -XX:+UnlockExperimentalVMOptions
> > -XX:G1MaxNewSizePercent=30 -XX:G1NewSizePercent=5
> -XX:G1HeapRegionSize=32M
> > -XX:InitiatingHeapOccupancyPercent=70"
> >
> > I have tried various permutations of the start command by dropping /
> adding
> > other parameters but it doesn't work. However starts up just fine with
> > just "-Dsolr.solr.home= -Denable.slave=true
> > -Denable.master=false" and starts up with the default CMS collector
> >
> > I don't get any useful error logs too. It waits for default 180 secs and
> > then prints
> >
> > Warning: Available entropy is low. As a result, use of the UUIDField,
> SSL,
> > or any other features that require
> > RNG might not work properly. To check for the amount of available
> entropy,
> > use 'cat /proc/sys/kernel/random/entropy_avail'.
> >
> > Waiting up to 180 seconds to see Solr running on port 25280 [|]  Still
> not
> > seeing Solr listening on 25280 after 180 seconds!
> > 2020-07-15 07:07:52.042 INFO  (coreCloseExecutor-60-thread-6) [
> > x:coreName] o.a.s.c.SolrCore [coreName]  CLOSING SolrCore
> > org.apache.solr.core.SolrCore@7cc638d8
> > 2020-07-15 07:07:52.099 INFO  (coreCloseExecutor-60-thread-6) [
> > x:coreName] o.a.s.m.SolrMetricManager Closing metric reporters for
> > registry=solr.core.coreName, tag=7cc638d8
> > 2020-07-15 07:07:52.100 INFO  (coreCloseExecutor-60-thread-6) [
> > x:coreName] o.a.s.m.r.SolrJmxReporter Closing reporter
> > [org.apache.solr.metrics.reporters.SolrJmxReporter@5216981f: rootName =
> > null, domain = solr.core.coreName, service url = null, agent id = null]
> for
> > registry solr.core.coreName /
> com.codahale.metrics.MetricRegistry@32988ddf
> > 2020-07-15 07:07:52.173 INFO  (ShutdownMonitor) [   ]
> > o.a.s.m.SolrMetricManager Closing metric reporters for
> registry=solr.node,
> > tag=null
> > 2020-07-15 07:07:52.173 INFO  (ShutdownMonitor) [   ]
> > o.a.s.m.r.SolrJmxReporter Closing reporter
> > [org.apache.solr.metrics.reporters.SolrJmxReporter@28952dea: rootName =
> > null, domain = solr.node, service url = null, agent id = null] for
> registry
> > solr.node / com.codahale.metrics.MetricRegistry@655f4a3f
> > 2020-07-15 07:07:52.175 INFO  (ShutdownMonitor) [   ]
> > o.a.s.m.SolrMetricManager Closing metric reporters for registry=solr.jvm,
> > tag=null
> > 2020-07-15 07:07:52.175 INFO  (ShutdownMonitor) [   ]
> > o.a.s.m.r.SolrJmxReporter Closing reporter
> > [org.apache.solr.metrics.reporters.SolrJmxReporter@69c6161d: rootName =
> > null, domain = solr.jvm, service url = null, agent id = null] for
> registry
> > solr.jvm / com.codahale.metrics.MetricRegistry@1252ce77
> > 2020-07-15 07:07:52.176 INFO  (ShutdownMonitor) [   ]
> > o.a.s.m.SolrMetricManager Closing metric reporters for
> registry=solr.jetty,
> > tag=null
> > 2020-07-15 07:07:52.176 INFO  (ShutdownMonitor) [   ]
> > o.a.s.m.r.SolrJmxReporter Closing reporter
> > [org.apache.solr.metrics.reporters.SolrJmxReporter@3aefae67: rootName =
> > null, domain = solr.jetty, service url = null, agent id = null] for
> > registry solr.jetty / com.codahale.metrics.MetricRegistry@3a538ecd
>
>


Solr fails to start with G1 GC

2020-07-15 Thread krishan goyal
Hi,

I am using Solr 7.7

I am trying to start my solr server with G1 GC instead of the default CMS
but the solr service doesn't get up.

The command I use to start solr is

bin/solr start -p 25280 -a "-Dsolr.solr.home=
-Denable.slave=true -Denable.master=false -XX:+UseG1GC
-XX:MaxGCPauseMillis=500 -XX:+UnlockExperimentalVMOptions
-XX:G1MaxNewSizePercent=30 -XX:G1NewSizePercent=5 -XX:G1HeapRegionSize=32M
-XX:InitiatingHeapOccupancyPercent=70"

I have tried various permutations of the start command by dropping / adding
other parameters but it doesn't work. However starts up just fine with
just "-Dsolr.solr.home= -Denable.slave=true
-Denable.master=false" and starts up with the default CMS collector

I don't get any useful error logs too. It waits for default 180 secs and
then prints

Warning: Available entropy is low. As a result, use of the UUIDField, SSL,
or any other features that require
RNG might not work properly. To check for the amount of available entropy,
use 'cat /proc/sys/kernel/random/entropy_avail'.

Waiting up to 180 seconds to see Solr running on port 25280 [|]  Still not
seeing Solr listening on 25280 after 180 seconds!
2020-07-15 07:07:52.042 INFO  (coreCloseExecutor-60-thread-6) [
x:coreName] o.a.s.c.SolrCore [coreName]  CLOSING SolrCore
org.apache.solr.core.SolrCore@7cc638d8
2020-07-15 07:07:52.099 INFO  (coreCloseExecutor-60-thread-6) [
x:coreName] o.a.s.m.SolrMetricManager Closing metric reporters for
registry=solr.core.coreName, tag=7cc638d8
2020-07-15 07:07:52.100 INFO  (coreCloseExecutor-60-thread-6) [
x:coreName] o.a.s.m.r.SolrJmxReporter Closing reporter
[org.apache.solr.metrics.reporters.SolrJmxReporter@5216981f: rootName =
null, domain = solr.core.coreName, service url = null, agent id = null] for
registry solr.core.coreName / com.codahale.metrics.MetricRegistry@32988ddf
2020-07-15 07:07:52.173 INFO  (ShutdownMonitor) [   ]
o.a.s.m.SolrMetricManager Closing metric reporters for registry=solr.node,
tag=null
2020-07-15 07:07:52.173 INFO  (ShutdownMonitor) [   ]
o.a.s.m.r.SolrJmxReporter Closing reporter
[org.apache.solr.metrics.reporters.SolrJmxReporter@28952dea: rootName =
null, domain = solr.node, service url = null, agent id = null] for registry
solr.node / com.codahale.metrics.MetricRegistry@655f4a3f
2020-07-15 07:07:52.175 INFO  (ShutdownMonitor) [   ]
o.a.s.m.SolrMetricManager Closing metric reporters for registry=solr.jvm,
tag=null
2020-07-15 07:07:52.175 INFO  (ShutdownMonitor) [   ]
o.a.s.m.r.SolrJmxReporter Closing reporter
[org.apache.solr.metrics.reporters.SolrJmxReporter@69c6161d: rootName =
null, domain = solr.jvm, service url = null, agent id = null] for registry
solr.jvm / com.codahale.metrics.MetricRegistry@1252ce77
2020-07-15 07:07:52.176 INFO  (ShutdownMonitor) [   ]
o.a.s.m.SolrMetricManager Closing metric reporters for registry=solr.jetty,
tag=null
2020-07-15 07:07:52.176 INFO  (ShutdownMonitor) [   ]
o.a.s.m.r.SolrJmxReporter Closing reporter
[org.apache.solr.metrics.reporters.SolrJmxReporter@3aefae67: rootName =
null, domain = solr.jetty, service url = null, agent id = null] for
registry solr.jetty / com.codahale.metrics.MetricRegistry@3a538ecd


Replication of Solr Model and feature store

2020-07-08 Thread krishan goyal
Hi,

How do I enable replication of the model and feature store ?

Thanks
Krishan


LTR feature computation caching

2020-07-07 Thread krishan goyal
Hi,

I am adding few features to my LTR model which re-uses the same value for
different features.

For example, I have features that compare different similarities for each
document with the input text: "token1 token2 token3 token4"

My features are

   - No of common terms
   - No of common terms / Term count in document
   - Term count in document - No of common terms
   - 4 - No of common terms
   - Boolean feature : Is no of common terms == 3

As you can see "No of common terms" is recomputed for each feature. Feature
cache caches the values per feature and isn't helpful here.

Is there any way where "No of common terms" is computed per document only
once and can be shared for all features for that document ?


Solr boolean query with phrase match

2019-03-24 Thread krishan goyal
Hi,

I want to execute a solr query with boolean clauses using the eDismax Query
Parser.

But the phrase match is executed on the complete query and not on the
individual queries which are created.

Is it possible to have both boolean conditions in query and phrase matches ?

Eg:
Query -
(gear AND cycle) OR (black AND cycle)

The parsed query for this is

"+((+(query:gear)~0.01 +(query:cycle)~0.01) (+(query:black)~0.01
+(query:cycle)~0.01)) (phrase:\"gear cycle black cycle\")~0.01"

As can be seen the query conditions are as expected but I want the phrase
match on "gear cycle" or "black cycle" .

Using boost/bq will not solve the use case because I also want to define
phrase slop. So that a phrase match for "black cycle" will match documents
like "black colour cycle".

Is it possible to either
1. Apply the phrase match on the individual queries produced ?
2. Apply the phrase match on a different attribute than 'q'. As a
workaround I can create the individual phrases to be matched and supply
that to this attribute.
3. Or any other solution for this use case ?

Thanks
Krishan