This parameter referers to the Solr request, for example:
https://lucene.apache.org/solr/guide/7_0/result-grouping.html#grouping-by-query
Drupal should expose it in the API, I guess?
Cheers,
diego
From: solr-user@lucene.apache.org At: 12/02/19 14:47:06To:
solr-user@lucene.apache.org
Hi Kamal,
You can use a MinMaxNormalizer[1], and get min and max from historical data,
for the original score won't guarantee that the value will be **always**
between 0..1 but it should happen in the majority of the cases, if the 0..1
constraint is not super strong I would rather use a
If you want a 'global' IDF across different fields, maybe one solution is to
use a copyfield to copy all the fields in a common field (e.g, title, authors,
body, footer all copied into a copyfield call text), and then you should be
able to use it with a function query or by implementing your
Hi all,
I just noticed this and I just wanted to share with you:
Full-text search is everywhere nowadays and FOSDEM 2019 will have a dedicated
devroom for search on Sunday the 3rd of February.
We would like to invite submissions of presentations from developers,
researchers, and users of
relevance.
> >
> > Joel Bernstein
> > http://joelsolr.blogspot.com/
> >
> >
> > On Thu, Sep 27, 2018 at 1:39 PM Diego Ceccarelli (BLOOMBERG/ LONDON) <
> > dceccarel...@bloomberg.net> wrote:
> >
> > > Yeah, I think Kmeans might be a way to implement the &q
gt; threshold.
I would allow to define the strategy and select it from the request.
From: solr-user@lucene.apache.org At: 09/27/18 18:25:43To: Diego Ceccarelli
(BLOOMBERG/ LONDON ) , solr-user@lucene.apache.org
Subject: Re: solr and diversification
I've thought about this problem a littl
Hi,
I'm considering to write a component for diversifying the results. I know that
diversification can be achieved by using grouping but I'm thinking about
something different and query biased.
The idea is to have something that gets applied after the normal retrieval and
selects the top k
Hi Akshay,
did you run solr enabling learning to rank?
./bin/solr -e techproducts -Dsolr.ltr.enabled=true
if you don't pass -Dsolr.ltr.enabled=true ltr will not be available.
Cheers,
Diego
From: solr-user@lucene.apache.org At: 07/16/18 09:00:39To:
solr-user@lucene.apache.org
Subject: Re:
Hello ilayaraja,
I think it would be good to move this discussion on the Jira item:
https://issues.apache.org/jira/browse/SOLR-8776?attachmentOrder=asc
You can add your comments there, and also in the page I explained how it works.
On the performance you are right: at the moment it is slow.
Hello,
I'm not sure 100% but I think that if you have multiple shards the number of
docs matched in each group is *not* guarantee to be exact. Increasing the rows
will increase the amount of partial information that each shard sends to the
federator and make the number more precise.
For
I just updated the PR to upstream - I still have to fix some things in
distribute mode, but unit tests in non distribute mode works.
Hope this helps,
Diego
From: solr-user@lucene.apache.org At: 04/15/18 03:37:54To:
solr-user@lucene.apache.org
Subject: Re: Learning to Rank (LTR) with
Patch has not been merged yet, it is available here:
https://github.com/apache/lucene-solr/pull/162
You can try to apply the patch on the current master and see if it fixes.
Please let us know if you have any questions.
Cheers,
Diego
From: solr-user@lucene.apache.org At: 04/05/18
I don't think you can define docTrasformer in the SolrConfig at the moment, I
agree it would be a cool feature.
Maybe one possibility could be to use the update request processors [1], and
precompute the fields at index time, it would be more expensive in disk and
index time, but then it
Hi Rick,
I don't think the issue is BM25 vs TFIDF (the old similarity), it seems more
due to the "matching" logic.
you are asking to match:
"(Action AND Technical AND Temporaries AND t/a AND CTR AND Corporation)"
This (in theory) means that you want to retrieve **only** the documents that
A similar problem came out with learning to rank models, and was fixed by
https://issues.apache.org/jira/browse/SOLR-11250
Maybe it can be useful..
From: solr-user@lucene.apache.org At: 02/26/18 13:13:28To:
solr-user@lucene.apache.org
Subject: FileDictionaryFactory:- pick source file from
Hi all,
We would like to perform a benchmark of
https://issues.apache.org/jira/browse/SOLR-11831
The patch improves the performance of grouped queries asking only for one
result per group (aka. group.limit=1).
I remember seeing a page showing a benchmark of the query performance on
Wikipedia,
ant -Dtests.slow=false
From: solr-user@lucene.apache.org At: 02/02/18 17:07:14To:
solr-user@lucene.apache.org
Subject: skip slow tests?
Hi *,
Some (slow) tests in Solr are annotated with @Slow. Is there a way to run ant
test skipping them?
thanks,
Diego
Hi *,
Some (slow) tests in Solr are annotated with @Slow. Is there a way to run ant
test skipping them?
thanks,
Diego
Hi Luigi, I don't know much that part of Lucene, I would check blog posts and
the code to understand if you can use NumericDocValues (my gut says yes).
Also, I don't know if it is important, but please note that if you index all
the documents at the beginning your scores will be different -
Hi Luigi,
What about using an updatable DocValue [1] for the field x ? you could
initially set it to -1,
and then update it for the docs in the step j. Range queries should still work
and the update should be fast.
Cheers
[1]
I think it really depends on the particular use case. Sometime the absolute
score is a good feature, sometimes no.
If you are using the default bm25, I think that increasing the number of terms
in the query will increase the average doc. score in the results. So maybe I
would normalize the
In theory it should be possible if you are indexing the positions of the tokens
in your field,
but I am not aware of any solr query that allows you to weight the matches
based on the position, does anyone know if is possible?
From: solr-user@lucene.apache.org At: 01/29/18 11:25:36To:
Hi Zahid, if you want to allow searching only if the query is shorter than a
certain number of terms / characters, I would do it before calling solr
probably, otherwise you could write a QueryParserPlugin (see [1]) and check
that the query is sound before processing it.
See also:
And you want to show to the users only the Lucene documents that matched the
original query sent to Solr? (what if a lucene document matches only part of
the query?)
From: solr-user@lucene.apache.org At: 01/23/18 13:55:46To: Diego Ceccarelli
(BLOOMBERG/ LONDON ) , solr-user
Rahul, can you provide more details on how you decide that the smaller lucene
objects are part of the same solr document?
From: solr-user@lucene.apache.org At: 01/23/18 09:59:17To:
solr-user@lucene.apache.org
Subject: Re: Using lucene to post-process Solr query results
Hi Rahul,
Looks like
Hi Fiz,
It is not possible at the moment, you will have to log the queries (from solr,
or before you sent them) and use external tools to do that.
There is a jira item on that if you are interested:
https://issues.apache.org/jira/browse/SOLR-10359
Diego
From: solr-user@lucene.apache.org At:
not the case then how to proceed with using fix in
>>> > master-solr-8776 with branch_6_6 can a new patch be created for this?
>>> >
>>> > Thank you,
>>> > Roopa
>>> >
>>> > On Mon, Dec 11, 2017 at 9:54 AM, Roopa Rao <roop..
I'm assuming that you are writing the cosine similarity and you have two
vectors containing the pairs . The two vectors could have
different sizes because they only contain the terms that have tfidf != 0.
if you want to compute cosine similarity between the two lists you just have
From: solr-user@lucene.apache.org At: 01/05/18 15:35:46To:
solr-user@lucene.apache.org
Subject: Re: Personalized search parameters
In particular we have to retrieve the documents with a normal search
followed by a result reranking phase where we calculate the cosine
similarity between the
Why you want the personalization to happen into Similarity?
Similarity will score all the docs matching your query, so it has too be really
fast. Unless your personalization is very easy (e.g., tf/idf computed in a
different way based on the user) I would not put it there..
Did you consider
> at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.
> executeProduceConsume(ExecuteProduceConsume.java:303)
> at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.
> produceConsume(ExecuteProduceConsume.java:148)
> at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(
> Exec
)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
at java.lang.Thread.run(Unknown Source)
Best regards,
Dariusz Wojtas
On Thu, Dec 28, 2017 at 1:03 PM, Diego Ceccarelli (BLOOMBERG/ LONDON) <
dceccarel...@bloomberg.net> wrote:
> Hello Dariusz,
>
> Can you look into the solr l
Hello Dariusz,
Can you look into the solr logs for a stack trace or ERROR logs?
From: solr-user@lucene.apache.org At: 12/27/17 19:01:29To:
solr-user@lucene.apache.org
Subject: SOLR 7.2 and LTR
Hi,
I am using SOLR 7.0 and use the ltr parser.
The configuration I use works nicely under SOLR
Hi Roopa,
If you look at the diff:
https://github.com/apache/lucene-solr/pull/162/files
I didn't change much in SolrIndexSearcher, you can try to skip the file when
applying the patch and redo the changes after.
Alternatively, the feature branch is available here:
Hello isspek,
Unfortunately no, it would be nice to patch RankLib to output the model in
json.
Jfyi, I've a script to convert the xml into the json format
https://github.com/bloomberg/lucene-solr/blob/ltr-demo-lucene-solr/py-solr-buzzwords/tree_model.py
Cheers,
Diego
From:
Hi all,
Yesterday Yahoo open sourced Vespa (i.e.: The open big data serving engine:
Store, search, rank and organize big data at user serving time.), looking at
the API they provide search.
I did a quick search on the code for lucene, getting only 5 results.
Does anyone know more about the
https://wiki.apache.org/solr/FAQ#How_can_I_delete_all_documents_from_my_index.3F
have a look also at the last post here: https://gist.github.com/nz/673027
I think there's a way to disallow delete by *:* in the solrconfig.xml but I
can't find it (I would take a look in the solrconfig just in
Hi Dariusz,
If you use *:* you'll rerank only the top N random documents, as Emir said,
that will not produce interesting results probably.
If you want to replace the original score, you can take a look at the learning
to rank module [1], that would allow you to reassign a
new score to the top
Hi,
Sorry for the delay, here are my replies:
1. I'm not yet a spark user (but I'm working on that :))
2. I'm not sure I understand how you would use a feature that is not a float
into a model,
in my experience all the learning to rank methods always train and predict from
a list of
floats.
Hi All,
At the moment RankQueries [1] are not supported when you perform grouping:
if you perform a ReRankQuery and ask for the groups, reranking will be ignored
in the scoring.
In SOLR-8776, I added support for ReRankQueries in grouping and I opened a PR
on github [2].
ReRankQueries are
Hi Jeffery,
I submitted a patch to the README of the learning to rank example folder,
trying to explain better how to produce a training set given a log with
interaction data.
Patch is available here: https://issues.apache.org/jira/browse/SOLR-9929
And you can see the new version of the
Hi David,
I implemented bm25f for Europeana on Solr 4.x a couple of years ago,
you can find it here:
https://github.com/europeana/contrib/tree/master/bm25f-ranking
maybe I should contribute it back..
Please do not hesitate to contact me if you need help :)
Cheers,
Diego
From:
42 matches
Mail list logo