Slow ReadProcessor read fields Warnings - Ideas to investigate?

2019-05-22 Thread David Winter
Hello User Group, we run Solr with HDFS and got a lot of the following warning: Slow ReadProcessor read fields took 15093ms (threshold=1ms); ack: seqno: 3 reply: SUCCESS reply: SUCCESS reply: SUCCESS downstreamAckTimeNanos: 798309 flag: 0 flag: 0 flag: 0, targets:

Re: Looking for design ideas

2018-03-18 Thread Rick Leir
e best so I >figured I >share it here and see what ideas others may have. > >I have a DB that hold documents (over 1 million and growing). This is >known as the "Public" DB that holds documents visible to all of my end >users. > >My application let users "ch

Re: Looking for design ideas

2018-03-18 Thread Rahul Singh
w to solve best so I figured I > share it here and see what ideas others may have. > > I have a DB that hold documents (over 1 million and growing). This is > known as the "Public" DB that holds documents visible to all of my end > users. > > My application let us

Looking for design ideas

2018-03-18 Thread Steven White
Hi everyone, I have a design problem that i"m not sure how to solve best so I figured I share it here and see what ideas others may have. I have a DB that hold documents (over 1 million and growing). This is known as the "Public" DB that holds documents visible to all of my

Re: Different ideas for querying unique and non-unique records

2017-08-30 Thread Rick Leir
Susheel, Just a guess, but carrot2.org might be useful. But it might be overkill. Cheers -- Rick On August 30, 2017 7:40:08 AM MDT, Susheel Kumar <susheel2...@gmail.com> wrote: >Hello, > >I am looking for different ideas/suggestions to solve the use case am >working on.

Different ideas for querying unique and non-unique records

2017-08-30 Thread Susheel Kumar
Hello, I am looking for different ideas/suggestions to solve the use case am working on. We have couple of fields in schema along with id, business_email and personal_email. We need to return all records based on unique business and personal email's. The criteria for unique records is either

Re: Ideas

2015-09-21 Thread Paul Libbrecht
going crazy. > > Basically someone is hitting start=15 + and rows=20. The start is crazy > large. > > And then they jump around. start=15 then start=213030 etc. > > Any ideas for how to stop this besides blocking these IPs? > > Sometimes it is Google doing it

Re: Ideas

2015-09-21 Thread DVT
start=15 + and rows=20. The start is crazy > large. > > And then they jump around. start=15 then start=213030 etc. > > Any ideas for how to stop this besides blocking these IPs? > > Sometimes it is Google doing it even though these search results are set > with N

Re: Ideas

2015-09-21 Thread Doug Turnbull
reads are > going crazy. > > Basically someone is hitting start=15 + and rows=20. The start is crazy > large. > > And then they jump around. start=15 then start=213030 etc. > > Any ideas for how to stop this besides blocking these IPs? > > Sometimes it is G

Ideas

2015-09-21 Thread William Bell
We have some Denial of service attacks on our web site. SOLR threads are going crazy. Basically someone is hitting start=15 + and rows=20. The start is crazy large. And then they jump around. start=15 then start=213030 etc. Any ideas for how to stop this besides blocking these IPs

Re: Ideas

2015-09-21 Thread Walter Underwood
are >> going crazy. >> >> Basically someone is hitting start=15 + and rows=20. The start is crazy >> large. >> >> And then they jump around. start=15 then start=213030 etc. >> >> Any ideas for how to stop this besides blocking these IPs? >

Re: Ideas for debugging poor SolrCloud scalability

2014-11-07 Thread Ian Rose
Hi again, all - Since several people were kind enough to jump in to offer advice on this thread, I wanted to follow up in case anyone finds this useful in the future. *tl;dr: *Routing updates to a random Solr node (and then letting it forward the docs to where they need to go) is very expensive,

Re: Ideas for debugging poor SolrCloud scalability

2014-11-07 Thread Shawn Heisey
On 11/7/2014 7:17 AM, Ian Rose wrote: *tl;dr: *Routing updates to a random Solr node (and then letting it forward the docs to where they need to go) is very expensive, more than I expected. Using a smart router that uses the cluster config to route documents directly to their shard results in

Re: Ideas for debugging poor SolrCloud scalability

2014-11-07 Thread Erick Erickson
Ian: Thanks much for the writeup! It's always good to have real-world documentation! Best, Erick On Fri, Nov 7, 2014 at 8:26 AM, Shawn Heisey apa...@elyograg.org wrote: On 11/7/2014 7:17 AM, Ian Rose wrote: *tl;dr: *Routing updates to a random Solr node (and then letting it forward the docs

Re: Ideas for debugging poor SolrCloud scalability

2014-11-01 Thread Ian Rose
Erick, Just to make sure I am thinking about this right: batching will certainly make a big difference in performance, but it should be more or less a constant factor no matter how many Solr nodes you are using, right? Right now in my load tests, I'm not actually that concerned about the

Re: Ideas for debugging poor SolrCloud scalability

2014-11-01 Thread Erick Erickson
bq: but it should be more or less a constant factor no matter how many Solr nodes you are using, right? Not really. You've stated that you're not driving Solr very hard in your tests. Therefore you're waiting on I/O. Therefore your tests just aren't going to scale linearly with the number of

Re: Ideas for debugging poor SolrCloud scalability

2014-10-31 Thread Ian Rose
Hi Erick - Thanks for the detailed response and apologies for my confusing terminology. I should have said WPS (writes per second) instead of QPS but I didn't want to introduce a weird new acronym since QPS is well known. Clearly a bad decision on my part. To clarify: I am doing *only* writes

Re: Ideas for debugging poor SolrCloud scalability

2014-10-31 Thread Erick Erickson
NP, just making sure. I suspect you'll get lots more bang for the buck, and results much more closely matching your expectations if 1 you batch up a bunch of docs at once rather than sending them one at a time. That's probably the easiest thing to try. Sending docs one at a time is something of

Re: Ideas for debugging poor SolrCloud scalability

2014-10-31 Thread Peter Keegan
Regarding batch indexing: When I send batches of 1000 docs to a standalone Solr server, the log file reports (1000 adds) in LogUpdateProcessor. But when I send them to the leader of a replicated index, the leader log file reports much smaller numbers, usually (12 adds). Why do the batches appear

Re: Ideas for debugging poor SolrCloud scalability

2014-10-31 Thread Erick Erickson
Internally, the docs are batched up into smaller buckets (10 as I remember) and forwarded to the correct shard leader. I suspect that's what you're seeing. Erick On Fri, Oct 31, 2014 at 12:20 PM, Peter Keegan peterlkee...@gmail.com wrote: Regarding batch indexing: When I send batches of 1000

Re: Ideas for debugging poor SolrCloud scalability

2014-10-31 Thread Peter Keegan
Yes, I was inadvertently sending them to a replica. When I sent them to the leader, the leader reported (1000 adds) and the replica reported only 1 add per document. So, it looks like the leader forwards the batched jobs individually to the replicas. On Fri, Oct 31, 2014 at 3:26 PM, Erick

Ideas for debugging poor SolrCloud scalability

2014-10-30 Thread Ian Rose
to get some ideas on where I should be looking to debug this. Apologies in advance for the length of this email; I'm trying to be comprehensive and provide all relevant information. Our setup: 1 load generating client - generates tiny, fake documents with unique IDs - performs only writes

Re: Ideas for debugging poor SolrCloud scalability

2014-10-30 Thread Shawn Heisey
On 10/30/2014 2:23 PM, Ian Rose wrote: My methodology is as follows. 1. Start up a K solr servers. 2. Remove all existing collections. 3. Create N collections, with numShards=K for each. 4. Start load testing. Every minute, print the number of successful updates and the number of failed

Re: Ideas for debugging poor SolrCloud scalability

2014-10-30 Thread Ian Rose
If you want to increase QPS, you should not be increasing numShards. You need to increase replicationFactor. When your numShards matches the number of servers, every single server will be doing part of the work for every query. I think this is true only for actual queries, right? I am

Re: Ideas for debugging poor SolrCloud scalability

2014-10-30 Thread Matt Hilt
If you are issuing writes to shard non-leaders, then there is a large overhead for the eventual redirect to the leader. I noticed a 3-5 times performance increase by making my write client leader aware. On Oct 30, 2014, at 2:56 PM, Ian Rose ianr...@fullstory.com wrote: If you want to

Re: Ideas for debugging poor SolrCloud scalability

2014-10-30 Thread Shawn Heisey
On 10/30/2014 2:56 PM, Ian Rose wrote: I think this is true only for actual queries, right? I am not issuing any queries, only writes (document inserts). In the case of writes, increasing the number of shards should increase my throughput (in ops/sec) more or less linearly, right? No, that

Re: Ideas for debugging poor SolrCloud scalability

2014-10-30 Thread Erick Erickson
Your indexing client, if written in SolrJ, should use CloudSolrServer which is, in Matt's terms leader aware. It divides up the documents to be indexed into packets that where each doc in the packet belongs on the same shard, and then sends the packet to the shard leader. This avoids a lot of

Re: Ideas for debugging poor SolrCloud scalability

2014-10-30 Thread Ian Rose
Thanks for the suggestions so for, all. 1) We are not using SolrJ on the client (not using Java at all) but I am working on writing a smart router so that we can always send to the correct node. I am certainly curious to see how that changes things. Nonetheless even with the overhead of extra

Re: Ideas for debugging poor SolrCloud scalability

2014-10-30 Thread Erick Erickson
I'm really confused: bq: I am not issuing any queries, only writes (document inserts) bq: It's clear that once the load test client has ~40 simulated users bq: A cluster of 3 shards over 3 Solr nodes *should* support a higher QPS than 2 shards over 2 Solr nodes, right QPS is usually used to

Spelling suggestions--any ideas?

2014-04-17 Thread Ed Smiley
Correctly spelled words are returning as not spelled correctly, with the original, correctly spelled word with a single oddball character appended as multiple suggestions... -- Ed Smiley, Senior Software Architect, eBooks ProQuest | 161 E Evelyn Ave| Mountain View, CA 94041 | USA | +1 650 475

Need ideas to perform historical search

2013-07-18 Thread SolrLover
of creating multiple records for a particular person again and again? -- View this message in context: http://lucene.472066.n3.nabble.com/Need-ideas-to-perform-historical-search-tp4078980.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Need ideas to perform historical search

2013-07-18 Thread Alexandre Rafalovitch
/ idea by which I can reduce the redundancy of creating multiple records for a particular person again and again? -- View this message in context: http://lucene.472066.n3.nabble.com/Need-ideas-to-perform-historical-search-tp4078980.html Sent from the Solr - User mailing list archive

ScorerDocQueue.java's downHeap showing up as frequent hotspot in profiling - ideas why?

2012-10-16 Thread Aaron Daubman
ms total time) ScorerDocQueue.topNextAndAdjustElsePop:120 (0ms self time, 308 ms total time) ScorerDocQueue.checkAdjustElsePop:135 (0ms self time, 111 ms total time) ScorerDocQueue.downHeap:212 (111ms self time, 111 ms total time) ---snip--- Any ideas

Any ideas on Solr 4.0 Release.

2012-07-05 Thread Sohail Aboobaker
Hi, Congratulations on Alpha release. I am wondering is there a ball park on final release for 4.0? Is it expected in August or Sep time frame or is it further away? We badly need some features included in this release. These are around grouped facet counts. We have limited use for Solr in our

RE: Any ideas on Solr 4.0 Release.

2012-07-05 Thread Steven A Rowe
- From: Sohail Aboobaker [mailto:sabooba...@gmail.com] Sent: Thursday, July 05, 2012 5:22 AM To: solr-user@lucene.apache.org Subject: Any ideas on Solr 4.0 Release. Hi, Congratulations on Alpha release. I am wondering is there a ball park on final release for 4.0? Is it expected in August or Sep

Re: Strange spikes in query response times...any ideas where else to look?

2012-06-29 Thread solr
: Thursday, June 28, 2012 9:20 PM Subject: RE: Strange spikes in query response times...any ideas where else to look? Michael, Thank you for responding...and for the excellent questions. 1) We have never seen this response time spike with a user-interactive search. However, in the span

Strange spikes in query response times...any ideas where else to look?

2012-06-28 Thread solr
Greetings all, We are working on building up a large Solr index for over 300 million records...and this is our first look at Solr. We are currently running a set of unique search queries against a single server (so no replication, no indexing going on at the same time, and no distributed

RE: Strange spikes in query response times...any ideas where else to look?

2012-06-28 Thread Michael Ryan
PM To: solr-user@lucene.apache.org Subject: Strange spikes in query response times...any ideas where else to look? Greetings all, We are working on building up a large Solr index for over 300 million records...and this is our first look at Solr. We are currently running a set of unique

RE: Strange spikes in query response times...any ideas where else to look?

2012-06-28 Thread solr
concurrently, or are you just using a single thread in JMeter? -Michael -Original Message- From: s...@isshomefront.com [mailto:s...@isshomefront.com] Sent: Thursday, June 28, 2012 7:56 PM To: solr-user@lucene.apache.org Subject: Strange spikes in query response times...any ideas where

Re: Strange spikes in query response times...any ideas where else to look?

2012-06-28 Thread Otis Gospodnetic
ideas where else to look? Michael, Thank you for responding...and for the excellent questions. 1) We have never seen this response time spike with a user-interactive search. However, in the span of about 40 minutes, which included about 82,000 queries, we only saw a handful of near-equally

RE: ideas for indexing large amount of pdf docs

2011-08-16 Thread Rode González
] Enviado el: lunes, 15 de agosto de 2011 14:54 Para: solr-user@lucene.apache.org Asunto: RE: ideas for indexing large amount of pdf docs Note on i: Solr replication provides pretty good clustering support out-of-the-box, including replication of multiple cores. Read the Wiki on replication

RE: ideas for indexing large amount of pdf docs

2011-08-15 Thread Jaeger, Jay - DOT
s ($average/request)\n; -Original Message- From: Rode Gonzalez (libnova) [mailto:r...@libnova.es] Sent: Saturday, August 13, 2011 3:50 AM To: solr-user@lucene.apache.org Subject: ideas for indexing large amount of pdf docs Hi all, I want to ask about the best way to implement

ideas for indexing large amount of pdf docs

2011-08-13 Thread Rode Gonzalez (libnova)
Hi all, I want to ask about the best way to implement a solution for indexing a large amount of pdf documents between 10-60 MB each one. 100 to 1000 users connected simultaneously. I actually have 1 core of solr 3.3.0 and it works fine for a few number of pdf docs but I'm afraid about the

Re: ideas for indexing large amount of pdf docs

2011-08-13 Thread Erick Erickson
Yeah, parsing PDF files can be pretty resource-intensive, so one solution is to offload it somewhere else. You can use the Tika libraries in SolrJ to parse the PDFs on as many clients as you want, just transmitting the results to Solr for indexing. HOw are all these docs being submitted? Is this

Re: ideas for indexing large amount of pdf docs

2011-08-13 Thread Rode Gonzalez (libnova)
this time all as possible when we entering in production time. Best, Rode. -Original Message- From: Erick Erickson erickerick...@gmail.com To: solr-user@lucene.apache.org Date: Sat, 13 Aug 2011 12:13:27 -0400 Subject: Re: ideas for indexing large amount of pdf docs Yeah, parsing PDF

Re: ideas for indexing large amount of pdf docs

2011-08-13 Thread Bill Bell
You could send PDF for processing using a queue solution like Amazon SQS. Kick off Amazon instances to process the queue. Once you process with Tika to text just send the update to Solr. Bill Bell Sent from mobile On Aug 13, 2011, at 10:13 AM, Erick Erickson erickerick...@gmail.com wrote:

Re: ideas for indexing large amount of pdf docs

2011-08-13 Thread Erick Erickson
To: solr-user@lucene.apache.org Date: Sat, 13 Aug 2011 12:13:27 -0400 Subject: Re: ideas for indexing large amount of pdf docs Yeah, parsing PDF files can be pretty resource-intensive, so one solution is to offload it somewhere else. You can use the Tika libraries in SolrJ to parse

Re: ideas for indexing large amount of pdf docs

2011-08-13 Thread Rode Gonzalez (libnova)
@lucene.apache.org Date: Sat, 13 Aug 2011 15:34:19 -0400 Subject: Re: ideas for indexing large amount of pdf docs Ahhh, ok, my reply was irrelevant G... Here's a good write-up on this problem: http://www.lucidimagination.com/content/scaling-lucene-and-solr [http://www.lucidimagination.com/content

ideas for versioning query?

2011-08-01 Thread Mike Sokolov
A customer has an interesting problem: some documents will have multiple versions. In search results, only the most recent version of a given document should be shown. The trick is that each user has access to a different set of document versions, and each user should see only the most recent

Re: ideas for versioning query?

2011-08-01 Thread Tomás Fernández Löbbe
Hi Michael, I guess this could be solved using grouping as you said. Documents inside a group can be sorted on a field (in your case, the version field, see parameter group.sort), and you can show only the first one. It will be more complex to show facets (post grouping faceting is work in

Re: ideas for versioning query?

2011-08-01 Thread Mike Sokolov
Thanks, Tomas. Yes we are planning to keep a current flag in the most current document. But there are cases where, for a given user, the most current document is not that one, because they only have access to some older documents. I took a look at http://wiki.apache.org/solr/FieldCollapsing

Re: ideas for versioning query?

2011-08-01 Thread Martijn v Groningen
Hi Mike, how many docs and groups do you have in your index? I think the group.sort option fits your requirements. If I remember correctly group.ngroup=true adds something like 30% extra time on top of the search request with grouping, but that was on my local test dataset (~30M docs, ~8000

Re: ideas for versioning query?

2011-08-01 Thread Mike Sokolov
I think a 30% increase is acceptable. Yes, I think we'll try it. Although our case is more like # groups ~ # documents / N, where N is a smallish number (~1-5?). We are planning for a variety of different index sizes, but aiming for a sweet spot around a few M docs. -Mike On 08/01/2011

Solr just 'hangs' under load test - ideas?

2011-06-29 Thread Bob Sandiford
necks were our apps, not Solr. What I'm benchmarking now is a descendent of that prototyping - a bit more complex on searches and more fields in the schema, but same basic search logic as far as SolrJ usage. Any ideas? What else to look at? Ringing any bells? I can send more details if anyone

Re: Solr just 'hangs' under load test - ideas?

2011-06-29 Thread Yonik Seeley
- a bit more complex on searches and more fields in the schema, but same basic search logic as far as SolrJ usage. Any ideas?  What else to look at?  Ringing any bells? I can send more details if anyone wants specifics... Bob Sandiford | Lead Software Engineer | SirsiDynix P: 800.288.8020

RE: Solr just 'hangs' under load test - ideas?

2011-06-29 Thread Bob Sandiford
12:18 PM To: solr-user@lucene.apache.org Subject: Re: Solr just 'hangs' under load test - ideas? Can you get a thread dump to see what is hanging? -Yonik http://www.lucidimagination.com On Wed, Jun 29, 2011 at 11:45 AM, Bob Sandiford bob.sandif...@sirsidynix.com wrote: Hi, all

Re: Ideas on how to implement sponsored results

2008-06-04 Thread Alexander Ramos Jardim
Cuong, I think you will need some manipulation beyond solr queries. You should separate the results by your site criteria after retrieving them. After that, you could cache the results on your application and randomize the lists every time you render the a page. I don't know if solr has

Ideas on how to implement sponsored results

2008-06-03 Thread climbingrose
Hi all, I'm trying to implement sponsored results in Solr search results similar to that of Google. We index products from various sites and would like to allow certain sites to promote their products. My approach is to query a slave instance to get sponsored results for user queries in addition

Re: Ideas on how to implement sponsored results

2008-06-03 Thread Alexander Ramos Jardim
Cuong, I have implemented sponsored words for a client. I don't know if my working can help you but I will expose it and let you decide. I have an index containing products entries that I created a field called sponsored words. What I do is to boost this field , so when these words are matched

Re: Ideas on how to implement sponsored results

2008-06-03 Thread climbingrose
Hi Alexander, Thanks for your suggestion. I think my problem is a bit different from yours. We don't have any sponsored words but we have to retrieve sponsored results directly from the index. This is because a site can have 60,000 products which is hard to insert/update keywords. I can live with

JSON tokenizer? tagging ideas

2008-01-25 Thread Ryan McKinley
I've been struggling with how to get various bits of structured data into solr documents. In various projects I have tried various ideas, but none feel great. Take a simple example where I want a document field to be the list of linked data with name, ID, and path. I have tried things like

Any clever ideas to inject into solr? Without http?

2007-08-09 Thread Kevin Holmes
I inherited an existing (working) solr indexing script that runs like this: Python script queries the mysql DB then calls bash script Bash script performs a curl POST submit to solr We're injecting about 1000 records / minute (constantly), frequently pushing the edge of our CPU / RAM

Re: Any clever ideas to inject into solr? Without http?

2007-08-09 Thread Clay Webster
Condensing the loader into a single executable sounds right if you have performance problems. ;-) You could also try adding multiple docs in a single post if you notice your problems are with tcp setup time, though if you're doing localhost connections that should be minimal. If you're already

RE: Any clever ideas to inject into solr? Without http?

2007-08-09 Thread David Whalen
To: solr-user@lucene.apache.org Subject: Re: Any clever ideas to inject into solr? Without http? Condensing the loader into a single executable sounds right if you have performance problems. ;-) You could also try adding multiple docs in a single post if you notice your problems are with tcp

Re: Any clever ideas to inject into solr? Without http?

2007-08-09 Thread Tobin Cataldo
PROTECTED] 203-849-7240 -Original Message- From: Clay Webster [mailto:[EMAIL PROTECTED] Sent: Thursday, August 09, 2007 11:43 AM To: solr-user@lucene.apache.org Subject: Re: Any clever ideas to inject into solr? Without http? Condensing the loader into a single executable sounds right

Re: Any clever ideas to inject into solr? Without http?

2007-08-09 Thread Brian Whitman
On Aug 9, 2007, at 11:12 AM, Kevin Holmes wrote: 2: Is there a way to inject into solr without using POST / curl / http? Check http://wiki.apache.org/solr/EmbeddedSolr There's examples in java and cocoa to use the DirectSolrConnection class, querying and updating solr w/o a web

Re: Any clever ideas to inject into solr? Without http?

2007-08-09 Thread Clay Webster
Services, Inc. [EMAIL PROTECTED] 203-849-7240 -Original Message- From: Clay Webster [mailto:[EMAIL PROTECTED] Sent: Thursday, August 09, 2007 11:43 AM To: solr-user@lucene.apache.org Subject: Re: Any clever ideas to inject into solr? Without http? Condensing the loader

Re: Any clever ideas to inject into solr? Without http?

2007-08-09 Thread Yonik Seeley
On 8/9/07, David Whalen [EMAIL PROTECTED] wrote: Plus, I have to believe there's a faster way to get documents into solr/lucene than using curl One issue with HTTP is latency. You can get around that by adding multiple documents per request, or by using multiple threads concurrently. You

Re: Any clever ideas to inject into solr? Without http?

2007-08-09 Thread Yonik Seeley
On 8/9/07, Siegfried Goeschl [EMAIL PROTECTED] wrote: +) my colleague just finished a database import service running within the servlet container to avoid writing out the data to the file system and transmitting it over HTTP. Most people doing this read data out of the database and construct

RE: Any clever ideas to inject into solr? Without http?

2007-08-09 Thread Kevin Holmes
Is this a native feature, or do we need to get creative with scp from one server to the other? If it's a contention between search and indexing, separate them via a query-slave and an index-master. --cw

Re: Any clever ideas to inject into solr? Without http?

2007-08-09 Thread Yonik Seeley
On 8/9/07, Kevin Holmes [EMAIL PROTECTED] wrote: Python script queries the mysql DB then calls bash script Bash script performs a curl POST submit to solr For the most up-to-date solr client for python, check out https://issues.apache.org/jira/browse/SOLR-216 -Yonik

RE: Any clever ideas to inject into solr? Without http?

2007-08-09 Thread Lance Norskog
Message- From: Kevin Holmes [mailto:[EMAIL PROTECTED] Sent: Thursday, August 09, 2007 8:13 AM To: solr-user@lucene.apache.org Subject: Any clever ideas to inject into solr? Without http? I inherited an existing (working) solr indexing script that runs like this: Python script queries

Re: Any clever ideas to inject into solr? Without http?

2007-08-09 Thread Norberto Meijome
On Thu, 9 Aug 2007 15:23:03 -0700 Lance Norskog [EMAIL PROTECTED] wrote: Underlying this all, you have a sneaky network performance problem. Your successive posts do not reuse a TCP socket. Obvious: re-opening a new socket each post takes time. Not obvious: your server has sockets building up

Re: Ideas for a relevance score that could be considered stable across multiple searches with the same query structure?

2007-05-30 Thread Daniel Einspanjer
On 4/11/07, Chris Hostetter [EMAIL PROTECTED] wrote: : Not really. The explain scores aren't normalized and I also couldn't : find a way to get the explain data as anything other than a whitespace : formatted text blob from Solr. Keep in mind that they need confidence the defualt way Solr

Re: Ideas for a relevance score that could be considered stable across multiple searches with the same query structure?

2007-05-09 Thread Sean Timm
Yes, for good (hopefully) or bad. -Sean Shridhar Venkatraman wrote on 5/7/2007, 12:37 AM: Interesting.. Surrogates can also bring the searcher's subjectivity (opinion and context) into it by the learning process ? shridhar Sean Timm wrote: It may not be easy or even possible

Re: Ideas for a relevance score that could be considered stable across multiple searches with the same query structure?

2007-05-06 Thread Shridhar Venkatraman
Interesting.. Surrogates can also bring the searcher's subjectivity (opinion and context) into it by the learning process ? shridhar Sean Timm wrote: It may not be easy or even possible without major changes, but having global collection statistics would allow scores to be compared across

Re: Ideas for a relevance score that could be considered stable across multiple searches with the same query structure?

2007-05-05 Thread Daniel Einspanjer
On 4/11/07, Chris Hostetter [EMAIL PROTECTED] wrote: A custom Similaity class with simplified tf, idf, and queryNorm functions might also help you get scores from the Explain method that are more easily manageable since you'll have predictible query structures hard coded into your application.

Re: Ideas for a relevance score that could be considered stable across multiple searches with the same query structure?

2007-05-05 Thread Sean Timm
It may not be easy or even possible without major changes, but having global collection statistics would allow scores to be compared across searchers. To do this, the master indexes would need to be able to communicate with each other. An other approach to merging across searchers is

Any Parm Substituion Ideas...

2007-04-10 Thread Jim Dow
I really like the flexibility of naming request handlers to append general constraints / filters. Has anyone spun thoughts around something like a solr.ParmSubstHandler or any way to pass maybe a special ps=0:discussions; ps=1:images; ps=2:false requestHandler name=partitioned

Re: Any Parm Substituion Ideas...

2007-04-10 Thread Chris Hostetter
I'm not certain that i understand exactly what you are describing, but there was some discussion a while back that may be similar... http://issues.apache.org/jira/browse/SOLR-109 ...there's not a lot in the issue itself, but the linked discussion may be fruitful for you. if what you are