Re: Multi index
On Sat, Nov 28, 2009 at 3:12 PM, Jörg Agatz wrote: > Hallo Users... > > At the Moment i test MultiCorae Solr, but i cant search in more than one > core direktly.. > > Exist a way to use multiindex, 3-5 Indizes in one core ans search direkty > in > all? ore only in one? > > > You can search on all cores if schema.xml is same. See http://wiki.apache.org/solr/DistributedSearch If schema.xml is different, you can search on one core only. You can denormalize and combine cores if you want to search on all of them. -- Regards, Shalin Shekhar Mangar.
Re: Multi Index
On Sun, Nov 29, 2009 at 11:26 AM, Bhuvi HN wrote: > Hi all > I am in need of using single Solr instance for multi indexing. Please let > me > know if this is possble to do. > See http://wiki.apache.org/solr/MultipleIndexes As a best practice, consider de-normalizing your data, if possible. -- Regards, Shalin Shekhar Mangar.
Multi Index
Hi all I am in need of using single Solr instance for multi indexing. Please let me know if this is possble to do. Regards Bhuvi
Multi index
Hallo Users... At the Moment i test MultiCorae Solr, but i cant search in more than one core direktly.. Exist a way to use multiindex, 3-5 Indizes in one core ans search direkty in all? ore only in one? it is realy important or my Projekt. Thanks King
Re: Multi-index Design
Matt Weber schrieb: http://wiki.apache.org/solr/MultipleIndexes Thanks, Mark. Your explanation and the pointer to the Wiki have clarified things for me. Michael Ludwig
RE: Multi-index Design
That's how we do it in Orbitz. We use "type" field to separate content, review and promotional information in one single index. And then we use the last-components to plugin these data together. Only thing that we haven't yet tested is the scalability of this model, since our data is small. Thanks, Kalyan Manepalli -Original Message- From: Chris Masters [mailto:roti...@yahoo.com] Sent: Tuesday, May 05, 2009 10:00 AM To: solr-user@lucene.apache.org Subject: Multi-index Design Hi All, I'm [still!] evaluating Solr and setting up a PoC. The requirements are to index the following objects: - people - name, status, date added, address, profile, other people specific fields like group... - organisations - name, status, date added, address, profile, other organisational specific fields like size... - products - name, status, date added, profile, other product specific fields like product groups.. AND...I need to isolate indexes to a number of dynamic domains (customerA, customerB...) that will grow over time. So, my initial thoughts are to do the following: - flatten the searchable objects as much as I can - use a type field to distinguish - into a single index - use multi-core approach to segregate domains of data So, a couple questions on this: 1) Is this approach/design sensible and do others use it? 2) By flattening the data we will only index common fields; is it unreasonable to do a second database search and union the results when doing advanced searches on non indexed fields? Do others do this? 3) I've read that I can dynamically add a new core - this fits well with the ability to dynamically add new domains; how scaliable is this approach? Would it be unreasonable to have 20-30 dynaimically created cores? I guess, redundancy aside and given our one core per domain approach, we could easily spill onto other physical servers without the need for replication? Thanks again for your help! rotis
Re: Multi-index Design
1 - A field that is called "type" which is probably a string field that you index values such as "people", "organization", "product". 2 - Yes, for each document you are indexing, you will include it's type, ie. "person" 3, 4, 5 - You would have a core for each domain. Each domain will then have it's own index that contains documents of all types. See http://wiki.apache.org/solr/MultipleIndexes . Thanks, Matt Weber On May 5, 2009, at 11:14 AM, Michael Ludwig wrote: Chris Masters schrieb: - flatten the searchable objects as much as I can - use a type field to distinguish - into a single index - use multi-core approach to segregate domains of data Some newbie questions: (1) What is a "type field"? Is it to designate different types of documents, e.g. product descriptions and forum postings? (2) Would I include such a "type field" in the data I send to the update facility and maybe configure Solr to take special action depending on the value of the update field? (3) Like, write the processing results to a domain dedicated to that type of data that I could limit my search to, as per Otis' post? (4) And is that what's called a "core" here? (5) Or, failing (3), and lumping everything together in one search domain (core?), would I use that "type field" to limit my search to a particular type of data? Michael Ludwig
Re: Multi-index Design
Chris Masters schrieb: - flatten the searchable objects as much as I can - use a type field to distinguish - into a single index - use multi-core approach to segregate domains of data Some newbie questions: (1) What is a "type field"? Is it to designate different types of documents, e.g. product descriptions and forum postings? (2) Would I include such a "type field" in the data I send to the update facility and maybe configure Solr to take special action depending on the value of the update field? (3) Like, write the processing results to a domain dedicated to that type of data that I could limit my search to, as per Otis' post? (4) And is that what's called a "core" here? (5) Or, failing (3), and lumping everything together in one search domain (core?), would I use that "type field" to limit my search to a particular type of data? Michael Ludwig
Re: Multi-index Design
Chris, 1) I'd put different types of data in different cores/instances, unless you relly need to search them all together. By using only common attributes you are kind of killing the richness of data and your ability to do something useful with it. 2) I'd triple-check the "do a second database search and union the results when doing advanced searches on non indexed field" part if you are dealing with non-trivial query rate. 3) Some people have thousands of Solr cores. Not sure on how many machines, but it's all a function of data size, hardware specs, query complexity and rate. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Chris Masters > To: solr-user@lucene.apache.org > Sent: Tuesday, May 5, 2009 10:59:40 AM > Subject: Multi-index Design > > > Hi All, > > I'm [still!] evaluating Solr and setting up a PoC. The requirements are to > index > the following objects: > > - people - name, status, date added, address, profile, other people specific > fields like group... > - organisations - name, status, date added, address, profile, other > organisational specific fields like size... > - products - name, status, date added, profile, other product specific > fields > like product groups.. > > AND...I need to isolate indexes to a number of dynamic domains (customerA, > customerB...) that will grow over time. > > So, my initial thoughts are to do the following: > > - flatten the searchable objects as much as I can - use a type field to > distinguish - into a single index > - use multi-core approach to segregate domains of data > > So, a couple questions on this: > > 1) Is this approach/design sensible and do others use it? > > 2) By flattening the data we will only index common fields; is it > unreasonable > to do a second database search and union the results when doing advanced > searches on non indexed fields? Do others do this? > > 3) I've read that I can dynamically add a new core - this fits well with the > ability to dynamically add new domains; how scaliable is this approach? Would > it > be unreasonable to have 20-30 dynaimically created cores? I guess, redundancy > aside and given our one core per domain approach, we could easily spill onto > other physical servers without the need for replication? > > Thanks again for your help! > rotis
Re: Multi-index Design
More precisely, we use a single core, flat schema, with a type field. wunder On 5/5/09 8:48 AM, "Walter Underwood" wrote: > That is how we do it at Netflix. --wunder > > On 5/5/09 7:59 AM, "Chris Masters" wrote: > >> 1) Is this approach/design sensible and do others use it? >
Re: Multi-index Design
That is how we do it at Netflix. --wunder On 5/5/09 7:59 AM, "Chris Masters" wrote: > 1) Is this approach/design sensible and do others use it?
Multi-index Design
Hi All, I'm [still!] evaluating Solr and setting up a PoC. The requirements are to index the following objects: - people - name, status, date added, address, profile, other people specific fields like group... - organisations - name, status, date added, address, profile, other organisational specific fields like size... - products - name, status, date added, profile, other product specific fields like product groups.. AND...I need to isolate indexes to a number of dynamic domains (customerA, customerB...) that will grow over time. So, my initial thoughts are to do the following: - flatten the searchable objects as much as I can - use a type field to distinguish - into a single index - use multi-core approach to segregate domains of data So, a couple questions on this: 1) Is this approach/design sensible and do others use it? 2) By flattening the data we will only index common fields; is it unreasonable to do a second database search and union the results when doing advanced searches on non indexed fields? Do others do this? 3) I've read that I can dynamically add a new core - this fits well with the ability to dynamically add new domains; how scaliable is this approach? Would it be unreasonable to have 20-30 dynaimically created cores? I guess, redundancy aside and given our one core per domain approach, we could easily spill onto other physical servers without the need for replication? Thanks again for your help! rotis
Re: Multi-index searches
Kirk Beers wrote: Kirk Beers wrote: Hi, I am interested in using solr and I ran the tutorial but I was wondering if it supports multi-index searching ? Kirk Allow me to clear that up! I would like to have the documents of 2 indices returned at once. Does solr support that ? Or am will it only return the documents of one index at a time? one index at a time... ryan
Re: Multi-index searches
Kirk Beers wrote: Hi, I am interested in using solr and I ran the tutorial but I was wondering if it supports multi-index searching ? Kirk Allow me to clear that up! I would like to have the documents of 2 indices returned at once. Does solr support that ? Or am will it only return the documents of one index at a time? Kirk
Multi-index searches
Hi, I am interested in using solr and I ran the tutorial but I was wondering if it supports multi-index searching ? Kirk
Re: Question: Pagination with multi index box
On 14-May-07, at 10:05 PM, James liu wrote: 2007/5/15, Mike Klaas <[EMAIL PROTECTED]>: I'm not ignoring it: I'm implying that the above is the correct descending score-sorted order. You have to perform that sort manually. i mean merged results(from 60 p) and sort it, not solr's sort. every result from box have been sorted by score. Yep, me too. so it will not sorted by score correctly. > > and if user click page 2 to see, how to show data? > > p1 start from 10 or query other partitions? Assemble results 1 through 20, then display 11-20 to the user. for example, i wanna query "solr" p1 have 100 results which score is bigger than 80 p2 have 100 results which score is smaller than 20 so if i use rows=10, score not correct. if i wanna promise 10 pages which sort by score correctly. so i have to get 100(rows=100) results from every box. and merge results, sort it, finallay get top 100 results. but it will very slow. i don't know other search how to solve it? maybe they not sort by score very correctly. Hmm, I feel as though we are going in circles. If you want to cache the top 100 documents for a query, there is essentially no efficient means of accumulating these results in one request--as you note, to be sure of having the top 100 documents, 100 documents from each partition must be requested. Your options are essentially: 1) request a smaller number of documents, and accept some inaccuracies (frinstance, if you request 10 docs, then the first page is guaranteed to be correct, but page 10 probably won't be quite right) 2) request a smaller number of documents and attempt to assemble the top 100 docs. if you can't, then request more documents from the partitions that were exhausted soonest. Keep in mind also that the scores across independent solr partitions are comparable, but not exact, due to idf differences. The relative exactitude of page 10 results might not be too important. -Mike
Re: Question: Pagination with multi index box
maybe full-text search sort correct not very import. 2007/5/15, James liu <[EMAIL PROTECTED]>: 2007/5/15, Mike Klaas <[EMAIL PROTECTED]>: > > On 14-May-07, at 8:55 PM, James liu wrote: > > > thks for your detail answer. > > > > but u ignore "sorted by score" > > > > p1, p2,p1,p1,p3,p4,p1,p1 > > > > maybe their max score is lower than from p19,p20. > > > > I'm not ignoring it: I'm implying that the above is the correct > descending score-sorted order. You have to perform that sort manually. i mean merged results(from 60 p) and sort it, not solr's sort. every result from box have been sorted by score. > so it will not sorted by score correctly. > > > > and if user click page 2 to see, how to show data? > > > > p1 start from 10 or query other partitions? > > Assemble results 1 through 20, then display 11-20 to the user. for example, i wanna query "solr" p1 have 100 results which score is bigger than 80 p2 have 100 results which score is smaller than 20 so if i use rows=10, score not correct. if i wanna promise 10 pages which sort by score correctly. so i have to get 100(rows=100) results from every box. and merge results, sort it, finallay get top 100 results. but it will very slow. i don't know other search how to solve it? maybe they not sort by score very correctly. -Mike > > > > > 2007/5/15, Mike Klaas <[EMAIL PROTECTED] >: > >> > >> On 14-May-07, at 6:49 PM, James liu wrote: > >> > >> > 2007/5/15, Mike Klaas <[EMAIL PROTECTED]>: > >> >> > >> >> On 14-May-07, at 1:35 AM, James liu wrote: > >> >> > >> >> When you get up to 60 partitions, you should make it a multi stage > >> >> process. Assuming your partitions are disjoint and evenly > >> >> distributed, estimate the number of documents that will appear > >> in the > >> >> final result from each. > >> > > >> > > >> > yes, partitions distrbuted. > >> > > >> > > >> > Double or triple that (and put a minimum > >> >> threshold), try to assemble the number of documents you > >> require, and > >> >> if one partition "runs out" of docs before it is done, request > >> a new > >> >> round. > >> > > >> > > >> > i dont' know what u mean "runs out" > >> > >> Say you request 5 docs from each of 60 partitions, and are interested > > >> in docs 1-10. If, sorted by score, the docs come from: > >> > >> p1, p2, p1, p1, p3, p4, p1, p1 > >> > >> Then p1 has "run out" at n=8, and there is no way to be sure if the > >> remaining two needed docs come from p1 or somewhere else. So you > >> have to now request at least two additional documents from p1. > >> > >> > one user request will generate 60 partitions request. > >> > > >> > they work in parallel。 > >> > > >> > so i don't know every partion's status before they done. > >> > >> Normally, you would wait for them to finish, and execute a subsequent > > >> request if more docs are needed. > >> > >> -Mike > > > > > > > > > > -- > > regards > > jl > > -- regards jl -- regards jl
Re: Question: Pagination with multi index box
2007/5/15, Mike Klaas <[EMAIL PROTECTED]>: On 14-May-07, at 8:55 PM, James liu wrote: > thks for your detail answer. > > but u ignore "sorted by score" > > p1, p2,p1,p1,p3,p4,p1,p1 > > maybe their max score is lower than from p19,p20. > I'm not ignoring it: I'm implying that the above is the correct descending score-sorted order. You have to perform that sort manually. i mean merged results(from 60 p) and sort it, not solr's sort. every result from box have been sorted by score. so it will not sorted by score correctly. > > and if user click page 2 to see, how to show data? > > p1 start from 10 or query other partitions? Assemble results 1 through 20, then display 11-20 to the user. for example, i wanna query "solr" p1 have 100 results which score is bigger than 80 p2 have 100 results which score is smaller than 20 so if i use rows=10, score not correct. if i wanna promise 10 pages which sort by score correctly. so i have to get 100(rows=100) results from every box. and merge results, sort it, finallay get top 100 results. but it will very slow. i don't know other search how to solve it? maybe they not sort by score very correctly. -Mike > > 2007/5/15, Mike Klaas <[EMAIL PROTECTED]>: >> >> On 14-May-07, at 6:49 PM, James liu wrote: >> >> > 2007/5/15, Mike Klaas <[EMAIL PROTECTED]>: >> >> >> >> On 14-May-07, at 1:35 AM, James liu wrote: >> >> >> >> When you get up to 60 partitions, you should make it a multi stage >> >> process. Assuming your partitions are disjoint and evenly >> >> distributed, estimate the number of documents that will appear >> in the >> >> final result from each. >> > >> > >> > yes, partitions distrbuted. >> > >> > >> > Double or triple that (and put a minimum >> >> threshold), try to assemble the number of documents you >> require, and >> >> if one partition "runs out" of docs before it is done, request >> a new >> >> round. >> > >> > >> > i dont' know what u mean "runs out" >> >> Say you request 5 docs from each of 60 partitions, and are interested >> in docs 1-10. If, sorted by score, the docs come from: >> >> p1, p2, p1, p1, p3, p4, p1, p1 >> >> Then p1 has "run out" at n=8, and there is no way to be sure if the >> remaining two needed docs come from p1 or somewhere else. So you >> have to now request at least two additional documents from p1. >> >> > one user request will generate 60 partitions request. >> > >> > they work in parallel。 >> > >> > so i don't know every partion's status before they done. >> >> Normally, you would wait for them to finish, and execute a subsequent >> request if more docs are needed. >> >> -Mike > > > > > -- > regards > jl -- regards jl
Re: Question: Pagination with multi index box
On 14-May-07, at 8:55 PM, James liu wrote: thks for your detail answer. but u ignore "sorted by score" p1, p2,p1,p1,p3,p4,p1,p1 maybe their max score is lower than from p19,p20. I'm not ignoring it: I'm implying that the above is the correct descending score-sorted order. You have to perform that sort manually. so it will not sorted by score correctly. and if user click page 2 to see, how to show data? p1 start from 10 or query other partitions? Assemble results 1 through 20, then display 11-20 to the user. -Mike 2007/5/15, Mike Klaas <[EMAIL PROTECTED]>: On 14-May-07, at 6:49 PM, James liu wrote: > 2007/5/15, Mike Klaas <[EMAIL PROTECTED]>: >> >> On 14-May-07, at 1:35 AM, James liu wrote: >> >> When you get up to 60 partitions, you should make it a multi stage >> process. Assuming your partitions are disjoint and evenly >> distributed, estimate the number of documents that will appear in the >> final result from each. > > > yes, partitions distrbuted. > > > Double or triple that (and put a minimum >> threshold), try to assemble the number of documents you require, and >> if one partition "runs out" of docs before it is done, request a new >> round. > > > i dont' know what u mean "runs out" Say you request 5 docs from each of 60 partitions, and are interested in docs 1-10. If, sorted by score, the docs come from: p1, p2, p1, p1, p3, p4, p1, p1 Then p1 has "run out" at n=8, and there is no way to be sure if the remaining two needed docs come from p1 or somewhere else. So you have to now request at least two additional documents from p1. > one user request will generate 60 partitions request. > > they work in parallel。 > > so i don't know every partion's status before they done. Normally, you would wait for them to finish, and execute a subsequent request if more docs are needed. -Mike -- regards jl
Re: Question: Pagination with multi index box
for example, i wanna query "lucene", it's numFound is 234300. and results should sorted by score. if u do, how to pagination and sort it's score? 2007/5/15, Mike Klaas <[EMAIL PROTECTED]>: On 14-May-07, at 7:15 PM, James liu wrote: > if i set rows=(page-1)*10,,,it will lose more result which fits query. > > how to set start when pagination. I'm not sure I understand the question. When combining results from partitions, you can't use startAt. if not use startAt, how to define rows to keep user can find results? You must always assemble the docs from 0 to N for each partition (whether through one request or multiple). if rows bigger it will slow, if smaller it will lose data and sort score not correctly. -Mike > > > 2007/5/15, James liu <[EMAIL PROTECTED]>: >> >> >> >> 2007/5/15, Mike Klaas <[EMAIL PROTECTED]>: >> > >> > On 14-May-07, at 1:35 AM, James liu wrote: >> > >> > > if use multi index box, how to pagination with sort by score >> > > correctly? >> > > >> > > for example, i wanna query "search" with 60 index box and sort by >> > > score. >> > > >> > > i don't know the num found from every index box which have >> different >> > > content. >> > > >> > > if promise 10 page with sort score correctly, i think solr 's >> start >> > > is 0, >> > > and rows is 100.(10 result per page) >> > > >> > > 60*100=6000, sort it and get top 100 to cache. >> > >> > > it is very slove although it promise 10 page with sort score >> > > correctly. >> > >> > With few index partitions, you it is sufficient to ask for startAt >> > +numNeeded docs from each partition and sort globally. Normally if >> > you wanted 10 for the first page, you would ask for 10 from each >> > server and cache the remainder. It is better to ask for more later >> > if the user asks for page ten. >> > >> > >> > When you get up to 60 partitions, you should make it a multi stage >> > process. Assuming your partitions are disjoint and evenly >> > distributed, estimate the number of documents that will appear >> in the >> > final result from each. >> >> >> yes, partitions distrbuted. >> >> >> Double or triple that (and put a minimum >> > threshold), try to assemble the number of documents you require, >> and >> > if one partition "runs out" of docs before it is done, request a >> new >> > round. >> >> >> i dont' know what u mean "runs out" >> >> one user request will generate 60 partitions request. >> >> they work in parallel。 >> >> so i don't know every partion's status before they done. >> >> >> To promise 10 page result sorted by score correctly, the only way >> seems to >> get 100 results(rows=100) from each partitioin. but it very slow. >> >> now i wanna find a way to get result sorted by score correctly and >> search >> fast. >> >> >> -Mike >> > >> >> Thks Mike. But it not i want. >> >> >> -- >> regards >> jl > > > > > -- > regards > jl -- regards jl
Re: Question: Pagination with multi index box
thks for your detail answer. but u ignore "sorted by score" p1, p2,p1,p1,p3,p4,p1,p1 maybe their max score is lower than from p19,p20. so it will not sorted by score correctly. and if user click page 2 to see, how to show data? p1 start from 10 or query other partitions? 2007/5/15, Mike Klaas <[EMAIL PROTECTED]>: On 14-May-07, at 6:49 PM, James liu wrote: > 2007/5/15, Mike Klaas <[EMAIL PROTECTED]>: >> >> On 14-May-07, at 1:35 AM, James liu wrote: >> >> When you get up to 60 partitions, you should make it a multi stage >> process. Assuming your partitions are disjoint and evenly >> distributed, estimate the number of documents that will appear in the >> final result from each. > > > yes, partitions distrbuted. > > > Double or triple that (and put a minimum >> threshold), try to assemble the number of documents you require, and >> if one partition "runs out" of docs before it is done, request a new >> round. > > > i dont' know what u mean "runs out" Say you request 5 docs from each of 60 partitions, and are interested in docs 1-10. If, sorted by score, the docs come from: p1, p2, p1, p1, p3, p4, p1, p1 Then p1 has "run out" at n=8, and there is no way to be sure if the remaining two needed docs come from p1 or somewhere else. So you have to now request at least two additional documents from p1. > one user request will generate 60 partitions request. > > they work in parallel。 > > so i don't know every partion's status before they done. Normally, you would wait for them to finish, and execute a subsequent request if more docs are needed. -Mike -- regards jl
Re: Question: Pagination with multi index box
On 14-May-07, at 7:15 PM, James liu wrote: if i set rows=(page-1)*10,,,it will lose more result which fits query. how to set start when pagination. I'm not sure I understand the question. When combining results from partitions, you can't use startAt. You must always assemble the docs from 0 to N for each partition (whether through one request or multiple). -Mike 2007/5/15, James liu <[EMAIL PROTECTED]>: 2007/5/15, Mike Klaas <[EMAIL PROTECTED]>: > > On 14-May-07, at 1:35 AM, James liu wrote: > > > if use multi index box, how to pagination with sort by score > > correctly? > > > > for example, i wanna query "search" with 60 index box and sort by > > score. > > > > i don't know the num found from every index box which have different > > content. > > > > if promise 10 page with sort score correctly, i think solr 's start > > is 0, > > and rows is 100.(10 result per page) > > > > 60*100=6000, sort it and get top 100 to cache. > > > it is very slove although it promise 10 page with sort score > > correctly. > > With few index partitions, you it is sufficient to ask for startAt > +numNeeded docs from each partition and sort globally. Normally if > you wanted 10 for the first page, you would ask for 10 from each > server and cache the remainder. It is better to ask for more later > if the user asks for page ten. > > > When you get up to 60 partitions, you should make it a multi stage > process. Assuming your partitions are disjoint and evenly > distributed, estimate the number of documents that will appear in the > final result from each. yes, partitions distrbuted. Double or triple that (and put a minimum > threshold), try to assemble the number of documents you require, and > if one partition "runs out" of docs before it is done, request a new > round. i dont' know what u mean "runs out" one user request will generate 60 partitions request. they work in parallel。 so i don't know every partion's status before they done. To promise 10 page result sorted by score correctly, the only way seems to get 100 results(rows=100) from each partitioin. but it very slow. now i wanna find a way to get result sorted by score correctly and search fast. -Mike > Thks Mike. But it not i want. -- regards jl -- regards jl
Re: Question: Pagination with multi index box
On 14-May-07, at 6:49 PM, James liu wrote: 2007/5/15, Mike Klaas <[EMAIL PROTECTED]>: On 14-May-07, at 1:35 AM, James liu wrote: When you get up to 60 partitions, you should make it a multi stage process. Assuming your partitions are disjoint and evenly distributed, estimate the number of documents that will appear in the final result from each. yes, partitions distrbuted. Double or triple that (and put a minimum threshold), try to assemble the number of documents you require, and if one partition "runs out" of docs before it is done, request a new round. i dont' know what u mean "runs out" Say you request 5 docs from each of 60 partitions, and are interested in docs 1-10. If, sorted by score, the docs come from: p1, p2, p1, p1, p3, p4, p1, p1 Then p1 has "run out" at n=8, and there is no way to be sure if the remaining two needed docs come from p1 or somewhere else. So you have to now request at least two additional documents from p1. one user request will generate 60 partitions request. they work in parallel。 so i don't know every partion's status before they done. Normally, you would wait for them to finish, and execute a subsequent request if more docs are needed. -Mike
Re: Question: Pagination with multi index box
if i set rows=(page-1)*10,,,it will lose more result which fits query. how to set start when pagination. 2007/5/15, James liu <[EMAIL PROTECTED]>: 2007/5/15, Mike Klaas <[EMAIL PROTECTED]>: > > On 14-May-07, at 1:35 AM, James liu wrote: > > > if use multi index box, how to pagination with sort by score > > correctly? > > > > for example, i wanna query "search" with 60 index box and sort by > > score. > > > > i don't know the num found from every index box which have different > > content. > > > > if promise 10 page with sort score correctly, i think solr 's start > > is 0, > > and rows is 100.(10 result per page) > > > > 60*100=6000, sort it and get top 100 to cache. > > > it is very slove although it promise 10 page with sort score > > correctly. > > With few index partitions, you it is sufficient to ask for startAt > +numNeeded docs from each partition and sort globally. Normally if > you wanted 10 for the first page, you would ask for 10 from each > server and cache the remainder. It is better to ask for more later > if the user asks for page ten. > > > When you get up to 60 partitions, you should make it a multi stage > process. Assuming your partitions are disjoint and evenly > distributed, estimate the number of documents that will appear in the > final result from each. yes, partitions distrbuted. Double or triple that (and put a minimum > threshold), try to assemble the number of documents you require, and > if one partition "runs out" of docs before it is done, request a new > round. i dont' know what u mean "runs out" one user request will generate 60 partitions request. they work in parallel。 so i don't know every partion's status before they done. To promise 10 page result sorted by score correctly, the only way seems to get 100 results(rows=100) from each partitioin. but it very slow. now i wanna find a way to get result sorted by score correctly and search fast. -Mike > Thks Mike. But it not i want. -- regards jl -- regards jl
Re: Question: Pagination with multi index box
2007/5/15, Mike Klaas <[EMAIL PROTECTED]>: On 14-May-07, at 1:35 AM, James liu wrote: > if use multi index box, how to pagination with sort by score > correctly? > > for example, i wanna query "search" with 60 index box and sort by > score. > > i don't know the num found from every index box which have different > content. > > if promise 10 page with sort score correctly, i think solr 's start > is 0, > and rows is 100.(10 result per page) > > 60*100=6000, sort it and get top 100 to cache. > it is very slove although it promise 10 page with sort score > correctly. With few index partitions, you it is sufficient to ask for startAt +numNeeded docs from each partition and sort globally. Normally if you wanted 10 for the first page, you would ask for 10 from each server and cache the remainder. It is better to ask for more later if the user asks for page ten. When you get up to 60 partitions, you should make it a multi stage process. Assuming your partitions are disjoint and evenly distributed, estimate the number of documents that will appear in the final result from each. yes, partitions distrbuted. Double or triple that (and put a minimum threshold), try to assemble the number of documents you require, and if one partition "runs out" of docs before it is done, request a new round. i dont' know what u mean "runs out" one user request will generate 60 partitions request. they work in parallel。 so i don't know every partion's status before they done. To promise 10 page result sorted by score correctly, the only way seems to get 100 results(rows=100) from each partitioin. but it very slow. now i wanna find a way to get result sorted by score correctly and search fast. -Mike Thks Mike. But it not i want. -- regards jl
Re: Question: Pagination with multi index box
On 14-May-07, at 1:35 AM, James liu wrote: if use multi index box, how to pagination with sort by score correctly? for example, i wanna query "search" with 60 index box and sort by score. i don't know the num found from every index box which have different content. if promise 10 page with sort score correctly, i think solr 's start is 0, and rows is 100.(10 result per page) 60*100=6000, sort it and get top 100 to cache. it is very slove although it promise 10 page with sort score correctly. With few index partitions, you it is sufficient to ask for startAt +numNeeded docs from each partition and sort globally. Normally if you wanted 10 for the first page, you would ask for 10 from each server and cache the remainder. It is better to ask for more later if the user asks for page ten. When you get up to 60 partitions, you should make it a multi stage process. Assuming your partitions are disjoint and evenly distributed, estimate the number of documents that will appear in the final result from each. Double or triple that (and put a minimum threshold), try to assemble the number of documents you require, and if one partition "runs out" of docs before it is done, request a new round. -Mike
Question: Pagination with multi index box
if use multi index box, how to pagination with sort by score correctly? for example, i wanna query "search" with 60 index box and sort by score. i don't know the num found from every index box which have different content. if promise 10 page with sort score correctly, i think solr 's start is 0, and rows is 100.(10 result per page) 60*100=6000, sort it and get top 100 to cache. it is very slove although it promise 10 page with sort score correctly. any idea to fix it? fast and correct. -- regards jl
Re: Question to php to do with multi index
i think curl_multi is slow. thks, i will try. 2007/4/27, Michael Kimsal <[EMAIL PROTECTED]>: The curl_multi is probably the most effective way, using straight PHP. Another option would be to spawn several jobs, assuming unix/linux, and wait for them to get done. It doesn't give you very good error handling (well, none at all actually!) but would let you run multiple indexing jobs at once. Visit http://us.php.net/shell_exec and look at the 'class exec' contributed note about halfway down the page. It'll give you an idea of how to easily spawn multiple jobs. If you're using PHP5, the proc_open function may be another way to go. proc_open was available in 4, but there were a number of extra parameters and controls made available in 5. http://us.php.net/manual/en/function.proc-open.php An adventurous soul could combine the two concepts in to one class to manage pipes communication between multiple child processes effectively. On 4/26/07, James liu <[EMAIL PROTECTED]> wrote: > > php not support multi thread,,,and how can u solve with multi index in > parallel? > > now i use curl_multi > > maybe more effect way i don't know,,,so if u know, tell me. thks. > > > -- > regards > jl > -- Michael Kimsal http://webdevradio.com -- regards jl
Re: Question to php to do with multi index
The curl_multi is probably the most effective way, using straight PHP. Another option would be to spawn several jobs, assuming unix/linux, and wait for them to get done. It doesn't give you very good error handling (well, none at all actually!) but would let you run multiple indexing jobs at once. Visit http://us.php.net/shell_exec and look at the 'class exec' contributed note about halfway down the page. It'll give you an idea of how to easily spawn multiple jobs. If you're using PHP5, the proc_open function may be another way to go. proc_open was available in 4, but there were a number of extra parameters and controls made available in 5. http://us.php.net/manual/en/function.proc-open.php An adventurous soul could combine the two concepts in to one class to manage pipes communication between multiple child processes effectively. On 4/26/07, James liu <[EMAIL PROTECTED]> wrote: php not support multi thread,,,and how can u solve with multi index in parallel? now i use curl_multi maybe more effect way i don't know,,,so if u know, tell me. thks. -- regards jl -- Michael Kimsal http://webdevradio.com
Question to php to do with multi index
php not support multi thread,,,and how can u solve with multi index in parallel? now i use curl_multi maybe more effect way i don't know,,,so if u know, tell me. thks. -- regards jl
Re: Does solr support Multi index and return by score and datetime
2007/4/5, Otis Gospodnetic <[EMAIL PROTECTED]>: How to cache results? Put them in a cache like memcached, for example, keyed off of query (can't exceed 250 bytes in the case of memcached, so you'll want to pack that query, perhaps use its MD5 as the cache key) Yes,i use memcached and key is md5 query. thk ur advice. I decrease count of documents because of ram is only 1g. I think master use tomcat which use 20 solr instance. and slaveA and SlaveB have 10 solr instance. Web Server use lighttpd+php+memcached. It is my design. but not test. Maybe u can show me ur experience. Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www.simpy.com/ - Tag - Search - Share - Original Message From: James liu <[EMAIL PROTECTED] > To: solr-user@lucene.apache.org Sent: Thursday, April 5, 2007 1:57:07 AM Subject: Re: Does solr support Multi index and return by score and datetime 2007/4/5, Otis Gospodnetic <[EMAIL PROTECTED]>: > > James, > > It looks like people already answered your questions. > Split your big index. > Put it on multiple servers. > Put Solr on each of those servers. > Write an application that searches multiple Solr instances in parallel. > Get N results from each, combine them, order by score. How to cache its result? I hesitate that It will cache many data. As far as I know, this is the best you can do with what is available from > Solr today. > For anything else, you'll have to roll up your sleeves and dig into the > code. > > Good luck! > > Otis > . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . > Simpy -- http://www.simpy.com/ - Tag - Search - Share > > - Original Message > From: James liu <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Thursday, April 5, 2007 1:18:30 AM > Subject: Re: Does solr support Multi index and return by score and > datetime > > Anyone have problem like this and how to solve it? > > > > > 2007/4/5, James liu <[EMAIL PROTECTED]>: > > > > > > > > 2007/4/5, Mike Klaas < [EMAIL PROTECTED]>: > > > > > > On 4/4/07, James liu <[EMAIL PROTECTED]> wrote: > > > > > > > > > I think it is part of full-text search. > > > > > > > > I think query slavers and combin result by score should be the part > of > > > solr. > > > > > > > > I find it http://dev.lucene-ws.net/wiki/MultiIndexOperations > > > > but i wanna use solr and i like it. > > > > > > > > Now i wanna find a good method to solve it by using solr and less > > > > coding.(More code will cost more time to write and test.) > > > > > > I agree that it would be an excellent addition to Solr, but it is a > > > major undertaking, and so I wouldn't wait around for it if it is > > > important to you. Solr devs have code to write and test too :). > > > > > > > > > If you document > > > > > > > distribution is uniform random, then the norms converge to > > > > > > > approximately equal values anyway. > > > > > > > > > > > > I don't know it. > > > > > > > > I don't know why u say "document distribution". Does it mean if i > > > write code > > > > independently, i will consider it? > > > > > > One of the complexities of queries multiple remote Solr/lucene > > > instances is that the scores are not directly comparable as the term > > > idf scores will be different. However, in practical situations, this > > > can be glossed over. > > > > > > This is the basic algorithm for single-pass querying multiple solr > > > slaves. Say you want results N to N + M (e.g 10 to 20). > > > > > > 1. query each solr instance independently for N+M documents for the > > > given query. This should be done asynchronously (or you could spawn a > > > thread per server). > > > 2. wait for all responses (or for a certain timeout) > > > 3. put all returned documents into an array, and reverse sort by score > > > 4. select documents [N, N+M) from this array. > > > > > > This is a relatively simple task. It gets more complicated once > > > multiple passes, idf compensation, deduplication, etc. are added. > > > > > > -Mike > > > > > > > Thks Mike. > > > > I find it more complicate than i think. > > > > Is it the only way to solve my problem: > > > > I have a project, it have 100g data, now i have 3-4 server for solr. > > > > > > > > > > > > > > -- > > regards > > jl > > > > > -- > regards > jl > > > > -- regards jl -- regards jl
Re: Does solr support Multi index and return by score and datetime
How to cache results? Put them in a cache like memcached, for example, keyed off of query (can't exceed 250 bytes in the case of memcached, so you'll want to pack that query, perhaps use its MD5 as the cache key) Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www.simpy.com/ - Tag - Search - Share - Original Message From: James liu <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Thursday, April 5, 2007 1:57:07 AM Subject: Re: Does solr support Multi index and return by score and datetime 2007/4/5, Otis Gospodnetic <[EMAIL PROTECTED]>: > > James, > > It looks like people already answered your questions. > Split your big index. > Put it on multiple servers. > Put Solr on each of those servers. > Write an application that searches multiple Solr instances in parallel. > Get N results from each, combine them, order by score. How to cache its result? I hesitate that It will cache many data. As far as I know, this is the best you can do with what is available from > Solr today. > For anything else, you'll have to roll up your sleeves and dig into the > code. > > Good luck! > > Otis > . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . > Simpy -- http://www.simpy.com/ - Tag - Search - Share > > - Original Message > From: James liu <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Thursday, April 5, 2007 1:18:30 AM > Subject: Re: Does solr support Multi index and return by score and > datetime > > Anyone have problem like this and how to solve it? > > > > > 2007/4/5, James liu <[EMAIL PROTECTED]>: > > > > > > > > 2007/4/5, Mike Klaas <[EMAIL PROTECTED]>: > > > > > > On 4/4/07, James liu <[EMAIL PROTECTED]> wrote: > > > > > > > > > I think it is part of full-text search. > > > > > > > > I think query slavers and combin result by score should be the part > of > > > solr. > > > > > > > > I find it http://dev.lucene-ws.net/wiki/MultiIndexOperations > > > > but i wanna use solr and i like it. > > > > > > > > Now i wanna find a good method to solve it by using solr and less > > > > coding.(More code will cost more time to write and test.) > > > > > > I agree that it would be an excellent addition to Solr, but it is a > > > major undertaking, and so I wouldn't wait around for it if it is > > > important to you. Solr devs have code to write and test too :). > > > > > > > > > If you document > > > > > > > distribution is uniform random, then the norms converge to > > > > > > > approximately equal values anyway. > > > > > > > > > > > > I don't know it. > > > > > > > > I don't know why u say "document distribution". Does it mean if i > > > write code > > > > independently, i will consider it? > > > > > > One of the complexities of queries multiple remote Solr/lucene > > > instances is that the scores are not directly comparable as the term > > > idf scores will be different. However, in practical situations, this > > > can be glossed over. > > > > > > This is the basic algorithm for single-pass querying multiple solr > > > slaves. Say you want results N to N + M (e.g 10 to 20). > > > > > > 1. query each solr instance independently for N+M documents for the > > > given query. This should be done asynchronously (or you could spawn a > > > thread per server). > > > 2. wait for all responses (or for a certain timeout) > > > 3. put all returned documents into an array, and reverse sort by score > > > 4. select documents [N, N+M) from this array. > > > > > > This is a relatively simple task. It gets more complicated once > > > multiple passes, idf compensation, deduplication, etc. are added. > > > > > > -Mike > > > > > > > Thks Mike. > > > > I find it more complicate than i think. > > > > Is it the only way to solve my problem: > > > > I have a project, it have 100g data, now i have 3-4 server for solr. > > > > > > > > > > > > > > -- > > regards > > jl > > > > > -- > regards > jl > > > > -- regards jl
Re: Does solr support Multi index and return by score and datetime
2007/4/5, Otis Gospodnetic <[EMAIL PROTECTED]>: James, It looks like people already answered your questions. Split your big index. Put it on multiple servers. Put Solr on each of those servers. Write an application that searches multiple Solr instances in parallel. Get N results from each, combine them, order by score. How to cache its result? I hesitate that It will cache many data. As far as I know, this is the best you can do with what is available from Solr today. For anything else, you'll have to roll up your sleeves and dig into the code. Good luck! Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www.simpy.com/ - Tag - Search - Share - Original Message From: James liu <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Thursday, April 5, 2007 1:18:30 AM Subject: Re: Does solr support Multi index and return by score and datetime Anyone have problem like this and how to solve it? 2007/4/5, James liu <[EMAIL PROTECTED]>: > > > > 2007/4/5, Mike Klaas <[EMAIL PROTECTED]>: > > > > On 4/4/07, James liu <[EMAIL PROTECTED]> wrote: > > > > > > > I think it is part of full-text search. > > > > > > I think query slavers and combin result by score should be the part of > > solr. > > > > > > I find it http://dev.lucene-ws.net/wiki/MultiIndexOperations > > > but i wanna use solr and i like it. > > > > > > Now i wanna find a good method to solve it by using solr and less > > > coding.(More code will cost more time to write and test.) > > > > I agree that it would be an excellent addition to Solr, but it is a > > major undertaking, and so I wouldn't wait around for it if it is > > important to you. Solr devs have code to write and test too :). > > > > > > > If you document > > > > > > distribution is uniform random, then the norms converge to > > > > > > approximately equal values anyway. > > > > > > > > > > I don't know it. > > > > > > I don't know why u say "document distribution". Does it mean if i > > write code > > > independently, i will consider it? > > > > One of the complexities of queries multiple remote Solr/lucene > > instances is that the scores are not directly comparable as the term > > idf scores will be different. However, in practical situations, this > > can be glossed over. > > > > This is the basic algorithm for single-pass querying multiple solr > > slaves. Say you want results N to N + M (e.g 10 to 20). > > > > 1. query each solr instance independently for N+M documents for the > > given query. This should be done asynchronously (or you could spawn a > > thread per server). > > 2. wait for all responses (or for a certain timeout) > > 3. put all returned documents into an array, and reverse sort by score > > 4. select documents [N, N+M) from this array. > > > > This is a relatively simple task. It gets more complicated once > > multiple passes, idf compensation, deduplication, etc. are added. > > > > -Mike > > > > Thks Mike. > > I find it more complicate than i think. > > Is it the only way to solve my problem: > > I have a project, it have 100g data, now i have 3-4 server for solr. > > > > > > > -- > regards > jl -- regards jl -- regards jl
Re: Does solr support Multi index and return by score and datetime
Anyone have problem like this and how to solve it? 2007/4/5, James liu <[EMAIL PROTECTED]>: 2007/4/5, Mike Klaas <[EMAIL PROTECTED]>: > > On 4/4/07, James liu <[EMAIL PROTECTED]> wrote: > > > > > I think it is part of full-text search. > > > > I think query slavers and combin result by score should be the part of > solr. > > > > I find it http://dev.lucene-ws.net/wiki/MultiIndexOperations > > but i wanna use solr and i like it. > > > > Now i wanna find a good method to solve it by using solr and less > > coding.(More code will cost more time to write and test.) > > I agree that it would be an excellent addition to Solr, but it is a > major undertaking, and so I wouldn't wait around for it if it is > important to you. Solr devs have code to write and test too :). > > > > > If you document > > > > > distribution is uniform random, then the norms converge to > > > > > approximately equal values anyway. > > > > > > > > I don't know it. > > > > I don't know why u say "document distribution". Does it mean if i > write code > > independently, i will consider it? > > One of the complexities of queries multiple remote Solr/lucene > instances is that the scores are not directly comparable as the term > idf scores will be different. However, in practical situations, this > can be glossed over. > > This is the basic algorithm for single-pass querying multiple solr > slaves. Say you want results N to N + M (e.g 10 to 20). > > 1. query each solr instance independently for N+M documents for the > given query. This should be done asynchronously (or you could spawn a > thread per server). > 2. wait for all responses (or for a certain timeout) > 3. put all returned documents into an array, and reverse sort by score > 4. select documents [N, N+M) from this array. > > This is a relatively simple task. It gets more complicated once > multiple passes, idf compensation, deduplication, etc. are added. > > -Mike > Thks Mike. I find it more complicate than i think. Is it the only way to solve my problem: I have a project, it have 100g data, now i have 3-4 server for solr. -- regards jl -- regards jl
Re: Does solr support Multi index and return by score and datetime
James, It looks like people already answered your questions. Split your big index. Put it on multiple servers. Put Solr on each of those servers. Write an application that searches multiple Solr instances in parallel. Get N results from each, combine them, order by score. As far as I know, this is the best you can do with what is available from Solr today. For anything else, you'll have to roll up your sleeves and dig into the code. Good luck! Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www.simpy.com/ - Tag - Search - Share - Original Message From: James liu <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Thursday, April 5, 2007 1:18:30 AM Subject: Re: Does solr support Multi index and return by score and datetime Anyone have problem like this and how to solve it? 2007/4/5, James liu <[EMAIL PROTECTED]>: > > > > 2007/4/5, Mike Klaas <[EMAIL PROTECTED]>: > > > > On 4/4/07, James liu <[EMAIL PROTECTED]> wrote: > > > > > > > I think it is part of full-text search. > > > > > > I think query slavers and combin result by score should be the part of > > solr. > > > > > > I find it http://dev.lucene-ws.net/wiki/MultiIndexOperations > > > but i wanna use solr and i like it. > > > > > > Now i wanna find a good method to solve it by using solr and less > > > coding.(More code will cost more time to write and test.) > > > > I agree that it would be an excellent addition to Solr, but it is a > > major undertaking, and so I wouldn't wait around for it if it is > > important to you. Solr devs have code to write and test too :). > > > > > > > If you document > > > > > > distribution is uniform random, then the norms converge to > > > > > > approximately equal values anyway. > > > > > > > > > > I don't know it. > > > > > > I don't know why u say "document distribution". Does it mean if i > > write code > > > independently, i will consider it? > > > > One of the complexities of queries multiple remote Solr/lucene > > instances is that the scores are not directly comparable as the term > > idf scores will be different. However, in practical situations, this > > can be glossed over. > > > > This is the basic algorithm for single-pass querying multiple solr > > slaves. Say you want results N to N + M (e.g 10 to 20). > > > > 1. query each solr instance independently for N+M documents for the > > given query. This should be done asynchronously (or you could spawn a > > thread per server). > > 2. wait for all responses (or for a certain timeout) > > 3. put all returned documents into an array, and reverse sort by score > > 4. select documents [N, N+M) from this array. > > > > This is a relatively simple task. It gets more complicated once > > multiple passes, idf compensation, deduplication, etc. are added. > > > > -Mike > > > > Thks Mike. > > I find it more complicate than i think. > > Is it the only way to solve my problem: > > I have a project, it have 100g data, now i have 3-4 server for solr. > > > > > > > -- > regards > jl -- regards jl
Re: Does solr support Multi index and return by score and datetime
2007/4/5, Mike Klaas <[EMAIL PROTECTED]>: On 4/4/07, James liu <[EMAIL PROTECTED]> wrote: > > > I think it is part of full-text search. > > I think query slavers and combin result by score should be the part of solr. > > I find it http://dev.lucene-ws.net/wiki/MultiIndexOperations > but i wanna use solr and i like it. > > Now i wanna find a good method to solve it by using solr and less > coding.(More code will cost more time to write and test.) I agree that it would be an excellent addition to Solr, but it is a major undertaking, and so I wouldn't wait around for it if it is important to you. Solr devs have code to write and test too :). > > > If you document > > > > distribution is uniform random, then the norms converge to > > > > approximately equal values anyway. > > > > > > I don't know it. > > I don't know why u say "document distribution". Does it mean if i write code > independently, i will consider it? One of the complexities of queries multiple remote Solr/lucene instances is that the scores are not directly comparable as the term idf scores will be different. However, in practical situations, this can be glossed over. This is the basic algorithm for single-pass querying multiple solr slaves. Say you want results N to N + M (e.g 10 to 20). 1. query each solr instance independently for N+M documents for the given query. This should be done asynchronously (or you could spawn a thread per server). 2. wait for all responses (or for a certain timeout) 3. put all returned documents into an array, and reverse sort by score 4. select documents [N, N+M) from this array. This is a relatively simple task. It gets more complicated once multiple passes, idf compensation, deduplication, etc. are added. -Mike Thks Mike. I find it more complicate than i think. Is it the only way to solve my problem: I have a project, it have 100g data, now i have 3-4 server for solr. -- regards jl
Re: Does solr support Multi index and return by score and datetime
On 4/4/07, James liu <[EMAIL PROTECTED]> wrote: > > I think it is part of full-text search. I think query slavers and combin result by score should be the part of solr. I find it http://dev.lucene-ws.net/wiki/MultiIndexOperations but i wanna use solr and i like it. Now i wanna find a good method to solve it by using solr and less coding.(More code will cost more time to write and test.) I agree that it would be an excellent addition to Solr, but it is a major undertaking, and so I wouldn't wait around for it if it is important to you. Solr devs have code to write and test too :). > > If you document > > > distribution is uniform random, then the norms converge to > > > approximately equal values anyway. > > > > I don't know it. I don't know why u say "document distribution". Does it mean if i write code independently, i will consider it? One of the complexities of queries multiple remote Solr/lucene instances is that the scores are not directly comparable as the term idf scores will be different. However, in practical situations, this can be glossed over. This is the basic algorithm for single-pass querying multiple solr slaves. Say you want results N to N + M (e.g 10 to 20). 1. query each solr instance independently for N+M documents for the given query. This should be done asynchronously (or you could spawn a thread per server). 2. wait for all responses (or for a certain timeout) 3. put all returned documents into an array, and reverse sort by score 4. select documents [N, N+M) from this array. This is a relatively simple task. It gets more complicated once multiple passes, idf compensation, deduplication, etc. are added. -Mike
Re: Does solr support Multi index and return by score and datetime
2007/4/5, Mike Klaas <[EMAIL PROTECTED]>: On 4/4/07, James liu <[EMAIL PROTECTED]> wrote: > 2007/4/5, Mike Klaas <[EMAIL PROTECTED]>: > > > > On 4/4/07, James liu <[EMAIL PROTECTED]> wrote: > > > That means now i can' solve it with solr? > > > > Not out-of-the-box, no. But you can certainly query your slaves > > independently can combine based on score. > > I think it is part of full-text search. I think query slavers and combin result by score should be the part of solr. I find it http://dev.lucene-ws.net/wiki/MultiIndexOperations but i wanna use solr and i like it. Now i wanna find a good method to solve it by using solr and less coding.(More code will cost more time to write and test.) > If you document > > distribution is uniform random, then the norms converge to > > approximately equal values anyway. > > I don't know it. I don't know why u say "document distribution". Does it mean if i write code independently, i will consider it? I'm afraid I didn't understand either of these comments. -Mike -- regards jl
Re: Does solr support Multi index and return by score and datetime
On 4/4/07, James liu <[EMAIL PROTECTED]> wrote: 2007/4/5, Mike Klaas <[EMAIL PROTECTED]>: > > On 4/4/07, James liu <[EMAIL PROTECTED]> wrote: > > That means now i can' solve it with solr? > > Not out-of-the-box, no. But you can certainly query your slaves > independently can combine based on score. I think it is part of full-text search. If you document > distribution is uniform random, then the norms converge to > approximately equal values anyway. I don't know it. I'm afraid I didn't understand either of these comments. -Mike
Re: Does solr support Multi index and return by score and datetime
2007/4/5, Mike Klaas <[EMAIL PROTECTED]>: On 4/4/07, James liu <[EMAIL PROTECTED]> wrote: > That means now i can' solve it with solr? Not out-of-the-box, no. But you can certainly query your slaves independently can combine based on score. I think it is part of full-text search. If you document distribution is uniform random, then the norms converge to approximately equal values anyway. I don't know it. -Mike -- regards jl
Re: Does solr support Multi index and return by score and datetime
On 4/4/07, James liu <[EMAIL PROTECTED]> wrote: That means now i can' solve it with solr? Not out-of-the-box, no. But you can certainly query your slaves independently can combine based on score. If you document distribution is uniform random, then the norms converge to approximately equal values anyway. -Mike
Re: Does solr support Multi index and return by score and datetime
That means now i can' solve it with solr? 2007/4/4, Yonik Seeley <[EMAIL PROTECTED]>: On 4/4/07, James liu <[EMAIL PROTECTED]> wrote: > i find it http://wiki.apache.org/solr/FederatedSearch That was design brainstorming. Nothing there has been implemented, and it's not currently at the top of my personal todo list. -Yonik -- regards jl
Re: Does solr support Multi index and return by score and datetime
On 4/4/07, James liu <[EMAIL PROTECTED]> wrote: i find it http://wiki.apache.org/solr/FederatedSearch That was design brainstorming. Nothing there has been implemented, and it's not currently at the top of my personal todo list. -Yonik
Re: Does solr support Multi index and return by score and datetime
i find it http://wiki.apache.org/solr/FederatedSearch we can use it and how? 2007/4/4, James liu <[EMAIL PROTECTED]>: i have a project, it have 100g data, now i have 3-4 server for solr. so i wanna use multi solr to decrease index's time. but how to search by using solr, if solr not support multi index. -- regards jl -- regards jl
Does solr support Multi index and return by score and datetime
i have a project, it have 100g data, now i have 3-4 server for solr. so i wanna use multi solr to decrease index's time. but how to search by using solr, if solr not support multi index. -- regards jl