Re: Shard Keys and Distributed Search

2013-06-02 Thread Niran Fajemisin
no distributed search is performed...at least from my limited observation. Thanks again for your response. Cheers. > > From: Daniel Collins >To: Solr User >Sent: Saturday, June 1, 2013 4:09 AM >Subject: Re: Shard Keys and Distributed Search >

Re: Shard Keys and Distributed Search

2013-06-02 Thread Lance Norskog
Distributed search does the actual search twice: once to get the scores and again to fetch the documents with the top N scores. This algorithm does not play well with "deep searches". On 06/02/2013 07:32 PM, Niran Fajemisin wrote: Thanks Daniel. That's exactly what I thought

Distributed Search and the Stale Check

2013-02-25 Thread Ryan Zezeski
Hello Solr Users, I just wrote up a piece about some work I did recently to improve the throughput of distributed search. http://www.zinascii.com/2013/solr-distributed-search-and-the-stale-check.html The short of it is that the stale check in Apache's HTTP Client used by SolrJ can add a l

Re: Slow performance on distributed search

2013-03-26 Thread Otis Gospodnetic
ontains no documents. > > I can tell that merging result is the bottle neck here, but I couldn't find > a way to fix it. Please let me know if you guys have any suggestion. Thanks > in advance! > > Best Regards, > Qun > > > > -- > View this message in con

Re: Slow performance on distributed search

2013-03-26 Thread qungg
.nabble.com/Slow-performance-on-distributed-search-tp4051434p4051439.html Sent from the Solr - User mailing list archive at Nabble.com.

RE: Slow performance on distributed search

2013-03-26 Thread Michael Ryan
-user@lucene.apache.org Subject: Slow performance on distributed search Hi, I have 40 shards running on 48 core machine with 256GB RAM (The data is about 40 GB). I am using legacy distributed method as setup. So I have one additional shard with no data. Queries would go to this shard and the shard

RE: Slow performance on distributed search

2013-03-26 Thread qungg
troller is getting 100,010*40 rows of data, therefore merging is taking a long time. I have not tried solr cloud, does any one know the performance of query large start row on solr cloud? -- View this message in context: http://lucene.472066.n3.nabble.com/Slow-performance-on-distributed

Re: Slow performance on distributed search

2013-03-26 Thread Joel Bernstein
ng time. > > I have not tried solr cloud, does any one know the performance of query > large start row on solr cloud? > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Slow-performance-on-distributed-search-tp4051434p4051492.html &

Re: Slow performance on distributed search

2013-03-26 Thread Walter Underwood
Why on earth are you starting at row 100,000? What use case is that? --wunder On Mar 26, 2013, at 11:55 AM, qungg wrote: > for start=100,000&row=10. event though each individual shard take only < 10ms > to query, the merging process done by controller would take about a minutes. > > By looking

Re: Slow performance on distributed search

2013-03-26 Thread Jack Krupansky
(You mean, other than "deep paging".) -- Jack Krupansky -Original Message- From: Walter Underwood Sent: Tuesday, March 26, 2013 3:47 PM To: solr-user@lucene.apache.org Subject: Re: Slow performance on distributed search Why on earth are you starting at row 100,000? What u

Re: Slow performance on distributed search

2013-03-26 Thread Walter Underwood
, 2013, at 1:14 PM, Jack Krupansky wrote: > (You mean, other than "deep paging".) > > -- Jack Krupansky > > -Original Message- From: Walter Underwood > Sent: Tuesday, March 26, 2013 3:47 PM > To: solr-user@lucene.apache.org > Subject: Re: Slow performance

Re: Slow performance on distributed search

2013-03-26 Thread Michael Della Bitta
From: Walter Underwood >> Sent: Tuesday, March 26, 2013 3:47 PM >> To: solr-user@lucene.apache.org >> Subject: Re: Slow performance on distributed search >> >> Why on earth are you starting at row 100,000? What use case is hat? --wunder >> >> On

RE: Slow performance on distributed search

2013-03-26 Thread Michael Ryan
olr-user@lucene.apache.org Subject: RE: Slow performance on distributed search for start=100,000&row=10. event though each individual shard take only < 10ms to query, the merging process done by controller would take about a minutes. By looking at logs, each shard is giving the controller shard

Re: Slow performance on distributed search

2013-03-26 Thread Erick Erickson
ssage- > From: qungg [mailto:qzheng1...@gmail.com] > Sent: Tuesday, March 26, 2013 2:55 PM > To: solr-user@lucene.apache.org > Subject: RE: Slow performance on distributed search > > for start=100,000&row=10. event though each individual shard take only < 10ms > to qu

Re: Slow qTime for distributed search

2013-04-08 Thread Manuel Le Normand
After taking a look on what I'd wrote earlier, I will try to rephrase in a clear manner. It seems that sharding my collection to many shards slowed down unreasonably, and I'm trying to investigate why. First, I created "collection1" - 4 shards*replicationFactor=1 collection on 2 servers. Second I

Re: Slow qTime for distributed search

2013-04-08 Thread Shawn Heisey
On 4/8/2013 12:19 PM, Manuel Le Normand wrote: It seems that sharding my collection to many shards slowed down unreasonably, and I'm trying to investigate why. First, I created "collection1" - 4 shards*replicationFactor=1 collection on 2 servers. Second I created "collection2" - 48 shards*replic

Re: Slow qTime for distributed search

2013-04-09 Thread Manuel Le Normand
Thanks for replying. My config: - 40 dedicated servers, dual-core each - Running Tomcat servlet on Linux - 12 Gb RAM per server, splitted half between OS and Solr - Complex queries (up to 30 conditions on different fields), 1 qps rate Sharding my index was done for two reasons, based

Re: Slow qTime for distributed search

2013-04-09 Thread Shawn Heisey
On 4/9/2013 2:10 PM, Manuel Le Normand wrote: Thanks for replying. My config: - 40 dedicated servers, dual-core each - Running Tomcat servlet on Linux - 12 Gb RAM per server, splitted half between OS and Solr - Complex queries (up to 30 conditions on different fields), 1 qps rate

Re: Slow qTime for distributed search

2013-04-09 Thread Furkan KAMACI
Hi Shawn; You say that: *... your documents are about 50KB each. That would translate to an index that's at least 25GB* I know we can not say an exact size but what is the approximately ratio of document size / index size according to your experiences? 2013/4/9 Shawn Heisey > On 4/9/2013 2:

Re: Slow qTime for distributed search

2013-04-09 Thread Shawn Heisey
On 4/9/2013 3:50 PM, Furkan KAMACI wrote: Hi Shawn; You say that: *... your documents are about 50KB each. That would translate to an index that's at least 25GB* I know we can not say an exact size but what is the approximately ratio of document size / index size according to your experiences

Re: Slow qTime for distributed search

2013-04-11 Thread Manuel Le Normand
Hi, We have different working hours, sorry for the reply delay. Your assumed numbers are right, about 25-30Kb per doc. giving a total of 15G per shard, there are two shards per server (+2 slaves that should do no work normally). An average query has about 30 conditions (OR AND mixed), most of them

Re: Slow qTime for distributed search

2013-04-12 Thread Furkan KAMACI
Manuel Le Normand, I am sorry but I want to learn something. You said you have 40 dedicated servers. What is your total document count, total document size, and total shard size? 2013/4/11 Manuel Le Normand > Hi, > We have different working hours, sorry for the reply delay. Your assumed > number

Re: Solr 4.6.0: DocValues (distributed search)

2014-01-08 Thread Shawn Heisey
On 1/8/2014 11:24 AM, ku3ia wrote: Hi! https://issues.apache.org/jira/browse/SOLR-3855 Description It would be nice if Solr supported DocValues: for ID fields (fewer disk seeks when running distributed search), Does docValues completely done for distributed search? for ID fields? P.S. I'm

Re: Solr 4.6.0: DocValues (distributed search)

2014-01-09 Thread ku3ia
Today I setup a simple SolrCloud with tow shards. Seems the same. When I'm debugging a distributed search I can't catch a break-point at lucene codec file, but when I'm using faceted search everything looks fine - debugger stops. Can anyone help me with my question? Thanks.

Re: Solr 4.6.0: DocValues (distributed search)

2014-01-10 Thread Manuel Le Normand
In short, when running a distributed search every shard runs the query separately. Each shard's collector returns the topN (rows param) internal docId's of the matching documents. These topN docId's are converted to their uniqueKey in the BinaryResponseWriter and sent to the fro

Re: Solr 4.6.0: DocValues (distributed search)

2014-01-10 Thread ku3ia
Manuel Le Normand wrote > In short, when running a distributed search every shard runs the query > separately. Each shard's collector returns the topN (rows param) internal > docId's of the matching documents. > > These topN docId's are converted to their uniqueKey

Re: Facet pivot and distributed search

2014-02-06 Thread Shalin Shekhar Mangar
Yes this is a open issue. https://issues.apache.org/jira/browse/SOLR-2894 On Fri, Feb 7, 2014 at 1:13 PM, Geert Van Huychem wrote: > Hi > > I'm using Solr 4.5 in a multi-core environment. > > I've setup > - one core per documenttype: text, rss, tweet and external documents. > - one distrib core

Re: Facet pivot and distributed search

2014-02-07 Thread Geert Van Huychem
Thx! Geert Van Huychem IT Consultant iFrameWorx BVBA Mobile: +32 497 27 69 03 E-mail: ge...@iframeworx.be Site: http://www.iframeworx.be LinkedIn: http://www.linkedin.com/in/geertvanhuychem On Fri, Feb 7, 2014 at 8:55 AM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > Yes this is a o

Re: Facet pivot and distributed search

2014-02-07 Thread Trey Grainger
FYI, the last distributed pivot facet patch functionally works, but there are some sub-optimal data structures being used and some unnecessary duplicate processing of values. As a result, we found that for certain worst-case scenarios (i.e. data is not randomly distributed across Solr cores and req

Re: solr distributed search don't work

2011-08-19 Thread olivier sallou
Hi, I do not use spell but I use distributed search, using qt=spell is correct, should not use qt=\spell. For "shards", I specify it in solrconfig directly, not in url, but should work the same. Maybe an issue in your spell request handler. 2011/8/19 Li Li > hi all, > I

Re: solr distributed search don't work

2011-08-19 Thread Li Li
could you please show me your configuration in solrconfig.xml? On Fri, Aug 19, 2011 at 5:31 PM, olivier sallou wrote: > Hi, > I do not use spell but I use distributed search, using qt=spell is correct, > should not use qt=\spell. > For "shards", I specify it in solrconfi

Re: solr distributed search don't work

2011-09-01 Thread olivier sallou
I do not use spell but I use distributed search, using qt=spell is > correct, > > should not use qt=\spell. > > For "shards", I specify it in solrconfig directly, not in url, but should > > work the same. > > Maybe an issue in your spell request handler. > > &

Re: Huge Performance: Solr distributed search

2011-11-23 Thread Dmitry Kan
1.89:8080/solr30 > - At another server there is a additional "common" application with > shards paramerter: > > > explicit > 192.168.1.85:8080/solr1,192.168.1.85:8080/solr2,..., > 192.168.1.89:8080/solr30 > 10 > > > - schema and solrconfig are identic

Re: Huge Performance: Solr distributed search

2011-11-23 Thread Artem Lokotosh
92.168.1.86:8080/solr7,http://192.168.1.86:8080/solr8,..., >> http://192.168.1.86:8080/solr12 >> ... >> 5) http://192.168.1.89:8080/solr25,http://192.168.1.89:8080/solr26,..., >> http://192.168.1.89:8080/solr30 >> - At another server there is a additional "common&qu

Re: Huge Performance: Solr distributed search

2011-11-23 Thread Dmitry Kan
gt; >> -XX:MaxHeapFreeRatio=25 > >> -verbose:gc > >> -XX:+PrintGCTimeStamps > >> -Xloggc:/opt/search/tomcat/logs/gc.log > >> > >> Out search schema is: > >> - 5 servers with configuration above; > >> - one tomcat6 application on each

Re: Huge Performance: Solr distributed search

2011-11-23 Thread Artem Lokotosh
> If the response time from each shard shows decent figures, then aggregator> > seems to be a bottleneck. Do you btw have a lot of concurrent users?For now > is not a problem, but we expect from 1K to 10K of concurrent users and maybe > more On Wed, Nov 23, 2011 at 4:43 PM, Dmitry Kan wrote: >

Re: Huge Performance: Solr distributed search

2011-11-23 Thread Robert Stewart
If you request 1000 docs from each shard, then aggregator is really fetching 30,000 total documents, which then it must merge (re-sort results, and take top 1000 to return to client). Its possible that SOLR merging implementation needs optimized, but it does not seem like it could be that slow. H

Re: Huge Performance: Solr distributed search

2011-11-24 Thread Artem Lokotosh
>> Can you merge, e.g. 3 shards together or is it much effort for your >> team?>Yes, we can merge. We'll try to do this and review how it will works Merge does not help :(I've tried to merge two shards in one, three shards in one, but results are similar to results first configuration with 30 shar

Re: Huge Performance: Solr distributed search

2011-11-24 Thread Artem Lokotosh
>How big are the documents you return (how many fields, avg KB per doc, etc.)? I have a following schema in my solr configuration 27M–30M docs and 12-15 GB for each shard, 0.5KB per doc >Does performance get much better if you only request top 100, or top>10 >documents instead of top 1000?

Re: Huge Performance: Solr distributed search

2011-11-24 Thread Mark Miller
On Thu, Nov 24, 2011 at 12:09 PM, Artem Lokotosh wrote: > >How big are the documents you return (how many fields, avg KB per doc, > etc.)? > I have a following schema in my solr configuration name="field1" type="text" indexed="true" stored="false"/> name="field2" type="text" indexed="true" stored

Re: Huge Performance: Solr distributed search

2011-11-25 Thread Dmitry Kan
45 000 000 per shard approx, Tomcat, caching was tweaked in solrconfig and shard given 12GB of RAM max. filterCache class="solr.FastLRUCache" size="1200" initialSize="1200" autowarmCount="128"/> true 50 200 In you case I would first check if the network throu

Re: Huge Performance: Solr distributed search

2011-11-25 Thread Artem Lokotosh
On 11/25/2011 3:13 AM, Mark Miller wrote: When you search each shard, are you positive that you are using all of the same parameters? You are sure you are hitting request handlers that are configured exactly the same and sending exactly the same queries? I'm my experience, the overhead for dist

Re: Huge Performance: Solr distributed search

2011-11-25 Thread Mikhail Garber
="shards">192.168.1.85:8080/solr1,192.168.1.85:8080/solr2,...,192.168.1.89:8080/solr30 > 10 > > > - schema and solrconfig are identical for all shards, for first shard > see attach; > - on these servers are only search, indexation is on another > (optimized to 2 se

Re: Huge Performance: Solr distributed search

2011-11-28 Thread Artem Lokotosh
Hi all again. Thanks to all for your replies. On this weekend I'd made some interesting tests, and I would like to share it with you. First of all I made speed test of my hdd: root@LSolr:~# hdparm -t /dev/sda9 /dev/sda9: Timing buffered disk reads: 146 MB in 3.01 seconds = 48.54 MB/se

Re: Huge Performance: Solr distributed search

2011-11-28 Thread Artem Lokotosh
Problem has been resolved. My disk subsystem been a bottleneck for quick search. I put my indexes to RAM and I see very nice QTimes :) Sorry for your time, guys. On Mon, Nov 28, 2011 at 4:02 PM, Artem Lokotosh wrote: > Hi all again. Thanks to all for your replies. > > On this weekend I'd made som

Re: Huge Performance: Solr distributed search

2011-12-02 Thread Tom Gullo
Interesting info. You should look into using Solid State Drives. I moved my search engine to SSD and saw dramatic improvements. -- View this message in context: http://lucene.472066.n3.nabble.com/Huge-Performance-Solr-distributed-search-tp3530627p346.html Sent from the Solr - User

Re: Poor performance on distributed search

2011-12-16 Thread Erick Erickson
u, Dec 15, 2011 at 5:00 PM, ku3ia wrote: > Hi, all! > > I have a problem with distributed search. I downloaded one shard from my > production. It has: > * ~29M docs > * 11 fields > * ~105M terms > * size of shard is: 13GB > On production there are near 30 the same shards. I split

Re: Poor performance on distributed search

2011-12-16 Thread ku3ia
the second - 100%. Drive read speed starting from 3-5 MB/s and falls to 500-700 KB/s in both tests. Have you any ideas? -- View this message in context: http://lucene.472066.n3.nabble.com/Poor-performance-on-distributed-search-tp3590028p3592364.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Poor performance on distributed search

2011-12-16 Thread Erick Erickson
e: 301 secs > --- solr --- > Queries processed: 15 > Queries cancelled: 35 > Average QTime is: 52775.7 ms > Average RTime is: 53.2667 sec(s) > Size of data-dir is: 212978 bytes > > In first test disk usage by nmon: ~30-40% and in the second - 100%. Drive > read speed start

Re: Poor performance on distributed search

2011-12-16 Thread ku3ia
27.0.0.1:8080/solr/shard2,127.0.0.1:8080/solr/shard3,127.0.0.1:8080/solr/shard4 This request handler is defined at shard1's solrconfig. -- View this message in context: http://lucene.472066.n3.nabble.com/Poor-performance-on-distributed-search-tp3590028p3592734.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Poor performance on distributed search

2011-12-16 Thread Erick Erickson
> >     >       explicit >       10 >       name="shards">127.0.0.1:8080/solr/shard1,127.0.0.1:8080/solr/shard2,127.0.0.1:8080/solr/shard3,127.0.0.1:8080/solr/shard4 >     >     > > This request handler is defined at shard1's solrconfig. > > --

Re: Poor performance on distributed search

2011-12-17 Thread ku3ia
my shards onto it. The QTime/RTime was magnificent. But the problem is, I dont have much RAM for it( Can it be my huge index, terms or maybe it's to much queries per minute, or row count or I made something wrong in my configs? And, can you please watch drive speed on your indexes?

Re: Poor performance on distributed search

2011-12-18 Thread Erick Erickson
nto it. The QTime/RTime was magnificent. But > the problem is, I dont have much RAM for it( > > Can it be my huge index, terms or maybe it's to much queries per minute, or > row count or I made something wrong in my configs? And, can you please watch > drive speed on your indexes? Is it the same? > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Poor-performance-on-distributed-search-tp3590028p3594683.html > Sent from the Solr - User mailing list archive at Nabble.com.

Re: Poor performance on distributed search

2011-12-19 Thread ku3ia
ined at query filed or score and pull result to the user? -- View this message in context: http://lucene.472066.n3.nabble.com/Poor-performance-on-distributed-search-tp3590028p3597893.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Poor performance on distributed search

2011-12-19 Thread Erick Erickson
g >>>extra work. > Yeah, on my production I have 5 servers and 6 shards (big shards) on each. > But I tried to use only one shard for each server (summary five shards) but > results wasn't fine. > >>>Although there's one other possibility: By returning 2,000 ro

Re: Poor performance on distributed search

2011-12-19 Thread ku3ia
(2000/4=500) and sends to each shard queries with rows=500, but not rows=2000, so finally, summary after merging and sorting I'll have 2000 rows (maybe less), but not 8000... That was my question. Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Poor-performance-on-distributed-search-tp3590028p3599636.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Poor performance on distributed search

2011-12-19 Thread Darren Govoni
have 2000 rows (maybe less), but not 8000... That was my question. Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Poor-performance-on-distributed-search-tp3590028p3599636.html Sent from the Solr - User mailing list archive at Nabble.com.

RE: Poor performance on distributed search

2011-12-19 Thread Michael Ryan
I had a similar requirement in my project, where a user might ask for up to 3000 results. What I did was change SolrIndexSearcher.doc(int, Set) to retrieve the unique key from the field cache instead of retrieving it as a stored field from disk. This resulted in a massive speed improvement for t

RE: Poor performance on distributed search

2011-12-19 Thread ku3ia
f I recall correctly). > > -Michael > Thanks Michel, I'll try it. -- View this message in context: http://lucene.472066.n3.nabble.com/Poor-performance-on-distributed-search-tp3590028p3599752.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Poor performance on distributed search

2011-12-20 Thread Tomas Zerolo
On Mon, Dec 19, 2011 at 01:32:22PM -0800, ku3ia wrote: > >>Uhm, either I misunderstand your question or you're doing > >>a lot of extra work for nothing > > >>The whole point of sharding it exactly to collect the top N docs > >>from each shard and merge them into a single result [...] > >>

Re: Poor performance on distributed search

2011-12-20 Thread ku3ia
t's true. But from the other side, the second month I'm trying to resolve this problem on my huge data, so I'm attempting to raise any ideas, maybe, even these ideas are incorrect. Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Poor-performance-on-dis

Re: Poor performance on distributed search

2011-12-20 Thread Chris Hostetter
: For example I have 4 shards. Finally, I need 2000 docs. Now, when I'm using : &shards=127.0.0.1:8080/solr/shard1,127.0.0.1:8080/solr/shard2,127.0.0.1:8080/solr/shard3,127.0.0.1:8080/solr/shard4 : Solr gets 2000 docs from each shard (shard1,2,3,4, summary we have 8000 : docs) merge and sort it,

Re: Poor performance on distributed search

2011-12-20 Thread Chris Hostetter
: So why do you have this 2,000 requirement in the first : place? This really sounds like an XY problem. I would really suggest re-visiting this question. No sinle user is going to look at 2000 docs on a single page, and in your previous email you said there was a requirement to ask solr for 2

RE: Poor performance on distributed search

2011-12-20 Thread Chris Hostetter
: improvement for these requests (like 10x if I recall correctly). There was another discussion about this idea for phrase #1 of distributed search a while back... http://osdir.com/ml/solr-user.lucene.apache.org/2011-07/msg00812.html ...but no one worked up a generalized patch - part of the issue is

Re: Solr Distributed Search vs Hadoop

2011-12-20 Thread Ted Dunning
to have a very very huge set > of data. > In a way that for sure we will need many servers (tens or hundreds of > servers). > We will also need failover. > Now the question is, if we should use Hadoop or using Solr Distributed > Search > with shards would be enough

Re: Solr Distributed Search vs Hadoop

2011-12-20 Thread Alireza Salimi
also need failover. > > Now the question is, if we should use Hadoop or using Solr Distributed > > Search > > with shards would be enough? > > > > I've read lots of articles like: > > http://www.lucidimagination.com/content/scaling-lucene-and-solr > > http://

Re: Solr Distributed Search vs Hadoop

2011-12-20 Thread Ted Dunning
ote: > > > > > Hi, > > > > > > I have a basic question, let's say we're going to have a very very huge > > set > > > of data. > > > In a way that for sure we will need many servers (tens or hundreds of > > > servers). > &g

RE: Poor performance on distributed search

2011-12-21 Thread ku3ia
e on any keyword the situations is similar. Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Poor-performance-on-distributed-search-tp3590028p3605074.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Distributed Search vs Hadoop

2011-12-23 Thread Nick Vincent
For data of this size you may want to look at something like Apache Cassandra, which is made specifically to handle data at this kind of scale across many machines. You can still use Hadoop to analyse and transform the data in a performant manner, however it's probably best to do some research on

RE: Poor performance on distributed search

2011-12-28 Thread ku3ia
72066.n3.nabble.com/Poor-performance-on-distributed-search-tp3590028p3616192.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Poor performance on distributed search

2011-12-28 Thread Yonik Seeley
on? For the first phase in a distributed search, Solr must return the top N ids (in your case 200). It currently does this by loading stored fields, which is slow. A better approach is to store the "id" field as a column stride field. https://issues.apache.org/jira/browse/

Re: Solr Distributed Search vs Hadoop

2011-12-28 Thread Lance Norskog
Here is an example of schema design: a PDF file of 5MB might have maybe 50k of actual text. The Solr ExtractingRequestHandler will find that text and only index that. If you set the field to stored=true, the 5mb will be saved. If saved=false, the PDF is not saved. Instead, you would store a link to

Re: Solr Distributed Search vs Hadoop

2011-12-28 Thread Ted Dunning
This copying is a bit overstated here because of the way that small segments are merged into larger segments. Those larger segments are then copied much less often than the smaller ones. While you can wind up with lots of copying in certain extreme cases, it is quite rare. In particular, if you

Re: Does Distributed Search support {!boost }?

2011-02-09 Thread Yonik Seeley
On Tue, Feb 8, 2011 at 9:02 PM, Andy wrote: > Is it possible to do a query like {!boost b=log(popularity)}foo over sharded > indexes? Yep, that should work fine. -Yonik http://lucidimagination.com

Re: Spellcheck with Distributed Search (sharding).

2013-10-23 Thread Luis Cappa Banda
More info: When executing the Query to a single Solr server it works: http://solr1:8080/events/data/suggest?q=m&wt=json { - responseHeader: { - status: 0, - QTime: 1 }, - response: { - numFo

Re: Spellcheck with Distributed Search (sharding).

2013-10-24 Thread Luis Cappa Banda
Any idea? 2013/10/23 Luis Cappa Banda > More info: > > When executing the Query to a single Solr server it works: > http://solr1:8080/events/data/suggest?q=m&wt=json > > { > >- responseHeader: >{ > - status: 0,

RE: Spellcheck with Distributed Search (sharding).

2013-10-24 Thread Dyer, James
24, 2013 6:22 AM To: solr-user@lucene.apache.org Subject: Re: Spellcheck with Distributed Search (sharding). Any idea? 2013/10/23 Luis Cappa Banda > More info: > > When executing the Query to a single Solr server it works: > http://solr1:8080/events/data/suggest?q=m&wt=json<http://solr

Re: Spellcheck with Distributed Search (sharding).

2013-10-24 Thread Luis Cappa Banda
; James Dyer > Ingram Content Group > (615) 213-4311 > > > -Original Message- > From: Luis Cappa Banda [mailto:luisca...@gmail.com] > Sent: Thursday, October 24, 2013 6:22 AM > To: solr-user@lucene.apache.org > Subject: Re: Spellcheck with Distributed Search (sharding).

Re: Poor performance on distributed search

2013-12-16 Thread ku3ia
t; cycle in writeDocs method. Am I right? Can you advice something in this >> situation? > > For the first phase in a distributed search, Solr must return the top > N ids (in your case 200). It currently does this by loading stored > fields, which is slow. A better approach is

Re: Poor performance on distributed search

2013-12-16 Thread ku3ia
Any ideas? -- View this message in context: http://lucene.472066.n3.nabble.com/Poor-performance-on-distributed-search-tp3590028p4106968.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Performance degradation with distributed search

2012-02-04 Thread Yonik Seeley
On Sat, Feb 4, 2012 at 1:20 AM, XJ wrote: > When I look into details (slow queries), I found some real issues that I > need help with. For example, a query which takes 200ms with geo sharding, > now timeout (>2000ms) with distributed search. And each shard query > (isShard=tr

Re: Performance degradation with distributed search

2012-02-06 Thread oleole
r? XJ -- View this message in context: http://lucene.472066.n3.nabble.com/Performance-degradation-with-distributed-search-tp3715060p3720739.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Performance degradation with distributed search

2012-02-06 Thread XJ
BTW we just upgraded to Solr 3.5 from Solr 1.4. Thats why we want to explore the improvements/new features of distributed search. On Mon, Feb 6, 2012 at 12:30 PM, oleole wrote: > Yonik, > > Thanks for your reply. Yeah that's the first thing I tried (adding fsv=true > to

Re: Performance degradation with distributed search

2012-02-06 Thread Yonik Seeley
On Mon, Feb 6, 2012 at 3:30 PM, oleole wrote: > Thanks for your reply. Yeah that's the first thing I tried (adding fsv=true > to the query) and it surprised me too. Could it due to we're using many > complex sortings (20 sortings with dismax, and, or...). Any thing it can be > optimized? Looks lik

Re: Performance degradation with distributed search

2012-02-06 Thread XJ
hm.. just looked at the log only 112 matched, and start=0, rows=30 On Mon, Feb 6, 2012 at 1:33 PM, Yonik Seeley wrote: > On Mon, Feb 6, 2012 at 3:30 PM, oleole wrote: > > Thanks for your reply. Yeah that's the first thing I tried (adding > fsv=true > > to the query) and it surprised me too. Coul

Re: Performance degradation with distributed search

2012-02-06 Thread Yonik Seeley
On Mon, Feb 6, 2012 at 5:35 PM, XJ wrote: > hm.. just looked at the log only 112 matched, and start=0, rows=30 Are any of the sort criteria sort-by-function with anything complex (like an embedded relevance query)? -Yonik lucidimagination.com > > On Mon, Feb 6, 2012 at 1:33 PM, Yonik Seeley >

Re: Performance degradation with distributed search

2012-02-06 Thread XJ
Yes as I mentioned in previous email, we do dismax queries(with different mm values), solr function queries (map, etc) math calculations (sum, product, log). I understand those are expensive. But worst case it should only double the time not going from 200ms to 1200ms right? XJ On Mon, Feb 6, 201

Re: Performance degradation with distributed search

2012-02-06 Thread Yonik Seeley
On Mon, Feb 6, 2012 at 5:53 PM, XJ wrote: > Yes as I mentioned in previous email, we do dismax queries(with different mm > values), solr function queries (map, etc) math calculations (sum, product, > log). I understand those are expensive. But worst case it should only double > the time not going

Re: Performance degradation with distributed search

2012-02-06 Thread XJ
Yonik, thanks for your explanation. I've created a ticket here https://issues.apache.org/jira/browse/SOLR-3104 On Mon, Feb 6, 2012 at 4:28 PM, Yonik Seeley wrote: > On Mon, Feb 6, 2012 at 6:16 PM, XJ wrote: > > Sorry I didn't make this clear. Yeah we use dismax in main query, as > well as > > in

Re: HTTP Auth and Distributed Search?

2012-04-26 Thread Mark Miller
On Apr 26, 2012, at 5:25 PM, Michael Della Bitta wrote: > Hi, > > I'm wondering if there's any way to use container-based HTTP auth and > Distributed Search configured in the SearchHandler that I haven't > discovered aside from writing my own shard handl

Re: HTTP Auth and Distributed Search?

2012-04-26 Thread Michael Della Bitta
> > I'm wondering if there's any way to use container-based HTTP auth and > > Distributed Search configured in the SearchHandler that I haven't > > discovered aside from writing my own shard handler implementation. > > > > Thanks, > > >

Re: HTTP Auth and Distributed Search?

2012-04-26 Thread Lance Norskog
t; > Michael > > On Thu, 2012-04-26 at 17:55 -0400, Mark Miller wrote: >> On Apr 26, 2012, at 5:25 PM, Michael Della Bitta wrote: >> >> > Hi, >> > >> > I'm wondering if there's any way to use container-based HTTP auth and >> >

Re: HTTP Auth and Distributed Search?

2012-04-27 Thread Mark Miller
gt;> >>> Hi, >>> >>> I'm wondering if there's any way to use container-based HTTP auth and >>> Distributed Search configured in the SearchHandler that I haven't >>> discovered aside from writing my own shard handler implementation. >>

Re: HTTP Auth and Distributed Search?

2012-04-27 Thread Michael Della Bitta
> Sure, open a JIRA issue and lets get it done. Done: https://issues.apache.org/jira/browse/SOLR-3421 Thanks, Michael

QueryElevationComponent not working in Distributed Search

2012-10-04 Thread vasokan
fix for the issue I have mentioned above is present in my version. 2. Is the problem of elevating in distributed search still exists. It will be of great help if anyone can share me your ideas with me. Thank you, Vinoth Asokan -- View this message in context: http://lucene.472066.n3

QueryElevationComponent not working in Distributed Search

2012-10-04 Thread vasokan
fix for the issue I have mentioned above is present in my version. 2. Is the problem of elevating in distributed search still exists. It will be of great help if anyone can share me your ideas with me. Thank you, Vinoth Asokan -- View this message in context: http://lucene.472066.n3

Elevation with distributed search causes NPE

2020-07-15 Thread Marc Linden
Hi all, I'm facing the problem that Solr is throwing a NullPointerException when performing a distributed search with multiple shards having elevation configured where one or more shards do have elevated results but others do not. We are using Solr 8.2 and have the QueryElevationComp

Re: Doing SpellCheck in distributed search

2009-10-07 Thread Shalin Shekhar Mangar
On Wed, Oct 7, 2009 at 2:14 PM, balaji.a wrote: > > Hi All, > I am trying to get spell check suggestions in my distributed search query > using shards. SpellCheckComponent does not support distributed search yet. There is an issue open with a patch. If you decide to use, do let

Re: Doing SpellCheck in distributed search

2009-10-07 Thread balaji.a
false false 1 spellcheck Shalin Shekhar Mangar wrote: > > On Wed, Oct 7, 2009 at 2:14 PM, balaji.a wrote: > >> >> Hi All, >> I am trying to get spell check suggestions in my distributed sear

Re: Doing SpellCheck in distributed search

2009-10-07 Thread balaji.a
> > false > > false > > 1 > > > spellcheck > > > > > > Shalin Shekhar Mangar wrote: >> >> On Wed, Oct 7, 2009 at 2:14 PM, balaji.a wrote: >> >>>

<    1   2   3   4   5   6   7   >