Re: Solr Shard - Strange results
So are we the only ones who never got sharding working with multi-cores? Bummer... Hopefully someone else will chime in with an answer. --Tony -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Shard-Strange-results-tp496373p832863.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Shard - Strange results
I know this post is old but did you ever get a resolution to this problem? I am running into the exact same issue. I even switched my id from "text" to "string" and reindexed as that was the last suggestion and still no resolution. --Tony -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Shard-Strange-results-tp496373p832844.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Shard - Strange results
I'm not quite sure how that would make a difference... From my most recent testing, it seems that the problem is related to the Shards element adding "ids=[...]" to one of the queries. However, I will give it a try. Yao Ge wrote: > > Maybe you want to try with docNumber field type as "string" and see it > would make a difference. > > > CB-PO wrote: >> >> I'm not quite sure what logs you are talking about, but in the >> tomcat/logs/catalina.out logs, i found the following [note, i can't >> copy/paste, so i am typing up a summary]: >> >> I execute command: >> localhost:8080/bravo/select?q=fred&rows=102&start=0&shards=localhost:8080/alpha,localhost:8080/bravo >> >> In this example, alpha has 27 instances of "fred", while bravo has 0. >> >> Then in the catalina.out: >> >> -There is the request for the command i sent, shards parameters and all. >> it has the proper queryString. >> -Then I see the two requests sent to the shards, apha and bravo. These >> two requests weave between each other until they are finished: >> INFO: REQUEST URI =/alpha/select >> INFO: REQUEST URI =/bravo/select >> The parameters have changed to: >> >> wt=javabin&fsv=true&version=2.2&f1=docNumber,score&q=fred&rows=102&isShard=true&start=0 >> >> -Then 2 INFO's scroll across: >> INFO: [] webapp=/bravo path=/select >> params={wt=javabin&fsv=true&version=2.2&f1=docNumber,score&q=fred&rows=102&isShard=true&start=0} >> hits=0 status=0 QTime=1 >> INFO: [] webapp=/alpha path=/select >> params={wt=javabin&fsv=true&version=2.2&f1=docNumber,score&q=fred&rows=102&isShard=true&start=0} >> hits=27 status=0 QTime=1 >> **Note, hits=27 >> >> -Then i see some octet-streams being transferred, with status 200, so >> those are OK. >> >> -The i see something peculiar: >> It calls alpha with the following parameters: >> wt=javabin&version=2.2&ids=ABC-1353,ABC-408,ABC-1355,ABC-1824,ABC-1354,FRED-ID-27,55&q=fred&rows=102¶meter=isShard=true&start=0 >> >> Performing this query on my own (without the wt=javabin) gives me >> numFound=2, the result-set I get back from the overarching query. >> Changing it to rows=10, it gives me numFound=2, and 2 's. This is >> not the strange functionality I was seeing with the overarching query and >> the mis-matched "numfound" and 's. >> >> This does beg the question.. why did it add: >> "ids=ABC-1353,ABC-408,ABC-1355,ABC-1824,ABC-1354,FRED-ID-27,55" to the >> query? They are the format that would be under docNumber, if that >> helps.. Any thoughts? I will do some research on those particular ID >> numbered docs, in the mean time. >> >> Here's the configuration information. I only posted the difference from >> the default files in the solr/example/solr/conf >> >> [solrconfig.xml] >> >> ${solr.data.dir:/data/indices/bravo/solr/data >> >> > class="org.apache.solr.handler.dataimport.DataImportHandler"> >> >> > name="config">/data/indices/bravo/solr/conf/data-config.xml >> >> >> >> >> >> [schema.xml] >> >> >> > stored="true" /> >> > /> >> > /> >> > /> >> > /> >> > /> >> > /> >> > /> >> > /> >> > /> >> >> docNumber >> column2 >> >> >> >> [data-config.xml] >> >> > url="jdbc:metamatrix:b...@mms://hostname:port" user="username" >> password="password"/> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Yonik Seeley-2 wrote: >>> >>> On Fri, May 15, 2009 at 4:11 PM, CB-PO >>> wrote: >>>> Yeah, the first thing I thought of was that perhaps there was something >>>> wrong >>>> with the uniqueKey and they were clashing between the indexes, however >>>> upon >>>> visual inspection of the data the field we are using as the unique key >>>> in >>>> each of the indexes is grossly different between the two databases, so >>>> there >>>> is no chance of them clashing. >>> >>> Yes, but is the same fieldname and FieldType used for both indexes? >>> (that's sort of a requirement) >>> >>> You might also try looking at the logs for the exact requests that >>> were sent to each shard as part of the distributed request, and >>> manually sending those requests and inspecting the results. That >>> should tell you if the shard requests or responses are weird, or if >>> it's the top-level combining logic that's causing this. >>> >>> -Yonik >>> http://www.lucidimagination.com >>> >>> >> >> > > -- View this message in context: http://www.nabble.com/Solr-Shard---Strange-results-tp23561201p23603031.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Shard - Strange results
Maybe you want to try with docNumber field type as "string" and see it would make a difference. CB-PO wrote: > > I'm not quite sure what logs you are talking about, but in the > tomcat/logs/catalina.out logs, i found the following [note, i can't > copy/paste, so i am typing up a summary]: > > I execute command: > localhost:8080/bravo/select?q=fred&rows=102&start=0&shards=localhost:8080/alpha,localhost:8080/bravo > > In this example, alpha has 27 instances of "fred", while bravo has 0. > > Then in the catalina.out: > > -There is the request for the command i sent, shards parameters and all. > it has the proper queryString. > -Then I see the two requests sent to the shards, apha and bravo. These > two requests weave between each other until they are finished: > INFO: REQUEST URI =/alpha/select > INFO: REQUEST URI =/bravo/select > The parameters have changed to: > > wt=javabin&fsv=true&version=2.2&f1=docNumber,score&q=fred&rows=102&isShard=true&start=0 > > -Then 2 INFO's scroll across: > INFO: [] webapp=/bravo path=/select > params={wt=javabin&fsv=true&version=2.2&f1=docNumber,score&q=fred&rows=102&isShard=true&start=0} > hits=0 status=0 QTime=1 > INFO: [] webapp=/alpha path=/select > params={wt=javabin&fsv=true&version=2.2&f1=docNumber,score&q=fred&rows=102&isShard=true&start=0} > hits=27 status=0 QTime=1 > **Note, hits=27 > > -Then i see some octet-streams being transferred, with status 200, so > those are OK. > > -The i see something peculiar: > It calls alpha with the following parameters: > wt=javabin&version=2.2&ids=ABC-1353,ABC-408,ABC-1355,ABC-1824,ABC-1354,FRED-ID-27,55&q=fred&rows=102¶meter=isShard=true&start=0 > > Performing this query on my own (without the wt=javabin) gives me > numFound=2, the result-set I get back from the overarching query. > Changing it to rows=10, it gives me numFound=2, and 2 's. This is > not the strange functionality I was seeing with the overarching query and > the mis-matched "numfound" and 's. > > This does beg the question.. why did it add: > "ids=ABC-1353,ABC-408,ABC-1355,ABC-1824,ABC-1354,FRED-ID-27,55" to the > query? They are the format that would be under docNumber, if that helps.. > Any thoughts? I will do some research on those particular ID numbered > docs, in the mean time. > > Here's the configuration information. I only posted the difference from > the default files in the solr/example/solr/conf > > [solrconfig.xml] > > ${solr.data.dir:/data/indices/bravo/solr/data > >class="org.apache.solr.handler.dataimport.DataImportHandler"> > >name="config">/data/indices/bravo/solr/conf/data-config.xml > > > > > > [schema.xml] > > >stored="true" /> >/> >/> >/> >/> >/> >/> >/> >/> >/> > > docNumber > column2 > > > > [data-config.xml] > >url="jdbc:metamatrix:b...@mms://hostname:port" user="username" > password="password"/> > > > > > > > > > > > > > > > > > > > > > Yonik Seeley-2 wrote: >> >> On Fri, May 15, 2009 at 4:11 PM, CB-PO wrote: >>> Yeah, the first thing I thought of was that perhaps there was something >>> wrong >>> with the uniqueKey and they were clashing between the indexes, however >>> upon >>> visual inspection of the data the field we are using as the unique key >>> in >>> each of the indexes is grossly different between the two databases, so >>> there >>> is no chance of them clashing. >> >> Yes, but is the same fieldname and FieldType used for both indexes? >> (that's sort of a requirement) >> >> You might also try looking at the logs for the exact requests that >> were sent to each shard as part of the distributed request, and >> manually sending those requests and inspecting the results. That >> should tell you if the shard requests or responses are weird, or if >> it's the top-level combining logic that's causing this. >> >> -Yonik >> http://www.lucidimagination.com >> >> > > -- View this message in context: http://www.nabble.com/Solr-Shard---Strange-results-tp23561201p23601624.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Shard - Strange results
I'm not quite sure what logs you are talking about, but in the tomcat/logs/catalina.out logs, i found the following [note, i can't copy/paste, so i am typing up a summary]: I execute command: localhost:8080/bravo/select?q=fred&rows=102&start=0&shards=localhost:8080/alpha,localhost:8080/bravo In this example, alpha has 27 instances of "fred", while bravo has 0. Then in the catalina.out: -There is the request for the command i sent, shards parameters and all. it has the proper queryString. -Then I see the two requests sent to the shards, apha and bravo. These two requests weave between each other until they are finished: INFO: REQUEST URI =/alpha/select INFO: REQUEST URI =/bravo/select The parameters have changed to: wt=javabin&fsv=true&version=2.2&f1=docNumber,score&q=fred&rows=102&isShard=true&start=0 -Then 2 INFO's scroll across: INFO: [] webapp=/bravo path=/select params={wt=javabin&fsv=true&version=2.2&f1=docNumber,score&q=fred&rows=102&isShard=true&start=0} hits=0 status=0 QTime=1 INFO: [] webapp=/alpha path=/select params={wt=javabin&fsv=true&version=2.2&f1=docNumber,score&q=fred&rows=102&isShard=true&start=0} hits=27 status=0 QTime=1 **Note, hits=27 -Then i see some octet-streams being transferred, with status 200, so those are OK. -The i see something peculiar: It calls alpha with the following parameters: wt=javabin&version=2.2&ids=ABC-1353,ABC-408,ABC-1355,ABC-1824,ABC-1354,FRED-ID-27,55&q=fred&rows=102¶meter=isShard=true&start=0 Performing this query on my own (without the wt=javabin) gives me numFound=2, the result-set I get back from the overarching query. Changing it to rows=10, it gives me numFound=2, and 2 's. This is not the strange functionality I was seeing with the overarching query and the mis-matched "numfound" and 's. This does beg the question.. why did it add: "ids=ABC-1353,ABC-408,ABC-1355,ABC-1824,ABC-1354,FRED-ID-27,55" to the query? They are the format that would be under docNumber, if that helps.. Any thoughts? I will do some research on those particular ID numbered docs, in the mean time. Here's the configuration information. I only posted the difference from the default files in the solr/example/solr/conf [solrconfig.xml] ${solr.data.dir:/data/indices/bravo/solr/data /data/indices/bravo/solr/conf/data-config.xml [schema.xml] docNumber column2 [data-config.xml] Yonik Seeley-2 wrote: > > On Fri, May 15, 2009 at 4:11 PM, CB-PO wrote: >> Yeah, the first thing I thought of was that perhaps there was something >> wrong >> with the uniqueKey and they were clashing between the indexes, however >> upon >> visual inspection of the data the field we are using as the unique key in >> each of the indexes is grossly different between the two databases, so >> there >> is no chance of them clashing. > > Yes, but is the same fieldname and FieldType used for both indexes? > (that's sort of a requirement) > > You might also try looking at the logs for the exact requests that > were sent to each shard as part of the distributed request, and > manually sending those requests and inspecting the results. That > should tell you if the shard requests or responses are weird, or if > it's the top-level combining logic that's causing this. > > -Yonik > http://www.lucidimagination.com > > -- View this message in context: http://www.nabble.com/Solr-Shard---Strange-results-tp23561201p23600878.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Shard - Strange results
On Fri, May 15, 2009 at 4:11 PM, CB-PO wrote: > Yeah, the first thing I thought of was that perhaps there was something wrong > with the uniqueKey and they were clashing between the indexes, however upon > visual inspection of the data the field we are using as the unique key in > each of the indexes is grossly different between the two databases, so there > is no chance of them clashing. Yes, but is the same fieldname and FieldType used for both indexes? (that's sort of a requirement) You might also try looking at the logs for the exact requests that were sent to each shard as part of the distributed request, and manually sending those requests and inspecting the results. That should tell you if the shard requests or responses are weird, or if it's the top-level combining logic that's causing this. -Yonik http://www.lucidimagination.com
Re: Solr Shard - Strange results
Yeah, the first thing I thought of was that perhaps there was something wrong with the uniqueKey and they were clashing between the indexes, however upon visual inspection of the data the field we are using as the unique key in each of the indexes is grossly different between the two databases, so there is no chance of them clashing. Unfortunately, I cannot provide the data in order to reproduce, however I will try and produce a set of sample data that will reproduce the problem. Although I must add that when we were testing the shard feature on smaller sets of data, we did not notice this issue ( < 100,000 docs per index ) but when we fully filled each index, the issue became more apparent ( > 1,000,000 docs per index ). This is not to say that the issue wasn't there before, we just never noticed it. On Monday, I will provide some configuration information and see if that helps. Yonik Seeley-2 wrote: > > Certainly does seem strange. > Do you have the same uniqueKeyField in both indexes? > Any way you can provide some configuration and some data to reproduce > this? > > -Yonik > > On Fri, May 15, 2009 at 10:40 AM, CB-PO wrote: >> >> Hello, >> What we have done is created multiple solr instances on the same server, >> where each instance is created with the DataImportHandler from a >> different >> DB. The information on each DB is similar, so the schema's for each >> instance are pretty much the same. Our goal is to use the shards feature >> to >> combine the results into a single table. >> >> The problem is that when we use shards, the "numFound" is acting very >> strangely. Here are some examples: >> >> 2 solr instances: >> localhost:8080/alpha/ >> localhost:8080/bravo/ >> >> Lets say i'm searching for the term "fred". If I do: >> >> localhost:8080/alpha/select?q=fred&rows=10&start=0 >> I get numFound="0". That's fine >> >> localhost:8080/bravo/select?q=fred&rows=10&start=0 >> I get: Followed by 10 >> 's. This is also fine. >> >> When i do these [same result for both]: >> localhost:8080/alpha/select?q=fred&rows=10&start=0&shards=localhost:8080/alpha,localhost:8080/bravo >> localhost:8080/bravo/select?q=fred&rows=10&start=0&shards=localhost:8080/alpha,localhost:8080/bravo >> >> I get: followed by 1 >> >> >> So... something weird happened... There should be 27 results, but even if >> it >> thought there were only 18 results, it should have displayed 10 of them. >> >> >> Alright, so I tried: >> >> localhost:8080/alpha/select?q=fred&rows=1&start=0&shards=localhost:8080/alpha,localhost:8080/bravo >> localhost:8080/bravo/select?q=fred&rows=1&start=0&shards=localhost:8080/alpha,localhost:8080/bravo >> >> I got: followed by 1 >> >> Seems to be working alright with this... But lets try... >> >> localhost:8080/alpha/select?q=fred&rows=1&start=1&shards=localhost:8080/alpha,localhost:8080/bravo >> localhost:8080/bravo/select?q=fred&rows=1&start=1&shards=localhost:8080/alpha,localhost:8080/bravo >> >> I got: with no >> 's... wtf? >> >> I continued this up to start=10, and numFound decreased by 1 every time, >> with no more 's. >> So i changed it to rows=100&start=0 and i got: > numFound="2" start="0"> followed by 2 's. >> >> This issue is happening with multiple search queries, however with some >> other search queries, it works fine and returns the proper number for >> numFound, and however many 's there are supposed to be. >> >> Has anyone seen this issue before? > > -- View this message in context: http://www.nabble.com/Solr-Shard---Strange-results-tp23561201p23566574.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Shard - Strange results
Certainly does seem strange. Do you have the same uniqueKeyField in both indexes? Any way you can provide some configuration and some data to reproduce this? -Yonik On Fri, May 15, 2009 at 10:40 AM, CB-PO wrote: > > Hello, > What we have done is created multiple solr instances on the same server, > where each instance is created with the DataImportHandler from a different > DB. The information on each DB is similar, so the schema's for each > instance are pretty much the same. Our goal is to use the shards feature to > combine the results into a single table. > > The problem is that when we use shards, the "numFound" is acting very > strangely. Here are some examples: > > 2 solr instances: > localhost:8080/alpha/ > localhost:8080/bravo/ > > Lets say i'm searching for the term "fred". If I do: > > localhost:8080/alpha/select?q=fred&rows=10&start=0 > I get numFound="0". That's fine > > localhost:8080/bravo/select?q=fred&rows=10&start=0 > I get: Followed by 10 > 's. This is also fine. > > When i do these [same result for both]: > localhost:8080/alpha/select?q=fred&rows=10&start=0&shards=localhost:8080/alpha,localhost:8080/bravo > localhost:8080/bravo/select?q=fred&rows=10&start=0&shards=localhost:8080/alpha,localhost:8080/bravo > > I get: followed by 1 > > > So... something weird happened... There should be 27 results, but even if it > thought there were only 18 results, it should have displayed 10 of them. > > > Alright, so I tried: > > localhost:8080/alpha/select?q=fred&rows=1&start=0&shards=localhost:8080/alpha,localhost:8080/bravo > localhost:8080/bravo/select?q=fred&rows=1&start=0&shards=localhost:8080/alpha,localhost:8080/bravo > > I got: followed by 1 > > Seems to be working alright with this... But lets try... > > localhost:8080/alpha/select?q=fred&rows=1&start=1&shards=localhost:8080/alpha,localhost:8080/bravo > localhost:8080/bravo/select?q=fred&rows=1&start=1&shards=localhost:8080/alpha,localhost:8080/bravo > > I got: with no > 's... wtf? > > I continued this up to start=10, and numFound decreased by 1 every time, > with no more 's. > So i changed it to rows=100&start=0 and i got: numFound="2" start="0"> followed by 2 's. > > This issue is happening with multiple search queries, however with some > other search queries, it works fine and returns the proper number for > numFound, and however many 's there are supposed to be. > > Has anyone seen this issue before?
Solr Shard - Strange results
Hello, What we have done is created multiple solr instances on the same server, where each instance is created with the DataImportHandler from a different DB. The information on each DB is similar, so the schema's for each instance are pretty much the same. Our goal is to use the shards feature to combine the results into a single table. The problem is that when we use shards, the "numFound" is acting very strangely. Here are some examples: 2 solr instances: localhost:8080/alpha/ localhost:8080/bravo/ Lets say i'm searching for the term "fred". If I do: localhost:8080/alpha/select?q=fred&rows=10&start=0 I get numFound="0". That's fine localhost:8080/bravo/select?q=fred&rows=10&start=0 I get: Followed by 10 's. This is also fine. When i do these [same result for both]: localhost:8080/alpha/select?q=fred&rows=10&start=0&shards=localhost:8080/alpha,localhost:8080/bravo localhost:8080/bravo/select?q=fred&rows=10&start=0&shards=localhost:8080/alpha,localhost:8080/bravo I get: followed by 1 So... something weird happened... There should be 27 results, but even if it thought there were only 18 results, it should have displayed 10 of them. Alright, so I tried: localhost:8080/alpha/select?q=fred&rows=1&start=0&shards=localhost:8080/alpha,localhost:8080/bravo localhost:8080/bravo/select?q=fred&rows=1&start=0&shards=localhost:8080/alpha,localhost:8080/bravo I got: followed by 1 Seems to be working alright with this... But lets try... localhost:8080/alpha/select?q=fred&rows=1&start=1&shards=localhost:8080/alpha,localhost:8080/bravo localhost:8080/bravo/select?q=fred&rows=1&start=1&shards=localhost:8080/alpha,localhost:8080/bravo I got: with no 's... wtf? I continued this up to start=10, and numFound decreased by 1 every time, with no more 's. So i changed it to rows=100&start=0 and i got: followed by 2 's. This issue is happening with multiple search queries, however with some other search queries, it works fine and returns the proper number for numFound, and however many 's there are supposed to be. Has anyone seen this issue before? -- View this message in context: http://www.nabble.com/Solr-Shard---Strange-results-tp23561201p23561201.html Sent from the Solr - User mailing list archive at Nabble.com.