Re: Raw query parameters
On 4/28/2014 7:54 PM, Xavier Morera wrote: > Would anyone be so kind to explain what are the "Raw query parameters" > in Solr's admin UI. I can't find an explanation in either the reference > guide nor wiki nor web search. The query API supports a lot more parameters than are shown on the admin UI. For instance, If you are doing a faceted search, there are only boxes for facet.query, facet.field, and facet.prefix ... but faceted search supports a lot more parameters (like facet.method, facet.limit, facet.mincount, facet.sort, etc). Raw Query Parameters gives you a way to use the entire query API, not just the few things that have UI input boxes. Thanks, Shawn
Re: Delete fields from document using a wildcard
Not out of the box, as far as I know. Custom UpdateRequestProcessor could possibly do some sort of expansion of the field name by verifying the actual schema. Not sure if API supports that level of flexibility. Or, for latest Solr, you can request the list of known field names via REST and do client-side expansion instead. Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Tue, Apr 29, 2014 at 12:20 AM, Costi Muraru wrote: > Hi guys, > > Would be possible, using Atomic Updates in SOLR4, to remove all fields > matching a pattern? For instance something like: > > > 100 > <*field name="*_name_i" update="set" null="true">* > > > Or something similar to remove certain fields in all documents. > > Thanks, > Costi
Re: saving user actions on item in solr for later retrieval
1. might be too expensive in terms of commits and performance of refreshing the index every time. 3. Have you looked at external fields, custom components, etc. For example: http://www.slideshare.net/lucenerevolution/potter-timothy-boosting-documents-in-solr http://lucene.472066.n3.nabble.com/Combining-Solr-score-with-customized-user-ratings-for-a-document-td4040200.html (past discussion that seems relevant) Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Tue, Apr 29, 2014 at 1:48 AM, nolim wrote: > Hi, > We are using solr in production system for around ~500 users and we have > around ~1 queries per day. > Our user's search topics most of the time static and repeat themselves over > time. > > We have in our system an option to specify "specific search subject" (we > also call it "specific information need") and most of our users are using > this option. > We keep in our system logs each query and document retrieved from each > "information need" > and the user can also give feedback if the document is relevant for his > "information need". > > We also have special query expansion technique and diversity algorithm based > on MMR. > > We want to use this information from logs as data set for training our > ranking system > and preforming "Learning To Rank" for each "information need" or cluster of > "information needs". > We also want to give the user the option filter by "relevant" and "read" > based on his actions\friends actions in the same topic. > When he runs a query again or similar one he can skip already read > documents. That's an important requirement to our users. > > We think about 2 possibilities to implement it: > 1. Updating each item in solr and creating 2 fields named: "read", > "relevant". > Each field is multivalue field with the corresponding label of the > "information need". > When the user reads a document an update is sent to solr and the field > "read" gets a label with > the "information need" the user is working on... > Will cause update when each item is read by user (still nothing compare to > new items coming in each day). > We are saving information that "belongs" to the application in solr which > may be wrong architecture. > > 2. Save the information In DB, and then preforming filtering on the > retrieved results. > this option is much more complicated (We now have "fields" that aren't solr > and the user uses them for search). We won't get facets, autocomplete and > other nice stuff that a regular field in solr can have. > cost in preformances, we can''t retrieve easy: "give me top 10 documents > that answer the query and unread from the information need" and more > complicated code to hold. > > 3. Do you have more ideas? > > Which of those options is the better? > > Thanks in advance! > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/saving-user-actions-on-item-in-solr-for-later-retrieval-tp4133558.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: Selectively hiding SOLR facets.
Yes, but with my query *country:"USA" * it is returning me languages belonging to countries other than USA. Is there any way I can avoid such languages appearing in my facet filters? -- View this message in context: http://lucene.472066.n3.nabble.com/Selectively-hiding-SOLR-facets-tp4132770p4133638.html Sent from the Solr - User mailing list archive at Nabble.com.
Raw query parameters
Hi, Would anyone be so kind to explain what are the "Raw query parameters" in Solr's admin UI. I can't find an explanation in either the reference guide nor wiki nor web search. [image: Inline image 1] A bit confused on what it actually is for [image: Inline image 3] Thanks in advance, Xavier -- *Xavier Morera* email: xav...@familiamorera.com CR: +(506) 8849 8866 US: +1 (305) 600 4919 skype: xmorera
Indexing an array of maps get transformed to a map
Our team is upgrading to solr 4.7.0 and running into an issue with indexing an array of map objects in solr 4.7.0. I understand that it makes no sense to index an array of map objects to solr, but I want to figure out why certain error outputs are coming out of the solr box. So we have a document structure that goes something like: { id: 1234, url: abcd, modules: [ { id: 1, name: a} ] } When this goes through the solrj, I receive this error. [http-bio-8080-exec-9] ERROR org.apache.solr.servlet.SolrDispatchFilter – null:org.apache.solr.common.SolrException: Can't use SignatureUpdateProcessor with partial update request containing signature field: url at org.apache.solr.update.processor.SignatureUpdateProcessorFactory$SignatureUpdateProcessor.processAdd(SignatureUpdateProcessorFactory.java:159) at org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:247) at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174) For some reason, when the SignatureUpdateProcessorFactory receives the update command, the solr document has become: { id: 1234, url: abcd, modules: {id: 1, name: a} }. Then the processor thinks I'm sending a partial update, when I'm trying to index a full document. :/ When I trace the code, I can see that I'm creating SolrInputDocument with key 'modules' and value '[ { id: 1, name: 1} ]'. But when I call Solrj to add to solr, the document values are transformed... Does anyone know why this is happening? -- Jinsu Oh
Re: Issue with SpanQuery
Adding positionIncrementGap="1" to the fields worked for me. I didn't re-index all the existing docs so it works for only future documents. On Mon, Apr 28, 2014 at 3:54 PM, Ahmet Arslan wrote: > > > Hi Vijay, > > It is a index time setting so yes solr restart and re-indexing is > required. So A small test case would be handy > > > > > On Tuesday, April 29, 2014 1:35 AM, Vijay Kokatnur < > kokatnur.vi...@gmail.com> wrote: > Thanks Ahmet, I'll give that a try. Do I need to re-index to add/update > positionIncrementGap? > > > > On Mon, Apr 28, 2014 at 3:31 PM, Ahmet Arslan wrote: > > > Hi, > > > > I would add positionIncrementGap to fieldType definitions and experiment > > with different values. 0, 1 and 100. > > > > > > > positionIncrementGap="1"> > > > > Same with OrderLineType too > > > > > > > > > > On Tuesday, April 29, 2014 1:25 AM, Vijay Kokatnur < > > kokatnur.vi...@gmail.com> wrote: > > Hey Ehmet, > > > > Here is the field def - > > > > > multiValued="true" omitTermFreqAndPositions="false"/> > > > > > > > > class="solr.LowerCaseFilterFactory"/> > > > > > > > > > > > > On Mon, Apr 28, 2014 at 3:19 PM, Ahmet Arslan wrote: > > > > > Hi, > > > > > > Can you paste your field definition of BookingRecordId and > OrderLineType? > > > It could be something related to positionIncrementGap. > > > > > > Ahmet > > > > > > > > > > > > On Tuesday, April 29, 2014 12:58 AM, Ethan wrote: > > > Facing the same problem!! I have noticed it works fine as long as > you're > > > looking up the first index position. > > > > > > Anyone faced similar problem before? > > > > > > > > > > > > On Mon, Apr 28, 2014 at 12:22 PM, Vijay Kokatnur > > > wrote: > > > > > > > I have been working on SpanQuery for some time now to look up > > multivalued > > > > fields and found one more issue - > > > > > > > > Now if a document has following lookup fields among others > > > > > > > > "*BookingRecordId*": [ "100268421", "190131", "8263325" ], > > > > > > > > "*OrderLineType*": [ "13", "1", "11" ], > > > > > > > > Here is the query I construct - > > > > > > > > val q1 = new SpanTermQuery(new Term("BookingRecordId", "100268421")) > > > > val q2 = new SpanTermQuery(new Term("OrderLineType", "13")) > > > > val q2m = new FieldMaskingSpanQuery(q2, "BookingRecordId") > > > > val sp = Array[SpanQuery](q1, q2m) > > > > > > > > val q = new SpanNearQuery(sp, -1, false) > > > > > > > > Query to find element at first index position works fine - > > > > > > > > *{!span} BookingRecordId: 100268421 +OrderLineType:13* > > > > but query to find element at third index position doesn't return any > > > > result. - > > > > > > > > *{!span} BookingRecordId: 8263325 +OrderLineType:11 * > > > > > > > > If I increase the slope to 4 then it returns correct result. But it > > also > > > > matches BookingRecordId: 100268421 with OrderLineType:11 which is > > > incorrect. > > > > > > > > I thought SpanQuery works for any multiValued field size. Any ideas > > how > > > I > > > > can fix this? > > > > > > > > Thanks, > > > > -Vijay > > > > > > > > > > > > > >
Re: Issue with SpanQuery
Hi Vijay, It is a index time setting so yes solr restart and re-indexing is required. So A small test case would be handy On Tuesday, April 29, 2014 1:35 AM, Vijay Kokatnur wrote: Thanks Ahmet, I'll give that a try. Do I need to re-index to add/update positionIncrementGap? On Mon, Apr 28, 2014 at 3:31 PM, Ahmet Arslan wrote: > Hi, > > I would add positionIncrementGap to fieldType definitions and experiment > with different values. 0, 1 and 100. > > > positionIncrementGap="1"> > > Same with OrderLineType too > > > > > On Tuesday, April 29, 2014 1:25 AM, Vijay Kokatnur < > kokatnur.vi...@gmail.com> wrote: > Hey Ehmet, > > Here is the field def - > > multiValued="true" omitTermFreqAndPositions="false"/> > > > class="solr.LowerCaseFilterFactory"/> > > > > > > On Mon, Apr 28, 2014 at 3:19 PM, Ahmet Arslan wrote: > > > Hi, > > > > Can you paste your field definition of BookingRecordId and OrderLineType? > > It could be something related to positionIncrementGap. > > > > Ahmet > > > > > > > > On Tuesday, April 29, 2014 12:58 AM, Ethan wrote: > > Facing the same problem!! I have noticed it works fine as long as you're > > looking up the first index position. > > > > Anyone faced similar problem before? > > > > > > > > On Mon, Apr 28, 2014 at 12:22 PM, Vijay Kokatnur > > wrote: > > > > > I have been working on SpanQuery for some time now to look up > multivalued > > > fields and found one more issue - > > > > > > Now if a document has following lookup fields among others > > > > > > "*BookingRecordId*": [ "100268421", "190131", "8263325" ], > > > > > > "*OrderLineType*": [ "13", "1", "11" ], > > > > > > Here is the query I construct - > > > > > > val q1 = new SpanTermQuery(new Term("BookingRecordId", "100268421")) > > > val q2 = new SpanTermQuery(new Term("OrderLineType", "13")) > > > val q2m = new FieldMaskingSpanQuery(q2, "BookingRecordId") > > > val sp = Array[SpanQuery](q1, q2m) > > > > > > val q = new SpanNearQuery(sp, -1, false) > > > > > > Query to find element at first index position works fine - > > > > > > *{!span} BookingRecordId: 100268421 +OrderLineType:13* > > > but query to find element at third index position doesn't return any > > > result. - > > > > > > *{!span} BookingRecordId: 8263325 +OrderLineType:11 * > > > > > > If I increase the slope to 4 then it returns correct result. But it > also > > > matches BookingRecordId: 100268421 with OrderLineType:11 which is > > incorrect. > > > > > > I thought SpanQuery works for any multiValued field size. Any ideas > how > > I > > > can fix this? > > > > > > Thanks, > > > -Vijay > > > > > > > >
Re: Issue with SpanQuery
I tried testing with positionIncrementGap but that didn't work. The values I passed for it were 0, 1, 4,100. Reindexing also didn't help. On Mon, Apr 28, 2014 at 3:34 PM, Vijay Kokatnur wrote: > Thanks Ahmet, I'll give that a try. Do I need to re-index to add/update > positionIncrementGap? > > > On Mon, Apr 28, 2014 at 3:31 PM, Ahmet Arslan wrote: > >> Hi, >> >> I would add positionIncrementGap to fieldType definitions and experiment >> with different values. 0, 1 and 100. >> >> >> > positionIncrementGap="1"> >> >> Same with OrderLineType too >> >> >> >> >> On Tuesday, April 29, 2014 1:25 AM, Vijay Kokatnur < >> kokatnur.vi...@gmail.com> wrote: >> Hey Ehmet, >> >> Here is the field def - >> >> > multiValued="true" omitTermFreqAndPositions="false"/> >> >> >> >> > class="solr.LowerCaseFilterFactory"/> >> >> >> >> >> >> On Mon, Apr 28, 2014 at 3:19 PM, Ahmet Arslan wrote: >> >> > Hi, >> > >> > Can you paste your field definition of BookingRecordId and >> OrderLineType? >> > It could be something related to positionIncrementGap. >> > >> > Ahmet >> > >> > >> > >> > On Tuesday, April 29, 2014 12:58 AM, Ethan wrote: >> > Facing the same problem!! I have noticed it works fine as long as you're >> > looking up the first index position. >> > >> > Anyone faced similar problem before? >> > >> > >> > >> > On Mon, Apr 28, 2014 at 12:22 PM, Vijay Kokatnur >> > wrote: >> > >> > > I have been working on SpanQuery for some time now to look up >> multivalued >> > > fields and found one more issue - >> > > >> > > Now if a document has following lookup fields among others >> > > >> > > "*BookingRecordId*": [ "100268421", "190131", "8263325" ], >> > > >> > > "*OrderLineType*": [ "13", "1", "11" ], >> > > >> > > Here is the query I construct - >> > > >> > > val q1 = new SpanTermQuery(new Term("BookingRecordId", "100268421")) >> > > val q2 = new SpanTermQuery(new Term("OrderLineType", "13")) >> > > val q2m = new FieldMaskingSpanQuery(q2, "BookingRecordId") >> > > val sp = Array[SpanQuery](q1, q2m) >> > > >> > > val q = new SpanNearQuery(sp, -1, false) >> > > >> > > Query to find element at first index position works fine - >> > > >> > > *{!span} BookingRecordId: 100268421 +OrderLineType:13* >> > > but query to find element at third index position doesn't return any >> > > result. - >> > > >> > > *{!span} BookingRecordId: 8263325 +OrderLineType:11 * >> > > >> > > If I increase the slope to 4 then it returns correct result. But it >> also >> > > matches BookingRecordId: 100268421 with OrderLineType:11 which is >> > incorrect. >> > > >> > > I thought SpanQuery works for any multiValued field size. Any ideas >> how >> > I >> > > can fix this? >> > > >> > > Thanks, >> > > -Vijay >> > > >> > >> > >> > >
Re: Issue with SpanQuery
Thanks Ahmet, I'll give that a try. Do I need to re-index to add/update positionIncrementGap? On Mon, Apr 28, 2014 at 3:31 PM, Ahmet Arslan wrote: > Hi, > > I would add positionIncrementGap to fieldType definitions and experiment > with different values. 0, 1 and 100. > > > positionIncrementGap="1"> > > Same with OrderLineType too > > > > > On Tuesday, April 29, 2014 1:25 AM, Vijay Kokatnur < > kokatnur.vi...@gmail.com> wrote: > Hey Ehmet, > > Here is the field def - > > multiValued="true" omitTermFreqAndPositions="false"/> > > > class="solr.LowerCaseFilterFactory"/> > > > > > > On Mon, Apr 28, 2014 at 3:19 PM, Ahmet Arslan wrote: > > > Hi, > > > > Can you paste your field definition of BookingRecordId and OrderLineType? > > It could be something related to positionIncrementGap. > > > > Ahmet > > > > > > > > On Tuesday, April 29, 2014 12:58 AM, Ethan wrote: > > Facing the same problem!! I have noticed it works fine as long as you're > > looking up the first index position. > > > > Anyone faced similar problem before? > > > > > > > > On Mon, Apr 28, 2014 at 12:22 PM, Vijay Kokatnur > > wrote: > > > > > I have been working on SpanQuery for some time now to look up > multivalued > > > fields and found one more issue - > > > > > > Now if a document has following lookup fields among others > > > > > > "*BookingRecordId*": [ "100268421", "190131", "8263325" ], > > > > > > "*OrderLineType*": [ "13", "1", "11" ], > > > > > > Here is the query I construct - > > > > > > val q1 = new SpanTermQuery(new Term("BookingRecordId", "100268421")) > > > val q2 = new SpanTermQuery(new Term("OrderLineType", "13")) > > > val q2m = new FieldMaskingSpanQuery(q2, "BookingRecordId") > > > val sp = Array[SpanQuery](q1, q2m) > > > > > > val q = new SpanNearQuery(sp, -1, false) > > > > > > Query to find element at first index position works fine - > > > > > > *{!span} BookingRecordId: 100268421 +OrderLineType:13* > > > but query to find element at third index position doesn't return any > > > result. - > > > > > > *{!span} BookingRecordId: 8263325 +OrderLineType:11 * > > > > > > If I increase the slope to 4 then it returns correct result. But it > also > > > matches BookingRecordId: 100268421 with OrderLineType:11 which is > > incorrect. > > > > > > I thought SpanQuery works for any multiValued field size. Any ideas > how > > I > > > can fix this? > > > > > > Thanks, > > > -Vijay > > > > > > > >
Re: Issue with SpanQuery
Hi, I would add positionIncrementGap to fieldType definitions and experiment with different values. 0, 1 and 100. Same with OrderLineType too On Tuesday, April 29, 2014 1:25 AM, Vijay Kokatnur wrote: Hey Ehmet, Here is the field def - On Mon, Apr 28, 2014 at 3:19 PM, Ahmet Arslan wrote: > Hi, > > Can you paste your field definition of BookingRecordId and OrderLineType? > It could be something related to positionIncrementGap. > > Ahmet > > > > On Tuesday, April 29, 2014 12:58 AM, Ethan wrote: > Facing the same problem!! I have noticed it works fine as long as you're > looking up the first index position. > > Anyone faced similar problem before? > > > > On Mon, Apr 28, 2014 at 12:22 PM, Vijay Kokatnur > wrote: > > > I have been working on SpanQuery for some time now to look up multivalued > > fields and found one more issue - > > > > Now if a document has following lookup fields among others > > > > "*BookingRecordId*": [ "100268421", "190131", "8263325" ], > > > > "*OrderLineType*": [ "13", "1", "11" ], > > > > Here is the query I construct - > > > > val q1 = new SpanTermQuery(new Term("BookingRecordId", "100268421")) > > val q2 = new SpanTermQuery(new Term("OrderLineType", "13")) > > val q2m = new FieldMaskingSpanQuery(q2, "BookingRecordId") > > val sp = Array[SpanQuery](q1, q2m) > > > > val q = new SpanNearQuery(sp, -1, false) > > > > Query to find element at first index position works fine - > > > > *{!span} BookingRecordId: 100268421 +OrderLineType:13* > > but query to find element at third index position doesn't return any > > result. - > > > > *{!span} BookingRecordId: 8263325 +OrderLineType:11 * > > > > If I increase the slope to 4 then it returns correct result. But it also > > matches BookingRecordId: 100268421 with OrderLineType:11 which is > incorrect. > > > > I thought SpanQuery works for any multiValued field size. Any ideas how > I > > can fix this? > > > > Thanks, > > -Vijay > > > >
Re: Issue with SpanQuery
Hey Ehmet, Here is the field def - On Mon, Apr 28, 2014 at 3:19 PM, Ahmet Arslan wrote: > Hi, > > Can you paste your field definition of BookingRecordId and OrderLineType? > It could be something related to positionIncrementGap. > > Ahmet > > > > On Tuesday, April 29, 2014 12:58 AM, Ethan wrote: > Facing the same problem!! I have noticed it works fine as long as you're > looking up the first index position. > > Anyone faced similar problem before? > > > > On Mon, Apr 28, 2014 at 12:22 PM, Vijay Kokatnur > wrote: > > > I have been working on SpanQuery for some time now to look up multivalued > > fields and found one more issue - > > > > Now if a document has following lookup fields among others > > > > "*BookingRecordId*": [ "100268421", "190131", "8263325" ], > > > > "*OrderLineType*": [ "13", "1", "11" ], > > > > Here is the query I construct - > > > > val q1 = new SpanTermQuery(new Term("BookingRecordId", "100268421")) > > val q2 = new SpanTermQuery(new Term("OrderLineType", "13")) > > val q2m = new FieldMaskingSpanQuery(q2, "BookingRecordId") > > val sp = Array[SpanQuery](q1, q2m) > > > > val q = new SpanNearQuery(sp, -1, false) > > > > Query to find element at first index position works fine - > > > > *{!span} BookingRecordId: 100268421 +OrderLineType:13* > > but query to find element at third index position doesn't return any > > result. - > > > > *{!span} BookingRecordId: 8263325 +OrderLineType:11 * > > > > If I increase the slope to 4 then it returns correct result. But it also > > matches BookingRecordId: 100268421 with OrderLineType:11 which is > incorrect. > > > > I thought SpanQuery works for any multiValued field size. Any ideas how > I > > can fix this? > > > > Thanks, > > -Vijay > > > >
RE: how to write my first solr query
Hello, Thank you! I will try out what you suggested and post back once I know more. yes given things like cat foo bar house foo bar foo bar I want to know when the term "foo bar" (but not the prefix cases I specify) exists in my documents. Thanks! Evan -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-write-my-first-solr-query-tp4133509p4133601.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Issue with SpanQuery
Hi, Can you paste your field definition of BookingRecordId and OrderLineType? It could be something related to positionIncrementGap. Ahmet On Tuesday, April 29, 2014 12:58 AM, Ethan wrote: Facing the same problem!! I have noticed it works fine as long as you're looking up the first index position. Anyone faced similar problem before? On Mon, Apr 28, 2014 at 12:22 PM, Vijay Kokatnur wrote: > I have been working on SpanQuery for some time now to look up multivalued > fields and found one more issue - > > Now if a document has following lookup fields among others > > "*BookingRecordId*": [ "100268421", "190131", "8263325" ], > > "*OrderLineType*": [ "13", "1", "11" ], > > Here is the query I construct - > > val q1 = new SpanTermQuery(new Term("BookingRecordId", "100268421")) > val q2 = new SpanTermQuery(new Term("OrderLineType", "13")) > val q2m = new FieldMaskingSpanQuery(q2, "BookingRecordId") > val sp = Array[SpanQuery](q1, q2m) > > val q = new SpanNearQuery(sp, -1, false) > > Query to find element at first index position works fine - > > *{!span} BookingRecordId: 100268421 +OrderLineType:13* > but query to find element at third index position doesn't return any > result. - > > *{!span} BookingRecordId: 8263325 +OrderLineType:11 * > > If I increase the slope to 4 then it returns correct result. But it also > matches BookingRecordId: 100268421 with OrderLineType:11 which is incorrect. > > I thought SpanQuery works for any multiValued field size. Any ideas how I > can fix this? > > Thanks, > -Vijay >
Re: How to get a list of currently executing queries?
No, though one could write a custom SearchComponent, I imagine. Not terribly useful for most situations where queries typically run for only a few milliseconds, but Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Thu, Apr 17, 2014 at 7:34 AM, Nikhil Chhaochharia wrote: > Hello, > > Is there some way of getting a list of all queries that are currently > executing? Something similar to 'show full processlist' in MySQL. > > Thanks, > Nikhil
Re: Issue with SpanQuery
Facing the same problem!! I have noticed it works fine as long as you're looking up the first index position. Anyone faced similar problem before? On Mon, Apr 28, 2014 at 12:22 PM, Vijay Kokatnur wrote: > I have been working on SpanQuery for some time now to look up multivalued > fields and found one more issue - > > Now if a document has following lookup fields among others > > "*BookingRecordId*": [ "100268421", "190131", "8263325" ], > > "*OrderLineType*": [ "13", "1", "11" ], > > Here is the query I construct - > > val q1 = new SpanTermQuery(new Term("BookingRecordId", "100268421")) > val q2 = new SpanTermQuery(new Term("OrderLineType", "13")) > val q2m = new FieldMaskingSpanQuery(q2, "BookingRecordId") > val sp = Array[SpanQuery](q1, q2m) > > val q = new SpanNearQuery(sp, -1, false) > > Query to find element at first index position works fine - > > *{!span} BookingRecordId: 100268421 +OrderLineType:13* > but query to find element at third index position doesn't return any > result. - > > *{!span} BookingRecordId: 8263325 +OrderLineType:11 * > > If I increase the slope to 4 then it returns correct result. But it also > matches BookingRecordId: 100268421 with OrderLineType:11 which is incorrect. > > I thought SpanQuery works for any multiValued field size. Any ideas how I > can fix this? > > Thanks, > -Vijay >
RE: spellcheck.q and local parameters
Thanks James, I was afraid of that. The problem is that spellcheck.q Is not always provided by the users and therefore it gives wrong suggestions. I'll just turn off spellcheck by default. Cheers, Jeroen -Original Message- From: Dyer, James [mailto:james.d...@ingramcontent.com] Sent: maandag 28 april 2014 22:55 To: solr-user@lucene.apache.org Subject: RE: spellcheck.q and local parameters spellcheck.q is supposed to take a list of raw query terms, so what you're trying to do in your example won't work. What you should do instead is space-delimit the actual query terms that exist in "qq" and (nothing else) use that for your value of spellcheck.q . James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Jeroen Steggink [mailto:jeroen.stegg...@contentstrategy.nl] Sent: Monday, April 28, 2014 3:01 PM To: solr-user@lucene.apache.org Subject: spellcheck.q and local parameters Hi, I'm having some trouble using the spellcheck.q parameter. The user's query is defined in the qq parameter and q parameter contains several other parameters for boosting. I would like to use the qq parameter as a default for spellcheck.q. I tried several ways of adding the qq parameter in the spellcheck.q parameter, but it doesn't seem to work. Is this at all possible or do I need to write a custom QueryConverter? This is the configuration: _query_:"{!edismax qf=$qfQuery pf=$pfQuery bq=$boostQuery bf=$boostFunction v=$qq}" {!v=$qq} I haven't included all the variables, because they seem unnecessary. Regards, Jeroen
RE: spellcheck.q and local parameters
spellcheck.q is supposed to take a list of raw query terms, so what you're trying to do in your example won't work. What you should do instead is space-delimit the actual query terms that exist in "qq" and (nothing else) use that for your value of spellcheck.q . James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Jeroen Steggink [mailto:jeroen.stegg...@contentstrategy.nl] Sent: Monday, April 28, 2014 3:01 PM To: solr-user@lucene.apache.org Subject: spellcheck.q and local parameters Hi, I'm having some trouble using the spellcheck.q parameter. The user's query is defined in the qq parameter and q parameter contains several other parameters for boosting. I would like to use the qq parameter as a default for spellcheck.q. I tried several ways of adding the qq parameter in the spellcheck.q parameter, but it doesn't seem to work. Is this at all possible or do I need to write a custom QueryConverter? This is the configuration: _query_:"{!edismax qf=$qfQuery pf=$pfQuery bq=$boostQuery bf=$boostFunction v=$qq}" {!v=$qq} I haven't included all the variables, because they seem unnecessary. Regards, Jeroen
RE: how to write my first solr query
Hi Evan, If I understand correctly, a document has to have at least one "foo bar" without having "cat" in front. A solution would be to use a combination of the ShingleFilterFactory and query for one occurences of "foo bar" using the termfreq function. https://cwiki.apache.org/confluence/display/solr/Filter+Descriptions#FilterDescriptions-ShingleFilter https://cwiki.apache.org/confluence/display/solr/Function+Queries The number of shingles depends on how many terms are in the query and how many terms cannot be prefixed. It might be easier to just retrieve all the documents which contain the phrase and process the results outside of Solr. If you could shed some more light on what you are trying to accomplish, maybe we can help you find an even better solution to fit your problem. Jeroen -Original Message- From: Evan Smith [mailto:e...@wingonwing.com] Sent: maandag 28 april 2014 19:20 To: solr-user@lucene.apache.org Subject: Re: how to write my first solr query Hello, Here is a better use case Documents A, B, C, and D A: "dear foo bar hello" B: "dear cat foo bar hello" C: "dear cat foo bar hello foo bar" D: "dear car foo bar" I have a dictionary of items outside of solr "foo bar" and "cat foo bar" And associated with each item is the set of "suffix's of that item" So I know that "foo bar" has "cat foo bar" as a "suffix" I would like to search my corpus of documents A, B, C and D And just get documents that contain "foo bar" and not the ones that contain "cat foo bar" So if I searched on "foo bar" but not "cat foo bar" I want to get documents A, C, D But not B which does not have just "foo bar" but has "cat foo bar". I am ok with C as it has a "foo bar" that is not prefixed with "cat". Does this make sense? I see that the ("foo bar" and not "cat foo bar") would not work as it would miss document C. Or at least I think it would. Evan -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-write-my-first-solr-query-tp4133509p4133537.html Sent from the Solr - User mailing list archive at Nabble.com.
spellcheck.q and local parameters
Hi, I'm having some trouble using the spellcheck.q parameter. The user's query is defined in the qq parameter and q parameter contains several other parameters for boosting. I would like to use the qq parameter as a default for spellcheck.q. I tried several ways of adding the qq parameter in the spellcheck.q parameter, but it doesn't seem to work. Is this at all possible or do I need to write a custom QueryConverter? This is the configuration: _query_:"{!edismax qf=$qfQuery pf=$pfQuery bq=$boostQuery bf=$boostFunction v=$qq}" {!v=$qq} I haven't included all the variables, because they seem unnecessary. Regards, Jeroen
Issue with SpanQuery
I have been working on SpanQuery for some time now to look up multivalued fields and found one more issue - Now if a document has following lookup fields among others "*BookingRecordId*": [ "100268421", "190131", "8263325" ], "*OrderLineType*": [ "13", "1", "11" ], Here is the query I construct - val q1 = new SpanTermQuery(new Term("BookingRecordId", "100268421")) val q2 = new SpanTermQuery(new Term("OrderLineType", "13")) val q2m = new FieldMaskingSpanQuery(q2, "BookingRecordId") val sp = Array[SpanQuery](q1, q2m) val q = new SpanNearQuery(sp, -1, false) Query to find element at first index position works fine - *{!span} BookingRecordId: 100268421 +OrderLineType:13* but query to find element at third index position doesn't return any result. - *{!span} BookingRecordId: 8263325 +OrderLineType:11 * If I increase the slope to 4 then it returns correct result. But it also matches BookingRecordId: 100268421 with OrderLineType:11 which is incorrect. I thought SpanQuery works for any multiValued field size. Any ideas how I can fix this? Thanks, -Vijay
saving user actions on item in solr for later retrieval
Hi, We are using solr in production system for around ~500 users and we have around ~1 queries per day. Our user's search topics most of the time static and repeat themselves over time. We have in our system an option to specify "specific search subject" (we also call it "specific information need") and most of our users are using this option. We keep in our system logs each query and document retrieved from each "information need" and the user can also give feedback if the document is relevant for his "information need". We also have special query expansion technique and diversity algorithm based on MMR. We want to use this information from logs as data set for training our ranking system and preforming "Learning To Rank" for each "information need" or cluster of "information needs". We also want to give the user the option filter by "relevant" and "read" based on his actions\friends actions in the same topic. When he runs a query again or similar one he can skip already read documents. That's an important requirement to our users. We think about 2 possibilities to implement it: 1. Updating each item in solr and creating 2 fields named: "read", "relevant". Each field is multivalue field with the corresponding label of the "information need". When the user reads a document an update is sent to solr and the field "read" gets a label with the "information need" the user is working on... Will cause update when each item is read by user (still nothing compare to new items coming in each day). We are saving information that "belongs" to the application in solr which may be wrong architecture. 2. Save the information In DB, and then preforming filtering on the retrieved results. this option is much more complicated (We now have "fields" that aren't solr and the user uses them for search). We won't get facets, autocomplete and other nice stuff that a regular field in solr can have. cost in preformances, we can''t retrieve easy: "give me top 10 documents that answer the query and unread from the information need" and more complicated code to hold. 3. Do you have more ideas? Which of those options is the better? Thanks in advance! -- View this message in context: http://lucene.472066.n3.nabble.com/saving-user-actions-on-item-in-solr-for-later-retrieval-tp4133558.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Stemming not working with wildcard search
Hi Ahmet, Thanks for your prompt response! I have added filters which you have specified but still its not working. Below is field Query Analyzer http://localhost:8080/solr/master/select?q=page_title_t:*products* http://localhost:8080/solr/master/select?q=page_title_t:*product* Please let me know if I am doing anything wrong. Thanks, G. Naresh Kumar -- View this message in context: http://lucene.472066.n3.nabble.com/Stemming-not-working-with-wildcard-search-tp4133382p4133556.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: zkCli zkhost parameter
I did, but it looks like I mixed in the chroot too after every entry rather than once at the very end (thanks to David Smiley for catching that). I'll try again and update if it's still a problem. Thanks! -Scott On Sat, Apr 26, 2014 at 1:08 PM, Mark Miller wrote: > Have you tried a comma-separated list or are you going by documentation? > It should work. > -- > Mark Miller > about.me/markrmiller > > On April 26, 2014 at 1:03:25 PM, Scott Stults ( > sstu...@opensourceconnections.com) wrote: > > It looks like this only takes a single host as its value, whereas the > zkHost environment variable for Solr takes a comma-separated list. > Shouldn't the client also take a comma-separated list? > > k/r, > Scott > -- Scott Stults | Founder & Solutions Architect | OpenSource Connections, LLC | 434.409.2780 http://www.opensourceconnections.com
Re: Wildcard search not working with search term having special characters and digits
Thanks jack for prompt response! So is there any solution to make this scenario works? Or wildcard doesn't work with special characters and numerics? Thanks, G. Naresh Kumar -- View this message in context: http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-with-search-term-having-special-characters-and-digits-tp4133385p4133554.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: how to write my first solr query
Hello, Here is a better use case Documents A, B, C, and D A: "dear foo bar hello" B: "dear cat foo bar hello" C: "dear cat foo bar hello foo bar" D: "dear car foo bar" I have a dictionary of items outside of solr "foo bar" and "cat foo bar" And associated with each item is the set of "suffix's of that item" So I know that "foo bar" has "cat foo bar" as a "suffix" I would like to search my corpus of documents A, B, C and D And just get documents that contain "foo bar" and not the ones that contain "cat foo bar" So if I searched on "foo bar" but not "cat foo bar" I want to get documents A, C, D But not B which does not have just "foo bar" but has "cat foo bar". I am ok with C as it has a "foo bar" that is not prefixed with "cat". Does this make sense? I see that the ("foo bar" and not "cat foo bar") would not work as it would miss document C. Or at least I think it would. Evan -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-write-my-first-solr-query-tp4133509p4133537.html Sent from the Solr - User mailing list archive at Nabble.com.
Delete fields from document using a wildcard
Hi guys, Would be possible, using Atomic Updates in SOLR4, to remove all fields matching a pattern? For instance something like: 100 <*field name="*_name_i" update="set" null="true">* Or something similar to remove certain fields in all documents. Thanks, Costi
Re: SpanQuery with Boolean Queries
Pretty neat. Thanks! On Fri, Apr 25, 2014 at 2:44 AM, Ahmet Arslan wrote: > Hi, > > I am not sure how OR clauses are executed. > > But after re-reading your mail, I think you can use SpanOrQuery (for your > q1) in your custom query parser plugin. > > val q2 = new SpanOrQuery( > new SpanTermQuery(new Term("BookingRecordId", > "ID_1")), > new SpanTermQuery(new Term("BookingRecordId", > "ID_N")) > ); > > > > > On Friday, April 25, 2014 3:22 AM, Vijay Kokatnur < > kokatnur.vi...@gmail.com> wrote: > Thanks Ahmet. It worked! > > Does solr execute these nested queries in parallel? > > > > On Thu, Apr 24, 2014 at 12:53 PM, Ahmet Arslan wrote: > > > Hi Vijay, > > > > May be you can use _query_ hook? > > > > _query_:"{!span}BookingRecordId:234 OrderLineType:11" OR _query_:"{!span} > > OrderLineType:13 + BookingRecordId:ID_N" > > > > Ahmet > > > > > > On Thursday, April 24, 2014 9:34 PM, Vijay Kokatnur < > > kokatnur.vi...@gmail.com> wrote: > > Hi, > > > > I have defined a SpanQuery for proximity search like - > > > > val q1 = new SpanTermQuery(new Term("BookingRecordId", "234")) > > val q2 = new SpanTermQuery(new Term("OrderLineType", "11")) > > val q2m = new FieldMaskingSpanQuery(q2, "BookingRecordId") > > val sp = Array[SpanQuery](q1, q2m) > > > > val q = new SpanNearQuery(sp, -1, false) > > > > Query: > > *&fq={!span} BookingRecordId: 234+OrderLineType11* > > > > However, I need to look up by multiple BookingRecordIds with an OR - > > > > *&fq={!span}OrderLineType:"13" + (BookingRecordId:ID_1 OR ... OR > > BookingRecordId:ID_N)* > > > > I can't specify multiple *span* in the same query like - > > > > *{!span} OrderLineType:"13" + BookingRecordId:ID_1 OR ... OR {!span} > > OrderLineType:"13" + BookingRecordId:ID_N* > > > > Is there any recommended to way to achieve this? > > Thanks, Vijay > > > > > >
[ANNOUNCE] Apache Solr 4.8.0 released
28 April 2014, Apache Solr™ 4.8.0 available The Lucene PMC is pleased to announce the release of Apache Solr 4.8.0 Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly scalable, providing fault tolerant distributed search and indexing, and powers the search and navigation features of many of the world's largest internet sites. Solr 4.8.0 is available for immediate download at: http://lucene.apache.org/solr/mirrors-solr-latest-redir.html See the CHANGES.txt file included with the release for a full list of details. Solr 4.8.0 Release Highlights: * Apache Solr now requires Java 7 or greater (recommended is Oracle Java 7 or OpenJDK 7, minimum update 55; earlier versions have known JVM bugs affecting Solr). * Apache Solr is fully compatible with Java 8. * and tags have been deprecated from schema.xml. There is no longer any reason to keep them in the schema file, they may be safely removed. This allows intermixing of , and definitions if desired. * The new {!complexphrase} query parser supports wildcards, ORs etc. inside Phrase Queries. * New Collections API CLUSTERSTATUS action reports the status of collections, shards, and replicas, and also lists collection aliases and cluster properties. * Added managed synonym and stopword filter factories, which enable synonym and stopword lists to be dynamically managed via REST API. * JSON updates now support nested child documents, enabling {!child} and {!parent} block join queries. * Added ExpandComponent to expand results collapsed by the CollapsingQParserPlugin, as well as the parent/child relationship of nested child documents. * Long-running Collections API tasks can now be executed asynchronously; the new REQUESTSTATUS action provides status. * Added a hl.qparser parameter to allow you to define a query parser for hl.q highlight queries. * In Solr single-node mode, cores can now be created using named configsets. * New DocExpirationUpdateProcessorFactory supports computing an expiration date for documents from the "TTL" expression, as well as automatically deleting expired documents on a periodic basis. Solr 4.8.0 also includes many other new features as well as numerous optimizations and bugfixes of the corresponding Apache Lucene release. Please report any feedback to the mailing lists (http://lucene.apache.org/solr/discussion.html) Note: The Apache Software Foundation uses an extensive mirroring network for distributing releases. It is possible that the mirror you are using may not have replicated the release yet. If that is the case, please try another mirror. This also goes for Maven access. - Uwe Schindler uschind...@apache.org Apache Lucene PMC Chair / Committer Bremen, Germany http://lucene.apache.org/
Re: how to write my first solr query
Hi Evan, Confusing use case :) You don't want "foo bar" is prefixed with "cat" ? But you are ok with a document that has "cat foo bar" Isn't this contradiction? On Monday, April 28, 2014 6:26 PM, Evan Smith wrote: Hello, I would like to find all documents that have say "foo bar" with a filter to remove any cases where "foo bar" is prefixed with things like "cat", "a", ... I am ok with a document that has "cat foo bar" and "foo bar", but if it only has "cat foo bar" then I don't want it while if it has "foo bar" I want it. I looked at span queries but was not able to come up with how to phrase this. Any pointers would be great! Thank you in advance, Evan -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-write-my-first-solr-query-tp4133509.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Stemming not working with wildcard search
Hi Naresh, quotes are only meaningful when there are two or more terms. don't use quotes for products* and product*. As regarding stemming and wildcards, use following chain, and your wildcard searches will be happier. Ahmet On Monday, April 28, 2014 5:41 PM, Jack Krupansky wrote: Wildcards and stemming are incompatible at query time - you need to manually stem the term before applying your wildcard. Wildcards are not supported in quoted phrases. They will be treated as punctuation, and ignored by the standard tokenizer or the word delimiter filter. -- Jack Krupansky -Original Message- From: Geepalem Sent: Sunday, April 27, 2014 3:13 PM To: solr-user@lucene.apache.org Subject: Stemming not working with wildcard search Hi, I have added SnowballPorterFilterFactory filter to field type to make singular and plural search terms return same results. So below queries (double quotes around search term) returning similar results which is fine. http://localhost:8080/solr/master/select?q=page_title_t:"product*"; http://localhost:8080/solr/master/select?q=page_title_t:"products*"; But when I have analyzed results, in both result sets, documents which dont start with words "Product" or "products" didnt come though there are few documents available. So I have added * as prefix and suffix to search term without double quotes to do wildcard search. http://localhost:8080/solr/master/select?q=page_title_t:*product* http://localhost:8080/solr/master/select?q=page_title_t:*products* Now, stemming is not working as above second query is not returning similar results as query 1. If double quotes are added around search term then its returning similar results but results are not as expected. With double quotes it wont return results like "Old products", "New products", "Cool Product". It will only return results with the values like "Product 1", "Product 2","Products of USA". Please suggest or guide how to make stemming work with wildcard search. Appreciate immediate response!! Thanks, G. Naresh Kumar -- View this message in context: http://lucene.472066.n3.nabble.com/Stemming-not-working-with-wildcard-search-tp4133382.html Sent from the Solr - User mailing list archive at Nabble.com.
how to write my first solr query
Hello, I would like to find all documents that have say "foo bar" with a filter to remove any cases where "foo bar" is prefixed with things like "cat", "a", ... I am ok with a document that has "cat foo bar" and "foo bar", but if it only has "cat foo bar" then I don't want it while if it has "foo bar" I want it. I looked at span queries but was not able to come up with how to phrase this. Any pointers would be great! Thank you in advance, Evan -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-write-my-first-solr-query-tp4133509.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Stemming not working with wildcard search
Wildcards and stemming are incompatible at query time - you need to manually stem the term before applying your wildcard. Wildcards are not supported in quoted phrases. They will be treated as punctuation, and ignored by the standard tokenizer or the word delimiter filter. -- Jack Krupansky -Original Message- From: Geepalem Sent: Sunday, April 27, 2014 3:13 PM To: solr-user@lucene.apache.org Subject: Stemming not working with wildcard search Hi, I have added SnowballPorterFilterFactory filter to field type to make singular and plural search terms return same results. So below queries (double quotes around search term) returning similar results which is fine. http://localhost:8080/solr/master/select?q=page_title_t:"product*"; http://localhost:8080/solr/master/select?q=page_title_t:"products*"; But when I have analyzed results, in both result sets, documents which dont start with words "Product" or "products" didnt come though there are few documents available. So I have added * as prefix and suffix to search term without double quotes to do wildcard search. http://localhost:8080/solr/master/select?q=page_title_t:*product* http://localhost:8080/solr/master/select?q=page_title_t:*products* Now, stemming is not working as above second query is not returning similar results as query 1. If double quotes are added around search term then its returning similar results but results are not as expected. With double quotes it wont return results like "Old products", "New products", "Cool Product". It will only return results with the values like "Product 1", "Product 2","Products of USA". Please suggest or guide how to make stemming work with wildcard search. Appreciate immediate response!! Thanks, G. Naresh Kumar -- View this message in context: http://lucene.472066.n3.nabble.com/Stemming-not-working-with-wildcard-search-tp4133382.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Wildcard search not working with search term having special characters and digits
Wildcard query only works for single terms. Any embedded special characters will cause a term to be split into multiple terms at index time. The use of a wildcard in a query term with embedded special characters will bypass normal analysis - you need to enter the term exactly as it would be analyzed at index time for wildcard to work. Ditto is your filed type uses the word delimiter filter with the split digits option enabled - the alpha and numeric portions will generate separate terms - and cause a wildcard to fail. -- Jack Krupansky -Original Message- From: Geepalem Sent: Sunday, April 27, 2014 3:30 PM To: solr-user@lucene.apache.org Subject: Wildcard search not working with search term having special characters and digits Hi, Below query without wildcard search is returning results. http://localhost:8080/solr/master/select?q=page_title_t:"an-138"; But below query with wildcard is not returning results http://localhost:8080/solr/master/select?q=page_title_t:"an-13*"; Below query with wildcard search and no didgits is returning results. http://localhost:8080/solr/master/select?q=page_title_t:"an-*"; I have tried by adding WordDelimeter Filter but there is no luck. Please suggest or guide how to make wildcard search works with special characters and digits. Appreciate immediate response!! Thanks, G. Naresh Kumar -- View this message in context: http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-with-search-term-having-special-characters-and-digits-tp4133385.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Cluster management having too many cores
On 4/28/2014 5:05 AM, Mukesh Jha wrote: > Thanks Erik, > > Sounds about right. > > BTW how long can I keep adding collections i.e. can I keep 5/10 years data > like this? > > Also what do you think of bullet 2) of having collection specific > configurations in zookeeper? Regarding bullet 2, there is work underway right now to create a separate clusterstate within zookeeper for each collection. I do not know how far along that work is. There are no hard limits in SolrCloud at all. The things that will cause issues with scalability are resource-related problems. You'll exceed the 1MB default limit on a zookeeper database pretty quickly. If you're not using the example jetty included with Solr, you'll exceed the default maxThreads on most servlet containers very quickly. You may run into problems with the default limits on Solr's HttpShardHandler. Running hundreds or thousands of cores efficiently will require lots of RAM, both for the OS disk cache and the java heap. A large java heap will require significant tuning of Java garbage collection parameters. Most operating systems limit a user to 1024 open files and 1024 running processes (which includes threads). These limits will need to be increased. There may be other limits imposed by the Solr config, Java, and/or the operating system that I have not thought of or stated here. Thanks, Shawn
Re: Solr Cloud and Replication request handler
Hello Shawn, Thanks for your reply, that's good news! All the best. 2014-04-28 15:28 GMT+02:00 Shawn Heisey : > On 4/28/2014 3:33 AM, Amanjit Gill wrote: > > Hi everybody, > > > > Considering a solr cloud configuration (4.6+) > > > > a) I am wondering if the solr replication handler always has to be > > configured completely, aka by choosing one master, then setting the > config > > accordingly (enable, masterUrl) etc ... Do we really need a replication > > master? > > > You simply need the replication handler to be present with a name of > "/replication" for SolrCloud to work properly. You do not need to > configure it for master or slave. SolrCloud will take care of > configuring which instance needs to be a slave whenever it needs to > recover an index. You literally just need one line in your solrconfig.xml: > > > > Thanks, > Shawn > >
Re: Wildcard search not working with search term having special characters and digits
Can some one please help me with this as I am struck with this issue.. -- View this message in context: http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-with-search-term-having-special-characters-and-digits-tp4133385p4133478.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Stemming not working with wildcard search
Can some one please help me with this as I am struck with this issue.. -- View this message in context: http://lucene.472066.n3.nabble.com/Stemming-not-working-with-wildcard-search-tp4133382p4133477.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Cloud and Replication request handler
On 4/28/2014 3:33 AM, Amanjit Gill wrote: > Hi everybody, > > Considering a solr cloud configuration (4.6+) > > a) I am wondering if the solr replication handler always has to be > configured completely, aka by choosing one master, then setting the config > accordingly (enable, masterUrl) etc ... Do we really need a replication > master? You simply need the replication handler to be present with a name of "/replication" for SolrCloud to work properly. You do not need to configure it for master or slave. SolrCloud will take care of configuring which instance needs to be a slave whenever it needs to recover an index. You literally just need one line in your solrconfig.xml: Thanks, Shawn
Re: merge shards indexes
Yes, according to this documentation: https://wiki.apache.org/solr/MergingSolrIndexes On Mon, Apr 28, 2014 at 12:14 PM, Gastone Penzo wrote: > Hi, > it's possible to merge 2 shards indexes into one? > > Thank you > > -- > *Gastone Penzo* > -- Dmitry Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan
Re: Solr Cluster management having too many cores
Thanks Erik, Sounds about right. BTW how long can I keep adding collections i.e. can I keep 5/10 years data like this? Also what do you think of bullet 2) of having collection specific configurations in zookeeper? On Fri, Apr 25, 2014 at 11:44 PM, Erick Erickson wrote: > So you're talking about 700 or so collections. That should be do-able, > especially as Solr is rapidly evolving to handle more and more > collections and there's two years for that to happen. > > The aging out bit is manual (well, you'd script it I suppose). So > every day there'd be a script that ran and "just knew" the right > collection to change the alias on, there's nothing automatic yet. > > Best, > Erick > > On Fri, Apr 25, 2014 at 9:37 AM, Mukesh Jha > wrote: > > Thanks for quick reply Erik, > > > > I want to keep my collections till I run out of hardware, which is at > least > > a couple of years worth data. > > I'd like to know more on ageing out aliases, did a quick search but > didn't > > find much. > > > > > > On Fri, Apr 25, 2014 at 9:45 PM, Erick Erickson >wrote: > > > >> Hmmm, tell us a little more about your use-case. In particular, how > >> long do you need to keep the data around? Days? Months? Years? > >> > >> Because if you only need to keep the data for a specified period, you > >> can use the collection aliasing process to age-out collections and > >> keep the number of cores from growing too large. > >> > >> Best, > >> Erick > >> > >> On Fri, Apr 25, 2014 at 6:49 AM, Mukesh Jha > >> wrote: > >> > Hi Experts, > >> > > >> > I need to divide my indexes based on hour/day with each index having > >> ~50-80 > >> > GB data & ~50-80 mill docs, so I'm planning to create daily collection > >> with > >> > names e.g. *sample_colledction__mm_dd_hh.* > >> > I'll also create an alias *sample_collection* and update it whenever I > >> will > >> > create a new collection so that the entire data set is searchable. > >> > > >> > I've a couple of question on the above design > >> > 1) How far can it scale? As my collections will increase (so will the > >> > shards & replicas) do we have a breaking point when adding > more/searching > >> > will become an issue? > >> > 2) As my cluster will grow because of huge number of collections the > >> > clusterstate.json file present in zookeeper will grow too, won't this > be > >> a > >> > limiting factor? If so instead of storing all this info in one > >> > clusterstate.json file shouldn't Solr save cluster specific details in > >> this > >> > file & have collection specific config files present on zookeeper? > >> > 3) How can I easily manage all these collections? Do we have Java > >> Coreadmin > >> > API's available. I cannot find much documented on it. > >> > > >> > -- > >> > Txz, > >> > > >> > *Mukesh Jha * > >> > > > > > > > > -- > > > > > > Thanks & Regards, > > > > *Mukesh Jha * > -- Thanks & Regards, *Mukesh Jha *
Solr Cloud and Replication request handler
Hi everybody, Considering a solr cloud configuration (4.6+) a) I am wondering if the solr replication handler always has to be configured completely, aka by choosing one master, then setting the config accordingly (enable, masterUrl) etc ... Do we really need a replication master? solrconfig.xml excerpt true [..] false http://mysolrinstance::port /default/replication [..] b) what happens to the cloud if the "master" instance goes down? Thanks for your info ... All the best, Amanjit
Re: Application of different stemmers / stopword lists within a single field
Why wouldn't you take advantage of your use case - the chars belong to different char classes. You can index this field to a single solr field (no copyField) and apply an analysis chain that includes both languages analysis - stopword, stemmers etc. As every filter should apply to its' specific language (e.g an arabic stemmer should not stem a lating word) you can make cross languages search on this single field. On Mon, Apr 28, 2014 at 5:59 AM, Alexandre Rafalovitch wrote: > If you can throw money at the problem: > http://www.basistech.com/text-analytics/rosette/language-identifier/ . > Language Boundary Locator at the bottom of the page seems to be > part/all of your solution. > > Otherwise, specifically for English and Arabic, you could play with > Unicode ranges to try detecting text blocks: > 1) Create an UpdateRequestProcessor chain that > a) clones text into field_EN and field_AR. > b) applies regular expression transformations that strip English or > Arabic unicode text range correspondingly, so field_EN only has > English characters left, etc. Of course, you need to decide what you > want to do with occasional EN or neutral characters happening in the > middle of Arabic text (numbers: Arabic or Indic? brackets, dashes, > etc). But if you just index text, it might be ok even if it is not > perfect. > c) deletes empty fields, just in case not all of them have mix language > 2) Use eDismax to search over both fields, each with its own processor. > > Regards, >Alex. > Personal website: http://www.outerthoughts.com/ > Current project: http://www.solr-start.com/ - Accelerating your Solr > proficiency > > > On Fri, Apr 25, 2014 at 5:34 PM, Timothy Hill > wrote: > > This may not be a practically solvable problem, but the company I work > for > > has a large number of lengthy mixed-language documents - for example, > > scholarly articles about Islam written in English but containing lengthy > > passages of Arabic. Ideally, we would like users to be able to search > both > > the English and Arabic portions of the text, using the full complement of > > language-processing tools such as stemming and stopword removal. > > > > The problem, of course, is that these two languages co-occur in the same > > field. Is there any way to apply different processing to different words > or > > paragraphs within a single field through language detection? Is this to > all > > intents and purposes impossible within Solr? Or is another approach > (using > > language detection to split the single large field into > > language-differentiated smaller fields, for example) > possible/recommended? > > > > Thanks, > > > > Tim Hill >
merge shards indexes
Hi, it's possible to merge 2 shards indexes into one? Thank you -- *Gastone Penzo*
Re: space issue in search results
On 28 April 2014 12:42, PAVAN wrote: > > I have indexed title in the following way. > > honda cars in rajaji nagar > honda cars in rajajinagar. > > suppose if i search for > > honda cars in rajainagar (OR) > honda cars in rajaji nagar > > it has to display both the results. Please do not start multiple threads with the same question. The straightforward way to do what you want is to use synonyms: rajaji nagar, rajajinagar as presumably you want to collapse spaces only for things like place names. Regards, Gora
space issue in search results
I have indexed title in the following way. honda cars in rajaji nagar honda cars in rajajinagar. suppose if i search for honda cars in rajainagar (OR) honda cars in rajaji nagar it has to display both the results. Anybody help me how can we do this. -- View this message in context: http://lucene.472066.n3.nabble.com/space-issue-in-search-results-tp4133421.html Sent from the Solr - User mailing list archive at Nabble.com.