Re: Raw query parameters

2014-04-28 Thread Shawn Heisey
On 4/28/2014 7:54 PM, Xavier Morera wrote:
> Would anyone be so kind to explain what are the "Raw query parameters"
> in Solr's admin UI. I can't find an explanation in either the reference
> guide nor wiki nor web search.

The query API supports a lot more parameters than are shown on the admin
UI.  For instance, If you are doing a faceted search, there are only
boxes for facet.query, facet.field, and facet.prefix ... but faceted
search supports a lot more parameters (like facet.method, facet.limit,
facet.mincount, facet.sort, etc).  Raw Query Parameters gives you a way
to use the entire query API, not just the few things that have UI input
boxes.

Thanks,
Shawn



Re: Delete fields from document using a wildcard

2014-04-28 Thread Alexandre Rafalovitch
Not out of the box, as far as I know.

Custom UpdateRequestProcessor could possibly do some sort of expansion
of the field name by verifying the actual schema. Not sure if API
supports that level of flexibility. Or, for latest Solr, you can
request the list of known field names via REST and do client-side
expansion instead.

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency


On Tue, Apr 29, 2014 at 12:20 AM, Costi Muraru  wrote:
> Hi guys,
>
> Would be possible, using Atomic Updates in SOLR4, to remove all fields
> matching a pattern? For instance something like:
>
> 
>   100
>   <*field name="*_name_i" update="set" null="true">*
> 
>
> Or something similar to remove certain fields in all documents.
>
> Thanks,
> Costi


Re: saving user actions on item in solr for later retrieval

2014-04-28 Thread Alexandre Rafalovitch
1. might be too expensive in terms of commits and performance of
refreshing the index every time.

3. Have you looked at external fields, custom components, etc. For example:
http://www.slideshare.net/lucenerevolution/potter-timothy-boosting-documents-in-solr
http://lucene.472066.n3.nabble.com/Combining-Solr-score-with-customized-user-ratings-for-a-document-td4040200.html
(past discussion that seems relevant)

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency


On Tue, Apr 29, 2014 at 1:48 AM, nolim  wrote:
> Hi,
> We are using solr in production system for around ~500 users and we have
> around ~1 queries per day.
> Our user's search topics most of the time static and repeat themselves over
> time.
>
> We have in our system an option to specify "specific search subject" (we
> also call it "specific information need") and most of our users are using
> this option.
> We keep in our system logs each query and document retrieved from each
> "information need"
> and the user can also give feedback if the document is relevant for his
> "information need".
>
> We also have special query expansion technique and diversity algorithm based
> on MMR.
>
> We want to use this information from logs as data set for training our
> ranking system
> and preforming "Learning To Rank" for each "information need" or cluster of
> "information needs".
> We also want to give the user the option filter by "relevant" and "read"
> based on his actions\friends actions in the same topic.
> When he runs a query again or similar one he can skip already read
> documents. That's an important requirement to our users.
>
> We think about 2 possibilities to implement it:
> 1. Updating each item in solr and creating 2 fields named: "read",
> "relevant".
> Each field is multivalue field with the corresponding label of the
> "information need".
> When the user reads a document an update is sent to solr and the field
> "read" gets a label with
> the "information need" the user is working on...
> Will cause update when each item is read by user (still nothing compare to
> new items coming in each day).
> We are saving information that "belongs" to the application in solr which
> may be wrong architecture.
>
> 2. Save the information In DB, and then preforming filtering on the
> retrieved results.
> this option is much more complicated (We now have "fields" that aren't solr
> and the user uses them for search). We won't get facets, autocomplete and
> other nice stuff that a regular field in solr can have.
> cost in preformances, we can''t retrieve easy: "give me top 10 documents
> that answer the query and unread from the information need" and more
> complicated code to hold.
>
> 3. Do you have more ideas?
>
> Which of those options is the better?
>
> Thanks in advance!
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/saving-user-actions-on-item-in-solr-for-later-retrieval-tp4133558.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Selectively hiding SOLR facets.

2014-04-28 Thread atuldj.jadhav
Yes, but with my query *country:"USA" * it is returning me languages
belonging to countries other than USA.
 
Is there any way I can avoid such languages appearing in my facet filters?




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Selectively-hiding-SOLR-facets-tp4132770p4133638.html
Sent from the Solr - User mailing list archive at Nabble.com.


Raw query parameters

2014-04-28 Thread Xavier Morera
Hi,

Would anyone be so kind to explain what are the "Raw query parameters" in
Solr's admin UI. I can't find an explanation in either the reference guide
nor wiki nor web search.

[image: Inline image 1]

A bit confused on what it actually is for
[image: Inline image 3]

Thanks in advance,
Xavier
-- 
*Xavier Morera*
email: xav...@familiamorera.com
CR: +(506) 8849 8866
US: +1 (305) 600 4919
skype: xmorera


Indexing an array of maps get transformed to a map

2014-04-28 Thread Jinsu Oh
Our team is upgrading to solr 4.7.0 and running into an issue with
indexing an array of map objects in solr 4.7.0.

I understand that it makes no sense to index an array of map objects
to solr, but I want to figure out why certain error outputs are coming
out of the solr box.

So we have a document structure that goes something like:
{ id: 1234,
  url: abcd,
  modules: [ { id: 1, name: a} ]
}

When this goes through the solrj, I receive this error.

[http-bio-8080-exec-9] ERROR
org.apache.solr.servlet.SolrDispatchFilter  –
null:org.apache.solr.common.SolrException: Can't use
SignatureUpdateProcessor with partial update request containing
signature field: url

at 
org.apache.solr.update.processor.SignatureUpdateProcessorFactory$SignatureUpdateProcessor.processAdd(SignatureUpdateProcessorFactory.java:159)

at org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:247)

at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174)

For some reason, when the SignatureUpdateProcessorFactory receives the
update command, the solr document has become:
{ id: 1234,
  url: abcd,
  modules: {id: 1, name: a}
}. Then the processor thinks I'm sending a partial update, when I'm
trying to index a full document. :/

When I trace the code, I can see that I'm creating SolrInputDocument
with key 'modules' and value '[ { id: 1, name: 1} ]'. But when I call
Solrj to add to solr, the document values are transformed...

Does anyone know why this is happening?

-- 
Jinsu Oh


Re: Issue with SpanQuery

2014-04-28 Thread Vijay Kokatnur
Adding positionIncrementGap="1" to the fields worked for me.  I didn't
re-index all the existing docs so it works for only future documents.


On Mon, Apr 28, 2014 at 3:54 PM, Ahmet Arslan  wrote:

>
>
> Hi Vijay,
>
> It is a index time setting so yes solr restart and re-indexing is
> required. So A small test case would be handy
>
>
>
>
> On Tuesday, April 29, 2014 1:35 AM, Vijay Kokatnur <
> kokatnur.vi...@gmail.com> wrote:
> Thanks Ahmet, I'll give that a try.  Do I need to re-index to add/update
> positionIncrementGap?
>
>
>
> On Mon, Apr 28, 2014 at 3:31 PM, Ahmet Arslan  wrote:
>
> > Hi,
> >
> > I would add positionIncrementGap to fieldType definitions and experiment
> > with different values. 0, 1 and 100.
> >
> >
> >  > positionIncrementGap="1">
> >
> > Same with OrderLineType too
> >
> >
> >
> >
> > On Tuesday, April 29, 2014 1:25 AM, Vijay Kokatnur <
> > kokatnur.vi...@gmail.com> wrote:
> > Hey Ehmet,
> >
> > Here is the field def -
> >
> >  > multiValued="true" omitTermFreqAndPositions="false"/>
> >
> > 
> 
> >   > class="solr.LowerCaseFilterFactory"/>  
> >
> >
> >
> >
> >
> > On Mon, Apr 28, 2014 at 3:19 PM, Ahmet Arslan  wrote:
> >
> > > Hi,
> > >
> > > Can you paste your field definition of BookingRecordId and
> OrderLineType?
> > > It could be something related to positionIncrementGap.
> > >
> > > Ahmet
> > >
> > >
> > >
> > > On Tuesday, April 29, 2014 12:58 AM, Ethan  wrote:
> > > Facing the same problem!! I have noticed it works fine as long as
> you're
> > > looking up the first index position.
> > >
> > > Anyone faced similar problem before?
> > >
> > >
> > >
> > > On Mon, Apr 28, 2014 at 12:22 PM, Vijay Kokatnur
> > > wrote:
> > >
> > > > I have been working on SpanQuery for some time now to look up
> > multivalued
> > > > fields and found one more issue  -
> > > >
> > > > Now if a document has following lookup fields among others
> > > >
> > > > "*BookingRecordId*": [ "100268421", "190131", "8263325" ],
> > > >
> > > > "*OrderLineType*": [ "13", "1", "11" ],
> > > >
> > > > Here is the query I construct -
> > > >
> > > > val q1 = new SpanTermQuery(new Term("BookingRecordId", "100268421"))
> > > > val q2 = new SpanTermQuery(new Term("OrderLineType", "13"))
> > > > val q2m = new FieldMaskingSpanQuery(q2, "BookingRecordId")
> > > > val sp = Array[SpanQuery](q1, q2m)
> > > >
> > > > val q = new SpanNearQuery(sp, -1, false)
> > > >
> > > > Query to find element at first index position works fine -
> > > >
> > > > *{!span} BookingRecordId: 100268421 +OrderLineType:13*
> > > > but query to find element at third index position doesn't return any
> > > > result. -
> > > >
> > > > *{!span} BookingRecordId: 8263325 +OrderLineType:11 *
> > > >
> > > > If I increase the slope to 4 then it returns correct result. But it
> > also
> > > > matches BookingRecordId: 100268421 with OrderLineType:11 which is
> > > incorrect.
> > > >
> > > > I thought SpanQuery works for any multiValued field size.  Any ideas
> > how
> > > I
> > > > can fix this?
> > > >
> > > > Thanks,
> > > > -Vijay
> > > >
> > >
> > >
> >
>
>


Re: Issue with SpanQuery

2014-04-28 Thread Ahmet Arslan


Hi Vijay,

It is a index time setting so yes solr restart and re-indexing is required. So 
A small test case would be handy




On Tuesday, April 29, 2014 1:35 AM, Vijay Kokatnur  
wrote:
Thanks Ahmet, I'll give that a try.  Do I need to re-index to add/update
positionIncrementGap?



On Mon, Apr 28, 2014 at 3:31 PM, Ahmet Arslan  wrote:

> Hi,
>
> I would add positionIncrementGap to fieldType definitions and experiment
> with different values. 0, 1 and 100.
>
>
>  positionIncrementGap="1">
>
> Same with OrderLineType too
>
>
>
>
> On Tuesday, April 29, 2014 1:25 AM, Vijay Kokatnur <
> kokatnur.vi...@gmail.com> wrote:
> Hey Ehmet,
>
> Here is the field def -
>
>  multiValued="true" omitTermFreqAndPositions="false"/>
>
>  
>   class="solr.LowerCaseFilterFactory"/>  
>
>
>
>
>
> On Mon, Apr 28, 2014 at 3:19 PM, Ahmet Arslan  wrote:
>
> > Hi,
> >
> > Can you paste your field definition of BookingRecordId and OrderLineType?
> > It could be something related to positionIncrementGap.
> >
> > Ahmet
> >
> >
> >
> > On Tuesday, April 29, 2014 12:58 AM, Ethan  wrote:
> > Facing the same problem!! I have noticed it works fine as long as you're
> > looking up the first index position.
> >
> > Anyone faced similar problem before?
> >
> >
> >
> > On Mon, Apr 28, 2014 at 12:22 PM, Vijay Kokatnur
> > wrote:
> >
> > > I have been working on SpanQuery for some time now to look up
> multivalued
> > > fields and found one more issue  -
> > >
> > > Now if a document has following lookup fields among others
> > >
> > > "*BookingRecordId*": [ "100268421", "190131", "8263325" ],
> > >
> > > "*OrderLineType*": [ "13", "1", "11" ],
> > >
> > > Here is the query I construct -
> > >
> > > val q1 = new SpanTermQuery(new Term("BookingRecordId", "100268421"))
> > > val q2 = new SpanTermQuery(new Term("OrderLineType", "13"))
> > > val q2m = new FieldMaskingSpanQuery(q2, "BookingRecordId")
> > > val sp = Array[SpanQuery](q1, q2m)
> > >
> > > val q = new SpanNearQuery(sp, -1, false)
> > >
> > > Query to find element at first index position works fine -
> > >
> > > *{!span} BookingRecordId: 100268421 +OrderLineType:13*
> > > but query to find element at third index position doesn't return any
> > > result. -
> > >
> > > *{!span} BookingRecordId: 8263325 +OrderLineType:11 *
> > >
> > > If I increase the slope to 4 then it returns correct result. But it
> also
> > > matches BookingRecordId: 100268421 with OrderLineType:11 which is
> > incorrect.
> > >
> > > I thought SpanQuery works for any multiValued field size.  Any ideas
> how
> > I
> > > can fix this?
> > >
> > > Thanks,
> > > -Vijay
> > >
> >
> >
>



Re: Issue with SpanQuery

2014-04-28 Thread Ethan
I tried testing with positionIncrementGap but that didn't work.  The values
I passed for it were 0, 1, 4,100.

Reindexing also didn't help.


On Mon, Apr 28, 2014 at 3:34 PM, Vijay Kokatnur wrote:

> Thanks Ahmet, I'll give that a try.  Do I need to re-index to add/update
> positionIncrementGap?
>
>
> On Mon, Apr 28, 2014 at 3:31 PM, Ahmet Arslan  wrote:
>
>> Hi,
>>
>> I would add positionIncrementGap to fieldType definitions and experiment
>> with different values. 0, 1 and 100.
>>
>>
>> > positionIncrementGap="1">
>>
>> Same with OrderLineType too
>>
>>
>>
>>
>> On Tuesday, April 29, 2014 1:25 AM, Vijay Kokatnur <
>> kokatnur.vi...@gmail.com> wrote:
>> Hey Ehmet,
>>
>> Here is the field def -
>>
>> > multiValued="true" omitTermFreqAndPositions="false"/>
>>
>> 
>> 
>>  > class="solr.LowerCaseFilterFactory"/>  
>>
>>
>>
>>
>>
>> On Mon, Apr 28, 2014 at 3:19 PM, Ahmet Arslan  wrote:
>>
>> > Hi,
>> >
>> > Can you paste your field definition of BookingRecordId and
>> OrderLineType?
>> > It could be something related to positionIncrementGap.
>> >
>> > Ahmet
>> >
>> >
>> >
>> > On Tuesday, April 29, 2014 12:58 AM, Ethan  wrote:
>> > Facing the same problem!! I have noticed it works fine as long as you're
>> > looking up the first index position.
>> >
>> > Anyone faced similar problem before?
>> >
>> >
>> >
>> > On Mon, Apr 28, 2014 at 12:22 PM, Vijay Kokatnur
>> > wrote:
>> >
>> > > I have been working on SpanQuery for some time now to look up
>> multivalued
>> > > fields and found one more issue  -
>> > >
>> > > Now if a document has following lookup fields among others
>> > >
>> > > "*BookingRecordId*": [ "100268421", "190131", "8263325" ],
>> > >
>> > > "*OrderLineType*": [ "13", "1", "11" ],
>> > >
>> > > Here is the query I construct -
>> > >
>> > > val q1 = new SpanTermQuery(new Term("BookingRecordId", "100268421"))
>> > > val q2 = new SpanTermQuery(new Term("OrderLineType", "13"))
>> > > val q2m = new FieldMaskingSpanQuery(q2, "BookingRecordId")
>> > > val sp = Array[SpanQuery](q1, q2m)
>> > >
>> > > val q = new SpanNearQuery(sp, -1, false)
>> > >
>> > > Query to find element at first index position works fine -
>> > >
>> > > *{!span} BookingRecordId: 100268421 +OrderLineType:13*
>> > > but query to find element at third index position doesn't return any
>> > > result. -
>> > >
>> > > *{!span} BookingRecordId: 8263325 +OrderLineType:11 *
>> > >
>> > > If I increase the slope to 4 then it returns correct result. But it
>> also
>> > > matches BookingRecordId: 100268421 with OrderLineType:11 which is
>> > incorrect.
>> > >
>> > > I thought SpanQuery works for any multiValued field size.  Any ideas
>> how
>> > I
>> > > can fix this?
>> > >
>> > > Thanks,
>> > > -Vijay
>> > >
>> >
>> >
>>
>
>


Re: Issue with SpanQuery

2014-04-28 Thread Vijay Kokatnur
Thanks Ahmet, I'll give that a try.  Do I need to re-index to add/update
positionIncrementGap?


On Mon, Apr 28, 2014 at 3:31 PM, Ahmet Arslan  wrote:

> Hi,
>
> I would add positionIncrementGap to fieldType definitions and experiment
> with different values. 0, 1 and 100.
>
>
>  positionIncrementGap="1">
>
> Same with OrderLineType too
>
>
>
>
> On Tuesday, April 29, 2014 1:25 AM, Vijay Kokatnur <
> kokatnur.vi...@gmail.com> wrote:
> Hey Ehmet,
>
> Here is the field def -
>
>  multiValued="true" omitTermFreqAndPositions="false"/>
>
>  
>   class="solr.LowerCaseFilterFactory"/>  
>
>
>
>
>
> On Mon, Apr 28, 2014 at 3:19 PM, Ahmet Arslan  wrote:
>
> > Hi,
> >
> > Can you paste your field definition of BookingRecordId and OrderLineType?
> > It could be something related to positionIncrementGap.
> >
> > Ahmet
> >
> >
> >
> > On Tuesday, April 29, 2014 12:58 AM, Ethan  wrote:
> > Facing the same problem!! I have noticed it works fine as long as you're
> > looking up the first index position.
> >
> > Anyone faced similar problem before?
> >
> >
> >
> > On Mon, Apr 28, 2014 at 12:22 PM, Vijay Kokatnur
> > wrote:
> >
> > > I have been working on SpanQuery for some time now to look up
> multivalued
> > > fields and found one more issue  -
> > >
> > > Now if a document has following lookup fields among others
> > >
> > > "*BookingRecordId*": [ "100268421", "190131", "8263325" ],
> > >
> > > "*OrderLineType*": [ "13", "1", "11" ],
> > >
> > > Here is the query I construct -
> > >
> > > val q1 = new SpanTermQuery(new Term("BookingRecordId", "100268421"))
> > > val q2 = new SpanTermQuery(new Term("OrderLineType", "13"))
> > > val q2m = new FieldMaskingSpanQuery(q2, "BookingRecordId")
> > > val sp = Array[SpanQuery](q1, q2m)
> > >
> > > val q = new SpanNearQuery(sp, -1, false)
> > >
> > > Query to find element at first index position works fine -
> > >
> > > *{!span} BookingRecordId: 100268421 +OrderLineType:13*
> > > but query to find element at third index position doesn't return any
> > > result. -
> > >
> > > *{!span} BookingRecordId: 8263325 +OrderLineType:11 *
> > >
> > > If I increase the slope to 4 then it returns correct result. But it
> also
> > > matches BookingRecordId: 100268421 with OrderLineType:11 which is
> > incorrect.
> > >
> > > I thought SpanQuery works for any multiValued field size.  Any ideas
> how
> > I
> > > can fix this?
> > >
> > > Thanks,
> > > -Vijay
> > >
> >
> >
>


Re: Issue with SpanQuery

2014-04-28 Thread Ahmet Arslan
Hi,

I would add positionIncrementGap to fieldType definitions and experiment with 
different values. 0, 1 and 100.




Same with OrderLineType too




On Tuesday, April 29, 2014 1:25 AM, Vijay Kokatnur  
wrote:
Hey Ehmet,

Here is the field def -



 
   





On Mon, Apr 28, 2014 at 3:19 PM, Ahmet Arslan  wrote:

> Hi,
>
> Can you paste your field definition of BookingRecordId and OrderLineType?
> It could be something related to positionIncrementGap.
>
> Ahmet
>
>
>
> On Tuesday, April 29, 2014 12:58 AM, Ethan  wrote:
> Facing the same problem!! I have noticed it works fine as long as you're
> looking up the first index position.
>
> Anyone faced similar problem before?
>
>
>
> On Mon, Apr 28, 2014 at 12:22 PM, Vijay Kokatnur
> wrote:
>
> > I have been working on SpanQuery for some time now to look up multivalued
> > fields and found one more issue  -
> >
> > Now if a document has following lookup fields among others
> >
> > "*BookingRecordId*": [ "100268421", "190131", "8263325" ],
> >
> > "*OrderLineType*": [ "13", "1", "11" ],
> >
> > Here is the query I construct -
> >
> > val q1 = new SpanTermQuery(new Term("BookingRecordId", "100268421"))
> > val q2 = new SpanTermQuery(new Term("OrderLineType", "13"))
> > val q2m = new FieldMaskingSpanQuery(q2, "BookingRecordId")
> > val sp = Array[SpanQuery](q1, q2m)
> >
> > val q = new SpanNearQuery(sp, -1, false)
> >
> > Query to find element at first index position works fine -
> >
> > *{!span} BookingRecordId: 100268421 +OrderLineType:13*
> > but query to find element at third index position doesn't return any
> > result. -
> >
> > *{!span} BookingRecordId: 8263325 +OrderLineType:11 *
> >
> > If I increase the slope to 4 then it returns correct result. But it also
> > matches BookingRecordId: 100268421 with OrderLineType:11 which is
> incorrect.
> >
> > I thought SpanQuery works for any multiValued field size.  Any ideas how
> I
> > can fix this?
> >
> > Thanks,
> > -Vijay
> >
>
>


Re: Issue with SpanQuery

2014-04-28 Thread Vijay Kokatnur
Hey Ehmet,

Here is the field def -



 
   




On Mon, Apr 28, 2014 at 3:19 PM, Ahmet Arslan  wrote:

> Hi,
>
> Can you paste your field definition of BookingRecordId and OrderLineType?
> It could be something related to positionIncrementGap.
>
> Ahmet
>
>
>
> On Tuesday, April 29, 2014 12:58 AM, Ethan  wrote:
> Facing the same problem!! I have noticed it works fine as long as you're
> looking up the first index position.
>
> Anyone faced similar problem before?
>
>
>
> On Mon, Apr 28, 2014 at 12:22 PM, Vijay Kokatnur
> wrote:
>
> > I have been working on SpanQuery for some time now to look up multivalued
> > fields and found one more issue  -
> >
> > Now if a document has following lookup fields among others
> >
> > "*BookingRecordId*": [ "100268421", "190131", "8263325" ],
> >
> > "*OrderLineType*": [ "13", "1", "11" ],
> >
> > Here is the query I construct -
> >
> > val q1 = new SpanTermQuery(new Term("BookingRecordId", "100268421"))
> > val q2 = new SpanTermQuery(new Term("OrderLineType", "13"))
> > val q2m = new FieldMaskingSpanQuery(q2, "BookingRecordId")
> > val sp = Array[SpanQuery](q1, q2m)
> >
> > val q = new SpanNearQuery(sp, -1, false)
> >
> > Query to find element at first index position works fine -
> >
> > *{!span} BookingRecordId: 100268421 +OrderLineType:13*
> > but query to find element at third index position doesn't return any
> > result. -
> >
> > *{!span} BookingRecordId: 8263325 +OrderLineType:11 *
> >
> > If I increase the slope to 4 then it returns correct result. But it also
> > matches BookingRecordId: 100268421 with OrderLineType:11 which is
> incorrect.
> >
> > I thought SpanQuery works for any multiValued field size.  Any ideas how
> I
> > can fix this?
> >
> > Thanks,
> > -Vijay
> >
>
>


RE: how to write my first solr query

2014-04-28 Thread Evan Smith
Hello,

Thank you!  I will try out what you suggested and post back once I know
more.

yes given things like
cat foo bar
house foo bar
foo bar

I want to know when the term "foo bar" (but not the prefix cases I specify)
exists in my documents.

Thanks!
Evan




--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-write-my-first-solr-query-tp4133509p4133601.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Issue with SpanQuery

2014-04-28 Thread Ahmet Arslan
Hi,

Can you paste your field definition of BookingRecordId and OrderLineType? It 
could be something related to positionIncrementGap.

Ahmet



On Tuesday, April 29, 2014 12:58 AM, Ethan  wrote:
Facing the same problem!! I have noticed it works fine as long as you're
looking up the first index position.

Anyone faced similar problem before?



On Mon, Apr 28, 2014 at 12:22 PM, Vijay Kokatnur
wrote:

> I have been working on SpanQuery for some time now to look up multivalued
> fields and found one more issue  -
>
> Now if a document has following lookup fields among others
>
> "*BookingRecordId*": [ "100268421", "190131", "8263325" ],
>
> "*OrderLineType*": [ "13", "1", "11" ],
>
> Here is the query I construct -
>
> val q1 = new SpanTermQuery(new Term("BookingRecordId", "100268421"))
> val q2 = new SpanTermQuery(new Term("OrderLineType", "13"))
> val q2m = new FieldMaskingSpanQuery(q2, "BookingRecordId")
> val sp = Array[SpanQuery](q1, q2m)
>
> val q = new SpanNearQuery(sp, -1, false)
>
> Query to find element at first index position works fine -
>
> *{!span} BookingRecordId: 100268421 +OrderLineType:13*
> but query to find element at third index position doesn't return any
> result. -
>
> *{!span} BookingRecordId: 8263325 +OrderLineType:11 *
>
> If I increase the slope to 4 then it returns correct result. But it also
> matches BookingRecordId: 100268421 with OrderLineType:11 which is incorrect.
>
> I thought SpanQuery works for any multiValued field size.  Any ideas how I
> can fix this?
>
> Thanks,
> -Vijay
>



Re: How to get a list of currently executing queries?

2014-04-28 Thread Otis Gospodnetic
No, though one could write a custom SearchComponent, I imagine.  Not
terribly useful for most situations where queries typically run for only a
few milliseconds, but

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Thu, Apr 17, 2014 at 7:34 AM, Nikhil Chhaochharia wrote:

> Hello,
>
> Is there some way of getting a list of all queries that are currently
> executing?  Something similar to 'show full processlist' in MySQL.
>
> Thanks,
> Nikhil


Re: Issue with SpanQuery

2014-04-28 Thread Ethan
Facing the same problem!! I have noticed it works fine as long as you're
looking up the first index position.

Anyone faced similar problem before?


On Mon, Apr 28, 2014 at 12:22 PM, Vijay Kokatnur
wrote:

> I have been working on SpanQuery for some time now to look up multivalued
> fields and found one more issue  -
>
> Now if a document has following lookup fields among others
>
> "*BookingRecordId*": [ "100268421", "190131", "8263325" ],
>
> "*OrderLineType*": [ "13", "1", "11" ],
>
> Here is the query I construct -
>
> val q1 = new SpanTermQuery(new Term("BookingRecordId", "100268421"))
> val q2 = new SpanTermQuery(new Term("OrderLineType", "13"))
> val q2m = new FieldMaskingSpanQuery(q2, "BookingRecordId")
> val sp = Array[SpanQuery](q1, q2m)
>
> val q = new SpanNearQuery(sp, -1, false)
>
> Query to find element at first index position works fine -
>
> *{!span} BookingRecordId: 100268421 +OrderLineType:13*
> but query to find element at third index position doesn't return any
> result. -
>
> *{!span} BookingRecordId: 8263325 +OrderLineType:11 *
>
> If I increase the slope to 4 then it returns correct result. But it also
> matches BookingRecordId: 100268421 with OrderLineType:11 which is incorrect.
>
> I thought SpanQuery works for any multiValued field size.  Any ideas how I
> can fix this?
>
> Thanks,
> -Vijay
>


RE: spellcheck.q and local parameters

2014-04-28 Thread Jeroen Steggink
Thanks James, I was afraid of that. The problem is that spellcheck.q Is not 
always provided by the users and therefore it gives wrong suggestions. I'll 
just turn off spellcheck by default.

Cheers,
Jeroen

-Original Message-
From: Dyer, James [mailto:james.d...@ingramcontent.com] 
Sent: maandag 28 april 2014 22:55
To: solr-user@lucene.apache.org
Subject: RE: spellcheck.q and local parameters

spellcheck.q is supposed to take a list of raw query terms, so what you're 
trying to do in your example won't work.  What you should do instead is 
space-delimit the actual query terms that exist in "qq" and (nothing else) use 
that for your value of spellcheck.q .  

James Dyer
Ingram Content Group
(615) 213-4311


-Original Message-
From: Jeroen Steggink [mailto:jeroen.stegg...@contentstrategy.nl]
Sent: Monday, April 28, 2014 3:01 PM
To: solr-user@lucene.apache.org
Subject: spellcheck.q and local parameters

Hi,

I'm having some trouble using the spellcheck.q parameter. The user's query is 
defined in the qq parameter and q parameter contains several other parameters 
for boosting.
I would like to use the qq parameter as a default for spellcheck.q.
I tried several ways of adding the qq parameter in the spellcheck.q parameter, 
but it doesn't seem to work. Is this at all possible or do I need to write a 
custom QueryConverter?

This is the configuration:

 _query_:"{!edismax qf=$qfQuery pf=$pfQuery bq=$boostQuery 
bf=$boostFunction v=$qq}" {!v=$qq}

I haven't included all the variables, because they seem unnecessary.

Regards,
Jeroen



RE: spellcheck.q and local parameters

2014-04-28 Thread Dyer, James
spellcheck.q is supposed to take a list of raw query terms, so what you're 
trying to do in your example won't work.  What you should do instead is 
space-delimit the actual query terms that exist in "qq" and (nothing else) use 
that for your value of spellcheck.q .  

James Dyer
Ingram Content Group
(615) 213-4311


-Original Message-
From: Jeroen Steggink [mailto:jeroen.stegg...@contentstrategy.nl] 
Sent: Monday, April 28, 2014 3:01 PM
To: solr-user@lucene.apache.org
Subject: spellcheck.q and local parameters

Hi,

I'm having some trouble using the spellcheck.q parameter. The user's query is 
defined in the qq parameter and q parameter contains several other parameters 
for boosting.
I would like to use the qq parameter as a default for spellcheck.q.
I tried several ways of adding the qq parameter in the spellcheck.q parameter, 
but it doesn't seem to work. Is this at all possible or do I need to write a 
custom QueryConverter?

This is the configuration:

 _query_:"{!edismax qf=$qfQuery pf=$pfQuery bq=$boostQuery 
bf=$boostFunction v=$qq}"
{!v=$qq}

I haven't included all the variables, because they seem unnecessary.

Regards,
Jeroen



RE: how to write my first solr query

2014-04-28 Thread Jeroen Steggink
Hi Evan,

If I understand correctly, a document has to have at least one "foo bar" 
without having "cat" in front.

A solution would be to use a combination of the ShingleFilterFactory and query 
for one occurences of "foo bar" using the termfreq function.

https://cwiki.apache.org/confluence/display/solr/Filter+Descriptions#FilterDescriptions-ShingleFilter
https://cwiki.apache.org/confluence/display/solr/Function+Queries

The number of shingles depends on how many terms are in the query and how many 
terms cannot be prefixed.

It might be easier to just retrieve all the documents which contain the phrase 
and process the results outside of Solr.
If you could shed some more light on what you are trying to accomplish, maybe 
we can help you find an even better solution to fit your problem.

Jeroen

-Original Message-
From: Evan Smith [mailto:e...@wingonwing.com] 
Sent: maandag 28 april 2014 19:20
To: solr-user@lucene.apache.org
Subject: Re: how to write my first solr query

Hello,

Here is a better use case

Documents A, B, C, and D

A: "dear foo bar hello"
B: "dear cat foo bar hello"
C: "dear cat foo bar hello foo bar"
D: "dear car foo bar"

I have a dictionary of items outside of solr "foo bar" and "cat foo bar"
And associated with each item is the set of "suffix's of that item"
So I know that "foo bar" has "cat foo bar" as a "suffix"

I would like to search my corpus of documents A, B, C and D And just get 
documents that contain "foo bar" and not the ones that contain "cat foo bar"

So if I searched on "foo bar" but not "cat foo bar"
I want to get documents A, C, D
But not B which does not have just "foo bar" but has "cat foo bar".
I am ok with C as it has a "foo bar" that is not prefixed with "cat".

Does this make sense?  I see that the ("foo bar" and not "cat foo bar") would 
not work as it would miss document C.  Or at least I think it would.

Evan



--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-write-my-first-solr-query-tp4133509p4133537.html
Sent from the Solr - User mailing list archive at Nabble.com.


spellcheck.q and local parameters

2014-04-28 Thread Jeroen Steggink
Hi,

I'm having some trouble using the spellcheck.q parameter. The user's query is 
defined in the qq parameter and q parameter contains several other parameters 
for boosting.
I would like to use the qq parameter as a default for spellcheck.q.
I tried several ways of adding the qq parameter in the spellcheck.q parameter, 
but it doesn't seem to work. Is this at all possible or do I need to write a 
custom QueryConverter?

This is the configuration:

 _query_:"{!edismax qf=$qfQuery pf=$pfQuery bq=$boostQuery 
bf=$boostFunction v=$qq}"
{!v=$qq}

I haven't included all the variables, because they seem unnecessary.

Regards,
Jeroen


Issue with SpanQuery

2014-04-28 Thread Vijay Kokatnur
I have been working on SpanQuery for some time now to look up multivalued
fields and found one more issue  -

Now if a document has following lookup fields among others

"*BookingRecordId*": [ "100268421", "190131", "8263325" ],

"*OrderLineType*": [ "13", "1", "11" ],

Here is the query I construct -

val q1 = new SpanTermQuery(new Term("BookingRecordId", "100268421"))
val q2 = new SpanTermQuery(new Term("OrderLineType", "13"))
val q2m = new FieldMaskingSpanQuery(q2, "BookingRecordId")
val sp = Array[SpanQuery](q1, q2m)

val q = new SpanNearQuery(sp, -1, false)

Query to find element at first index position works fine -

*{!span} BookingRecordId: 100268421 +OrderLineType:13*
but query to find element at third index position doesn't return any
result. -

*{!span} BookingRecordId: 8263325 +OrderLineType:11 *

If I increase the slope to 4 then it returns correct result. But it also
matches BookingRecordId: 100268421 with OrderLineType:11 which is incorrect.

I thought SpanQuery works for any multiValued field size.  Any ideas how I
can fix this?

Thanks,
-Vijay


saving user actions on item in solr for later retrieval

2014-04-28 Thread nolim
Hi,
We are using solr in production system for around ~500 users and we have
around ~1 queries per day.
Our user's search topics most of the time static and repeat themselves over
time. 

We have in our system an option to specify "specific search subject" (we
also call it "specific information need") and most of our users are using
this option.
We keep in our system logs each query and document retrieved from each
"information need"
and the user can also give feedback if the document is relevant for his
"information need".

We also have special query expansion technique and diversity algorithm based
on MMR.

We want to use this information from logs as data set for training our
ranking system
and preforming "Learning To Rank" for each "information need" or cluster of
"information needs".
We also want to give the user the option filter by "relevant" and "read"
based on his actions\friends actions in the same topic.
When he runs a query again or similar one he can skip already read
documents. That's an important requirement to our users.

We think about 2 possibilities to implement it:
1. Updating each item in solr and creating 2 fields named: "read",
"relevant".
Each field is multivalue field with the corresponding label of the
"information need".
When the user reads a document an update is sent to solr and the field
"read" gets a label with
the "information need" the user is working on...
Will cause update when each item is read by user (still nothing compare to
new items coming in each day).
We are saving information that "belongs" to the application in solr which
may be wrong architecture.

2. Save the information In DB, and then preforming filtering on the
retrieved results.
this option is much more complicated (We now have "fields" that aren't solr
and the user uses them for search). We won't get facets, autocomplete and
other nice stuff that a regular field in solr can have.
cost in preformances, we can''t retrieve easy: "give me top 10 documents
that answer the query and unread from the information need" and more
complicated code to hold.

3. Do you have more ideas?

Which of those options is the better?

Thanks in advance!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/saving-user-actions-on-item-in-solr-for-later-retrieval-tp4133558.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Stemming not working with wildcard search

2014-04-28 Thread Geepalem
Hi Ahmet,

Thanks for your prompt response!

I have added filters which you have specified but still its not working.
Below is field Query Analyzer

 

 




 

http://localhost:8080/solr/master/select?q=page_title_t:*products*
http://localhost:8080/solr/master/select?q=page_title_t:*product*


Please let me know if I am doing anything wrong.

Thanks,
G. Naresh Kumar



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Stemming-not-working-with-wildcard-search-tp4133382p4133556.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: zkCli zkhost parameter

2014-04-28 Thread Scott Stults
I did, but it looks like I mixed in the chroot too after every entry rather
than once at the very end (thanks to David Smiley for catching that). I'll
try again and update if it's still a problem.

Thanks!
-Scott




On Sat, Apr 26, 2014 at 1:08 PM, Mark Miller  wrote:

> Have you tried a comma-separated list or are you going by documentation?
> It should work.
> --
> Mark Miller
> about.me/markrmiller
>
> On April 26, 2014 at 1:03:25 PM, Scott Stults (
> sstu...@opensourceconnections.com) wrote:
>
> It looks like this only takes a single host as its value, whereas the
> zkHost environment variable for Solr takes a comma-separated list.
> Shouldn't the client also take a comma-separated list?
>
> k/r,
> Scott
>



-- 
Scott Stults | Founder & Solutions Architect | OpenSource Connections, LLC
| 434.409.2780
http://www.opensourceconnections.com


Re: Wildcard search not working with search term having special characters and digits

2014-04-28 Thread Geepalem
Thanks jack for prompt response!

So is there any solution to make this scenario works? 
Or wildcard doesn't work with special characters and numerics?

Thanks,
G. Naresh Kumar



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-with-search-term-having-special-characters-and-digits-tp4133385p4133554.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: how to write my first solr query

2014-04-28 Thread Evan Smith
Hello,

Here is a better use case

Documents A, B, C, and D

A: "dear foo bar hello"
B: "dear cat foo bar hello"
C: "dear cat foo bar hello foo bar"
D: "dear car foo bar"

I have a dictionary of items outside of solr 
"foo bar" and "cat foo bar"
And associated with each item is the set of "suffix's of that item"
So I know that "foo bar" has "cat foo bar" as a "suffix"

I would like to search my corpus of documents A, B, C and D
And just get documents that contain "foo bar" and not the ones that contain
"cat foo bar"

So if I searched on "foo bar" but not "cat foo bar"
I want to get documents A, C, D
But not B which does not have just "foo bar" but has "cat foo bar".
I am ok with C as it has a "foo bar" that is not prefixed with "cat".

Does this make sense?  I see that the ("foo bar" and not "cat foo bar")
would not work as it would miss document C.  Or at least I think it would.

Evan



--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-write-my-first-solr-query-tp4133509p4133537.html
Sent from the Solr - User mailing list archive at Nabble.com.


Delete fields from document using a wildcard

2014-04-28 Thread Costi Muraru
Hi guys,

Would be possible, using Atomic Updates in SOLR4, to remove all fields
matching a pattern? For instance something like:


  100
  <*field name="*_name_i" update="set" null="true">*


Or something similar to remove certain fields in all documents.

Thanks,
Costi


Re: SpanQuery with Boolean Queries

2014-04-28 Thread Vijay Kokatnur
Pretty neat. Thanks!


On Fri, Apr 25, 2014 at 2:44 AM, Ahmet Arslan  wrote:

> Hi,
>
> I am not sure how OR clauses are executed.
>
> But after re-reading your mail, I think you can use SpanOrQuery (for your
> q1) in your custom query parser plugin.
>
> val q2 = new SpanOrQuery(
> new SpanTermQuery(new Term("BookingRecordId",
> "ID_1")),
> new SpanTermQuery(new Term("BookingRecordId",
> "ID_N"))
> );
>
>
>
>
> On Friday, April 25, 2014 3:22 AM, Vijay Kokatnur <
> kokatnur.vi...@gmail.com> wrote:
> Thanks Ahmet. It worked!
>
> Does solr execute these nested queries in parallel?
>
>
>
> On Thu, Apr 24, 2014 at 12:53 PM, Ahmet Arslan  wrote:
>
> > Hi Vijay,
> >
> > May be you can use _query_ hook?
> >
> > _query_:"{!span}BookingRecordId:234 OrderLineType:11" OR _query_:"{!span}
> > OrderLineType:13 + BookingRecordId:ID_N"
> >
> > Ahmet
> >
> >
> > On Thursday, April 24, 2014 9:34 PM, Vijay Kokatnur <
> > kokatnur.vi...@gmail.com> wrote:
> > Hi,
> >
> > I have defined a SpanQuery for proximity search like -
> >
> > val q1 = new SpanTermQuery(new Term("BookingRecordId", "234"))
> > val q2 = new SpanTermQuery(new Term("OrderLineType", "11"))
> > val q2m = new FieldMaskingSpanQuery(q2, "BookingRecordId")
> > val sp = Array[SpanQuery](q1, q2m)
> >
> > val q = new SpanNearQuery(sp, -1, false)
> >
> > Query:
> > *&fq={!span} BookingRecordId: 234+OrderLineType11*
> >
> > However, I need to look up by multiple BookingRecordIds with an OR -
> >
> > *&fq={!span}OrderLineType:"13" + (BookingRecordId:ID_1 OR ... OR
> > BookingRecordId:ID_N)*
> >
> > I can't specify multiple *span* in the same query like -
> >
> > *{!span} OrderLineType:"13" + BookingRecordId:ID_1 OR ... OR {!span}
> > OrderLineType:"13" + BookingRecordId:ID_N*
> >
> > Is there any recommended to way to achieve this?
> > Thanks, Vijay
> >
> >
>
>


[ANNOUNCE] Apache Solr 4.8.0 released

2014-04-28 Thread Uwe Schindler
28 April 2014, Apache Solr™ 4.8.0 available

The Lucene PMC is pleased to announce the release of Apache Solr 4.8.0

Solr is the popular, blazing fast, open source NoSQL search platform
from the Apache Lucene project. Its major features include powerful
full-text search, hit highlighting, faceted search, dynamic
clustering, database integration, rich document (e.g., Word, PDF)
handling, and geospatial search.  Solr is highly scalable, providing
fault tolerant distributed search and indexing, and powers the search
and navigation features of many of the world's largest internet sites.

Solr 4.8.0 is available for immediate download at:
  http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

See the CHANGES.txt file included with the release for a full list of
details.

Solr 4.8.0 Release Highlights:

* Apache Solr now requires Java 7 or greater (recommended is
  Oracle Java 7 or OpenJDK 7, minimum update 55; earlier versions
  have known JVM bugs affecting Solr).

* Apache Solr is fully compatible with Java 8.

*  and  tags have been deprecated from schema.xml.
  There is no longer any reason to keep them in the schema file,
  they may be safely removed. This allows intermixing of ,
   and  definitions if desired.

* The new {!complexphrase} query parser supports wildcards, ORs etc.
  inside Phrase Queries. 

* New Collections API CLUSTERSTATUS action reports the status of
  collections, shards, and replicas, and also lists collection
  aliases and cluster properties.
 
* Added managed synonym and stopword filter factories, which enable
  synonym and stopword lists to be dynamically managed via REST API.

* JSON updates now support nested child documents, enabling {!child}
  and {!parent} block join queries. 

* Added ExpandComponent to expand results collapsed by the
  CollapsingQParserPlugin, as well as the parent/child relationship
  of nested child documents.

* Long-running Collections API tasks can now be executed
  asynchronously; the new REQUESTSTATUS action provides status.

* Added a hl.qparser parameter to allow you to define a query parser
  for hl.q highlight queries.

* In Solr single-node mode, cores can now be created using named
  configsets.

* New DocExpirationUpdateProcessorFactory supports computing an
  expiration date for documents from the "TTL" expression, as well as
  automatically deleting expired documents on a periodic basis. 

Solr 4.8.0 also includes many other new features as well as numerous
optimizations and bugfixes of the corresponding Apache Lucene release.

Please report any feedback to the mailing lists
(http://lucene.apache.org/solr/discussion.html)

Note: The Apache Software Foundation uses an extensive mirroring network
for distributing releases.  It is possible that the mirror you are using
may not have replicated the release yet.  If that is the case, please
try another mirror.  This also goes for Maven access.

-
Uwe Schindler
uschind...@apache.org 
Apache Lucene PMC Chair / Committer
Bremen, Germany
http://lucene.apache.org/




Re: how to write my first solr query

2014-04-28 Thread Ahmet Arslan


Hi Evan,

Confusing use case :)

You don't want "foo bar" is prefixed with "cat" ?

But you are ok with a document that has "cat foo bar"

Isn't this contradiction?




On Monday, April 28, 2014 6:26 PM, Evan Smith  wrote:
Hello,

I would like to find all documents that have say "foo bar" with a filter to
remove any cases where "foo bar" is prefixed with things like "cat", "a",
...

I am ok with a document that has "cat foo bar"  and "foo bar", but if it
only has "cat foo bar" then I don't want it while if it has "foo bar" I want
it.

I looked at span queries but was not able to come up with how to phrase
this.

Any pointers would be great!

Thank you in advance,
Evan




--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-write-my-first-solr-query-tp4133509.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Stemming not working with wildcard search

2014-04-28 Thread Ahmet Arslan
Hi Naresh,

quotes are only meaningful when there are two or more terms. don't use quotes 
for products* and product*.

As regarding stemming and wildcards, use following chain, and your wildcard 
searches will be happier.





Ahmet


On Monday, April 28, 2014 5:41 PM, Jack Krupansky  
wrote:
Wildcards and stemming are incompatible at query time - you need to manually 
stem the term before applying your wildcard.

Wildcards are not supported in quoted phrases. They will be treated as 
punctuation, and ignored by the standard tokenizer or the word delimiter 
filter.

-- Jack Krupansky

-Original Message- 
From: Geepalem
Sent: Sunday, April 27, 2014 3:13 PM
To: solr-user@lucene.apache.org
Subject: Stemming not working with wildcard search

Hi,

I have added  SnowballPorterFilterFactory filter to field type to make
singular and plural search terms return same results.

So below queries (double quotes around search term) returning similar
results which is fine.

http://localhost:8080/solr/master/select?q=page_title_t:"product*";
http://localhost:8080/solr/master/select?q=page_title_t:"products*";

But when I have analyzed results, in both result sets, documents which dont
start with words "Product" or "products" didnt come though there are few
documents available.

So I have added * as prefix and suffix to search term without double quotes
to do wildcard search.

http://localhost:8080/solr/master/select?q=page_title_t:*product*
http://localhost:8080/solr/master/select?q=page_title_t:*products*

Now, stemming is not working as above second query is not returning similar
results as query 1.

If double quotes are added around search term then its returning similar
results but results are not as expected. With double quotes it wont return
results like "Old products", "New products", "Cool Product".
It will only return results with the values like "Product 1", "Product
2","Products of USA".

Please suggest or guide how to make stemming work with wildcard search.


Appreciate immediate response!!

Thanks,
G. Naresh Kumar





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Stemming-not-working-with-wildcard-search-tp4133382.html
Sent from the Solr - User mailing list archive at Nabble.com.


how to write my first solr query

2014-04-28 Thread Evan Smith
Hello,

I would like to find all documents that have say "foo bar" with a filter to
remove any cases where "foo bar" is prefixed with things like "cat", "a",
...

I am ok with a document that has "cat foo bar"  and "foo bar", but if it
only has "cat foo bar" then I don't want it while if it has "foo bar" I want
it.

I looked at span queries but was not able to come up with how to phrase
this.

Any pointers would be great!

Thank you in advance,
Evan




--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-write-my-first-solr-query-tp4133509.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Stemming not working with wildcard search

2014-04-28 Thread Jack Krupansky
Wildcards and stemming are incompatible at query time - you need to manually 
stem the term before applying your wildcard.


Wildcards are not supported in quoted phrases. They will be treated as 
punctuation, and ignored by the standard tokenizer or the word delimiter 
filter.


-- Jack Krupansky

-Original Message- 
From: Geepalem

Sent: Sunday, April 27, 2014 3:13 PM
To: solr-user@lucene.apache.org
Subject: Stemming not working with wildcard search

Hi,

I have added  SnowballPorterFilterFactory filter to field type to make
singular and plural search terms return same results.

So below queries (double quotes around search term) returning similar
results which is fine.

http://localhost:8080/solr/master/select?q=page_title_t:"product*";
http://localhost:8080/solr/master/select?q=page_title_t:"products*";

But when I have analyzed results, in both result sets, documents which dont
start with words "Product" or "products" didnt come though there are few
documents available.

So I have added * as prefix and suffix to search term without double quotes
to do wildcard search.

http://localhost:8080/solr/master/select?q=page_title_t:*product*
http://localhost:8080/solr/master/select?q=page_title_t:*products*

Now, stemming is not working as above second query is not returning similar
results as query 1.

If double quotes are added around search term then its returning similar
results but results are not as expected. With double quotes it wont return
results like "Old products", "New products", "Cool Product".
It will only return results with the values like "Product 1", "Product
2","Products of USA".

Please suggest or guide how to make stemming work with wildcard search.


Appreciate immediate response!!

Thanks,
G. Naresh Kumar





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Stemming-not-working-with-wildcard-search-tp4133382.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: Wildcard search not working with search term having special characters and digits

2014-04-28 Thread Jack Krupansky
Wildcard query only works for single terms. Any embedded special characters 
will cause a term to be split into multiple terms at index time. The use of 
a wildcard in a query term with embedded special characters will bypass 
normal analysis - you need to enter the term exactly as it would be analyzed 
at index time for wildcard to work.


Ditto is your filed type uses the word delimiter filter with the split 
digits option enabled - the alpha and numeric portions will generate 
separate terms - and cause a wildcard to fail.


-- Jack Krupansky

-Original Message- 
From: Geepalem

Sent: Sunday, April 27, 2014 3:30 PM
To: solr-user@lucene.apache.org
Subject: Wildcard search not working with search term having special 
characters and digits


Hi,

Below query without wildcard search is returning results.
http://localhost:8080/solr/master/select?q=page_title_t:"an-138";

But below query with wildcard is not returning results
http://localhost:8080/solr/master/select?q=page_title_t:"an-13*";

Below query with wildcard search and no didgits  is returning results.
http://localhost:8080/solr/master/select?q=page_title_t:"an-*";

I have tried by adding WordDelimeter Filter but there is no luck.



Please suggest or guide how to make wildcard search works with special
characters and digits.

Appreciate immediate response!!

Thanks,
G. Naresh Kumar






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-with-search-term-having-special-characters-and-digits-tp4133385.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: Solr Cluster management having too many cores

2014-04-28 Thread Shawn Heisey
On 4/28/2014 5:05 AM, Mukesh Jha wrote:
> Thanks Erik,
> 
> Sounds about right.
> 
> BTW how long can I keep adding collections i.e. can I keep 5/10 years data
> like this?
> 
> Also what do you think of bullet 2) of having collection specific
> configurations in zookeeper?

Regarding bullet 2, there is work underway right now to create a
separate clusterstate within zookeeper for each collection.  I do not
know how far along that work is.

There are no hard limits in SolrCloud at all.  The things that will
cause issues with scalability are resource-related problems.  You'll
exceed the 1MB default limit on a zookeeper database pretty quickly.  If
you're not using the example jetty included with Solr, you'll exceed the
default maxThreads on most servlet containers very quickly.  You may run
into problems with the default limits on Solr's HttpShardHandler.

Running hundreds or thousands of cores efficiently will require lots of
RAM, both for the OS disk cache and the java heap.  A large java heap
will require significant tuning of Java garbage collection parameters.

Most operating systems limit a user to 1024 open files and 1024 running
processes (which includes threads).  These limits will need to be increased.

There may be other limits imposed by the Solr config, Java, and/or the
operating system that I have not thought of or stated here.

Thanks,
Shawn



Re: Solr Cloud and Replication request handler

2014-04-28 Thread Amanjit Gill
Hello Shawn,

Thanks for your reply, that's good news!

All the best.


2014-04-28 15:28 GMT+02:00 Shawn Heisey :

> On 4/28/2014 3:33 AM, Amanjit Gill wrote:
> > Hi everybody,
> >
> > Considering a solr cloud configuration (4.6+)
> >
> > a) I am wondering if the solr replication handler always has to be
> > configured completely, aka by choosing one master, then setting the
> config
> > accordingly (enable, masterUrl) etc ...  Do we really need a replication
> > master?
>
>
> You simply need the replication handler to be present with a name of
> "/replication" for SolrCloud to work properly.  You do not need to
> configure it for master or slave.  SolrCloud will take care of
> configuring which instance needs to be a slave whenever it needs to
> recover an index.  You literally just need one line in your solrconfig.xml:
>
>   
>
> Thanks,
> Shawn
>
>


Re: Wildcard search not working with search term having special characters and digits

2014-04-28 Thread Geepalem
Can some one please help me with this as I am struck with this issue.. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-with-search-term-having-special-characters-and-digits-tp4133385p4133478.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Stemming not working with wildcard search

2014-04-28 Thread Geepalem
Can some one please help me with this as I am struck with this issue..



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Stemming-not-working-with-wildcard-search-tp4133382p4133477.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Cloud and Replication request handler

2014-04-28 Thread Shawn Heisey
On 4/28/2014 3:33 AM, Amanjit Gill wrote:
> Hi everybody,
> 
> Considering a solr cloud configuration (4.6+)
> 
> a) I am wondering if the solr replication handler always has to be
> configured completely, aka by choosing one master, then setting the config
> accordingly (enable, masterUrl) etc ...  Do we really need a replication
> master?


You simply need the replication handler to be present with a name of
"/replication" for SolrCloud to work properly.  You do not need to
configure it for master or slave.  SolrCloud will take care of
configuring which instance needs to be a slave whenever it needs to
recover an index.  You literally just need one line in your solrconfig.xml:

  

Thanks,
Shawn



Re: merge shards indexes

2014-04-28 Thread Dmitry Kan
Yes, according to this documentation:
https://wiki.apache.org/solr/MergingSolrIndexes


On Mon, Apr 28, 2014 at 12:14 PM, Gastone Penzo wrote:

> Hi,
> it's possible to merge 2 shards indexes into one?
>
> Thank you
>
> --
> *Gastone Penzo*
>



-- 
Dmitry
Blog: http://dmitrykan.blogspot.com
Twitter: http://twitter.com/dmitrykan


Re: Solr Cluster management having too many cores

2014-04-28 Thread Mukesh Jha
Thanks Erik,

Sounds about right.

BTW how long can I keep adding collections i.e. can I keep 5/10 years data
like this?

Also what do you think of bullet 2) of having collection specific
configurations in zookeeper?


On Fri, Apr 25, 2014 at 11:44 PM, Erick Erickson wrote:

> So you're talking about 700 or so collections. That should be do-able,
> especially as Solr is rapidly evolving to handle more and more
> collections and there's two years for that to happen.
>
> The aging out bit is manual (well, you'd script it I suppose). So
> every day there'd be a script that ran and "just knew" the right
> collection to change the alias on, there's nothing automatic yet.
>
> Best,
> Erick
>
> On Fri, Apr 25, 2014 at 9:37 AM, Mukesh Jha 
> wrote:
> > Thanks for quick reply Erik,
> >
> > I want to keep my collections till I run out of hardware, which is at
> least
> > a couple of years worth data.
> > I'd like to know more on ageing out aliases, did a quick search but
> didn't
> > find much.
> >
> >
> > On Fri, Apr 25, 2014 at 9:45 PM, Erick Erickson  >wrote:
> >
> >> Hmmm, tell us a little more about your use-case. In particular, how
> >> long do you need to keep the data around? Days? Months? Years?
> >>
> >> Because if you only need to keep the data for a specified period, you
> >> can use the collection aliasing process to age-out collections and
> >> keep the number of cores from growing too large.
> >>
> >> Best,
> >> Erick
> >>
> >> On Fri, Apr 25, 2014 at 6:49 AM, Mukesh Jha 
> >> wrote:
> >> > Hi Experts,
> >> >
> >> > I need to divide my indexes based on hour/day with each index having
> >> ~50-80
> >> > GB data & ~50-80 mill docs, so I'm planning to create daily collection
> >> with
> >> > names e.g. *sample_colledction__mm_dd_hh.*
> >> > I'll also create an alias *sample_collection* and update it whenever I
> >> will
> >> > create a new collection so that the entire data set is searchable.
> >> >
> >> > I've a couple of question on the above design
> >> > 1) How far can it scale? As my collections will increase (so will the
> >> > shards & replicas) do we have a breaking point when adding
> more/searching
> >> > will become an issue?
> >> > 2) As my cluster will grow because of huge number of collections the
> >> > clusterstate.json file present in zookeeper will grow too, won't this
> be
> >> a
> >> > limiting factor? If so instead of storing all this info in one
> >> > clusterstate.json file shouldn't Solr save cluster specific details in
> >> this
> >> > file & have collection specific config files present on zookeeper?
> >> > 3) How can I easily manage all these collections? Do we have Java
> >> Coreadmin
> >> > API's available. I cannot find much documented on it.
> >> >
> >> > --
> >> > Txz,
> >> >
> >> > *Mukesh Jha *
> >>
> >
> >
> >
> > --
> >
> >
> > Thanks & Regards,
> >
> > *Mukesh Jha *
>



-- 


Thanks & Regards,

*Mukesh Jha *


Solr Cloud and Replication request handler

2014-04-28 Thread Amanjit Gill
Hi everybody,

Considering a solr cloud configuration (4.6+)

a) I am wondering if the solr replication handler always has to be
configured completely, aka by choosing one master, then setting the config
accordingly (enable, masterUrl) etc ...  Do we really need a replication
master?

solrconfig.xml excerpt


 
 true  
[..]
   
  
 false 
 http://mysolrinstance::port
/default/replication
[..]
   


b) what happens to the cloud if the "master" instance goes down?

Thanks for your info ...

All the best,
Amanjit


Re: Application of different stemmers / stopword lists within a single field

2014-04-28 Thread Manuel Le Normand
Why wouldn't you take advantage of your use case - the chars belong to
different char classes.

You can index this field to a single solr field (no copyField) and apply an
analysis chain that includes both languages analysis - stopword, stemmers
etc.
As every filter should apply to its' specific language (e.g an arabic
stemmer should not stem a lating word) you can make cross languages search
on this single field.


On Mon, Apr 28, 2014 at 5:59 AM, Alexandre Rafalovitch
wrote:

> If you can throw money at the problem:
> http://www.basistech.com/text-analytics/rosette/language-identifier/ .
> Language Boundary Locator at the bottom of the page seems to be
> part/all of your solution.
>
> Otherwise, specifically for English and Arabic, you could play with
> Unicode ranges to try detecting text blocks:
> 1) Create an UpdateRequestProcessor chain that
> a) clones text into field_EN and field_AR.
> b) applies regular expression transformations that strip English or
> Arabic unicode text range correspondingly, so field_EN only has
> English characters left, etc. Of course, you need to decide what you
> want to do with occasional EN or neutral characters happening in the
> middle of Arabic text (numbers: Arabic or Indic? brackets, dashes,
> etc). But if you just index text, it might be ok even if it is not
> perfect.
> c) deletes empty fields, just in case not all of them have mix language
> 2) Use eDismax to search over both fields, each with its own processor.
>
> Regards,
>Alex.
> Personal website: http://www.outerthoughts.com/
> Current project: http://www.solr-start.com/ - Accelerating your Solr
> proficiency
>
>
> On Fri, Apr 25, 2014 at 5:34 PM, Timothy Hill 
> wrote:
> > This may not be a practically solvable problem, but the company I work
> for
> > has a large number of lengthy mixed-language documents - for example,
> > scholarly articles about Islam written in English but containing lengthy
> > passages of Arabic. Ideally, we would like users to be able to search
> both
> > the English and Arabic portions of the text, using the full complement of
> > language-processing tools such as stemming and stopword removal.
> >
> > The problem, of course, is that these two languages co-occur in the same
> > field. Is there any way to apply different processing to different words
> or
> > paragraphs within a single field through language detection? Is this to
> all
> > intents and purposes impossible within Solr? Or is another approach
> (using
> > language detection to split the single large field into
> > language-differentiated smaller fields, for example)
> possible/recommended?
> >
> > Thanks,
> >
> > Tim Hill
>


merge shards indexes

2014-04-28 Thread Gastone Penzo
Hi,
it's possible to merge 2 shards indexes into one?

Thank you

-- 
*Gastone Penzo*


Re: space issue in search results

2014-04-28 Thread Gora Mohanty
On 28 April 2014 12:42, PAVAN  wrote:
>
> I have indexed title in the following way.
>
> honda cars in rajaji nagar
> honda cars in rajajinagar.
>
> suppose if i search for
>
> honda cars in rajainagar (OR)
> honda cars in rajaji nagar
>
> it has to display both the results.

Please do not start multiple threads with the same question.

The straightforward way to do what you want is to use synonyms:
  rajaji nagar, rajajinagar
as presumably you want to collapse spaces only for things like
place names.

Regards,
Gora


space issue in search results

2014-04-28 Thread PAVAN
I have indexed title in the following way.

honda cars in rajaji nagar
honda cars in rajajinagar.

suppose if i search for 

honda cars in rajainagar (OR) 
honda cars in rajaji nagar 

it has to display both the results.

Anybody help me how can we do this.








--
View this message in context: 
http://lucene.472066.n3.nabble.com/space-issue-in-search-results-tp4133421.html
Sent from the Solr - User mailing list archive at Nabble.com.