Re: AndQueryNode to NearSpanQuery

2011-06-14 Thread mtraynham
Thanks for your help, great solution! Turned out perfectly. Too bad they don't actually add this to the sdk. -- View this message in context: http://lucene.472066.n3.nabble.com/AndQueryNode-to-NearSpanQuery-tp3061286p3066035.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to involve JMX by configuration

2011-06-14 Thread Gora Mohanty
On Wed, Jun 15, 2011 at 7:41 AM, kun xiong wrote: > Hi, > > I am wondering how to start JMX monitor without code change. > > Currently, I have to insert code "LocateRegistry.createRegistry();" into > SolrCore.java. > > And I specify serviceUrl="service:jmx:rmi:///jndi/rmi://localhost:/sol

How to involve JMX by configuration

2011-06-14 Thread kun xiong
Hi, I am wondering how to start JMX monitor without code change. Currently, I have to insert code "LocateRegistry.createRegistry();" into SolrCore.java. And I specify at solrconfig.xml. Can I make it by only configuration change? Thanks Kun

Re: International filters/tokenizers doing too much

2011-06-14 Thread Shawn Heisey
On 6/14/2011 5:34 PM, Robert Muir wrote: On Tue, Jun 14, 2011 at 7:07 PM, Shawn Heisey wrote: Because the text in my index comes in many different languages with no ability to know the language ahead of time, I have a need to use ICUTokenizer and/or the CJK filters, but I have a problem with th

Re: How to avoid double counting for facet query

2011-06-14 Thread Way Cool
I just checked SolrQueryParser.java from 3.2.0 source. Looks like Yonik Seeley's changes for LUCENE-996is not in. I will check trunk later. Thanks! On Tue, Jun 14, 2011 at 5:34 PM, Way Cool wrote: > I already checked out facet range query. By the

Re: International filters/tokenizers doing too much

2011-06-14 Thread Robert Muir
On Tue, Jun 14, 2011 at 7:07 PM, Shawn Heisey wrote: > Because the text in my index comes in many different languages with no > ability to know the language ahead of time, I have a need to use > ICUTokenizer and/or the CJK filters, but I have a problem with them as they > are implemented currently

Re: How to avoid double counting for facet query

2011-06-14 Thread Way Cool
I already checked out facet range query. By the way, I did put the facet.range.include as below: lower Couple things I don't like though are: 1. It returns the following without end values (I have to re-calculate the end values) : 20 3 50.0 0.0 600.0 0 2. I can't specify custom ranges of values

International filters/tokenizers doing too much

2011-06-14 Thread Shawn Heisey
Because the text in my index comes in many different languages with no ability to know the language ahead of time, I have a need to use ICUTokenizer and/or the CJK filters, but I have a problem with them as they are implemented currently. They do extra things like handle email addresses, token

Re: Modifying Configuration from a Browser

2011-06-14 Thread Stefan Matheis
Brandon, actually afaik there is no ability to to this, not via API or something else. If you really want to start building such an API, would you mind to build an generic one? I'm asking because this was already requested as feature for the new admin UI [https://issues.apache.org/jira/brow

Re: How to avoid double counting for facet query

2011-06-14 Thread Chris Hostetter
: You can use exclusive range queries which are denoted by curly brackets. that will solve the problem of making the fq exclude a bound, but for the range facet counts you'll want to pay attention to look at facet.range.include... http://wiki.apache.org/solr/SimpleFacetParameters#facet.range.i

Search with Dynamic indexing

2011-06-14 Thread zarni aung
Hi, I have requirements to make large amounts of data (> 5 million) documents search-able. The problem is that more than half have highly volatile field values. I will also have a data store specifically for Meta Data. Committing frequently isn't a solution. What I'm basically trying to achieve

Re: Text field case sensitivity problem

2011-06-14 Thread Mike Sokolov
opps, please s/Highlight/Wildcard/ On 06/14/2011 05:31 PM, Mike Sokolov wrote: Wildcard queries aren't analyzed, I think? I'm not completely sure what the best workaround is here: perhaps simply lowercasing the query terms yourself in the application. Also - I hope someone more knowledgeable

RE: Text field case sensitivity problem

2011-06-14 Thread Bob Sandiford
Unfortunately, wild card search terms don't get processed by the analyzers. One suggestion that's fairly common is to make sure you lower case your wild card search terms yourself before issuing the query. Bob Sandiford | Lead Software Engineer | SirsiDynix P: 800.288.8020 X6943 | bob.sandif...@

Re: Text field case sensitivity problem

2011-06-14 Thread Mike Sokolov
Wildcard queries aren't analyzed, I think? I'm not completely sure what the best workaround is here: perhaps simply lowercasing the query terms yourself in the application. Also - I hope someone more knowledgeable will say that the new HighlightQuery in trunk doesn't have this restriction, bu

Re: Strange behavior

2011-06-14 Thread Erick Erickson
Well, you could provide the results with &debugQuery=on. You could provide the schema.xml and solrconfig.xml files for both. You could provide a listing of your index files. You could provide some evidence that you've tried chasing down your problem using tools like Luke or the Solr admin interface

Re: ampersand, dismax, combining two fields, one of which is keywordTokenizer

2011-06-14 Thread Jonathan Rochkind
Okay, let's try the debug trace again without a pf to be less confusing. One field in qf, that's ordinary text tokenized, and does get hits: q=churchill%20%3A%20roosevelt&qt=search&qf=title1_t&mm=100%&debugQuery=true&pf= churchill : roosevelt churchill : roosevelt +((DisjunctionMaxQuery((title

ampersand, dismax, combining two fields, one of which is keywordTokenizer

2011-06-14 Thread Jonathan Rochkind
I'm aware that using a field tokenized with KeywordTokenizerFactory is in a dismax 'qf' is often going to result in 0 hits on that field -- (when a whitespace-containing query is entered). But I do it anyway, for cases where a non-whitespace-containing query is entered, then it hits. And in t

Re: Text field case sensitivity problem

2011-06-14 Thread Jamie Johnson
Also of interest to me is this returns results http://localhost:8983/solr/select?defType=lucene&q=Person_Name:Kristine On Tue, Jun 14, 2011 at 5:08 PM, Jamie Johnson wrote: > I am using the following for my text field: > > positionIncrementGap="100" autoGeneratePhraseQueries="true"> >

Text field case sensitivity problem

2011-06-14 Thread Jamie Johnson
I am using the following for my text field: I have a field defined as when I execute a go to the following url I get re

Re: How to avoid double counting for facet query

2011-06-14 Thread Ahmet Arslan
> That's good to know. From the ticket, > looks like the fix will be in 4.0 > then? It is already committed. You can use trunk: svn checkout http://svn.apache.org/repos/asf/lucene/dev/trunk > Currently I can see {} and [] worked, but not combined for > Solr 3.1. I will > try 3.2 soon. After re

Re: Modifying Configuration from a Browser

2011-06-14 Thread Way Cool
+1 Good idea! I was thinking to write a web interface to change contents for elevate.xml and feed back to Solr core. On Tue, Jun 14, 2011 at 1:51 PM, Markus Jelsma wrote: > There is no API. Upload and restart the core is the way to go. > > > Does anyone have any examples of modifying a configurat

Re: How to avoid double counting for facet query

2011-06-14 Thread Way Cool
That's good to know. From the ticket, looks like the fix will be in 4.0 then? Currently I can see {} and [] worked, but not combined for Solr 3.1. I will try 3.2 soon. Thanks. On Tue, Jun 14, 2011 at 2:07 PM, Ahmet Arslan wrote: > > You sure Solr supports that? > > I am getting exceptions by do

Re: How to avoid double counting for facet query

2011-06-14 Thread Ahmet Arslan
> You sure Solr supports that? > I am getting exceptions by doing that. Ahmet, do you > remember where you see > that document? Thanks. I tested it with trunk. https://issues.apache.org/jira/browse/SOLR-355 https://issues.apache.org/jira/browse/LUCENE-996

Re: How to avoid double counting for facet query

2011-06-14 Thread Way Cool
You sure Solr supports that? I am getting exceptions by doing that. Ahmet, do you remember where you see that document? Thanks. On Tue, Jun 14, 2011 at 1:58 PM, Way Cool wrote: > Thanks! That's what I was trying to find. > > > On Tue, Jun 14, 2011 at 1:48 PM, Ahmet Arslan wrote: > >> > 23 >>

Re: How to avoid double counting for facet query

2011-06-14 Thread Way Cool
Thanks! That's what I was trying to find. On Tue, Jun 14, 2011 at 1:48 PM, Ahmet Arslan wrote: > > 23 > > 1 > > > > ... > > * > > > > As you notice, the number of the results is 23, however an > > extra doc was > > found in the 160-200 range. > > > > Any way I can avoid double counting issue? >

Re: Updating only one indexed field for all documents quickly.

2011-06-14 Thread karthik
Look at solr-2272. It might help in your situation. you can have a separate core & join using the document unique id. This way in the separate core you can just have the document id & the view stats & you can just keep updating those 2 fields alone instead of the entire document. -- karthik On T

Re: Modifying Configuration from a Browser

2011-06-14 Thread Markus Jelsma
There is no API. Upload and restart the core is the way to go. > Does anyone have any examples of modifying a configuration file, like > "elevate.xml" from a browser? Is there an API that would help for this? > > If nothing exists for this, I am considering implementing something that > would cha

Re: How to avoid double counting for facet query

2011-06-14 Thread Ahmet Arslan
> 23 > 1 > > ... > * > > As you notice, the number of the results is 23, however an > extra doc was > found in the 160-200 range. > > Any way I can avoid double counting issue? You can use exclusive range queries which are denoted by curly brackets. price:[110 TO 160} price:[160 TO 200}

Modifying Configuration from a Browser

2011-06-14 Thread Brandon Fish
Does anyone have any examples of modifying a configuration file, like "elevate.xml" from a browser? Is there an API that would help for this? If nothing exists for this, I am considering implementing something that would change the "elevate.xml" file then reload the core. Or is there a better appr

Re: Updating only one indexed field for all documents quickly.

2011-06-14 Thread Adam Duston
Hi Erick, Thanks for your message. > What is the use-case you're considering? The use case is actually quite similar to the one in the blog post. We have view counts for Videos in our mysql database. We want to be able to find "most viewed videos" that match certain search criteria. So, for exam

Re: Updating only one indexed field for all documents quickly.

2011-06-14 Thread Erick Erickson
Nope, there isn't a way to index a single field, it's always the entire document. That said, the URL you pointed to is very interesting, but it may be overkill depending upon what you want to do with the integer field. If you just want to influence the score, then just plain external field fields

How to avoid double counting for facet query

2011-06-14 Thread Way Cool
Hi, guys, I fixed Solr search UI (solr/browse) to display the price range facet values via http://thetechietutorials.blogspot.com/2011/06/fix-price-facet-display-in-solr-search.htm l: - Under 50 (1331) - [50.0 TO

Re: query parsing - removes a term

2011-06-14 Thread Dmitry Kan
Do you use stop word removal on text field? Dmitry On Tue, Jun 14, 2011 at 9:18 PM, Andrea Eakin < andrea.ea...@systemsbiology.org> wrote: > I am trying to do the following type of query: > > +text:(was wasp) +pub_date_year:[1991 TO 2011] > > When I turn debugQuery=on I find that the parsedquery

query parsing - removes a term

2011-06-14 Thread Andrea Eakin
I am trying to do the following type of query: +text:(was wasp) +pub_date_year:[1991 TO 2011] When I turn debugQuery=on I find that the parsedquery is only sending in the +text:(wasp) on parsing, and doesn't use the "was" value. Why is it removing one of the terms? Thanks! Andrea

Re: huge shards (300GB each) and load balancing

2011-06-14 Thread Dmitry Kan
Hi Tom, Thanks a lot for sharing this. We have about half a terabyte total index size, and we have split our index over 10 shards (horizontal scaling, no replication). Each shard currently is allocated max 12GB memory. We use facet search a lot and non-facet search with parameter values generated

Re: Using Edismax

2011-06-14 Thread Jan Høydahl
Hi, Let's assume you're using Solr version 3.1.0 and an unmodified FieldType "text_rev". It looks like this: ... Also let's assume that what you have two docs in your index with these URLs: A:"http://my.host/SPC265_SharePoint

Solr Data Import Handler - German Language Database

2011-06-14 Thread venkat
Hi All, I am experiencing a problem with DataImportHandler trying to index a colum in a table that has several umlaut German characters , the problem is the field is not getting indexed at all and i get a error log stating that this field which has been defined as required in the schema is missing

Re: AndQueryNode to NearSpanQuery

2011-06-14 Thread mtraynham
Thanks for the correction! I thought I had read that phrases were assumed to be in order and the slop was the distance between them. I'll look into this also. -- View this message in context: http://lucene.472066.n3.nabble.com/AndQueryNode-to-NearSpanQuery-tp3061286p3063673.html Sent from the S

Re: AndQueryNode to NearSpanQuery

2011-06-14 Thread mtraynham
That is a really good idea. I'll have to try that. -- View this message in context: http://lucene.472066.n3.nabble.com/AndQueryNode-to-NearSpanQuery-tp3061286p3063668.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Strange behavior

2011-06-14 Thread Denis Kuzmenok
What should i provide, OS is the same, environment is the same, solr is completely copied, searches work, except that one, and that is strange.. > I think you will need to provide more information than this, no-one on this > list is omniscient AFAIK. > François > On Jun 14, 2011, at 10:

DIH entity threads

2011-06-14 Thread Mark
Hello all, We are using DIH to index our data (~6M documents) and its taking an extremely long time (~24 hours). I am trying to find ways that we can speed this up. I've been reading through older posts and it's my understanding this should not take that long. One probably bottleneck is that

Re: Strange behavior

2011-06-14 Thread François Schiettecatte
I think you will need to provide more information than this, no-one on this list is omniscient AFAIK. François On Jun 14, 2011, at 10:44 AM, Denis Kuzmenok wrote: > Hi. > > I've debugged search on test machine, after copying to production server > the entire directory (entire solr director

Strange behavior

2011-06-14 Thread Denis Kuzmenok
Hi. I've debugged search on test machine, after copying to production server the entire directory (entire solr directory), i've noticed that one query (SDR S70EE K) does match on test server, and does not on production. How can that be?

Updating only one indexed field for all documents quickly.

2011-06-14 Thread Adam Duston
We are updating one indexed integer field in Solr for all documents once every two hours. We're using Solr through Haystack so we're not exactly Solr experts. Is there a way to update just one indexed field for all documents without reindexing all other fields also? We saw this blog post [1], which

Re: Huge performance drop in distributed search w/ shards on the same server/container

2011-06-14 Thread Johannes Goll
I increased the maximum POST size and headerBufferSize to 10MB ; lowThreads to 50, maxThreads to 10 and lowResourceMaxIdleTime=15000. We tried tomcat 6 using the following Connnector settings : I am getting the same exception as for jetty SEVERE: org.apache.solr.common.SolrException: org.ap

RE: ISOLatin1AccentFilterFactory vs ASCIIFoldingFilterFactory

2011-06-14 Thread Steven A Rowe
On 6/14/2011 at 7:12 AM, Ahmet Arslan wrote: > --- On Tue, 6/14/11, Nils Weinander wrote: > > The documentation states that ISOLatin1AccentFilterFactory > > is deprecated in favour of ASCIIFoldingFilterFactory: [...] > > Is there a way to limit which characters are folded? > > With MappingCharFil

Re: WordDelimiter and stemEnglishPossessive doesn't work

2011-06-14 Thread roySolr
THANK YOU!! I thought i only could use one character for the pattern.. Now i use a regular expression:) I don't need the wordDelimiter anymore. It's split on # and whitespace dataset: mcdonald's#burgerking#Free record shop#h&m mcdonald's burgerking free record shop h&m This is exactly how we

Re: WordDelimiter and stemEnglishPossessive doesn't work

2011-06-14 Thread Erick Erickson
It's a little obscure, but you can use http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PatternReplaceCharFilterFactory in front of WhitespaceTokenizer if you prefer. Note that a CharFilterFactory is different than a FilterFactory, so read carefully .. Best Erick On Tue, Jun 14,

Re: AndQueryNode to NearSpanQuery

2011-06-14 Thread Erick Erickson
<<>> This is not true. "slop" includes re-arranging the terms, it just takes a little more slop (see Lucene In Action for an excellent pictorial explanation). Best Erick On Mon, Jun 13, 2011 at 10:45 PM, mtraynham wrote: > Hey Erick, > > Thanks for the feedback, but I it doesn't particularly so

Re: Adding documents in a batch using Solrj

2011-06-14 Thread Erick Erickson
Have fun. Note that the intent is to have the logging/record keeping in the superclass (whose name escapes me) and each update type should be able to use that Best Erick On Mon, Jun 13, 2011 at 11:15 PM, karthik wrote: > Thanks Erick. Will certainly take a look. > > I am looking to do this f

Re: Document Level Security (SOLR-1872 ,SOLR,SOLR-1834)

2011-06-14 Thread Peter Sturge
SOLR-1872 doesn't add discrete booleans to the query, it does it programmatically, so you shouldn't see this problem. (if you have a look at the code, you'll see how it filters queries) I suppose you could modify SOLR-1872 to use an in-memory, dynamically-updated user list (+ associated filters) in

Re: WordDelimiter and stemEnglishPossessive doesn't work

2011-06-14 Thread lee carroll
do you need the word delimiter ? #|\s i think its just regex in the pattern tokeniser - i might be wrong though ? On 14 June 2011 11:15, roySolr wrote: > Ok, with catenatewords the index term will be mcdonalds. But that's not what > i want. > > I only use the wordDelimiter to split on whitespa

Re: disable sort by score

2011-06-14 Thread Ahmet Arslan
--- On Tue, 6/14/11, Jason, Kim wrote: > From: Jason, Kim > Subject: Re: disable sort by score > To: solr-user@lucene.apache.org > Date: Tuesday, June 14, 2011, 6:57 AM > Thanks to reply, Erick! > > Actually, I need sort by score. > I was just curious that seach result without sorting is > po

Re: ISOLatin1AccentFilterFactory vs ASCIIFoldingFilterFactory

2011-06-14 Thread Nils Weinander
On Tue, Jun 14, 2011 at 1:11 PM, Ahmet Arslan wrote: > > With MappingCharFilterFactory you have fully control over which characters > are folded. You can see the default mappings in > mapping-ISOLatin1Accent.txt file. > > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.MappingCha

Re: ISOLatin1AccentFilterFactory vs ASCIIFoldingFilterFactory

2011-06-14 Thread Ahmet Arslan
--- On Tue, 6/14/11, Nils Weinander wrote: > From: Nils Weinander > Subject: ISOLatin1AccentFilterFactory vs ASCIIFoldingFilterFactory > To: solr-user@lucene.apache.org > Date: Tuesday, June 14, 2011, 12:18 PM > Hi all, I'm new to the list (but not > totally new to Solr). > > The documentatio

Re: Using Edismax

2011-06-14 Thread Ahmet Arslan
> Thx for the reply. But what can I do to avoid getting 2010. > I wanted a phrase query with underscore, so it would return > results with underscore2010 only. For example, you can remove WordDelimeterFilterFactory from your field type definition. According to your needs, you can use an other fi

Re: WordDelimiter and stemEnglishPossessive doesn't work

2011-06-14 Thread roySolr
Ok, with catenatewords the index term will be mcdonalds. But that's not what i want. I only use the wordDelimiter to split on whitespace. I have already used the PatternTokenizerFactory so i can't use the whitespacetokenizer. I want my index looks like this: dataset: mcdonald's#burgerking#Free r

Re: Query on Synonyms feature in Solr

2011-06-14 Thread roySolr
Maybe you can try to escape the synonyms so it's no tokized by whitespace.. Private\ schools,NGO\ Schools,Unaided\ schools -- View this message in context: http://lucene.472066.n3.nabble.com/Query-on-Synonyms-feature-in-Solr-tp3058197p3062392.html Sent from the Solr - User mailing list archive

Re: Using Edismax

2011-06-14 Thread Tirthankar Chatterjee
Eric Thx for the reply. But what can I do to avoid getting 2010. I wanted a phrase query with underscore, so it would return results with underscore2010 only. Sent from iPod On Jun 13, 2011, at 3:47 PM, "Erick Erickson" wrote: > You haven't supplied the information that's really > needed to he

ISOLatin1AccentFilterFactory vs ASCIIFoldingFilterFactory

2011-06-14 Thread Nils Weinander
Hi all, I'm new to the list (but not totally new to Solr). The documentation states that ISOLatin1AccentFilterFactory is deprecated in favour of ASCIIFoldingFilterFactory: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ISOLatin1AccentFilterFactory I see problems with this. If I

Re: Document Level Security (SOLR-1872 ,SOLR,SOLR-1834)

2011-06-14 Thread Sujatha Arun
Thanks Peter , for your input . I really would like a document and schema agnostic solution as in solr 1872. Am I right in my assumption that SOLR1872 is same as the solution that we currently have where we add a flter query of the products to orignal query and hence (SOLR 1872) will als

Re: Document Level Security (SOLR-1872 ,SOLR,SOLR-1834)

2011-06-14 Thread Peter Sturge
Hi, SOLR-1834 is good when the original documents' ACL is accessible. SOLR-1872 is good where the usernames are persistent - neither of these really fit your use case. It sounds like you need more of an 'in-memory', transient access control mechanism. Does the access have to exist beyond the user'

Re: AndQueryNode to NearSpanQuery

2011-06-14 Thread karsten-solr
Hi member of digitalsmiths, I also implemented SpanNearQueryNode and some QueryNodeProcessors. Most possible you can solve your problem by using QueryNode#setTag: In QueryNodeProcessor#preProcessNode you can set and remove and reset a Tag to mark the AndNodes that should became SpanNodes; after t

Document Level Security (SOLR-1872 ,SOLR,SOLR-1834)

2011-06-14 Thread Sujatha Arun
Hello, Our Use Case is as follows Several solr webapps (one JVM) ,Each webapp catering to one client .Each client has their users who can purchase products from the site .Once they purchase ,they have full access to the products ,other wise they can only view details . The products are not tie