Programmatic Access to Solr schema?

2010-08-09 Thread Aparna Chaudhary
Hi, I need access to solr schema definition(schema.xml) file to perform some validations while constructing the document. I see there is a class IndexSchema which provides information about the field metadata. How to get handle to this class? Aparna Chaudhary http://blog.aparnachaudhary.net

how to query a string using solr URL in the browser

2010-08-09 Thread e8en
hi everyone, this is my solr query link and the result that showed in my browser where ITEM_CAT = 1191 http://172.11.18.120:9000/search/select/?q=ITEM_CAT:1191q.op=ANDstart=0rows=1000 doc int name=AP_AUC_PHOTO_AVAIL1/int double name=AUC_AD_PRICE1.0/double int name=ITEM_CLIENT_ID27017/int str

Re: Indexing fieldvalues with dashes and spaces

2010-08-09 Thread PeterKerk
Hi Erick, Ok. its more clear now. I indeed have the whitespace tokenizer: fieldType name=textTrue class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt

Re: how to create a custom type in Solr

2010-08-09 Thread Mark Allan
I see you have the exact same requirements I did and have also hit the same problem I did a month-or-so ago. I ended up writing a custom field type based on solr.schema.PointType and making some very minor modifications to one of the Solr classes (AbstractSubfieldType) to allow the field

How do i update some document when i use sharding indexs?

2010-08-09 Thread lu.rongbin
My index has 76 million documents, I split it to 20 indexs because the size of index is 33G. I deploy 20 shards for search response performence on ec2's 20 instances.But when i wan't to update some doc, it means i must traversal each index , and find the document is in which shard index, and

wildcards in solr synonyms file

2010-08-09 Thread M.Rizwan
Hi, I want to write an entry in solr synonyms file so that when i search for word discount, solr return all records which have [10]% kind of words in title. So for example, i have a document cars - 10% off and I search for word discount, this document should be returned aswell. In synonyms file,

Re: Solr 1.4.1 and 3x: Grouping of query changes results

2010-08-09 Thread gwk
On 8/9/2010 12:01 AM, David Benson wrote: I'm seeing what I believe to be a logic error in the processing of a query. Returns document 1234 as expected: id:1234 AND -indexid:1 AND -indexid:2 AND -indexid:3 Does not return document as expected: id:1234 AND (-indexid:1 AND -indexid:2) AND

Re: How do i update some document when i use sharding indexs?

2010-08-09 Thread Geert-Jan Brits
I'm not sure if Solr has some build-in support for sharding-functions, but you should generally use some hashing-algorithm to split the indices and use the same hash-algorithm to locate which shard contains a document. http://en.wikipedia.org/wiki/Hash_function Without employing any domain

Re: How do i update some document when i use sharding indexs?

2010-08-09 Thread Geert-Jan Brits
Just to be completely clear: the program that splits your index in 20 shards should employ this algo as well. 2010/8/9 Geert-Jan Brits gbr...@gmail.com I'm not sure if Solr has some build-in support for sharding-functions, but you should generally use some hashing-algorithm to split the

Re: how to query a string using solr URL in the browser

2010-08-09 Thread e8en
someone please help me, I'm running out of time :( -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-query-a-string-using-solr-URL-in-the-browser-tp1052434p1054327.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr single threaded?

2010-08-09 Thread Andy
Otis, Thanks. In that case what does it mean that Lucene search is single threaded? How is that different from the Solr behavior? Andy --- On Mon, 8/9/10, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: From: Otis Gospodnetic otis_gospodne...@yahoo.com Subject: Re: solr single threaded?

Re: solr single threaded?

2010-08-09 Thread Otis Gospodnetic
Andy, A single non-distributed search requests uses a single thread both in Lucene and Solr. This is kind of normal, so I wouldn't worry about it if I were you. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ -

Re: wildcards in solr synonyms file

2010-08-09 Thread Otis Gospodnetic
Riz, The synonyms file doesn't support wildcards. You'll need to list all numbers explicitly. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: M.Rizwan muhammad.riz...@sigmatec.com.pk

Re: how to create a custom type in Solr

2010-08-09 Thread Otis Gospodnetic
Mark, A good way to get your changes/improvements into Solr is by putting them in JIRA. Please see http://wiki.apache.org/solr/HowToContribute Thanks! Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original

Re: how to query a string using solr URL in the browser

2010-08-09 Thread Otis Gospodnetic
Eben, Your URL was: http://172.11.18.120:9000/search/select/?q=text:bracketq.op=ANDstart=0rows=1000 change text:bracket to: AUC_DESCR_SHORT:bracket AND AUC_TITLE:bracket This will need to be URL-encoded, of course. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene

Re: how to query a string using solr URL in the browser

2010-08-09 Thread e8en
Hi Otis, your answer gave me some lights :) the input must be: http://172.11.18.120:9000/search/select/?q=text:bracketq.op=ANDstart=0rows=1000 how to change 'AUC_DESCR_SHORT:bracket' and 'AUC_TITLE:bracket' into 'text:bracket' ? is there any solution? thanks, Eben -- View this message in

Re: how to query a string using solr URL in the browser

2010-08-09 Thread e8en
I forgot something, when I enter this: http://172.11.18.120:9000/search/select/?q=text:bracketq.op=ANDstart=0rows=1000 the result will show all ITEM_ID that contain 'bracket' word in both or one of ITEM_DESCR_SHORT or ITEM_TITLE -- View this message in context:

Re: wildcards in solr synonyms file

2010-08-09 Thread dotriz
Does the synonyms file support regular expressions? -- View this message in context: http://lucene.472066.n3.nabble.com/wildcards-in-solr-synonyms-file-tp1053691p1055822.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: how to create a custom type in Solr

2010-08-09 Thread Mark Allan
On 9 Aug 2010, at 1:01 pm, Otis Gospodnetic wrote: Mark, A good way to get your changes/improvements into Solr is by putting them in JIRA. Please see http://wiki.apache.org/solr/HowToContribute Thanks! Otis Hi Otis, For the class which requires only minor modifications, I tested it to

Re: how to create a custom type in Solr

2010-08-09 Thread Thomas Joiner
I'd love to see your code on this, however what I've really been wondering is the following: When did AbstractSubTypeFieldType get added? It isn't in 1.4.1 (as far as I can tell that's the latest one that is bundled on their site). So, do I just need to grab it from subversion, and build it?

how to support implicit trailing wildcards

2010-08-09 Thread yandong yao
Hi everyone, How to support 'implicit trailing wildcard *' using Solr, eg: using Google to search 'umoun', 'umount' will be matched , search 'mounta', 'mountain' will be matched. From my point of view, there are several ways, both with disadvantages: 1) Using EdgeNGramFilterFactory, thus

Re: Programmatic Access to Solr schema?

2010-08-09 Thread Erik Hatcher
Are you looking to get access to a remote schema? You can pull schema.xml via HTTP using a URL like: http://localhost:8983/solr/admin/file/?file=schema.xml If you're accessing the schema from inside a custom Solr component then the IndexSchema API (which you can get to from pretty much

AW: how to support implicit trailing wildcards

2010-08-09 Thread Bastian Spitzer
Wildcard-Search is already built in, just use: ?q=umoun* ?q=mounta* -Ursprüngliche Nachricht- Von: yandong yao [mailto:yydz...@gmail.com] Gesendet: Montag, 9. August 2010 15:57 An: solr-user@lucene.apache.org Betreff: how to support implicit trailing wildcards Hi everyone, How to

Re: can't use strdist in sorting either?

2010-08-09 Thread solr-user
finally figured out that I can simply escape the quotation marks in the query URL using backslashes to use strdist as a functionquery (sorry all, that should have been a no-brainer) http://10.0.11.54:8994/solr/select?q=(*:*)^0%20_val_:strdist(\phoenix\,city,edit)fl=score,*sort=score%20desc

Re: can't use strdist in sorting either?

2010-08-09 Thread solr-user
issue resolved I should have read the documentation with more care; Calculate the distance between two strings my city field was a tokenized text field so changing it to string type got things working sorry all -- View this message in context:

hl.usePhraseHighlighter

2010-08-09 Thread Ma, Xiaohui (NIH/NLM/LHC) [C]
I am trying to do exactly match. For example, I hope only get study highlighted if I search study, not others (studies, studied and so on). I didn't find any function for it from SolrQuery. I added following in solrconfig.xml str name=hl.usePhraseHighlightertrue/str. Unfortunately I didn't get

Re: how to query a string using solr URL in the browser

2010-08-09 Thread Gora Mohanty
On Mon, 9 Aug 2010 05:31:36 -0700 (PDT) e8en e...@tokobagus.com wrote: I forgot something, when I enter this: http://172.11.18.120:9000/search/select/?q=text:bracketq.op=ANDstart=0rows=1000 the result will show all ITEM_ID that contain 'bracket' word in both or one of ITEM_DESCR_SHORT

It seems like using a wildcard causes lowercase filter to not do the lowercasing?

2010-08-09 Thread Robert Petersen
I have a field with lowercase filter on search and index sides, and searching in this field works fine with uppercase or lowercase terms, except if I wildcard! So searching for 'gps' or 'GPS' returns the same result set, but searching for 'gps*' returns results as expected and searching for

[ANN] Free technical webinar: Mastering the Lucene Index: Wednesday, August 11, 2010 11:00 AM PST / 2:00 PM EST / 20:00 CET

2010-08-09 Thread Mark Miller
Hey all - apologize for the quick cross post - just to let you know, Andrzej is giving a free webinar this wed. His presentations are always fantastic, so check it out: Lucid Imagination Presents a free technical webinar: Mastering the Lucene Index Wednesday, August 11, 2010 11:00 AM PST / 2:00

Re: It seems like using a wildcard causes lowercase filter to not do the lowercasing?

2010-08-09 Thread Ahmet Arslan
I have a field with lowercase filter on search and index sides, and searching in this field works fine with uppercase or lowercase terms, except if I wildcard!  So searching for 'gps' or 'GPS' returns the same result set, but searching for 'gps*' returns results as expected and searching

Re: hl.usePhraseHighlighter

2010-08-09 Thread Ahmet Arslan
I am trying to do exactly match. For example, I hope only get study highlighted if I search study, not others (studies, studied and so on). This has nothing to do with highlighting and its parameters. You need to remove stem filter factory (porter, snowball) from your analyzer chain.

RE: It seems like using a wildcard causes lowercase filter to not do the lowercasing?

2010-08-09 Thread Robert Petersen
Aha, I overlooked that. Thank you. -Original Message- From: Ahmet Arslan [mailto:iori...@yahoo.com] Sent: Monday, August 09, 2010 1:28 PM To: solr-user@lucene.apache.org Subject: Re: It seems like using a wildcard causes lowercase filter to not do the lowercasing? I have a field with

Re: anti-words - exact match

2010-08-09 Thread Satish Kumar
Thanks Jon. My initial thought was exactly like yours. My preference was to implement this requirement completely at Solr level so that different applications won't have to put this logic. However, I am not sure how to shingle-ize the input query and use that in filter query with a NOT operator

Facet Fields - ID vs. Display Value

2010-08-09 Thread Frank A
Is there a general best practice on whether facet fields should be on IDs or Display values? -Frank

RE: hl.usePhraseHighlighter

2010-08-09 Thread Ma, Xiaohui (NIH/NLM/LHC) [C]
Thanks so much for your help! I used text type and found the following in schema.xml. I don't know which ones I should remove. *** fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer

randomness - percent share

2010-08-09 Thread Satish Kumar
Hi, We have some identical records in our data set (e.g. what is swine flu? written by two different authors). When user searches for What is swine flu?, we want the result by author1 appear as the first result for x% of the queries and result by author2 for y% of the queries (where x and y

Re: Facet Fields - ID vs. Display Value

2010-08-09 Thread Otis Gospodnetic
Hi Frank, I'm not sure what you mean by that. If the question is about what should be shown in the UI, it should be something pretty and human-readable, such as the original facet string value, assuming it was nice and clean. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch

RE: hl.usePhraseHighlighter

2010-08-09 Thread Ahmet Arslan
I used text type and found the following in schema.xml. I don't know which ones I should remove. *** You should remove filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ from both index and query time.

Problems to clustering on tomcat

2010-08-09 Thread Claudio Devecchi
Hi everybody, I need to do some tests in my solr instalation, previously I configured my application on a single node, and now I need to make some tests on a cluster configuration. I followed the steps on http://wiki.apache.org/solr/ClusteringComponent; and when a startup the example system

Re: Facet Fields - ID vs. Display Value

2010-08-09 Thread Frank A
What I meant (which I realize now wasn't very clear) was if I have something like categoryID and categorylabel - is the normal practice to define categoryID as the facet field and then have the UI layer display the label? Or would it be normal to directly use categorylabel as the facet field?

RE: Re: Facet Fields - ID vs. Display Value

2010-08-09 Thread Markus Jelsma
Well, you can do both, of cource but there's no need for additional code if you get it for free. I'd prefer - as most i assume - to use the label as a facet field.   -Original message- From: Frank A fsa...@gmail.com Sent: Tue 10-08-2010 01:11 To: solr-user@lucene.apache.org; Subject:

Re: Facet Fields - ID vs. Display Value

2010-08-09 Thread dc tech
I think depends on what you need: 1) Simple,unique category - use display facet 2) Categories may be duplicate from display perspective (eg authors) : store display#id in facet field but show only display 3) Internationalization requirements - store I'd but have ui pull and display the translated

Re: DIH and multivariable fields problems

2010-08-09 Thread harrysmith
This is increasingly more looking like a bug. To recap, I am trying to use the DIH to import multivalued dynamic fields and using a variable to name that field. Upon further testing, the multivalued import works fine with a static/constant name, but only keeps the first record when naming the

Re: how to support implicit trailing wildcards

2010-08-09 Thread yandong yao
Hi Bastian, Sorry for not make it clear, I also want exact match have higher score than wildcard match, that is means: if searching 'mount', documents with 'mount' will have higher score than documents with 'mountain', while 'mount*' seems treat 'mount' and 'mountain' as same. besides, also want

solr query result not read the latest xml file

2010-08-09 Thread e8en
hi everyone, I do these steps every time the new xml file created (for example cat_978.xml has just been created): 1. delete the index (deletequeryAUC_CAT:978/query/delete) 2. commit the new cat_978.xml (java -jar post.jar cat_978.xml) 3. restart the java (stop and java -jar start.jar) if I'm

Re: enhancing auto complete

2010-08-09 Thread Bhavnik Gajjar
Thanks Avlesh for sharing the info. Will try it! In between, some another solution is also found http://metaoptimize.com/qa/questions/17/stemming-problems-when-writing-search-auto-complete Kind regards. On 8/4/2010 9:13 PM, Avlesh Singh wrote: I preferred to answer this question privately

Re: randomness - percent share

2010-08-09 Thread Lance Norskog
There is a Random value type documented in the standard schema.xml. I'm not sure if it reseeds each time. You can use this in a function query along with your percentage bias fields. On Mon, Aug 9, 2010 at 2:44 PM, Satish Kumar satish.kumar.just.d...@gmail.com wrote: Hi, We have some identical