Re: TermsComponent from deleted document

2011-09-09 Thread Manish Bafna
Which is preferable? using TermsComponent or Facets for autosuggest? On Fri, Sep 9, 2011 at 10:33 PM, Chris Hostetter wrote: > > : http://wiki.apache.org/solr/TermsComponent states that TermsComponent > will > : return frequencies from deleted documents too. > : > : Is there anyway to omit the de

Re: SolrCloud and replica question

2011-09-09 Thread Jamie Johnson
great, thanks Yury, that's what I thought but just wanted to verify. 2011/9/9 Yury Kats : > On 9/9/2011 4:48 PM, Jamie Johnson wrote: >> When doing writes do all writes need to be done to the primary shard >> or are writes that are done to the replica also pushed to all replicas >> of that shard?

Re: Solr Cloud - is replication really a feature on the trunk?

2011-09-09 Thread Jamie Johnson
as a note you could change out the values in solr.xml to be as follows and pull these values from System Properties. unless someone says otherwise, but the quick tests I've run seem to work perfectly well with this setup. 2011/9/9 Yury Kats : > On 9/9/2011 6:54 PM, Pulkit Singhal wrot

Re: MMapDirectory failed to map a 23G compound index segment

2011-09-09 Thread Lance Norskog
I remember now: by memory-mapping one block of address space that big, the garbage collector has problems working around it. If the OOM is repeatable, you could try watching the app with jconsole and watch the memory spaces. Lance On Thu, Sep 8, 2011 at 8:58 PM, Lance Norskog wrote: > Do you ne

Stemming and other tokenizers

2011-09-09 Thread Patrick Sauts
Hello, I want to implement some king of AutoStemming that will detect the language of a field based on a tag at the start of this field like #en# my field is stored on disc but I don't want this tag to be stored. Is there a way to avoid this field to be stored ? To me all the filters and the t

Re: Solr Cloud - is replication really a feature on the trunk?

2011-09-09 Thread Yury Kats
On 9/9/2011 6:54 PM, Pulkit Singhal wrote: > Thanks Again. > > Another question: > > My solr.xml has: > > > > > And I omitted -Dcollection.configName=myconf from the startup command > because I felt that specifying collection="myconf" should take care of > that: > cd /trunk/solr/examp

Re: SolrCloud and replica question

2011-09-09 Thread Yury Kats
On 9/9/2011 4:48 PM, Jamie Johnson wrote: > When doing writes do all writes need to be done to the primary shard > or are writes that are done to the replica also pushed to all replicas > of that shard? > If you have replication setup between cores, all changes to the slave will be overwritten by

searching for terms containing embedded spaces

2011-09-09 Thread Mark juszczec
Hi folks I've got a field that contains 2 words separated by a single blank. What's the trick to creating a search string that contains the single blank? Mark

solr equivalent of "select distinct"

2011-09-09 Thread Mark juszczec
Hello everyone Let's say each record in my index contains fields named PK, FLD1, FLD2, FLD3 FLD100 PK is my solr primary key and I'm creating it by concatenating FLD1+FLD2+FLD3 and I'm guaranteed that combination will be unique Let's say 2 of these records have FLD1 = A and FLD2 = B. I am

Re: Solr Cloud - is replication really a feature on the trunk?

2011-09-09 Thread Pulkit Singhal
I had forgotten to save the file, the collection name at least shows up but the core name is still not used, is it simply decorative? /collections (v=6 children=1) myconf (v=0 children=1) "configName=configuration1" shards (v=0 children=1) shard1 (v=0 children=1) tiklup-mac.loc

Re: Solr Cloud - is replication really a feature on the trunk?

2011-09-09 Thread Pulkit Singhal
Thanks Again. Another question: My solr.xml has: And I omitted -Dcollection.configName=myconf from the startup command because I felt that specifying collection="myconf" should take care of that: cd /trunk/solr/example java -Dbootstrap_confdir=./solr/conf -Dslave=disabled -DzkRun -jar

SolrCloud and replica question

2011-09-09 Thread Jamie Johnson
When doing writes do all writes need to be done to the primary shard or are writes that are done to the replica also pushed to all replicas of that shard?

RE: Alias name for a index field

2011-09-09 Thread Tirthankar Chatterjee
We are using fielded query using EDISMAX which is passed into the q parameter. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: Friday, September 09, 2011 10:21 AM To: solr-user@lucene.apache.org Subject: Re: Alias name for a index field How are you doing field

Re: Running solr on small amounts of RAM

2011-09-09 Thread Mike Austin
or actually disabling caching as mentioned here: http://wiki.apache.org/solr/SolrCaching#Cache_Sizing On Fri, Sep 9, 2011 at 11:48 AM, Mike Austin wrote: > I'm trying to push to get solr used in our environment. I know I could have > responses saying WHY can't you get more RAM etc.., but lets ju

"String index out of range: -1" for hl.fl=* in Solr 1.4.1?

2011-09-09 Thread Demian Katz
I'm running into a strange problem with Solr 1.4.1 - this request: http://localhost:8080/solr/website/select/?q=*%3A*&rows=20&start=0&indent=yes&fl=score&facet=true&facet.mincount=1&facet.limit=30&facet.field=category&facet.field=linktype&facet.field=subject&facet.prefix=&facet.sort=&fq=category%3

Re: FunctionQueryNode pipeline?

2011-09-09 Thread Chris Hostetter
: space, so identifying a function vs. a group clause hinders any progress. : Is that why they separated the functionality of queries using the : defType=func? The function syntax in solr predates the new QueryNode based QueryParser in lucene. The main motivation behind "defType" was to refa

Re: SolrCloud Feedback

2011-09-09 Thread Pulkit Singhal
I think I understand it a bit better now but wouldn't mind some validation. 1) solr.xml does not become part of ZooKeeper 2) The default looks like this out-of-box: so that may leave one wondering where the core's association to a collection name is made? It can be made like so: a) sta

Re: TermsComponent from deleted document

2011-09-09 Thread Chris Hostetter
: http://wiki.apache.org/solr/TermsComponent states that TermsComponent will : return frequencies from deleted documents too. : : Is there anyway to omit the deleted documents to get the frequencies. not really -- until a deleted document is expunged from segment merging, they are still include

Re: problems of getting frequency and position for a paticular word

2011-09-09 Thread Chris Hostetter
: Is there a way for solr to return only the frequency and position of a : paticular word back to client? I don't think so. It would probably be relatively straight forward to add to TermVectorComponent -- i don't know that it would save any *time* (i tihnk it would still have to process all t

Re: Sorting groups by numFound group size

2011-09-09 Thread O. Klein
I am also looking for way to sort on numFound. Has an issue been created? -- View this message in context: http://lucene.472066.n3.nabble.com/Sorting-groups-by-numFound-group-size-tp3315740p3323420.html Sent from the Solr - User mailing list archive at Nabble.com.

Running solr on small amounts of RAM

2011-09-09 Thread Mike Austin
I'm trying to push to get solr used in our environment. I know I could have responses saying WHY can't you get more RAM etc.., but lets just skip those and work with this situation. Our index is very small with 100k documents and a light load at the moment. If I wanted to use the smallest possible

Re: SolrCloud Feedback

2011-09-09 Thread Pulkit Singhal
Hello Jan, You've made a very good point in (b). I would be happy to make the edit to the wiki if I understood your explanation completely. When you say that it is "looking up what collection that core is part of" ... I'm curious how a core is being put under a particular collection in the first

Re: How to order results by word position???

2011-09-09 Thread Chris Hostetter
: I have a problem with solr search. If I search after "vitamin" I receive : : 1 - arrca MULTIVITAMIN FRUCHTSAFTBĂ„RCHEN : 2 - VITAMIN E-KAPSELN NAT. 400 1) That first example will not match a query for "vitamin" using the analyzers you specified -- so if those are the results you are getting,

Weird behaviors with not operators.

2011-09-09 Thread electroyou
Hi all. I'm crashing into a weird behavior with - operators. If I execute the query -text AND -text I get all expected results (lot), but if I put some parenthesis like -text AND (-text) or (-text) AND (-text) then I get no results at all. I can't understand why. Do you have an explanation for this

Re: How to write this query?

2011-09-09 Thread Erick Erickson
That's not valid query syntax at all. what are you trying to do with key=? You probably want something like key:(value1^8 value2^4 value3^2) or key:value1^8 key:value2^4 key:value3^2 Best Erick On Thu, Sep 8, 2011 at 8:29 AM, crisfromnova wrote: > You can try this: q=key:value1^8 key=value2^

RE: question about StandardAnalyzer, differences between solr 1.4 and solr 3.3

2011-09-09 Thread Marc Des Garets
Ok thanks, I don't know why the behaviour is different from my 1.4 index then but hopefully it will be the same by doing what you tell me. Thanks again, Marc -Original Message- From: Steven A Rowe [mailto:sar...@syr.edu] Sent: 09 September 2011 14:40 To: solr-user@lucene.apache.org Sub

Re: Solr Cloud - is replication really a feature on the trunk?

2011-09-09 Thread Yury Kats
On 9/9/2011 10:52 AM, Pulkit Singhal wrote: > Thank You Yury. After looking at your thread, there's something I must > clarify: Is solr.xml not uploaded and held in ZooKeeper? Not as far as I understand. Cores are loaded/created by the local Solr server based on solr.xml and then registered with

Re: Solr Cloud - is replication really a feature on the trunk?

2011-09-09 Thread Pulkit Singhal
Thank You Yury. After looking at your thread, there's something I must clarify: Is solr.xml not uploaded and held in ZooKeeper? I ask this because you have a slightly different config between Node 1 & 2: http://lucene.472066.n3.nabble.com/Replication-setup-with-SolrCloud-Zk-td2952602.html On Wed,

Re: scoring only by higher boost

2011-09-09 Thread Jamie Johnson
No problem, occasionally a blind squirrel finds a nut :) On Fri, Sep 9, 2011 at 8:46 AM, crisfromnova wrote: > It works with dismax. > Thank you very much!! > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/scoring-only-by-higher-boost-tp3322666p3322800.html > Sent from

Re: Alias name for a index field

2011-09-09 Thread Erik Hatcher
How are you doing fielded search currently? End users using the "lucene" query parser? Or using dismax/qf? I'm just curious to drill into your needs here exactly in terms of request/response and whether a simple application layer handling of the alias need would suffice or this is something

Re: Alias name for a index field

2011-09-09 Thread darren
See http://wiki.apache.org/solr/FieldAliasesAndGlobsInParams On Fri, 9 Sep 2011 09:59:57 -0400, Tirthankar Chatterjee wrote: > Hi, > Is there a way that we can give an alias name for a field so that the > schema is not required to change. > > Use Case: We defined the schema with a field called

Alias name for a index field

2011-09-09 Thread Tirthankar Chatterjee
Hi, Is there a way that we can give an alias name for a field so that the schema is not required to change. Use Case: We defined the schema with a field called "conv" (basically to store conversation of an email) There are users who wants this to be used as "subject" One Solution: Use copy fiel

Adding Query Filter custom implementation to Solr's pipeline

2011-09-09 Thread Eugene Prystupa
Hi, When I was using Lucene directly I used a custom implementation of query filter to enforce entitlements of search results. Now, that I'm switching my infrastructure from custom host to Solr, what is the best way to configure Solr to use my custom query filter for every request? Thanks! -Eu

TermsComponent from deleted document

2011-09-09 Thread Manish Bafna
Hi, http://wiki.apache.org/solr/TermsComponent states that TermsComponent will return frequencies from deleted documents too. Is there anyway to omit the deleted documents to get the frequencies. I know there is a facets which can be used. Is it recommended to use facets for autosuggest feature?

RE: NRT and commit behavior

2011-09-09 Thread Tirthankar Chatterjee
Erick, What you said is correct for us the searches are based on some Active Directory permissions which are populated in Filter query parameter. So we don't have any warming query concept as we cannot fire for every user ahead of time. What we do here is that when user logs in we do an invalid

RE: question about StandardAnalyzer, differences between solr 1.4 and solr 3.3

2011-09-09 Thread Steven A Rowe
Hi Marc, StandardAnalyzer includes StopFilter. See the Javadocs for Lucene 3.3 here: This is not new behavior - StandardAnalyzer in Lucene 2.9.1 (the version of Lucene bundled with Solr 1.4)

Re: Indexing Lotus Notes database using API

2011-09-09 Thread Alexandre Rafalovitch
I was looking at doing something similar a little while ago and I would not actually go with entry-by-entry extraction code. There is a semi-secret way to export the whole Lotus Notes database into an XML format. It can then be processed to extract and import whatever information you want, much mo

Re: indexing data from rich documents - Tika with solr3.1

2011-09-09 Thread Erik Hatcher
If the only thing you're doing is indexing file content, then you can bypass using the Data Import Handler altogether and use the ExtractingRequestHandler (aka Solr Cell). And you can feed in a file from a URL using the stream.url capability, like the stream.file example here:

Re: scoring only by higher boost

2011-09-09 Thread Jamie Johnson
I could be wrong, but isn't that what edismax does? http://lucene.apache.org/java/3_0_0/api/core/org/apache/lucene/search/DisjunctionMaxQuery.html On Fri, Sep 9, 2011 at 7:49 AM, crisfromnova wrote: > Hi, > > For my search I want to calculate the score only by higher boost. For > example: > doc1

scoring only by higher boost

2011-09-09 Thread crisfromnova
Hi, For my search I want to calculate the score only by higher boost. For example: doc1 Charlie Jonhson doc2 Charlie Charlie So when I use the query : "q=name:Charlie^5 surname:Charlie^2", I want that both documents to have the same score, based on the boost value of the first field m

indexing data from rich documents - Tika with solr3.1

2011-09-09 Thread scorpking
Hi everyone, Now i have had a problem with tika and solr. I successed in index data from various file formats (pdf, doc...) with a file absolute path. but now I have a link from internet (ex: http://myweb/filename.pdf). I want to index from this link, But it's not ok. I don't why? This is my file

question about StandardAnalyzer, differences between solr 1.4 and solr 3.3

2011-09-09 Thread Marc Des Garets
Hi, I have a simple field defined like this: Which I use here: In solr 1.4, I could do: ?q=(middlename:a*) And I was getting all documents where middlename = A or where middlename starts by the letter A. In solr 3.3, I get only results where middlename starts by the lette

Checkout SearchWorkings.org - it just went live!

2011-09-09 Thread Simon Willnauer
Hey folks, Some of you might have heard, myself and a small group of other passionate search technology professionals have been working hard in the last few months to launch a community site known as SearchWorkings.org [1]. This initiative has been set up for other search professionals to have a s