Re: Stopword filter - refreshing stop word list periodically

2011-11-03 Thread Jithin
Thanks Sami. I ended up setting up a proper core as per documentation, named core0. On Thu, Nov 3, 2011 at 11:07 PM, Sami Siren-2 [via Lucene] < ml-node+s472066n3477844...@n3.nabble.com> wrote: > On Fri, Oct 14, 2011 at 10:06 PM, Jithin <[hidden > email]

how to achieve google.com like results for phrase queries

2011-11-03 Thread alxsss
Hello, I use nutch-1.3 crawled results in solr-3.4. I noticed that for two word phrases like newspaper latimes, latimes.com is not in results at all. This may be due to the dismax def type that I use in request handler dismax url^1.5 id^1.5 content^ title^1.2 url^1.5 id^1.5 content^0.5 title^1

Re: Using Solr components for dictionary matching?

2011-11-03 Thread Vijay Ramachandran
On Thu, Nov 3, 2011 at 4:06 PM, Nagendra Mishr wrote: > The scenarios that could use dictionary matching: > > 1. Document being processed to see if it contains one of 10,000 terms. > > 2. Query completion as you type > > 3. Basically the inverse of finding a document.. Instead the document > is t

RE: Using Solr components for dictionary matching?

2011-11-03 Thread Nagendra Mishr
The scenarios that could use dictionary matching: 1. Document being processed to see if it contains one of 10,000 terms. 2. Query completion as you type 3. Basically the inverse of finding a document.. Instead the document is the query term and the dictionary of terms is being matched in paralle

Highlighter showing matched query words only

2011-11-03 Thread Nikeman
Hello Folks, I am a newbie of Solr. I wonder if Solr Highlighter can show the matched query words only. Suppose my query is "godfather AND pacino." I just want to display "godfather" and "pacino" in any of the highlighted fields. For the sake of performance, I do not want to use regular expression

Re: UnInvertedField vs FieldCache for facets for single-token text fields

2011-11-03 Thread Martijn v Groningen
Hi Micheal, The FieldCache is an easier data structure and easier to create, so I also expect it to be faster. Unfortunately for TextField UnInvertedField is always used even if you have one token per document. I think overriding the multiValuedFieldCache method and return false would work. If yo

Re: Access Document Score in Custom Function Query (ValueSource)

2011-11-03 Thread sangrish
I understand that. Thanks. I just posted a related question , titled : "Access Score in Custom Function Query " where (among other things) I am asking about the performance aspects of this method. As you said, I need to execute "some" query first to create a constrained recall set & then apply

Access Score in Custom Function Query

2011-11-03 Thread sangrish
Hi, I have a custom function query (value source) where I want to use the score for some computation. For example, for every document I want to add some number (obtained from an external file) to its score. I am achieving this like the following: http://localhost:PORT/myCore/select?q=querySt

Re: facet with group by (or field collapsing)

2011-11-03 Thread Martijn v Groningen
collapse.facet=after doesn't exists in Solr 3.3. This parameter exists in the SOLR-236 patches and is implemented differently in the released versions of Solr. >From Solr 3.4 you can use group.truncate. The facet counts are then computed based on the most relevant documents per group. Martijn On

Re: facet with group by (or field collapsing)

2011-11-03 Thread erj35
I'm attempting the following query: http://{host}/apache-solr-3.3.0/select/?q=cesy&version=2.2&start=0&rows=10&indent=on&group=true&group.field=SIP&group.limit=1&facet=true&facet.field=REPOSITORYNAME The result is 4 matches all in 1 group (with group.limit=1). Rather than show facet.field=REPOSI

Re: Default value for dynamic fields

2011-11-03 Thread Yonik Seeley
On Thu, Nov 3, 2011 at 12:59 PM, Milan Dobrota wrote: > Is there any way to define the default value for the dynamic fields in > SOLR? I use some dynamic fields of type float with _val_ and if they > haven't been created at index time, the value defaults to 0. I would want > this to be 1. Can that

Re: BaseTokenFilterFactory not found in plugin

2011-11-03 Thread Chris Hostetter
: myorg/solr/analysis/*.java`. I then made a `.jar` file from the .class files : and put the .jar file in the solr/lib/ directory. I modified schema.xml to : include the new filter: what exactly do you mean by "the solr/lib/ directory" ? ... if you mean that "solr" is the solr home dir where you

RE: Questions about Solr's security

2011-11-03 Thread Robert Petersen
Me too! -Original Message- From: Walter Underwood [mailto:wun...@wunderwood.org] Sent: Tuesday, November 01, 2011 1:02 PM To: solr-user@lucene.apache.org Subject: Re: Questions about Solr's security I once had to deal with a severe performance problem caused by a bot that was requesting

Re: Dismax and phrases

2011-11-03 Thread Chris Hostetter
: ...is this perhaps a side effect of the new autoGeneratePhraseQueries : option? ... you are explicitly specifying a quoted phrase, but : maybe somehwere in the code path of the dismax parser that information is : getting lost? FWIW: a) I just realized you said in your first message you were

UnInvertedField vs FieldCache for facets for single-token text fields

2011-11-03 Thread Michael Ryan
I have some fields I facet on that are TextFields but have just a single token. The fieldType looks like this: SimpleFacets uses an UnInvertedField for these fields because multiValuedFieldCache() returns true for TextField. I tried changing the type for these fields to the plain "s

Re: Dismax and phrases

2011-11-03 Thread Chris Hostetter
Interesting, in the case where you use quotes... : + ... : "asuntojen hinnat" : "asuntojen hinnat" ...there is one DisjunctionMaxQuery (expected) for the entire phrase, but in the sub-clauses for each individual field the clauses coming from your "_fi" fields are just building boolean

Re: DIH doesn't handle bound namespaces?

2011-11-03 Thread Chris Hostetter
: *It does not support namespaces , but it can handle xmls with namespaces . The real crux of hte issue is that XPathEntityProcessor is terribly named. it should have been called "LimitedXPathishSyntaxEntityProcessor" or something like that because it doesn't support full xpath syntax... "The

admin index version not updating

2011-11-03 Thread Nathan Moon
I have a setup with a master and single slave, using the collection distribution scripts. I'm not sure if it's relevant, but I'm running multicore also. I am on version 3.4.0 (we are upgrading from 1.3). My understanding that the indexVersion (a number) reported by the stats page (admin/stats

Re: Access Document Score in Custom Function Query (ValueSource)

2011-11-03 Thread Chris Hostetter
: In this value source I compute another score for every document : using some features. I want to access the score of the query myField^2 : (for a given document) in this same value source. : : Ideas? your ValueSource can wrap the score from the other query using a QueryValueSource.

Re: score based on unique words matching

2011-11-03 Thread Chris Hostetter
: > q=david bowie changes : > : > Problem : If a record mentions david bowie a lot, it beats out something : > more relevant (more unique matches) ... : > : > A. (now appearing david bowie at the cineplex 7pm david bowie goes on stage, : > then mr. bowie will sign autographs) : > B. song :david

performance - dynamic fields versus static fields

2011-11-03 Thread Memory Makers
Hi, Is there a handy resource on the: a. performance of: dynamic fields versus static fields b. other pros-cons? Thanks.

Re: Can you please guide me through step-by-step installation of Solr Cell ?

2011-11-03 Thread Chris Hostetter
: Caused by: org.apache.solr.common.SolrException: Error loading class 'solr.extraction.ExtractingRequestHandler' : : With the jetty and the provided example, I have no problem. It all happens when I use tomcat and solr. : : My setup is as follows: : : I downloaded the apache-solr-3.3.0 and

Re: Default value for dynamic fields

2011-11-03 Thread Milan Dobrota
It doesn't work for me. 2011/11/3 Yury Kats > On 11/3/2011 12:59 PM, Milan Dobrota wrote: > > Is there any way to define the default value for the dynamic fields in > > SOLR? I use some dynamic fields of type float with _val_ and if they > > haven't been created at index time, the value defaults

Ordered proximity search

2011-11-03 Thread LT.thomas
Hi, By ordered I mean term1 will always come before term2 in the document. I have two documents: 1. "By ordered I mean term1 will always come before term2 in the document" 2. "By ordered I mean term2 will always come before term1 in the document" if I make the query: "term1 term2"~Integer.MAX_V

Three questions about: Commit, single index vs multiple indexes and implementation advice

2011-11-03 Thread Gustavo Falco
Hi guys! I have a couple of questions that I hope someone could help me with: 1) Recently I've implemented Solr in my app. My use case is not complicated. Suppose that there will be 50 concurrent users tops. This is an app like, let's say, a CRM. I tell you this so you have an idea in terms of ho

Re: how to apply sort and search both on multivalued field in solr

2011-11-03 Thread Erick Erickson
Right, the behavior when sorting on a multivalued field is not defined, so results are unreliable. There's nothing that I know of that'll allow your sort to occur on the matched terms in a multiValued field. But, again, defining correct behavior here isn't easy. What if you searched for two terms

Re: Stopword filter - refreshing stop word list periodically

2011-11-03 Thread Sami Siren
On Fri, Oct 14, 2011 at 10:06 PM, Jithin wrote: > What will be the name of this hard coded core? I was re arranging my > directory structure adding a separate directory for code. And it does work > with a single core. In trunk the "single core setup" core is called "collection1". So to reload tha

Re: Default value for dynamic fields

2011-11-03 Thread Yury Kats
On 11/3/2011 12:59 PM, Milan Dobrota wrote: > Is there any way to define the default value for the dynamic fields in > SOLR? I use some dynamic fields of type float with _val_ and if they > haven't been created at index time, the value defaults to 0. I would want > this to be 1. Can that be changed

Re: Selective Result Grouping

2011-11-03 Thread Martijn v Groningen
Ok I think I get this. I think this can be achieved if one could specify a filter inside a group and only documents that pass the filter get grouped. For example only group documents with the value image for the mimetype field. This filter should be specified per group command. Maybe we should open

Re: exact matches possible?

2011-11-03 Thread Roland Tollenaar
Hi Erik, you are spot on with your guess. I had reinserted my data but apparently that does not reindex. Delete everything and re-enter was required. Behaviour now seems to be as desired. Thank you very much. PS, thanks for pointing out that the !term is literal. Where can I find that kind

Re: how to apply sort and search both on multivalued field in solr

2011-11-03 Thread vrpar...@gmail.com
Thanks Erick, what i given 'abc',...etc... its values of one multivalued field in one document, but might be its confusing. lets say, i have one field named Array1 has multivalued=true now i want to Search on Array1 , but i want only affected values (which i can get in "highlighting"), now

Default value for dynamic fields

2011-11-03 Thread Milan Dobrota
Is there any way to define the default value for the dynamic fields in SOLR? I use some dynamic fields of type float with _val_ and if they haven't been created at index time, the value defaults to 0. I would want this to be 1. Can that be changed?

Re: Stream still in memory after tika exception? Possible memoryleak?

2011-11-03 Thread P Williams
Hi All, I'm experiencing a similar problem to the other's in the thread. I've recently upgraded from apache-solr-4.0-2011-06-14_08-33-23.war to apache-solr-4.0-2011-10-14_08-56-59.war and then apache-solr-4.0-2011-10-30_09-00-00.war to index ~5300 pdfs, of various sizes, using the TikaEntityProce

Re: DIH doesn't handle bound namespaces?

2011-11-03 Thread P Williams
Hi Gary, From http://wiki.apache.org/solr/DataImportHandler#Usage_with_XML.2BAC8-HTTP_Datasource *It does not support namespaces , but it can handle xmls with namespaces . When you provide the xpath, just drop the namespace and give the rest (eg if the tag is '' the mapping should just contain 's

Re: how to apply sort and search both on multivalued field in solr

2011-11-03 Thread Erick Erickson
What does "sorting on a multivalued field" mean? Should the document appear, in your example, in the a's? c's? e's? p's? There's no logical place to sort a document into a list when there's more than one token that makes sense in the general case that I can think of Why wouldn't searching oh y

RE: Jetty logging

2011-11-03 Thread darul
Well, jetty is running as a unix service. Here is run command : jetty-logging.xml: With this configuration I have logs of jetty but no logs of log4j: exemple "/logs/_mm_dd.stderrout.log" 2011-11-03 14:36:59.306:INFO::jetty-6.1-SNAPSHOT Nov 3, 2011 2:36:59 PM org.apache.solr.core.Solr

Re: Multivalued fields question

2011-11-03 Thread Travis Low
Thanks much, Erick. Between your explanation, and what I read at http://lucene.472066.n3.nabble.com/positionIncrementGap-in-schema-xml-td488338.html, the utility of multiValued fields is clear. On Thu, Nov 3, 2011 at 8:26 AM, Erick Erickson wrote: > multiValued has nothing to do with how many to

AW: large scale indexing issues / single threaded bottleneck

2011-11-03 Thread sebastian.reese
Hi, we are currently thinking about the performance facts too. I wonder if there are any sites on the net describing what a large index is? People always talk about huge indexes and heavy commits etc. but i can't find some stats about it in numbers and no information about the hardware used. M

RE: Questions about Solr's security

2011-11-03 Thread Jaeger, Jay - DOT
It seems to me that this issue needs to be addressed in the FAQ and in the tutorial, and that somewhere there should be a /select lock-down "how to". This is not obvious to many (most?) users of Solr. It certainly wasn't obvious to me before I read this. JRJ -Original Message- From:

RE: change solr url

2011-11-03 Thread Jaeger, Jay - DOT
The file that he refers to, web.xml, is inside the solr WAR file in folder web-inf. That WAR file is in ...\example\webapps. You would have to uncomment the section under and change the to something else. But, as the comments in the section explain, you would also have to make other cha

RE: large scale indexing issues / single threaded bottleneck

2011-11-03 Thread Jaeger, Jay - DOT
Shishir, we have 35 million "documents", and should be doing about 5000-1 new "documents" a day, but with very small "documents": 40 fields which have at most a few terms, with many being single terms. You may occasionally see some impact from top level index merges but those should be

Re: Using Solr components for dictionary matching?

2011-11-03 Thread Andrea Gazzarini
Assuming that with "dictionary" you would mean (also) a thesaurus, you can consider to use SIREn which is a SOLR / Lucene add-on, able to index (and search) RDF data. In this way, you could index an already available thesaurus like LCSH, Agrovoc or build and index your own vocabulary. subsequentl

RE: Jetty logging

2011-11-03 Thread Kai Gülzau
Hi, remove slf4j-jdk14-1.6.1.jar from the war and repack it with slf4j-log4j12.jar and log4j-1.2.14.jar instead. ->http://wiki.apache.org/solr/SolrLogging Regards, Kai Gülzau -Original Message- From: darul [mailto:daru...@gmail.com] Sent: Thursday, November 03, 2011 11:26 AM To: solr

Re: pingQuery problem ?

2011-11-03 Thread darul
One of my core had a missing ping request handler. -- View this message in context: http://lucene.472066.n3.nabble.com/pingQuery-problem-tp3476850p3476980.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Multivalued fields question

2011-11-03 Thread Erick Erickson
multiValued has nothing to do with how many tokens are in the field, it's just whether you can call document.add("field1", val1) more than once on the same field. Or, equivalently, in input document in XML has two entries with the same name="field" entries. So it strictly depends upon whether you

Re: exact matches possible?

2011-11-03 Thread Erik Hatcher
Roland - Is it possible that you indexed with a different field type and then changed to "string" without reindexing? A query on a string will only match literally the exact value (barring any wildcard/regex syntax), so something is fishy with your example. Your query example was odd, not su

Re: SOLRJ commitWithin inconsistent

2011-11-03 Thread Nagendra Nagarajayya
Vijay: You may want to try Solr 3.3/3.4 with RankingAlgorithm as it supports NRT (Real Time Updates). You can set the commit interval to about 15 mins or as desired. You can get more information about NRT with 3.3/3.4.0 from here: http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver_3.x

Re: Using Solr components for dictionary matching?

2011-11-03 Thread Erick Erickson
I really don't understand what you're asking. Could you give some examples of what you're trying to do? Best Erick On Tue, Nov 1, 2011 at 10:38 AM, Nagendra Mishr wrote: > Hi all, > > Is there a good guide on using Solr components as a dictionary > matcher?  I'm need to do some pre-processing th

pingQuery problem ?

2011-11-03 Thread darul
My solr instance works well, when calling ping page I get no problem : But in logs, I see this error lines repeated, do you know how to solve this ? solrconfig.xml Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/pingQuery-problem-tp3476850p3476850.html Sent fro

Re: exact matches possible?

2011-11-03 Thread Roland Tollenaar
Hi Erik, thanks for the response. I have ensured the type is string and that the field is indexed. No luck though: (Schema setting under solr/conf): Query: Word:apple Desired result: apple Achieved Results: apple, the red apple, pine-apple, etc, etc I have also tried your other sugges