Re: How real-time are Solr/Lucene queries?

2010-05-25 Thread Amit Nithian
This is an interesting discussion and I have a few questions: 1) My apologies but I haven't been following the NRT patch beyond what was presented at a meetup some months back and the wiki but what is the status of it in Solr? 2) What are typical/accepted definitions of "Real Time" vs "Near Real Ti

Re: Full Import failed

2010-05-25 Thread Mohamed Parvez
I am just using the sor.war file that came with the Solr 1.4 download on weblogic. did not add any jar or remove any jar On Tue, May 25, 2010 at 9:54 PM, Chris Hostetter wrote: > > : yes i am running 1.5, Any idea how we can run Solr 1.4 using Java 1.5 > > Solr 1.4 works just fine with Java 1.5

Re: Full Import failed

2010-05-25 Thread Chris Hostetter
: yes i am running 1.5, Any idea how we can run Solr 1.4 using Java 1.5 Solr 1.4 works just fine with Java 1.5 -- even when Using the DataImportHandler. there are some features of DIH like the ScriptTransformer that requires java 1.6, but that's not your issue... : > Last I encountered that

Re: Solr highlighter and custom queries?

2010-05-25 Thread Chris Hostetter
: Actually, its not as much a Solr problem as a Lucene one, as it turns : out, the WeightedSpanTermExtractor is in Lucene and not Solr. : : Why they decided to only highlight queries that are in Lucene I don't : know, but what I did to solve this problem was simply to make my queries : extends

Re: solr caches from external caching system like memcached

2010-05-25 Thread Chris Hostetter
: Is it possible to use solr caches such as query cache , filter cache : and document cache from external caching system like memcached as it : has several advantages such as centralized caching system and reducing the : pause time of JVM 's garbage collection as we can assign less

Re: Solr Delta Queries

2010-05-25 Thread Chris Hostetter
: : For some reason when doing delta indexing via DIH, this field is not being updated. : : Are timestamp fields updated during DELTA updates? timestamp fields aren't treated any differnetly then any other field -- as far as Solr is concerned this just date field that happens to have a defa

Re: question about indexing...

2010-05-25 Thread Erick Erickson
Don't forget to re-index after you make the change Lance suggested... Erick On Tue, May 25, 2010 at 4:51 PM, Lance Norskog wrote: > Change type="string" to type="text". This causes the field to be > analyzed and then searching on words finds the document. > > > > On Tue, May 25, 2010 at 8:34 AM

Re: Solr Cell and encrypted pdf files

2010-05-25 Thread Chris Hostetter
: I can't seem to get solr cell to index password protected pdf files. : I can't figure out how to pass the password to tika and looking at : ExtractingDocumentLoader, : it doesn't seem to pass any pdf password related metadata to the tika parser. I suspect you are correct, i don't think anyone h

Re: Debugging - DIH Delta Queries-

2010-05-25 Thread Chris Hostetter
: Subject: Debugging - DIH Delta Queries- : References: : <1659766275.5213.1274376509278.javamail.r...@vicenza.dmz.lexum.pri> : In-Reply-To: : <1659766275.5213.1274376509278.javamail.r...@vicenza.dmz.lexum.pri> http://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing L

Re: Does SOLR provide a java class to perform url-encoding

2010-05-25 Thread Ryan McKinley
You may also want to look at: ClientUtils.escapeQueryChars( String s ) http://lucene.apache.org/solr/api/org/apache/solr/client/solrj/util/ClientUtils.html#escapeQueryChars%28java.lang.String%29 this will escape any lucene query chars, then pass it to URLEncoder and you should be good to go. On

Enhancing Solr relevance functions through predefined constants

2010-05-25 Thread Prasanna R
Hi all, I have a suggestion for improving relevance functions in Solr by way of providing access to a set of pre-defined constants in Solr queries. Specifically, the number of documents indexed, the number of unique terms in a field, the total number of terms in a field, etc. are some of the query

Re: question about indexing...

2010-05-25 Thread Lance Norskog
Change type="string" to type="text". This causes the field to be analyzed and then searching on words finds the document. On Tue, May 25, 2010 at 8:34 AM, Jörg Agatz wrote: > i create a new Index, but nothing Change. > >   multiValued="true"/> > > > > > > > > > I search for : > > " *:* " > I fo

Re: Does SOLR provide a java class to perform url-encoding

2010-05-25 Thread JohnRodey
I was assuming that I needed to leave the special characters in the http get, but running the solr admin it looks like it converts them the same way that URLEncoder.encode does. What is the need to preserve special characters? http://localhost:8983/solr/select?indent=on&version=2.2&q=%22mr.+bill

RE: Solr read-only core

2010-05-25 Thread Yao
My motivation is more from the performance prospective than functional prospective. I was hoping by opening the Solr index/core read-only, underlying Lucene IndexReader can be opened in read-only mode for optimum query performance (removing the overhead of multi-thread management). -- View this m

Re: IndexSearcher and Caches

2010-05-25 Thread Lance Norskog
The stats.jsp page walks the internal JMX beans. It prints out the numbers of documents among other things. I would look at how that works instead of writing your own thing for the internal APIs. They may have changed from Solr 1.3 to 1.4 and will change further for 1.5 (4.0 is the new name?). On

RE: Solr read-only core

2010-05-25 Thread Markus Jelsma
Hi,   I'd guess there are two ways in doing this but i've never seen any solrconfig.xml file having any directives that explicitly do not allow for updates.   You'd either have a proxy in front that simply won't allow any other HTTP method than GET and HEAD, or you could remove the update re

Solr read-only core

2010-05-25 Thread Yao
Is there a way to open a Solr index/core in read-only mode? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-read-only-core-tp843049p843049.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Indexing stalls reads

2010-05-25 Thread Lance Norskog
This sounds like you have the same solrconfig for the slave and the master? You should turn off autoCommit on the slave. Only the master should autoCommit. You should set up the ReplicationHandler. This moves index updates from the indexer to the query server. http://www.lucidimagination.com/sear

Re: SOLR-343 date facet mincount patch

2010-05-25 Thread Umesh_
Chris, Please ignore the repeated response header due to typo in the previous message. ~Umesh -- View this message in context: http://lucene.472066.n3.nabble.com/Re-SOLR-343-date-facet-mincount-patch-tp789556p842863.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: SOLR-343 date facet mincount patch

2010-05-25 Thread Umesh_
Hoss, I was able to successfully apply the path Solr-343 and even after applying the patch, date facet minCount does not work. Appropriate part of response are as given below: ["responseHeader"] => object(SolrObject)#107 (3) { ["status"] => int(0) ["QTime"] => int(4) ["params"] =>

Re: Does SOLR provide a java class to perform url-encoding

2010-05-25 Thread JohnRodey
Thanks Sean, that was exactly what I need. One question though... How to correctly retain the Solr specific characters. I tried adding escape chars but URLEncoder doesn't seem to care about that: Example: String s1 = "\"mr. bill\" oh n?"; String s2 = "\\\"mr. bill\\\" oh n\\?"; String encoded1

Re: Faceted search not working? (RESOLVED)

2010-05-25 Thread Ilya Sterin
Ah, the issue was explicitly specifying components... query I don't remember changing this during default install, commenting this out enabled faceted search component. Thanks all for the help. Ilya On Tue, May 25, 2010 at 10:38 AM, Sascha Szott wrote: > Hi, > > please note, that the Face

Re: Faceted search not working?

2010-05-25 Thread Ilya Sterin
Sascha thanks for the response, here is the output... 0 0 xml title:* title Baseball game Soccer game Football game On Mon, May 24, 2010 at 5:39 PM, Sascha Szott wrote: > Hi Ilya, > > Ilya Sterin

Re: Using solrJ to get all fields in a particular schema/index

2010-05-25 Thread Chris Hostetter
:Is there any way to get all the fields (irrespective of whether : it contains a value or null) in solrDocument. no. a document only has "Field" instances for the fields which it has values for. it's also not a feature that would even be theoretically posisbly to add, becuase of d

Re: How real-time are Solr/Lucene queries?

2010-05-25 Thread Thomas J. Buhr
My documents are all quite small if not down right tiny, there is not much analysis to do. I plan to mainly use Solr for indexing application configuration data which there is a lot of and I have all pre-formated. Since it is a music application there are many score templates, scale and rhythm

Re: Does SOLR provide a java class to perform url-encoding

2010-05-25 Thread Sean Timm
Java provides one. You probably want to use utf-8 as the encoding scheme. http://java.sun.com/javase/6/docs/api/java/net/URLEncoder.html Note you also will want to strip or escape character that are meaningful in the Solr/Lucene query syntax. http://lucene.apache.org/java/2_4_0/queryparsersyn

Does SOLR provide a java class to perform url-encoding

2010-05-25 Thread JohnRodey
I would like to leverage on whatever SOLR provides to properly url-encode a search string. For example a user enters: "mr. bill" oh no The URL submitted by the admin page is: http://localhost:8983/solr/select?indent=on&version=2.2&q=%22mr.+bill%22+oh+no&fq=&start=0&rows=10&fl=*%2Cscore&qt=standa

Re: Problem with extended dismax, minus prefix (to mean NOT) and interaction with mm?

2010-05-25 Thread Chris Hostetter
: I'm running edismax (on both a 1.4 with patch and a branch_3x version) and : I'm seeing something I don't expect. ... : dog cat -trilogy : dog cat -trilogy : allfields:dog allfields:cat : -allfields:trilogi : allfields:dog allfields:cat : -allfields:trilogi Hmmm... something is reall

Re: How real-time are Solr/Lucene queries?

2010-05-25 Thread Jason Rutherglen
The main issue is if you're using facets, which are currently inefficient for the realtime use case because they're created on the entire set of segment/readers. Field caches in Lucene are per segment and so don't have this problem. On Tue, May 25, 2010 at 4:09 AM, Grant Ingersoll wrote: > How m

Help me understand query syntax of subqueries

2010-05-25 Thread Tigi Scramble
Any idea why this query returns 0 records: "sexual assault" AND (-obama) while this one returns 1400 ? "sexual assault" AND -(obama) Some debug info: "sexual assault" AND (-obama), translates to: +text:"sexual assault" +(-text:obama), returns 0 records "sexual assault" AND -(obama), tr

Re: caching on unique queries

2010-05-25 Thread Chris Hostetter
: Pretty much every one of my queries is going to be unique. However, the : query is fairly complex and also contains both unique and non-unique : data. In the query, some fields will be unique (e.g description), but : other fields will be fairly common (e.g. category). If we could use : those

Re: question about indexing...

2010-05-25 Thread Jörg Agatz
i create a new Index, but nothing Change. I search for : " *:* " I fond it i search vor "hallo" "Hallo" "hallo*" "Hallo*"or some other content from the CDATA field i dosent.

Re: question about indexing...

2010-05-25 Thread Erik Hatcher
You have to provide more details than that. We need to know the field definition for that named field, the corresponding field type definition, and the exact request you're making to Solr that you think should find this document. And most importantly, did you :) Erik On May 25,

Re: question about indexing...

2010-05-25 Thread Jörg Agatz
ok, done.. But now i dosent find any word in the CDATA field. i make : it is a string field Multivalued.. King

Re: question about indexing...

2010-05-25 Thread Erik Hatcher
Well, you'll just have to create valid XML, either encoding some characters or using CDATA sections. Erik On May 25, 2010, at 10:06 AM, Jörg Agatz wrote: I have a work!, i musst indexing a lot of E-Mails, so i will create a Script to generate me a xml of the Mails. Now is the que

Re: Faceted search not working?

2010-05-25 Thread Sascha Szott
Hi, please note, that the FacetComponent is one of the six search components that are automatically associated with solr.SearchHandler (this holds also for the QueryComponent). Another note: By using name="components" all default components will be replaced by the components you explicitly m

Re: IndexSearcher and Caches

2010-05-25 Thread Rahul R
Chris, I am using SolrIndexSearcher to get a handle to the total number of records in the index. I am doing it like this : int num = Integer.parseInt((String)solrSearcher.getStatistics().get("numDocs").toString()); Please let me know if there is a better way to do this. Mark, I can tell you what I

question about indexing...

2010-05-25 Thread Jörg Agatz
I have a work!, i musst indexing a lot of E-Mails, so i will create a Script to generate me a xml of the Mails. Now is the question, what happens when i creade a field "body" and in this field comes a lot of "<" or ">" like this: Confidentiality Caution: This message and all its included content a

Re: Faceted search not working?

2010-05-25 Thread Jean-Sebastien Vachon
Is the FacetComponent loaded at all? query facet On 2010-05-25, at 3:32 AM, Sascha Szott wrote: > Hi Birger, > > Birger Lie wrote: >> I don't think the bolean fields is mapped to "on" and "off" :) > You can use true and on interchangeably. > > -Sascha > >> >> >> -birg

Re: sort by field length

2010-05-25 Thread Erick Erickson
Ah, I may have misunderstood, I somehow got it in my mind you were talking about the length of each term (as in string length). But if you're looking at the field length as the count of terms, that's another question, sorry for the confusion... I have to ask, though, why you want to sort this way

Re: Tagging and excluding Filters

2010-05-25 Thread Lukas Kahwe Smith
On 25.05.2010, at 08:55, Lukas Kahwe Smith wrote: > Now when I deselect one of the checkboxes I add an fq parameters: > facet=true&fl=*,score&sort=score+desc&start=0&q=(tag_ids:("23"))&facet.field={!ex%3Ddt}organisation_id&facet.field={!ex%3Ddt}tag_ids&facet.field={!ex%3Ddt}addressee_ids&facet.fi

Re: Using solrJ to get all fields in a particular schema/index

2010-05-25 Thread findbestopensource
If a field doesn't have a value, You will get NULL on retrieving it. How could you expect a value for a field which is not provided? You have two options, choose either one.. 1. If the fieldvalue is returned NULL then display a proper error / user defined message. Handle the error. 2. Add a dummy

Re: Using solrJ to get all fields in a particular schema/index

2010-05-25 Thread Rakhi Khatwani
Hi Aditya, i can retrieve all documents. but cannot retrieve all the fields in a document(if it does not hv any value). For example i get a list of documents, some of the documents have some value for title field, and others mite not contain a value for title field. in anycase i need to

Re: Machine utilization while indexing

2010-05-25 Thread Thijs
Hi all, I did some further investigation and (after turning of some filters in yourkit) found that is was actually the machine sending the files to solr that was slowing things down. At first I couldn't find this as it turned out that yourkit hides org.apache.* classes. When I removed this f

Re: Using solrJ to get all fields in a particular schema/index

2010-05-25 Thread findbestopensource
Resending it as there is a typo error. To reterive all documents, You need to use the query/filter FieldName:*:* . Regards Aditya www.findbestopensource.com On Tue, May 25, 2010 at 4:29 PM, findbestopensource < findbestopensou...@gmail.com> wrote: > To reterive all documents, You need to use

Re: Using solrJ to get all fields in a particular schema/index

2010-05-25 Thread findbestopensource
To reterive all documents, You need to use the query/filter *FieldName:*:** Regards Aditya www.findbestopensource.com On Tue, May 25, 2010 at 4:14 PM, Rakhi Khatwani wrote: > Hi, > Is there any way to get all the fields (irrespective of whether > it contains a value or null) in solrDocu

Re: How real-time are Solr/Lucene queries?

2010-05-25 Thread Grant Ingersoll
How many docs are in the batch you are pulling down? How many docs/second do you expect on the index size? How big are the docs? What do you expect in terms of queries per second? How fast do new documents need to be available on the local server? How much analysis do you have to do? Also,

Re: Using solrJ to get all fields in a particular schema/index

2010-05-25 Thread findbestopensource
To reterive all documents, You need to use the query/filter *FieldName:*:** Regards Aditya www.findbestopensource.com On Tue, May 25, 2010 at 4:14 PM, Rakhi Khatwani wrote: > Hi, > Is there any way to get all the fields (irrespective of whether > it contains a value or null) in solrDo

Using solrJ to get all fields in a particular schema/index

2010-05-25 Thread Rakhi Khatwani
Hi, Is there any way to get all the fields (irrespective of whether it contains a value or null) in solrDocument. or Is there any way to get all the fields in schema.xml of the url link ( http://localhost:8983/solr/core0/)?? Regards, Raakhi

RE: Highlighting is not happening

2010-05-25 Thread Doddamani, Prakash
Thanks much all I am using following looks good to me Regards Prakash -Original Message- From: Sascha Szott [mailto:sz...@zib.de] Sent: Tuesday, May 25, 2010 1:16 PM To: solr-user@lucene.apache.org Subject: Re: Highlighting is not happening Hi, to accomplish that, use the highligh

Re: How well does Solr scale over large number of facet values?

2010-05-25 Thread Marc Sturlese
Since Solr 1.4 I think the uninverted method is on by default. Anyway, you can choose wich to use with the method param: facet.method=fc/enum (where fc is the uninverted one) http://wiki.apache.org/solr/SimpleFacetParameters -- View this message in context: http://lucene.472066.n3.nabble.com/How

Re: Problem with extended dismax, minus prefix (to mean NOT) and interaction with mm?

2010-05-25 Thread Erik Hatcher
This looks like a case where the extended dismax parser is creating a Lucene QueryParser parsed query rather than a disjunction maximum query. A case of "too much magic" maybe? Looks like this one should be parsed quite differently. Try dismax and see what you get, it'll be quite differe

Re: How well does Solr scale over large number of facet values?

2010-05-25 Thread Andy
Thanks. Do I need to configure Solr to use the uninverted algorithm or is it the default algorithm? --- On Tue, 5/25/10, Marc Sturlese wrote: > From: Marc Sturlese > Subject: Re: How well does Solr scale over large number of facet values? > To: solr-user@lucene.apache.org > Date: Tuesday, Ma

Re: How well does Solr scale over large number of facet values?

2010-05-25 Thread Marc Sturlese
With the uninverted algorithm it will be very fast whatever is the number of unique terms. But be careful with the memory because it uses quite a lot. Using the oldest facet algorithm, if you have a lot of different terms it will be slow. -- View this message in context: http://lucene.472066.n3.

Re: Apache or Nginx In front of SOLR?

2010-05-25 Thread Kranti™ K K Parisa
Thanks Paul, I shall continue doing some more R&D with your inputs. Best Regards, Kranti K K Parisa On Tue, May 25, 2010 at 12:54 PM, Paul Dhaliwal wrote: > It depends on what kind of load you are talking about and what your > expertise is. > > NGINX does perform better than apache for most p

Re: Highlighting is not happening

2010-05-25 Thread Sascha Szott
Hi, to accomplish that, use the highlighting parameters hl.simple.pre and hl.simple.post. By the way, there are a plenty of other parameters that affect highlighting. Take a look at: http://wiki.apache.org/solr/HighlightingParameters -Sascha Doddamani, Prakash wrote: Hey, I thought the

Re: sort by field length

2010-05-25 Thread Sascha Szott
Hi Erick, Erick Erickson wrote: Are you sure you want to recompute the length when sorting? It's the classic time/space tradeoff, but I'd suggest that when your index is big enough to make taking up some more space a problem, it's far too big to spend the cycles calculating each term length for

Re: Faceted search not working?

2010-05-25 Thread Sascha Szott
Hi Birger, Birger Lie wrote: I don't think the bolean fields is mapped to "on" and "off" :) You can use true and on interchangeably. -Sascha -birger -Original Message- From: Ilya Sterin [mailto:ster...@gmail.com] Sent: 24. mai 2010 23:11 To: solr-user@lucene.apache.org Subject: Fa

RE: Highlighting is not happening

2010-05-25 Thread Doddamani, Prakash
Hey, I thought the Highlights would happen in the field of the documents returned from SOLR J But it gives new list of Highlighting at below, sorry for the confusion I was wondering is there a way that the fields returned itself contains bold characters Eg : if searched for "query" re

Re: Apache or Nginx In front of SOLR?

2010-05-25 Thread Paul Dhaliwal
It depends on what kind of load you are talking about and what your expertise is. NGINX does perform better than apache for most people, however less people know about NGINX than apache. If you have more than 100K searchers a day doing a few searches each, you will benefits from NGINX. If your tra

How well does Solr scale over large number of facet values?

2010-05-25 Thread Andy
I want to facet over a field "group". Since "group" is created by users, potentially there can be a huge number of values for "group". - Would Solr be able to handle a use case like this? Or is Solr not really appropriate for facet fields with a large number of values? - I understand that I ca