What does this mean?

2011-12-13 Thread Kashif Khan
I have field value cache stats. Anyone can tell me whether this shows any problem with respect to performance? name:fieldCache class: org.apache.solr.search.SolrFieldCacheMBean version: 1.0 description: Provides introspection of the Lucene FieldCache, this is **NOT** a

Too many connections in CLOSE_WAIT state on master solr server

2011-12-13 Thread samarth s
Hi, I am using solr replication and am experiencing a lot of connections in the state CLOSE_WAIT at the master solr server. These disappear after a while, but till then the master solr stops responding. There are about 130 open connections on the master server with the client as the slave m/c

Re: Virtual Memory very high

2011-12-13 Thread Dmitry Kan
If you allow me to chime in, is there a way to check for which DirectoryFactory is in use, if ${solr.directoryFactory:solr.StandardDirectoryFactory} has been configured? Dmitry 2011/12/12 Yury Kats yuryk...@yahoo.com On 12/11/2011 4:57 AM, Rohit wrote: What are the difference in the

Looking for a good commit/merge strategy

2011-12-13 Thread peter_solr
Hi all, we are indexing real-time documents from various sources. Since we have multiple sources, we encounter quite a number of duplicates which we delete from the index. This mostly occurs within a short timeframe; deletes of older documents may happen, but they do not have a high priority.

Re: Looking for a good commit/merge strategy

2011-12-13 Thread darren
How do you determine a duplicate? Solr has de-duplication built in and also you may consider hashing documents on some fields to create a consistent doc id that would be the same for same documents and let Solr re-write them. Either approach would reduce or eliminate the possibility of duplicates

Problem with result grouping

2011-12-13 Thread Kissue Kissue
Hi, Maybe there is something i am missing here but i have a field in my solr index called categoryId. The field definition is as follows: field name=categoryId type=string indexed=true stored=true required=true / I am trying to group on this field and i get a result as follows: str

Re: Looking for a good commit/merge strategy

2011-12-13 Thread solr-ra
Peter: You may want to take a look at Solr 3.4 with RankingAlgorithm 1.3. It has NRT support that allows you to search in real time with updates. The performance is about 1 docs / sec with the MBArtists index (approx 43 fields ). MBArtists index is the index of artists from musicbrainz.org in

Re: Quick relevance question

2011-12-13 Thread Erick Erickson
There's actually a Solr JIRA about this: https://issues.apache.org/jira/browse/SOLR-2953 But it begs the question of why you want to do this? Are you sure this would actually providing a better experience for your users? The reason I ask is that you could put a lot of effort into making this

Re: Highlighter highlighting terms which are not part of the search

2011-12-13 Thread Erick Erickson
Well, we need some more details to even guess. Please review: http://wiki.apache.org/solr/UsingMailingLists Best Erick On Mon, Dec 12, 2011 at 12:04 AM, Shyam Bhaskaran shyam.bhaska...@synopsys.com wrote: Hi We recently upgraded our Solr to the latest 4.0 trunk and we are seeing a weird

Re: Ask about the question of solr cache

2011-12-13 Thread Erick Erickson
Are you sure you commit after you're done? If you change the index, this should all be automatic. Although that doesn't make a lot of sense if you restart Solr because the changes would probably be lost then. But I'm a bit confused about what you mean by caches not being updated. Do you mean

Re: manipulate the results coming back from SOLR? (was: possible to do arithmetic on returned values?)

2011-12-13 Thread Erick Erickson
Erik hatcher wrote you a comment assuming you were using Velocity. The more generic form of that comment is that this is an app-level issue by and large. Solr is in charge of searching and returning data, the app is a better place to change that into something pretty... Best Erick On Mon, Dec

MLT as a nested query

2011-12-13 Thread Vyacheslav Zholudev
Hi, is it possible to use MLT as a nested query? I tried the following: select?q=field1:foo field2:bar AND _query_:{!mlt fl=mltField mindf=1 mintf=1 mlt.match.include=false} selectField:baz} but it does not work with an error: Unknown query type 'mlt' I guess I should have an MLT parser

social/collaboration features on top of solr

2011-12-13 Thread Robert Stewart
Has anyone implemented some social/collaboration features on top of SOLR? What I am thinking is ability to add ratings and comments to documents in SOLR and then be able to fetch comments and ratings for each document in results (and have as part of response from SOLR), similar in fashion to

RE: social/collaboration features on top of solr

2011-12-13 Thread Demian Katz
VuFind (http://vufind.org) uses Solr for library catalog (or similar) applications and features a MySQL database which it uses for storing user tags and comments outside of Solr itself. If there were a mechanism more closely tied to Solr for achieving this sort of effect, that would allow

Re: Looking for a good commit/merge strategy

2011-12-13 Thread peter_solr
@ project: Thanks for the hints, I will take a look! @ Nagendra: Solr-RA seems very interesting! I take it that you can use it with an existing index? -- View this message in context: http://lucene.472066.n3.nabble.com/Looking-for-a-good-commit-merge-strategy-tp3582294p3582626.html Sent from

edismax phrase matching with a non-word char inbetween

2011-12-13 Thread Robert Brown
I have a field which is indexed and queried as follows: tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=text-synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt

Re: Looking for a good commit/merge strategy

2011-12-13 Thread Nagendra Nagarajayya
Yes, no changes to your existing index. No commit needed. You may want to change your autocommit interval to about 15 mins ... Regards, - Nagendra Nagarajayya http://solr-ra.tgels.org http://rankingalgorithm.tgels.org On 12/13/2011 7:32 AM, peter_solr wrote: @ project: Thanks for the hints,

Matching all documents in the index

2011-12-13 Thread Kissue Kissue
Hi, I have come across this query in the admin interface: *.* Is this meant to match all documents in my index? Currently when i run query with q= *.*, numFound is 130310 but the actuall number of documents in my index is 603308. Shen i then run the query with q = * then numFound is 603308

CRUD on solr Index while replicating between master/slave

2011-12-13 Thread Tarun Jain
Hi, When replication is happening between master to slave what operations can we do on the master what operations are possible on the slave? I know it is not adivisable to do DML on the slave index but I wanted to know this anyway. Also I understand that doing DML on a slave will make the slave

Re: Matching all documents in the index

2011-12-13 Thread Simon Willnauer
try *:* instead of *.* simon On Tue, Dec 13, 2011 at 5:03 PM, Kissue Kissue kissue...@gmail.com wrote: Hi, I have come across this query in the admin interface: *.* Is this meant to match all documents in my index? Currently when i run query with q= *.*, numFound is 130310 but the actuall

Best way to convert a field in a query to a fq?

2011-12-13 Thread Andrew Lundgren
I want to modify incoming queries such that a field is always transformed to a filter query. For example, I want to convert a query field like q= ... part_page=3 ... to a filter query like q= ... fq=partpage(3) . Is the right way to do this in a custom component, or is there someplace else

Re: Matching all documents in the index

2011-12-13 Thread Kissue Kissue
Hi Simon, Thanks for this. Query time dramatically reduced to 27ms with this. Many thanks. On Tue, Dec 13, 2011 at 4:20 PM, Simon Willnauer simon.willna...@googlemail.com wrote: try *:* instead of *.* simon On Tue, Dec 13, 2011 at 5:03 PM, Kissue Kissue kissue...@gmail.com wrote: Hi,

Re: Create 2 index with solr

2011-12-13 Thread pratikmhta
Thank you Dimitry Kan -- View this message in context: http://lucene.472066.n3.nabble.com/Create-2-index-with-solr-tp2730485p3581568.html Sent from the Solr - User mailing list archive at Nabble.com.

reposting highlighting questions

2011-12-13 Thread Bent Jensen
I am new to solr/xml/xslt, and trying to figure out how to display search query fields highlighted in html. I can enable the highlighting in the query, and I think I get the correct xml response back (See below: I search using 'Contents' and the highlighting is shown with strong and /strong.

Combination of edgengram and ngram

2011-12-13 Thread Shawn Heisey
I am interested in a new filter type, one that would combine edgengram and ngram. The idea is that it would create all ngrams specified by the min/max size, but the ngrams that happen to be edgengrams (specifically the left side) would get an index-time boost. Optionally the boost would be

Re: MySQL data import

2011-12-13 Thread Shawn Heisey
On 12/11/2011 1:54 PM, Brian Lamb wrote: By nature of my schema, I have several multivalued fields. Each one I populate with a separate entity. Is there a better way to do it? For example, could I pull in all the singular data in one sitting and then come back in later and populate with the

RE: Virtual Memory very high

2011-12-13 Thread Rohit
Thanks Yurykats. Regards, Rohit Mobile: +91-9901768202 About Me: http://about.me/rohitg -Original Message- From: Dmitry Kan [mailto:dmitry@gmail.com] Sent: 13 December 2011 11:17 To: solr-user@lucene.apache.org Subject: Re: Virtual Memory very high If you allow me to chime in, is

How to get SolrServer

2011-12-13 Thread Joey
Hi I am new to Solr and want to do some customize development. I have wrapped solr to my own web application, and want to write a servlet to index a file system. The question is how can I get SolrServer inside my Servlet? -- View this message in context:

Re: Matching all documents in the index

2011-12-13 Thread Chris Hostetter
: Thanks for this. Query time dramatically reduced to 27ms with this. to understand what is going on, use debugQuery=true with each of those examples and look at the query toString info. *:* is the one and only true syntax (in any solr QParser that i know of) for find all docs efficiently.

RE: Problem with result grouping

2011-12-13 Thread Young, Cody
There's another discussion going on about this in the solr users mail list, but I think you're using *.* instead of *:* to match all documents. *.* ends up doing a search against the default field where *:* means match all documents. Cody -Original Message- From: Kissue Kissue

Re: highlighting questions

2011-12-13 Thread Erick Erickson
How to get *what* to display in HTML? The VelocityResponseWriter? Extracting this content to show in your webapp? How are you displaying any page at all? You can look at the examples in the VelocityResponseWriter to get an idea of how to do this with that templating engine. But the general

solr ignore duplicate documents

2011-12-13 Thread Alexander Aristov
People, I am asking for your help with solr. When a document is sent to solr and such document already exists in its index (by its ID) then the new doc replaces the old one. But I don't want to automatically replace documents. Just ignore and proceed to the next. How can I configure solr to do

Re: How to get SolrServer within my own servlet

2011-12-13 Thread Joey
Anybody could help? -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-get-SolrServer-within-my-own-servlet-tp3583304p3583368.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to get SolrServer within my own servlet

2011-12-13 Thread Patrick Plaatje
Have à look here first and you're will probably be using SolrEmbeddedServer. http://wiki.apache.org/solr/Solrj Patrick Op 13 dec. 2011 om 20:38 heeft Joey vanjo...@gmail.com het volgende geschreven: Anybody could help? -- View this message in context:

Re: How to get SolrServer within my own servlet

2011-12-13 Thread Joey
Thanks Patrick for the reply. What I did was un-jar solr.war and created my own web application. Now I want to write my own servlet to index all files inside a folder. I suppose there is already solrserver instance initialized when my web app started. How can I access that solr server

Re: Best way to convert a field in a query to a fq?

2011-12-13 Thread Otis Gospodnetic
Hi, We've done similar query rewriting in a custom SearchComponent that runs before QueryComponent. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ From: Andrew Lundgren

Re: CRUD on solr Index while replicating between master/slave

2011-12-13 Thread Otis Gospodnetic
Hi, Master: Update/insert/delete docs    --    Yes Slaves: Search                              --   Yes Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ From: Tarun Jain tjai...@yahoo.com

Re: How to get SolrServer within my own servlet

2011-12-13 Thread Patrick Plaatje
Hey Joey, You should first configure your deployed Solr instance by adding/changing the schema.xml and solrconfig.xml. After that you can use SolrJ to connect to that Solr instance and add documents to it. On the link i posted earlier, you'll find à couple of examples on how to do that. -

RE: Best way to convert a field in a query to a fq?

2011-12-13 Thread Andrew Lundgren
Thanks for the confirmation! -Original Message- From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] Sent: Tuesday, December 13, 2011 1:02 PM To: solr-user@lucene.apache.org Subject: Re: Best way to convert a field in a query to a fq? Hi, We've done similar query

Re: How to get SolrServer within my own servlet

2011-12-13 Thread Mikhail Khludnev
The first drawback of SolrJ is using xml serialization for in-process communication. I guess you can start from SolrDispatchFilter source, and get something for your servlet from there. Regards On Tue, Dec 13, 2011 at 11:45 PM, Patrick Plaatje pplaa...@gmail.comwrote: Have à look here first

Re: solr ignore duplicate documents

2011-12-13 Thread Mikhail Khludnev
Man, Does overwrite=false work for you? http://wiki.apache.org/solr/UpdateXmlMessages#add.2BAC8-replace_documents Regards On Tue, Dec 13, 2011 at 11:34 PM, Alexander Aristov alexander.aris...@gmail.com wrote: People, I am asking for your help with solr. When a document is sent to solr

Re: Too many connections in CLOSE_WAIT state on master solr server

2011-12-13 Thread Mikhail Khludnev
You can try to reuse your connections (prevent them from closing) by specifying -Dhttp.maxConnections=http://download.oracle.com/javase/1.4.2/docs/guide/net/properties.htmlN in jvm startup params. At client JVM!. Number should be chosen considering the number of connection you'd like to keep

Re: How to get SolrServer

2011-12-13 Thread Schmidt Jeff
Joey: I'm not sure what you mean by wapping solr to your own web application. There is a way to embed Solr into your application (same JVM), but I've never used that. If you're talking about your servlet running in one JVM and Solr in another, then use the SolrJ client library to interact

Re: CRUD on solr Index while replicating between master/slave

2011-12-13 Thread Tarun Jain
Hi, Thanks. So just to clarify here again while replicating we cannot search on master index ? Tarun Jain -=- - Original Message - From: Otis Gospodnetic otis_gospodne...@yahoo.com To: solr-user@lucene.apache.org solr-user@lucene.apache.org Cc: Sent: Tuesday, December 13, 2011 3:03

Re: solr ignore duplicate documents

2011-12-13 Thread Erick Erickson
You're probably talking a custom update handler here. That way you can do a document ID lookup, that is just see if the incoming document ID is in the index already and throw the document away if you find one. This should be very efficient, much more efficient than making a separate query for each

Re: Too many connections in CLOSE_WAIT state on master solr server

2011-12-13 Thread Erick Erickson
Replicating 40 cores every 20 seconds is just *asking* for trouble. How often do your cores change on the master? How big are they? Is there any chance you just have too many cores replicating at once? Best Erick On Tue, Dec 13, 2011 at 3:52 PM, Mikhail Khludnev mkhlud...@griddynamics.com wrote:

Re: CRUD on solr Index while replicating between master/slave

2011-12-13 Thread Erick Erickson
No, you can search on the master when replicating, no problem. But why do you want to? The whole point of master/slave setups is to separate indexing from searching machines. Best Erick On Tue, Dec 13, 2011 at 4:10 PM, Tarun Jain tjai...@yahoo.com wrote: Hi, Thanks. So just to clarify here

Migrate Lucene 2.9 To SOLR

2011-12-13 Thread Anderson vasconcelos
Hi I have a old project that use Lucene 2.9. Its possible to use the index created by lucene in SOLR? May i just copy de index to data directory of SOLR, or exists some mechanism to import Lucene index? Thanks

Re: How to get SolrServer within my own servlet

2011-12-13 Thread Joey
Thank you guys for the reply. So what I want to do is to modify Solr a bit - add one servlet so I can trigger a full index of a folder in the file system. What I did: un-jar solr.war; Create a web app and copy the un-jar the solr files to this app; Create my servlet;

Re: Ask about the question of solr cache

2011-12-13 Thread Samuel García Martínez
Is it possible that your Solr client (or the way you communicate with it) is aware of HTTP caching? If you are using a navigator in order to confirm these updates and commits, try disabling HTTP caching. On Tue, Dec 13, 2011 at 3:24 PM, Erick Erickson erickerick...@gmail.comwrote: Are you sure

Re: Migrate Lucene 2.9 To SOLR

2011-12-13 Thread Robert Stewart
I am about to try exact same thing, running SOLR on top of Lucene indexes created by Lucene.Net 2.9.2. AFAIK, it should work. Not sure if indexes become non-backwards compatible once any new documents are written to them by SOLR though. Probably good to make a backup first. On Dec 13, 2011,

Re: Suggest component

2011-12-13 Thread kmf
I think I may have solved my problem. Not 100% certain what the solution was because I've been trying so many things, but in the end what I did was revisit this article and re-step my configuration. http://www.lucidimagination.com/blog/2011/04/08/solr-powered-isfdb-part-9/ I believe what the

Re: Virtual Memory very high

2011-12-13 Thread Yury Kats
On 12/13/2011 6:16 AM, Dmitry Kan wrote: If you allow me to chime in, is there a way to check for which DirectoryFactory is in use, if ${solr.directoryFactory:solr.StandardDirectoryFactory} has been configured? I think you can get the currently used factory in a Luke response, if you hit your

Re: Difference between field collapsing and result grouping

2011-12-13 Thread Chris Hostetter
: Nope, they're the same. The original name was Field Collapsing, : but it was changed to Grouping later. Specificly: Field Collapsing is one type of usecase for the more general concept of Result Grouping (other types of result grouping are group by query, group by function results, etc...)

Re: Maximum File Size Handled by post.jar / Speed of Deletes?

2011-12-13 Thread Chris Hostetter
: We would like to know is there a maximum size of a xml file that can be : posted to Solr using the post.jar, maximum number of docs, etc. at one time : as well as how fast deletes can be achieved. post.jar is provided purely as an extremeley trivial tool for beginers to use to manual post

RE: MoreLikeThis questions

2011-12-13 Thread Chris Hostetter
: I'm implementing a MoreLikeThis search. I have a couple of questions. : I'm implementing this with solrj so I would appreciate it if any code : snippets reflect that. : : First, I want to provide the text that Solr should check for : interesting words and do the search on. This means I

Re: Reducing heap space consumption for large dictionaries?

2011-12-13 Thread Maciej Lisiewski
W dniu 2011-12-13 05:48, Chris Male pisze: Hi, Its good to hear some feedback on using the Hunspell dictionaries. Lucene's support is pretty new so we're obviously looking to improve it. Could you open a JIRA issue so we can explore whether there is some ways to reduce memory consumption?

Re: Too many connections in CLOSE_WAIT state on master solr server

2011-12-13 Thread samarth s
The updates to the master are user driven, and are needed to be visible quickly. Hence, the high frequency of replication. It may be that too many replication requests are being handled at a time, but why should that result in half closed connections? On Wed, Dec 14, 2011 at 2:47 AM, Erick

Re: How to get SolrServer within my own servlet

2011-12-13 Thread yunfei wu
Just curious, sounds like you try to deploy your servlet with Solr support, why don't you just deploy you app as separate Sevlet with the Solr war in the server, then let your servlet send requests to Solr? This will bring much benefit in maintaining your app with Solr support. Yunfei On