Re: Very basic questions: Indexing text

2010-06-29 Thread Ahmet Arslan
Could you give an example? E.g. lets say I have a field 'title' and a field 'fulltext' and my search term is 'solr'. What would be the right set of parameters to get back the whole title-field but only a sniplet of 50 words (or three sentences or whatever the unit) from the fulltext field.

Re: Wither field compresed=true ?

2010-06-29 Thread MitchK
David, well, I am no committer, but I noticed that Lucene will no longer care of compressing (I think this was because of the trouble when doing this) and maybe this is the reason why Solr keeps this option no longer available. Unfortunately, I do not have got any link for it, but I think this

Re: AutoSuggest Question

2010-06-29 Thread Ahmet Arslan
fieldType name=autocomplete3 class=solr.TextField positionIncrementGap=100             analyzer type=index                 tokenizer class=solr.LetterTokenizerFactory/                 filter class=solr.LowerCaseFilterFactory/                 filter class=solr.EdgeNGramFilterFactory

optional vs. probhibited aka standard vs. dismax handler

2010-06-29 Thread Lukas Kahwe Smith
Hi, I am a bit confused about the +/- syntax. Am I understanding it properly that when using the normal query handler + means required and - means prohibit where as in the dismax handler + means required and - means optional? http://lucene.apache.org/java/2_9_1/queryparsersyntax.html The + or

Some issues concerning SOLR 1.4.1

2010-06-29 Thread Bastian Spitzer
Hi, We just migrated from SOLR 1.4 to 1.4.1. We are observing some new Errors in the logs that didnt occured before the migration, so we want to share them with you and are hoping to get some help solving them. We are using 1Master and 1Slave with replication on 2 different machines running only

Solr search streaming/callback

2010-06-29 Thread Peter Sturge
Hi, I was wondering if anyone was aware of any existing functionality where clients/server components could register some search criteria and be notified of newly committed data matching the search when it becomes available - a 'push/streaming' search, rather than 'pull'? Thanks!

AW: Some issues concerning SOLR 1.4.1

2010-06-29 Thread Bastian Spitzer
What i forgot to mention is that those Errors only occur on the Slave, the Master is working just fine. Ram/Hardware/Java Version/Config/Startup parameters etc. are exactly the same on both Machines. -Ursprüngliche Nachricht- Von: Bastian Spitzer [mailto:bspit...@magix.net] Gesendet:

Diiferences in avgRequestsPerSecond of Solr ..

2010-06-29 Thread Na_D
hi , I am fectching the following details programatically : --- --- Name :: /replication Class :: org.apache.solr.handler.ReplicationHandler Version :: $Revision: 829682 $ Description :: ReplicationHandler provides replication of

use copyField to gather and then split

2010-06-29 Thread solr
(sorry if this message ends up being sent twice) We have a use-case where we'd like to fill a field from multiple sources, i.e. copyField source=title dest=text / copyField source=body dest=text / … (other source-fields are copied in to text as well) and then analyze the resulting text-field

Where to check optimize status

2010-06-29 Thread Frederico Azeiteiro
Hi, I'm using solr1.4.0 default installation. Is there a place where I can find the optimization status. I sent a optimize http request and it should had finish by now, but I still see the lock file on index folder. Can I see somewhere if the optimization is still running?

schemaxml: field-property required for any field?

2010-06-29 Thread Alexander Rothenberg
Hi, was curious if the field-property 'required' can be added to any field, not just the unique-field. Wiki has no info about it. I would like to set that property to some fields in the shema.xml that dont belong to the root-entity of the document-schema (looking at data-config.xml)... I want

Re: Where to check optimize status

2010-06-29 Thread Alexander Rothenberg
To determine if the optimize is still in progress, you can look at the admin-frontend on the page THREAD DUMP for something like Lucene Merge Thread. If its there, then optimize is still running. Also, index-filesize and filenames in your index-dir are changing a lot... On Tuesday 29 June

Re: optional vs. probhibited aka standard vs. dismax handler

2010-06-29 Thread Jan Høydahl / Cominvent
Hi, In DisMax the mm parameter controls whether terms are required or optional. The default is 100% which means all terms required, i.e. you do not need to add +. You can change to mm=0 and you will get the same behaviour as standard parser, i.e. an OR behaviour, where the + would say that a

Re: use copyField to gather and then split

2010-06-29 Thread Jan Høydahl / Cominvent
Hi pal :) Unfortunately copyField works only BEFORE analysis and you cannot chain them... The simplest solution would be to duplicate your copyField's: copyField source=title dest=textanayzemethod2 / copyField source=body dest=textanayzemethod2 / copyField source=title dest=textanayzemethod1

Re: optional vs. probhibited aka standard vs. dismax handler

2010-06-29 Thread Lukas Kahwe Smith
On 29.06.2010, at 13:24, Jan Høydahl / Cominvent wrote: Hi, In DisMax the mm parameter controls whether terms are required or optional. The default is 100% which means all terms required, i.e. you do not need to add +. You can change to mm=0 and you will get the same behaviour as

Re: Wither field compresed=true ?

2010-06-29 Thread Mark Miller
On 6/27/10 4:51 PM, David Smiley (@MITRE.org) wrote: I just noticed that field compression (e.g. compressed=true) is no longer in Solr, nor can I find why this was done. Can a committer offer an explanation? If the reason is that it eats up CPU, then I'd rather accept this tradeoff for a

Re: schemaxml: field-property required for any field?

2010-06-29 Thread Alexander Rothenberg
I just saw now that it exactly works like expected below, anyway thx On Tuesday 29 June 2010 13:05:15 Alexander Rothenberg wrote: Hi, was curious if the field-property 'required' can be added to any field, not just the unique-field. Wiki has no info about it. I would like to set that

Re: optional vs. probhibited aka standard vs. dismax handler

2010-06-29 Thread Lukas Kahwe Smith
On 29.06.2010, at 13:38, Lukas Kahwe Smith wrote: On 29.06.2010, at 13:24, Jan Høydahl / Cominvent wrote: Hi, In DisMax the mm parameter controls whether terms are required or optional. The default is 100% which means all terms required, i.e. you do not need to add +. You can change

Re: optional vs. probhibited aka standard vs. dismax handler

2010-06-29 Thread Jan Høydahl / Cominvent
When you mix query handlers like this you will need to add a + or an AND in front of the _query_: part as well, in order for it to be required. I.e.

Re: optional vs. probhibited aka standard vs. dismax handler

2010-06-29 Thread Lukas Kahwe Smith
On 29.06.2010, at 15:01, Jan Høydahl / Cominvent wrote: When you mix query handlers like this you will need to add a + or an AND in front of the _query_: part as well, in order for it to be required. You will see the difference when you try the above query directly on your Solr instance

RE: solr data config questions

2010-06-29 Thread Peng, Wei
Thank you for your answer, Alex. I tried it, but I got some weird output commentreply:[[...@1b06a21,[...@107dcfe,[...@13dcd27,[...@67cd84,[...@e5ace9,[...@bb05de,[...@7e56 It is supposed to be commentreply:[234234,2,87979,343,... Both comment_id and reply_id are integers. Concat should be

Cache hits exposed by API

2010-06-29 Thread Na_D
This is just an enquiry.I just wanted to know if the cache hit rates of solr exposed via the API of solr? -- View this message in context: http://lucene.472066.n3.nabble.com/Cache-hits-exposed-by-API-tp930602p930602.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: How I can use score value for my function

2010-06-29 Thread Geert-Jan Brits
It's possible using functionqueries. See this link. http://wiki.apache.org/solr/FunctionQuery#query 2010/6/29 MitchK mitc...@web.de Ramzesua, this is not possible, because Solr does not know what is the resulting score at query-time (as far as I know). The score will be computed, when

Re: Cache hits exposed by API

2010-06-29 Thread Markus Jelsma
Hi, De AdminRequestHandler exposes a JSP [1] that'll return a nice XML document with all the information you need about cache statistics and other. [1]: http://localhost:8983/solr/admin/stats.jsp Cheers, On Tuesday 29 June 2010 15:52:56 Na_D wrote: This is just an enquiry.I just wanted to

RE: Where to check optimize status

2010-06-29 Thread Frederico Azeiteiro
Thank you but I didn't find anything like Merge thread and I continued to have the lock file. The segments were not merged so I stopped the SOLR and restart. The lock disappear but I guess the optimization didn’t complete. I'll try again tomorrow -Original Message- From: Alexander

Re: Cache hits exposed by API

2010-06-29 Thread Na_D
I knew that the jsp page= http://localhost:8983/solr/admin/stats.jsp shows the different statistics but actually I am trying to read the hit rate of the solr cache's via a Java Code.That's why I asked if the same is exposed via Solr API's...Please share if you know about the same.

problem with formulating a negative query

2010-06-29 Thread Sascha Szott
Hi folks, I have a (multi-valued) field topic in my index which does not need to exist in every document. Now, I'm struggling with formulating a query that returns all documents that either have no topic field at all *or* whose topic field value is R. Unfortunately, the query

Re: How to wait for StreamingUpdateSolrServer to finish?

2010-06-29 Thread Stephen Duncan Jr
On Tue, Jun 22, 2010 at 9:38 AM, Stephen Duncan Jr stephen.dun...@gmail.com wrote: I'm prototyping using StreamingUpdateSolrServer. I want to send a commit (or optimize) after I'm done adding all of my docs, rather than wait for the autoCommit to kick in. However, since

Re: problem with formulating a negative query

2010-06-29 Thread Ahmet Arslan
I have a (multi-valued) field topic in my index which does not need to exist in every document. Now, I'm struggling with formulating a query that returns all documents that either have no topic field at all *or* whose topic field value is R. Does this work? defType=luceneq.op=ORq=topic:R

Disabling Access to Solr Admin Panel

2010-06-29 Thread Vladimir Sutskever
Hi All, How can I forbid access to the SOLR index admin panel? Can I configure this in the /jetty.xml - I understand that's it's not true security - considering updates/delete/re-indexing commands will still be allowed - via GET request. Kind regards, Vladimir Sutskever Investment Bank -

Indexing a database

2010-06-29 Thread Lance Hill
How do I know if solr is actually loading my database driver properly? I added the mysql connector to the solr/lib directory, I added lib dir=./lib / to the solrconfig.xml just to be sure it would find the connector. When I start the application, I see it loaded my dataImporter data config, but

Re: Indexing a database

2010-06-29 Thread Ahmet Arslan
How do I know if solr is actually loading my database driver properly? I added the mysql connector to the solr/lib directory, I added lib dir=./lib / to the solrconfig.xml just to be sure it would find the connector. When I start the application, I see it loaded my dataImporter data

RE: Disabling Access to Solr Admin Panel

2010-06-29 Thread Markus Jelsma
Hi,   Check out the wiki [1] on this subject.   [1]: http://wiki.apache.org/solr/SolrSecurity   Cheers,   -Original message- From: Vladimir Sutskever vladimir.sutske...@jpmorgan.com Sent: Tue 29-06-2010 18:05 To: solr-user@lucene.apache.org; Subject: Disabling Access to Solr Admin

Re: How to wait for StreamingUpdateSolrServer to finish?

2010-06-29 Thread Yonik Seeley
On Tue, Jun 22, 2010 at 9:38 AM, Stephen Duncan Jr stephen.dun...@gmail.com wrote: I'm prototyping using StreamingUpdateSolrServer.  I want to send a commit (or optimize) after I'm done adding all of my docs, rather than wait for the autoCommit to kick in.  However, since

RE: Indexing a database

2010-06-29 Thread Lance Hill
Yes, it is registered exactly as you indicated in solrconfig and when the application starts up, I can see a message indicating the data-config is loaded successfully. So although the data config is loaded successfully, I cannot seem to access the dataimport handler. Regards, L. Hill

RE: Indexing a database

2010-06-29 Thread Ahmet Arslan
Yes, it is registered exactly as you indicated in solrconfig and when the application starts up, I can see a message indicating the data-config is loaded successfully. So although the data config is loaded successfully, I cannot seem to access the dataimport handler. Strange,

Faceted search outofmemory

2010-06-29 Thread olivier sallou
Hi, I try to make a faceted search on a very large index (around 200GB with 200M doc). I have an out of memory error. With no facet it works fine. There are quite many questions around this but I could not find the answer. How can we know the required memory when facets are used so that I try to

RE: Faceted search outofmemory

2010-06-29 Thread Ankit Bhatnagar
Did you trying paging them? -Original Message- From: olivier sallou [mailto:olivier.sal...@gmail.com] Sent: Tuesday, June 29, 2010 2:04 PM To: solr-user@lucene.apache.org Subject: Faceted search outofmemory Hi, I try to make a faceted search on a very large index (around 200GB with

Re: Faceted search outofmemory

2010-06-29 Thread olivier sallou
How do make paging over facets? 2010/6/29 Ankit Bhatnagar abhatna...@vantage.com Did you trying paging them? -Original Message- From: olivier sallou [mailto:olivier.sal...@gmail.com] Sent: Tuesday, June 29, 2010 2:04 PM To: solr-user@lucene.apache.org Subject: Faceted search

Re: Very basic questions: Indexing text - working, but slow!

2010-06-29 Thread Peter Spam
Thanks for everyone's help - I have this working now, but sometimes the queries are incredibly slow!! For example, int name=QTime461360/int. Also, I had to bump up the min/max RAM size to 1GB/3.5GB for things to inject without throwing heap memory errors. However, my data set is very small!

RE: Re: Faceted search outofmemory

2010-06-29 Thread Markus Jelsma
http://wiki.apache.org/solr/SimpleFacetParameters#facet.limit   -Original message- From: olivier sallou olivier.sal...@gmail.com Sent: Tue 29-06-2010 20:11 To: solr-user@lucene.apache.org; Subject: Re: Faceted search outofmemory How do make paging over facets? 2010/6/29 Ankit Bhatnagar

Re: Faceted search outofmemory

2010-06-29 Thread olivier sallou
I have given 6G to Tomcat. Using facet.method=enum and facet.limit seems to fix the issue with a few tests, but I do know that it is not a final solution. Will work under certain configurations. Real issue is to be able to know what is the required RAM for an index... 2010/6/29 Nagelberg, Kallin

Re: How I can use score value for my function

2010-06-29 Thread MitchK
Britske good workaround! I did not thought about the possibility of using subqueries. Regards - Mitch -- View this message in context: http://lucene.472066.n3.nabble.com/How-I-can-use-score-value-for-my-function-tp899662p931448.html Sent from the Solr - User mailing list archive at Nabble.com.

RE: solr data config questions

2010-06-29 Thread Peng, Wei
I tried query=select cast(concat(replytable.comment_id,',', replytable.SID) as char), it works now ! Thanks you, Alex :) Vivian -Original Message- From: Alexey Serba [mailto:ase...@gmail.com] Sent: Tuesday, June 29, 2010 4:38 PM To: solr-user@lucene.apache.org Subject: Re: solr data

Leading Wildcard query strangeness

2010-06-29 Thread dbashford
We've got an app in production that executes leading wildcard queries just fine. lst name=responseHeader int name=status0/int int name=QTime1298/int lst name=params str name=qtitle:*news/str /lst /lst result name=response numFound=5514 start=0 The same app in dev/qa has undergone a

Solrj Question

2010-06-29 Thread Neil Lott
Hi, I'm a little confused on how either solrj is working or how solr is working. I'm using solr 1.4. @Test (groups = {integration}, enabled = true) public void testDate() throws Exception { SolrServer solr =

Re: Leading Wildcard query strangeness

2010-06-29 Thread Ahmet Arslan
We've got an app in production that executes leading wildcard queries just fine. lst name=responseHeader   int name=status0/int   int name=QTime1298/int   lst name=params     str name=qtitle:*news/str   /lst /lst result name=response numFound=5514 start=0 The same app in dev/qa has

Re: preside != president

2010-06-29 Thread Darren Govoni
Jan, Looks interesting. I will try this. Thanks! Darren On Mon, 2010-06-28 at 19:54 +0200, Jan Høydahl / Cominvent wrote: Hi, You might also want to check out the new Lucene-Hunspell stemmer at http://code.google.com/p/lucene-hunspell/ It uses OpenOffice dictionaries with known stems

OOM on uninvert field request

2010-06-29 Thread Robert Petersen
Hello I am trying to find the right max and min settings for Java 1.6 on 20GB index with 8 million docs, running 1.6_018 JVM with solr 1.4, and am currently have java set to an even 4GB (export JAVA_OPTS=-Xmx4096m -Xms4096m) for both min and max which is doing pretty well but occasionally still

Re: Indexing a database

2010-06-29 Thread Koji Sekiguchi
(10/06/30 1:11), Lance Hill wrote: How do I know if solr is actually loading my database driver properly? I added the mysql connector to the solr/lib directory, I addedlib dir=./lib / to the solrconfig.xml just to be sure it would find the connector. When I start the application, I see it

Re: problem with formulating a negative query

2010-06-29 Thread Erick Erickson
This may help: http://lucene.apache.org/java/2_4_0/queryparsersyntax.html#Boolean%20operators But the clause you specified translates roughly as find all the documents that contain R, then remove any of them that match * TO *. * TO * contains all the documents with R, so everything you just

Re: Very basic questions: Indexing text - working, but slow!

2010-06-29 Thread Peter Spam
To follow up, I've found that my queries are very fast (even with fq=), until I add hl=true. What can I do to speed up highlighting? Should I consider injecting a line at a time, rather than the entire file as a field? -Pete On Jun 29, 2010, at 11:07 AM, Peter Spam wrote: Thanks for

Re: Very basic questions: Indexing text - working, but slow!

2010-06-29 Thread Erick Erickson
What are you actual highlighting requirements? you could try things like maxAnalyzedChars, requireFieldMatch, etc http://wiki.apache.org/solr/HighlightingParameters has a good list, but you've probably already seen that page Best Erick On Tue, Jun 29, 2010 at 9:11 PM, Peter Spam

Re: OOM on uninvert field request

2010-06-29 Thread Lance Norskog
Yes, it is better to use ints for ids than strings. Also, the Trie int fields have a compressed format that may cut the storage needs even more. 8m * 4 = 32mb, times a few hundred, we'll say 300, is 900mb of IDs. I don't know how these fields are stored, but if they are separate objects we've

Re: REST calls

2010-06-29 Thread Lance Norskog
Not at all. For one thing, a RESTful service does not allow a GET to alter any data. It is just an HTTP-based web service. On Sat, Jun 26, 2010 at 5:29 PM, Jason Chaffee jchaf...@ebates.com wrote: The solr docs say it is RESTful, yet it seems that it doesn't use http headers in a RESTful way.  

Re: one to many denormalization approach

2010-06-29 Thread Lance Norskog
Solr supports multi-valued fields. You can add various skills to one field and it will store all of the values in order. You can search on any of the values. For numbers, you might want a subtype_value convention: skillYears1_9 as one of the values for the skillYears field. Lance On Mon, Jun 28,

Re: unknown handler dataimport

2010-06-29 Thread Lance Norskog
The 'bind error' means that you already had another Solr running. Use 'jps' to find all of the processes called 'start.jar' and kill them. Lance On Mon, Jun 28, 2010 at 2:36 PM, Lance Hill la...@baldhead.com wrote: Hi, I am trying to get db indexing up and running, but I am having trouble

Re: Faceted search outofmemory

2010-06-29 Thread Lance Norskog
There is memory used for each facet. All of the facets are loaded for any facet query. Your best shot is to limit the number of facets. On Tue, Jun 29, 2010 at 11:42 AM, olivier sallou olivier.sal...@gmail.com wrote: I have given 6G to Tomcat. Using facet.method=enum and facet.limit seems to

Re: Very basic questions: Indexing text - working, but slow!

2010-06-29 Thread Lance Norskog
To highlight a field, Solr needs some extra Lucene values. If these are not configured for the field in the schema, Solr has to re-analyze the field to highlight it. If you want faster highlighting, you have to add term vectors to the schema. Here is the grand map of such things:

Re: Cache hits exposed by API

2010-06-29 Thread Lance Norskog
Yes, the StatsComponent returns the values in an XML. http://wiki.apache.org/solr/StatsComponent On Tue, Jun 29, 2010 at 7:23 AM, Na_D nabam...@zaloni.com wrote:  I knew that  the jsp page=  http://localhost:8983/solr/admin/stats.jsp  shows the different statistics but actually I am

Re: REST calls

2010-06-29 Thread Don Werve
2010/6/27 Jason Chaffee jchaf...@ebates.com The solr docs say it is RESTful, yet it seems that it doesn't use http headers in a RESTful way. For example, it doesn't seem to use the Accept: request header to determine the media-type to be returned. Instead, it requires a query parameter to

Custom PhraseQuery

2010-06-29 Thread Blargy
Is there anyway to override/change up the default PhraseQuery class that is used... similar to how you can change out the Similarity class? Let me explain what I am trying to do. I would like to override the TF is calculated... always returning a max of 1 for phraseFreq. For example: Query: