Indexing Content of a Direcory

2011-05-11 Thread Robert Naczinski
Hi all, i want PERIODIC ( 1x per 30s ) index the content of a directory to solr. In doing that i will to index only the new files and add it to the index. Can I use for that the standard solr, or must I write own indexer? Can I implement custom DIH. The files are readable via HTTP. Can someone

RE: Boosting score of a document without deleting and adding another document

2011-05-11 Thread Ahmet Arslan
> Is the keyField over here the same thing as the "score" of > the field? It is not the same as score (it is an independent additional field) but you use this field in FunctionQueries which means you can influence (multiply, add, etc) score with it, or sort by it. http://wiki.apache.org/solr/

Re: search by url in Solr?

2011-05-11 Thread Ahmet Arslan
> Can i mention > url,content defaultSearchField> > to inlcude two default fields.? No you cannot define two fields. Though you can change default field via URL with &df= parameter. If you want to query multiple fields you may consider using (e)dismax. http://wiki.apache.org/solr/DisMaxQParser

RE: Boosting score of a document without deleting and adding another document

2011-05-11 Thread karan veer singh
How exactly do I define these external keyfields?For example, I have a document in monitor.xml, and I want to define another field, lets say score1, to be used in FunctionQueries.From what I got from the documentation, I have to define a file called external_score1 in the index directory. In th

Re: Indexing Content of a Direcory

2011-05-11 Thread Ahmet Arslan
> i want PERIODIC ( 1x per 30s ) index the content of a > directory to > solr. In doing that i will to index only the new files and > add it to > the index. Probably you can use DIH, there are different DataSource implementations. http://wiki.apache.org/solr/DataImportHandler#DataSource For exam

RE: Boosting score of a document without deleting and adding another document

2011-05-11 Thread Ahmet Arslan
> How exactly do I define these external keyfields?For > example, I have a document in monitor.xml, and I want to > define another field, lets say score1, to be used in > FunctionQueries.From what I got from the documentation, I > have to define a file called external_score1 in the index > director

Re: Indexing Content of a Direcory

2011-05-11 Thread Robert Naczinski
Thanx for your help. Can I configure with this components native solr or must I write own Indexer that use org.apache.solr.handler.dataimport.FileListEntityProcessor? Someware must I transform the inputdocuments to the schema. My iputdocuments are in different format. Robert 2011/5/11 Ahmet Ars

Re: Indexing Content of a Direcory

2011-05-11 Thread Ahmet Arslan
> Can I configure with this components native solr or must I > write own > Indexer that use > org.apache.solr.handler.dataimport.FileListEntityProcessor? You can use native solr. All things are configured in xml files, data-config.xml, schema.xml etc. There are examples in the directory example-

Re: Indexing Content of a Direcory

2011-05-11 Thread Robert Naczinski
Hi, we try to index logfiles. That are xmlfiles ( this can be not valid ) and text files. Unfortunately have all files not the same content Regards, Robert 2011/5/11 Ahmet Arslan : >> Can I configure with this components native solr or must I >> write own >> Indexer that use >> org.apache.solr.

Re: Building hierarcies of query object instead of flat string queries

2011-05-11 Thread Geir Gullestad Pettersen
I've just become aware of the XML Query parser, which seems to fit with my needs which really are ability to send query syntax trees to Solr (I need to do all query parsing in my client application). However, I cannot find any examples on how to configure a solr request handler for this query pars

RE: Boosting score of a document without deleting and adding another document

2011-05-11 Thread karan veer singh
I did exactly the stuff that was told in the link. I have certain items with ids In schema.xml:I have set a fieldtype with name = idRankFile and keyField = id Also, a field with name idRank and type idRankFile In data directory, I made a text file external_idRankIn it, I set 3007WFP = 1.0wh

Document match with no highlight

2011-05-11 Thread Phong Dais
HI, I am having a problem with highlighting which I cannot comprehend. I'm using the solr/admin/form.jsp (full interface) to submit a search for "3 1 15" (with the quotes). I have "Enable Highlighting" checked and I have specified the field to highlight, in my case DOC_TEXT. Everything else defau

RE: Boosting score of a document without deleting and adding another document

2011-05-11 Thread Ahmet Arslan
> I did exactly the stuff that was told in the link. I have > certain items with ids > > In schema.xml:I have set a fieldtype with name = idRankFile > and keyField = id Also, a field with name idRank and type > idRankFile > In data directory, I made a text file external_idRankIn it, > I set 3007WF

Re: Document match with no highlight

2011-05-11 Thread Ahmet Arslan
--- On Wed, 5/11/11, Phong Dais wrote: > From: Phong Dais > Subject: Document match with no highlight > To: solr-user@lucene.apache.org > Date: Wednesday, May 11, 2011, 1:29 PM > HI, > > I am having a problem with highlighting which I cannot > comprehend. > I'm using the solr/admin/form.jsp (

Solr performance

2011-05-11 Thread javaxmlsoapdev
I have some 25 odd fields with "stored=true" in schema.xml. Retrieving back 5,000 records back takes a few secs. I also tried passing "fl" and only include one field in the response but still response time is same. What are the things to look to tune the performance. Thanks, -- View this message

Re: Solr performance

2011-05-11 Thread Ahmet Arslan
--- On Wed, 5/11/11, javaxmlsoapdev wrote: > From: javaxmlsoapdev > Subject: Solr performance > To: solr-user@lucene.apache.org > Date: Wednesday, May 11, 2011, 2:07 PM > I have some 25 odd fields with > "stored=true" in schema.xml. Retrieving back > 5,000 records back takes a few secs. I also

RE: Boosting score of a document without deleting and adding another document

2011-05-11 Thread karan veer singh
I just want to confirm, (as I'm not quiet familiar with the function query syntax) will the following query be alright to search "car power adaptor", so that the entry with highest idRank be displayed first? http://localhost:8983/solr/select?indent=on&q=car%20power%20adaptor&fl=id,score,name&_va

RE: Boosting score of a document without deleting and adding another document

2011-05-11 Thread Ahmet Arslan
> I just want to confirm, (as I'm not quiet familiar with the > function query syntax) will the following query be alright > to search "car power adaptor", so that the entry with > highest idRank be displayed first? > http://localhost:8983/solr/select?indent=on&q=car%20power%20adaptor&fl=id,score,n

Re: Document match with no highlight

2011-05-11 Thread Phong Dais
Hi, Already tried that. Tried a ridiculously huge number and -1. Same result. Some clarification. I submitted the search string: DOC_TEXT:"3 1 15" Thanks, P. On Wed, May 11, 2011 at 7:01 AM, Ahmet Arslan wrote: > > > --- On Wed, 5/11/11, Phong Dais wrote: > > > From: Phong Dais > > Subj

Re: How to set a common field to several values types ?

2011-05-11 Thread cocowww
Finally, I found a workaround! It's absolutely ugly, but it works as well :D : I have to put the plain text data in a file, which is then extract by tika. Now resolved. Best regards. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-set-a-common-field-to-several-values-t

Re: Document match with no highlight

2011-05-11 Thread Jan-Eirik B . Nævdal
Have you checked that the search phrase are in the field you uses as highlight field? Its standard if it dont get hits in the defined highlight field it would return emty result. A way around this problem is to add more fields to highlight or merge the searchable text into a single text field and

Re: Document match with no highlight

2011-05-11 Thread Phong Dais
Hi, When I "eyeball" the highlighted field, I do not find the search phrase in the document that was returned as a match. The search field is DOC_TEXT, the highlighted field is DOC_TEXT, and the search query is DOC_TEXT:"3 1 15". I get a match with "empty" highlight but it looks to me like it shou

Re: Document match with no highlight

2011-05-11 Thread Ahmet Arslan
> Already tried that.  Tried a ridiculously huge number > and -1.  Same result. > > Some clarification.  I submitted the search string: > > DOC_TEXT:"3 1 15" Can you append &debugQuery=on and give us its output? And the complete search URL will also help.

Frequency of words in index

2011-05-11 Thread Jasneet Sabharwal
I did a facet query on my data field and it showed a list of words with their count but it miss lot of words in facet count. The query used was :- http://localhost:8983/solr/select/?q=*:*&facet=true&facet.field=Data How can I get the count of each word in my index and one more question is it

Re: Frequency of words in index

2011-05-11 Thread Ahmet Arslan
> I did a facet query on my data field > and it showed a list of words with their count but it miss > lot of words in facet count. > > The query used was :- > > http://localhost:8983/solr/select/?q=*:*&facet=true&facet.field=Data > > How can I get the count of each word in my index and one > mor

Re: Replication Clarification Please

2011-05-11 Thread Ravi Solr
Mr. Bell, Thank you for your help. Yes, the full index replicated every 1000, 1, 10 etc, if mergeFactor is 10 as per it's definition. We do index every 5 minutes and replicate every 3 minutes just to make sure consumers have immediate access to the indexed docs. Thanks, Ravi Kiran B

Re: how to do offline adding/updating index

2011-05-11 Thread kenf_nc
My understanding is that the Master has done all the indexing, that replication is a series of file copies to a temp directory, then a move and commit. The slave only gets hit with the effects of a commit, so whatever warming queries are in place, and the caches get reset. Doing too many commits to

Anyone familiar with Solandra?

2011-05-11 Thread kenf_nc
The recent Amazon outage exposed a weakness in our architecture. We could really use a Master-Master redundancy. We already have Master to multiple Slaves. I've looked at the various options of converting a Slave into a Master, of having a Repeater (hybrid master/slave) become the Master etc. But,

Re: Solr performance

2011-05-11 Thread Jay Luker
On Wed, May 11, 2011 at 7:07 AM, javaxmlsoapdev wrote: > I have some 25 odd fields with "stored=true" in schema.xml. Retrieving back > 5,000 records back takes a few secs. I also tried passing "fl" and only > include one field in the response but still response time is same. What are > the things

RE: how to do offline adding/updating index

2011-05-11 Thread Jonathan Rochkind
Theoretically, a commit alone should have negligible effect on the slave, because of the same aspect of Solr architecture that makes too frequent commits problematic --- an existing Searcher continues to serve requests off the old version of the index, until the new commit (plus all it's warming

RE: how to do offline adding/updating index

2011-05-11 Thread Jonathan Rochkind
You can also turn off automatic replication pulling, and just manually issue a 'replicate' command to slave exactly when you want, without relying on it being triggered by optimization or whatever. (Well probably not 'manually', probably some custom update process you run that you'll have issue

No more standard query type?

2011-05-11 Thread Gabriele Kahlout
Is the tagged release of solr 3.1 different from the one distributed in the downloads page? It looks like a reproducible bug. svn co -r 1101526 http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_1 solr This is the default query I get from http://localhost:8080/solr/admin/form.jsp: htt

DataImportHandler and multithreading/threads

2011-05-11 Thread Mark
Is this option available on entities of type JdbcDataSource? If so, which version(s) of Solr are required to take advantage of this option? Also, are there any limitations of this feature that I should be aware of? Thanks

Spatial search - SOLR 3.1

2011-05-11 Thread roySolr
Hello, I'm using the spatial solr plugin from jteam for SOLR 1.4. Now i want to use SOLR 3.1 because it contains a lot of bugfixes:) Now i want to get a distance field back in my results. How can i do it? My url looks like: q=testquery&fq={!geofilt pt=52.78556,3.4546 sfield=latlon d=50} I get

Debugging same SOLR installation on 2 different servers

2011-05-11 Thread Paul Michalet
Hello everyone We have succesfully installed SOLR on 2 servers (developpement and production), using the same configuration files and paths. Both SOLR instances have indexed the same contents and most queries give identical results, but there's a few exceptions where the production instance re

Re: Spatial search - SOLR 3.1

2011-05-11 Thread Smiley, David W.
Hi Roy. See this: http://wiki.apache.org/solr/SpatialSearch#Returning_the_distance I recommend returning the point location and calculating the distance yourself -- it's not hard. Getting Solr to return it is a bit of a hack now. ~ David Smiley Author: http://www.packtpub.com/solr-1-4-enterpri

Enable/disable mainIndex component

2011-05-11 Thread dan sutton
Hi, Does anyone know if I can do the following: 10 ... 2 ... Cheers, Dan

Re: Frequency of words in index

2011-05-11 Thread dan whelan
On 5/11/11 6:12 AM, Jasneet Sabharwal wrote: I did a facet query on my data field and it showed a list of words with their count but it miss lot of words in facet count. The query used was :- http://localhost:8983/solr/select/?q=*:*&facet=true&facet.field=Data How can I get the count of each

Re: Debugging same SOLR installation on 2 different servers

2011-05-11 Thread Paul Libbrecht
Could it be something in the transmission of the query? Or is it also identical? paul Le 11 mai 2011 à 17:19, Paul Michalet a écrit : > Hello everyone > > We have succesfully installed SOLR on 2 servers (developpement and > production), using the same configuration files and paths. > Both SOL

Re: Debugging same SOLR installation on 2 different servers

2011-05-11 Thread Paul Michalet
Thanks for the hint :) We ruled that out after having tested special characters, and if it was an applicative bug, it wouldn't work consistently like it currently does for the majority of queries. The only difference we noticed was in the HTTP headers in the SOLR response: occasionnally, the "C

Re: Document match with no highlight

2011-05-11 Thread Phong Dais
Hi, I can upload the search URL and part of the output but not all of it. Company trade secrets does not allow me to upload the content of the DOC_TEXT field. I can upload the "debug" output section and whatever else is needed but I cannot upload the actual document data. Please let me know if

Re: Document match with no highlight

2011-05-11 Thread Ahmet Arslan
> I can upload the search URL and part of the output but not > all of it. > Company trade secrets does not allow me to upload the > content of the > DOC_TEXT field.  I can upload the "debug" output > section and whatever else > is needed but I cannot upload the actual document data. > > Please le

Re: Indexing Content of a Direcory

2011-05-11 Thread Gora Mohanty
On Wed, May 11, 2011 at 2:24 PM, Robert Naczinski wrote: > Hi, > > we try to index logfiles. That are xmlfiles ( this can be not valid ) > and text files. Unfortunately have all files not the same content Sorry: While Ahmet has done his best to help you out, you are not providing enough details f

Re: No more standard query type?

2011-05-11 Thread Jan Høydahl
Fixed in 3.2 https://issues.apache.org/jira/browse/SOLR-2445 -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 11. mai 2011, at 16.33, Gabriele Kahlout wrote: > Is the tagged release of solr 3.1 different from the one distributed in the > downloads page? It looks lik

Result docs missing only when shards parameter present in query?

2011-05-11 Thread mrw
We have two Solr nodes, each with multiple shards. If we query each shard directly (no shards parameter), we get the expected results: response lst name="responseHeader" int name="status" 0 int name="QTime" 22 result name="response" numFound="100" start="0" d

Re: Can ExtractingRequestHandler ignore documents metadata

2011-05-11 Thread Grant Ingersoll
You can map the attributes to the ignore field. Alternatively, override the SolrContentHandler's newMethod() method to skip adding them. Come to think of it, I'll put up a quick patch that breaks that out a bit more and makes it easier to override. Longer term, a patch to exclude metadata wou

Re: Replication Clarification Please

2011-05-11 Thread Alexander Kanarsky
Ravi, if you have what looks like a full replication each time even if the master generation is greater than slave, try to watch for the index on both master and slave the same time to see what files are getting replicated. You probably may need to adjust your merge factor, as Bill mentioned. -A

Field collapsing on multiple fields and/or ranges?

2011-05-11 Thread arian487
I'm wondering if there is a way to get the field collapsing to collapse on multiple things? For example, is there a way to get it to collapse on a field (lets say 'domain') but ALSO something else (maybe time or something)? To visualize maybe something like this: Group1 has common field 'www.for

K-Stemmer for Solr 3.1

2011-05-11 Thread Mark
It appears that the older version of the Lucid Works KStemmer is incompatible with Solr 3.1. Has anyone been able to get this to work? If not, what are you using as an alternative? Thanks

RE: Boosting score of a document without deleting and adding another document

2011-05-11 Thread karan veer singh
So now I did the following : http://localhost:8983/solr/select?indent=on&q=car%20power%20adaptor&sort=idRank%20desc&fl=id,score,name Yet, it's not taking idRank into account, it's just giving the results according to the score. Also, when I try to output idRank by including it in the fl

Facet filter: how to specify OR expression?

2011-05-11 Thread cnyee
Hi, Is there anyway to specify an 'OR' expression for facet filter? For example docType="pdf" or docType="txt" Many thanks in advance. Yee -- View this message in context: http://lucene.472066.n3.nabble.com/Facet-filter-how-to-specify-OR-expression-tp2930570p2930570.html Sent from the Solr - U

Re: Facet filter: how to specify OR expression?

2011-05-11 Thread Grijesh
How about fq=docType:(pdf OR txt) - Thanx: Grijesh www.gettinhahead.co.in -- View this message in context: http://lucene.472066.n3.nabble.com/Facet-filter-how-to-specify-OR-expression-tp2930570p2930648.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: K-Stemmer for Solr 3.1

2011-05-11 Thread Bernd Fehling
Am 12.05.2011 02:05, schrieb Mark: It appears that the older version of the Lucid Works KStemmer is incompatible with Solr 3.1. Has anyone been able to get this to work? If not, what are you using as an alternative? Thanks Lucid KStemmer works nice with Solr3.1 after some minor mods to KStem

Applying SOLR-236 field collapse patch to Solr 3.1.0

2011-05-11 Thread karan singh
I've been trying to install the field collapse patch to solr 3.1.0 using the following link :https://issues.apache.org/jira/browse/SOLR-236However, I'm not entirely sure which patch to download. How do I decide on that?Also, as I understand it, I have to cd into the apache-solr-3.1.0 directory

Re: Indexing Mails

2011-05-11 Thread Chandan Tamrakar
what kind of emails you want to parse ? MS emails ? You could integrate apache tika but it depends on what kind of emails Tika parser would be able to parse You can define the fields that could be parsed and define that in your xml schema thanks On Tue, May 10, 2011 at 2:07 PM, Jörg Agatz wro