RE: Different Filters
Can the client send in which filters should be turned on and off, but leave the definition of the filters in solrconfig.xml? The client must set the property, how solr deals with that is how I want it to work. If so, you can get this effect with the new query parser plugin framework. Part of that includes what I call local parameters (not really documented yet), which includes parameter dereferencing. What version of solr does this first appear? we're using a nightly build from December which was heavily hacked to do database result returning and word offset highlighting (and some other fixes) so we'd like to avoid using anything newer. So you could add something like this to your query fq=!val=$filter1fq=!val=$filter3) and have the various filters be a default defined in a handler in solrconfig.xml How does this work? I'm still confused from your explanation. Are the query options turning the filters on or off? what kind of hander would go into solrconfig.xml? Best Regards, Martin Owens
RE: Different Filters
This feature was first committed 10/22/07 Great! should be there then. Now put filter1 as a default in your handler (same as any other default), and the client can turn on and off filter1 without knowing what exactly it is. OK so I have to add a new search hander into solrconfig.xml with a set name, I then use that in the query line to specify which field the search hander should use? Are you able to do an example including the solrconfig or schema changes and show the field and how it works with the English Stemmer for instance. Sorry for being a dunce today, I'm just not sure I'm totally understanding everything. Best Regards, Martin Owens
RE: Different Filters
What field??? or what filter? I'm not really sure I still understand what you are trying to accomplish. Perhaps if you have some explicit examples of what types of things clients would send in as query parameters to Solr, and what types of lucene queries you actually want to be generated. Oh dear a complete break down, OK so our perl based software uses http to set a request to solr, we want for our software to be able to control the query filters being used with each search by modifying attributes in the http query string such as I think you were suggesting. I need examples of how to impliment what you were talking about. Best Regards, Martin Owens
RE: Indexing Directly, searching with solr
It doesn't appear to be working: Sending URI: http://text4:8983/solr/@10324_1_155/update Sending Content: ?xml version='1.0' encoding='UTF-8'? commit/ This multi core update request to commit the data doesn't change the numDocs of the core :-/ hmm and the response was interesting: Response: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime5/int/lststr name=WARNINGThis response format is experimental. It is likely to change in the future./str /response Any thoughts? Best Regards, Martin Owens -Original Message- From: Stu Hood [mailto:[EMAIL PROTECTED] Sent: Mon 1/28/2008 5:10 PM To: solr-user@lucene.apache.org Subject: RE: Indexing Directly, searching with solr Performing a 'commit/' command on the Solr server will force it to open a new IndexReader, and make your changes visible. Thanks, Stu -Original Message- From: Owens, Martin [EMAIL PROTECTED] Sent: Monday, January 28, 2008 5:38pm To: solr-user@lucene.apache.org Subject: Indexing Directly, searching with solr Hello all, In order to get around problems with Solr indexing very large files (mostly memory issues, not being able to deal with streams, not being able to handle file pointers and handling everything in java as huge strings) we've decided to index using lucene directly. Now we have an index created, but Solr just doesn't see any documents there, even though we know there are documents indexed and the segments and cfs contains data. Any thoughts? Best Regards, Martin Owens
Updating and Appending
Hello, We've got some memory constraint worries from using Java RMI, although I can see this problem could effect the xml requests too. The Java code doesn't seem to handle large files as streams. Now we're thinking that there are two possible solutions, either the exists or we create a file path plugin which tells the server to load the contents from a file as a buffer, but this runs into the risk that Solr isn't built to deal with buffers and would simply eat up all the ram trying to load the file as a full string. Or there exists or we create some kind of update method which appends the contents to a field data and runs all the indexing and filters applicable. But I want to know what you guys think is the best solution for the problem. BEst Regards, Martin Owens
RE: AND as a default search operator
Yes, in the schema.xml file at the bottom there is a DefaultOperator property. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Fri 12/21/2007 4:29 PM To: solr-user@lucene.apache.org Subject: AND as a default search operator Hi, Is there any way to set AND as the default search operator in Solr instead of OR? cheers Y.
RE: MultiCore problem
If you started with the example confit, make small changed till it stops working as expected. The problem was using consistency assumptions instead of looking at what the real url was. so I was using solr/select?core=core1 instead of solr/@core1/select simply because the multicore admin works that way. silly me. Best Regards, Martin Owens
Python Solr Writer
I'm having some trouble understanding how the solr writer intergrates into python, I can't find any examples so does anyone have any good examples of a python writer? Best Regards, Martin Owens
RE: Python Solr Writer
That would be a python solr client, not a solr writer using this: http://lucene.apache.org/solr/api/org/apache/solr/request/PythonResponseWriter.html Not sure how the hell it's supposed to work to be honest. -Original Message- From: [EMAIL PROTECTED] on behalf of Ed Summers Sent: Fri 12/14/2007 11:30 AM To: solr-user@lucene.apache.org Subject: Re: Python Solr Writer Do you mean something like: http://svn.apache.org/repos/asf/lucene/solr/trunk/client/python/solr.py //Ed On Dec 14, 2007 10:20 AM, Owens, Martin [EMAIL PROTECTED] wrote: I'm having some trouble understanding how the solr writer intergrates into python, I can't find any examples so does anyone have any good examples of a python writer? Best Regards, Martin Owens
Solr 1.3 expected release date
What date or year do we believe Solr 1.3 will be released? Regards, Martin Owens
RE: Solr Highlighting, word index
Hello Everyone, Just to keep you all up to date about the maddness I've created. I managed to get the data I wanted by hacking: lucene-2.2.0 highlight/Highlighter.java solr-1.2 util/HighlightingUtils.java I got it to output either the word index or pairs of letter offsets (end and start) depending on what is required. That means I'm getting the data I want. I'm not very happy with the code since it returns these numbers as strings and ends up loading and storing the entire file string. But beggars can't be choosers and I'm certainly no artisan at java. Best Regards, Martin Owens -Original Message- From: Binkley, Peter [mailto:[EMAIL PROTECTED] Sent: Wed 12/5/2007 4:07 PM To: solr-user@lucene.apache.org Subject: RE: Solr Highlighting, word index We're doing a similar process using term vectors to look up the bounding-box data in a custom response writer for a specific project, but we're trying to get this packaged up in a more generally usable way along with handling paging: see https://issues.apache.org/jira/browse/SOLR-380. We're looking at using Lucene's new payload functionality. Peter
Solr result offsets
Hello again, So I've been concentrating on hacking the Util/Highlighting.java to see if I could get it to output the results offsets I need to do the highlighting I need. It turns out that this method requires that the field be stored as well as indexed. I would like to be able to just set termOffsets and termPositions and have that data returned to me when I do a specific kind of search. I ended up getting very confused about the Request Handler plugin that I'm sure will be the solution in the end; It just seems to want the search to be performed again for no good reason, surely the term offsets are returned when a search is done on a field with that data available? Best Regards, Martin Owens
RE: Solr Highlighting, word index
You do not necessarily need two requests; instead, you can override or modify the request handler you are using (StandardRequestHandler, DisMaxREquestHandler) to return the information. You'll have to process the Query to extract the terms (like HighlighingUtils does), then get the TermVector token offset data for each matching doc and look for the terms in the Query. I haven't worked with Term Vectors (a Lucene API), so I'm not sure exactly how to go about this. Thanks Mike, So in essence I need to write a new RequestHandler plugin which takes the query string, tokenises it then perform a some kind of action against the index to return results which I should then be able to get the termVectors from? Would not the termVectors already be available from the normal search and we'd just be asking for the term vectors from that? Any advice for a perl/python programmer who is trying to baddly hack this in Java? Best Regards, Martin Owens
RE: Solr Highlighting, word index
You can tell lucene to store token offsets using TermVectors (configurable via schema.xml). Then you can customize the request handler to return the token offsets (and/or positions) by retrieving the TVs. I think that is the best plan of action, how do I create a custom request handler that will use the existing indexed fields? There will be 2 requests as I see it, 1 for the search and 1 to retrieve the offsets when you view one of those found items. Any advice you can give me will be much appricated as I've had no luck with google so far. Thanks for your help so far, Best Regards, Martin Owens
RE: Solr Highlighting, word index
Or I'm just completely off base here. A little, we already have the locations for each word on every ocr, we just need the word index to feed into the existing program. Best Regards, Martin Owens
Solr Highlighting, word index
Hello everyone, We're working to replace the old Linux version of dtSearch with Lucene/Solr, using the http requests for our perl side and java for the indexing. The functionality that is causing the most problems is the highlighting since we're not storing the text in solr (only indexing) and we need to highlight an image file (ocr) so what we really need is to request from solr the word indexes of the matches, we then tie this up to the ocr image and create html boxes to do the highlighting. The text is also multi page, each page is seperated by Ctrl-L page breaks, should we handle the paging out selves or can Solr tell use which page the match happened on too? Thanks for your help, Best Regards, Martin Owens