RE: Different Filters

2008-02-21 Thread Owens, Martin

 Can the client send in which filters should be turned on and off, but
 leave the definition of the filters in solrconfig.xml?

The client must set the property, how solr deals with that is how I want it to 
work.

 If so, you can get this effect with the new query parser plugin
 framework.  Part of that includes what I call local parameters (not
 really documented yet), which includes parameter dereferencing.

What version of solr does this first appear? we're using a nightly build from 
December which was heavily hacked to do database result returning and word 
offset highlighting (and some other fixes) so we'd like to avoid using anything 
newer.

 So you could add something like this to your query
 fq=!val=$filter1fq=!val=$filter3)
 and have the various filters be a default defined in a handler in 
 solrconfig.xml

How does this work? I'm still confused from your explanation. Are the query 
options turning the filters on or off? what kind of hander would go into 
solrconfig.xml?

Best Regards, Martin Owens


RE: Different Filters

2008-02-21 Thread Owens, Martin


 This feature was first committed 10/22/07

Great! should be there then.

 Now put filter1 as a default in your handler (same as any other
 default), and the client can turn on and off filter1 without knowing
 what exactly it is.

OK so I have to add a new search hander into solrconfig.xml with a set name,
I then use that in the query line to specify which field the search hander 
should use?

Are you able to do an example including the solrconfig or schema changes and 
show the field and how it works with the English Stemmer for instance.

Sorry for being a dunce today, I'm just not sure I'm totally understanding 
everything.

Best Regards, Martin Owens


RE: Different Filters

2008-02-21 Thread Owens, Martin

 What field??? or what filter?
 I'm not really sure I still understand what you are trying to accomplish.
 Perhaps if you have some explicit examples of what types of things
 clients would send in as query parameters to Solr, and what types of
 lucene queries you actually want to be generated.

Oh dear a complete break down,

OK so our perl based software uses http to set a request to solr, we want for 
our software to be able to control the query filters being used with each 
search by modifying attributes in the http query string such as I think you 
were suggesting. I need examples of how to impliment what you were talking 
about.

Best Regards, Martin Owens


RE: Indexing Directly, searching with solr

2008-01-29 Thread Owens, Martin

It doesn't appear to be working:

Sending URI: http://text4:8983/solr/@10324_1_155/update
Sending Content: ?xml version='1.0' encoding='UTF-8'?
commit/

This multi core update request to commit the data doesn't change the numDocs of 
the core :-/ hmm and the response was interesting:

Response: ?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeaderint name=status0/intint 
name=QTime5/int/lststr name=WARNINGThis response format is 
experimental.  It is likely to change in the future./str

/response

Any thoughts?

Best Regards, Martin Owens 

-Original Message-
From: Stu Hood [mailto:[EMAIL PROTECTED]
Sent: Mon 1/28/2008 5:10 PM
To: solr-user@lucene.apache.org
Subject: RE: Indexing Directly, searching with solr
 
Performing a 'commit/' command on the Solr server will force it to open a new 
IndexReader, and make your changes visible.

Thanks,
Stu


-Original Message-
From: Owens, Martin [EMAIL PROTECTED]
Sent: Monday, January 28, 2008 5:38pm
To: solr-user@lucene.apache.org
Subject: Indexing Directly, searching with solr

Hello all,

In order to get around problems with Solr indexing very large files (mostly 
memory issues, not being able to deal with streams, not being able to handle 
file pointers and handling everything in java as huge strings) we've decided to 
index using lucene directly.

Now we have an index created, but Solr just doesn't see any documents there, 
even though we know there are documents indexed and the segments and cfs 
contains data.

Any thoughts?

Best Regards, Martin Owens







Updating and Appending

2008-01-22 Thread Owens, Martin
Hello,

We've got some memory constraint worries from using Java RMI, although I can 
see this problem could effect the xml requests too. The Java code doesn't seem 
to handle large files as streams. Now we're thinking that there are two 
possible solutions, either the exists or we create a file path plugin which 
tells the server to load the contents from a file as a buffer, but this runs 
into the risk that Solr isn't built to deal with buffers and would simply eat 
up all the ram trying to load the file as a full string. Or there exists or we 
create some kind of update method which appends the contents to a field data 
and runs all the indexing and filters applicable.

But I want to know what you guys think is the best solution for the problem.

BEst Regards, Martin Owens


RE: AND as a default search operator

2007-12-21 Thread Owens, Martin
Yes, in the schema.xml file at the bottom there is a DefaultOperator property.


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Fri 12/21/2007 4:29 PM
To: solr-user@lucene.apache.org
Subject: AND as a default search operator
 
Hi,

Is there any way to set AND as the default search operator in Solr instead of 
OR?

cheers
Y. 




RE: MultiCore problem

2007-12-17 Thread Owens, Martin

 If you started with the example confit, make small changed till it  
 stops working as expected.


The problem was using consistency assumptions instead of looking at what the 
real url was. so I was using solr/select?core=core1 instead of 
solr/@core1/select simply because the multicore admin works that way. silly me.

Best Regards, Martin Owens


Python Solr Writer

2007-12-14 Thread Owens, Martin
I'm having some trouble understanding how the solr writer intergrates into 
python, I can't find any examples so does anyone have any good examples of a 
python writer?

Best Regards, Martin Owens


RE: Python Solr Writer

2007-12-14 Thread Owens, Martin
That would be a python solr client, not a solr writer using this:

http://lucene.apache.org/solr/api/org/apache/solr/request/PythonResponseWriter.html

Not sure how the hell it's supposed to work to be honest.

-Original Message-
From: [EMAIL PROTECTED] on behalf of Ed Summers
Sent: Fri 12/14/2007 11:30 AM
To: solr-user@lucene.apache.org
Subject: Re: Python Solr Writer
 
Do you mean something like:

  http://svn.apache.org/repos/asf/lucene/solr/trunk/client/python/solr.py

//Ed

On Dec 14, 2007 10:20 AM, Owens, Martin [EMAIL PROTECTED] wrote:
 I'm having some trouble understanding how the solr writer intergrates into 
 python, I can't find any examples so does anyone have any good examples of a 
 python writer?

 Best Regards, Martin Owens




Solr 1.3 expected release date

2007-12-12 Thread Owens, Martin
What date or year do we believe Solr 1.3 will be released?

Regards, Martin Owens


RE: Solr Highlighting, word index

2007-12-10 Thread Owens, Martin
Hello Everyone,

Just to keep you all up to date about the maddness I've created. I managed to 
get the data I wanted by hacking:

lucene-2.2.0 highlight/Highlighter.java
solr-1.2 util/HighlightingUtils.java

I got it to output either the word index or pairs of letter offsets (end and 
start) depending on what is required. That means I'm getting the data I want. 
I'm not very happy with the code since it returns these numbers as strings and 
ends up loading and storing the entire file string. But beggars can't be 
choosers and I'm certainly no artisan at java.

Best Regards, Martin Owens

-Original Message-
From: Binkley, Peter [mailto:[EMAIL PROTECTED]
Sent: Wed 12/5/2007 4:07 PM
To: solr-user@lucene.apache.org
Subject: RE: Solr Highlighting, word index
 
We're doing a similar process using term vectors to look up the
bounding-box data in a custom response writer for a specific project,
but we're trying to get this packaged up in a more generally usable way
along with handling paging: see
https://issues.apache.org/jira/browse/SOLR-380. We're looking at using
Lucene's new payload functionality.

Peter


Solr result offsets

2007-12-05 Thread Owens, Martin
Hello again,

So I've been concentrating on hacking the Util/Highlighting.java to see if I 
could get it to output the results offsets I need to do the highlighting I 
need. It turns out that this method requires that the field be stored as well 
as indexed. I would like to be able to just set termOffsets and termPositions 
and have that data returned to me when I do a specific kind of search.

I ended up getting very confused about the Request Handler plugin that I'm sure 
will be the solution in the end; It just seems to want the search to be 
performed again for no good reason, surely the term offsets are returned when a 
search is done on a field with that data available?

Best Regards, Martin Owens



RE: Solr Highlighting, word index

2007-12-05 Thread Owens, Martin

You do not necessarily need two requests; instead, you can override  
or modify the request handler you are using (StandardRequestHandler,  
DisMaxREquestHandler) to return the information.  You'll have to  
process the Query to extract the terms (like HighlighingUtils does),  
then get the TermVector token offset data for each matching doc and  
look for the terms in the Query.  I haven't worked with Term Vectors  
(a Lucene API), so I'm not sure exactly how to go about this.

Thanks Mike, So in essence I need to write a new RequestHandler plugin which 
takes the query string, tokenises it then perform a some kind of action against 
the index to return results which I should then be able to get the termVectors 
from?

Would not the termVectors already be available from the normal search and we'd 
just be asking for the term vectors from that?

Any advice for a perl/python programmer who is trying to baddly hack this in 
Java?

Best Regards, Martin Owens


RE: Solr Highlighting, word index

2007-12-03 Thread Owens, Martin


 You can tell lucene to store token offsets using TermVectors  
 (configurable via schema.xml).  Then you can customize the request  
 handler to return the token offsets (and/or positions) by retrieving  
 the TVs.

I think that is the best plan of action, how do I create a custom request 
handler that will use the existing indexed fields? There will be 2 requests as 
I see it, 1 for the search and 1 to retrieve the offsets when you view one of 
those found items. Any advice you can give me will be much appricated as I've 
had no luck with google so far.

Thanks for your help so far,

Best Regards, Martin Owens



RE: Solr Highlighting, word index

2007-11-30 Thread Owens, Martin

 Or I'm just completely off base here.

A little, we already have the locations for each word on every ocr, we just 
need the word index to feed into the existing program.

Best Regards, Martin Owens


Solr Highlighting, word index

2007-11-30 Thread Owens, Martin

Hello everyone,

We're working to replace the old Linux version of dtSearch with Lucene/Solr, 
using the http requests for our perl side and java for the indexing.

The functionality that is causing the most problems is the highlighting since 
we're not storing the text in solr (only indexing) and we need to highlight an 
image file (ocr) so what we really need is to request from solr the word 
indexes of the matches, we then tie this up to the ocr image and create html 
boxes to do the highlighting.

The text is also multi page, each page is seperated by Ctrl-L page breaks, 
should we handle the paging out selves or can Solr tell use which page the 
match happened on too?

Thanks for your help,

Best Regards, Martin Owens