Solr 1.4 schedule?

2009-08-04 Thread Robert Young
Hi,
When is Solr 1.4 scheduled for release? Is there any ballpark date yet?

Thanks
Rob


Mock solr server

2008-11-27 Thread Robert Young
Hi,

Does anyone know of an easy to use Mock solr server?

Thanks
Rob


Re: Mock solr server

2008-11-28 Thread Robert Young
I'm not using Java unfortunately. Is there anything that allows me to
interact with it much like a normal mock object, setting expectations and
return values?

On Fri, Nov 28, 2008 at 12:06 AM, Jeryl Cook <[EMAIL PROTECTED]> wrote:

> are you trying to unit test something? I would simply make use  of the
> Embedded SOLR component in your unit tests..
>
> On 11/27/08, Robert Young <[EMAIL PROTECTED]> wrote:
> > Hi,
> >
> > Does anyone know of an easy to use Mock solr server?
> >
> > Thanks
> > Rob
> >
>
>
> --
> Jeryl Cook
> /^\ Pharaoh /^\
> http://pharaohofkush.blogspot.com/
> "Whether we bring our enemies to justice, or bring justice to our
> enemies, justice will be done."
> --George W. Bush, Address to a Joint Session of Congress and the
> American People, September 20, 2001
>


Re: Mock solr server

2008-11-28 Thread Robert Young
Will look into it, thanks.

On Fri, Nov 28, 2008 at 9:01 AM, Erik Hatcher <[EMAIL PROTECTED]>wrote:

> In solr-ruby there is a basic "mock" Solr server implementation:
>
>  <
> http://svn.apache.org/viewvc/lucene/solr/trunk/client/ruby/solr-ruby/test/unit/solr_mock_base.rb?view=markup
> >
>
> It's used to test some core response handling routines, like this:
>
>  <
> http://svn.apache.org/viewvc/lucene/solr/trunk/client/ruby/solr-ruby/test/unit/standard_response_test.rb?view=markup
> >
>
>Erik
>
>
>
> On Nov 28, 2008, at 3:41 AM, Robert Young wrote:
>
>  I'm not using Java unfortunately. Is there anything that allows me to
>> interact with it much like a normal mock object, setting expectations and
>> return values?
>>
>> On Fri, Nov 28, 2008 at 12:06 AM, Jeryl Cook <[EMAIL PROTECTED]> wrote:
>>
>>  are you trying to unit test something? I would simply make use  of the
>>> Embedded SOLR component in your unit tests..
>>>
>>> On 11/27/08, Robert Young <[EMAIL PROTECTED]> wrote:
>>>
>>>> Hi,
>>>>
>>>> Does anyone know of an easy to use Mock solr server?
>>>>
>>>> Thanks
>>>> Rob
>>>>
>>>>
>>>
>>> --
>>> Jeryl Cook
>>> /^\ Pharaoh /^\
>>> http://pharaohofkush.blogspot.com/
>>> "Whether we bring our enemies to justice, or bring justice to our
>>> enemies, justice will be done."
>>> --George W. Bush, Address to a Joint Session of Congress and the
>>> American People, September 20, 2001
>>>
>>>
>


Response status

2008-12-04 Thread Robert Young
In the standard response format, what does the status mean? It always seems
to be 0.

Thanks
Rob


Re: Response status

2008-12-04 Thread Robert Young
Thanks

On Thu, Dec 4, 2008 at 2:53 PM, Erik Hatcher <[EMAIL PROTECTED]>wrote:

> It means the request was successful.  If the status is non-zero (err, 1)
> then there was an error of some sort.
>
>Erik
>
>
> On Dec 4, 2008, at 9:32 AM, Robert Young wrote:
>
>  In the standard response format, what does the status mean? It always
>> seems
>> to be 0.
>>
>> Thanks
>> Rob
>>
>
>


Re: Nwebie Question on boosting

2008-12-10 Thread Robert Young
On Thu, Dec 11, 2008 at 6:49 AM, ayyanar
<[EMAIL PROTECTED]>wrote:

> 1) Can you given an example for field level boosting and document level
> boosting and the difference between two?

Field level boosting is used when one field is considered more or less
important than another. For example, you may want the title field of a
document to be considered more important so that if a term appears in the
title this considered more important than if it appears in the body. On the
other hand, document level boosting is about when a document is more or less
important than another. For example, an FAQ is often considered a very
important page and as such, may be required to appear higher in results than
it otherwise would have.

>
>
> 2) If we set the boost at field level (index time), should the query
> contains the that particular field?
> For example, if we set the boost for title field, should we create the
> termquery for title field?
>
> Yes, if you want that it to make any difference.


Rob


Does Solr Have?

2007-10-04 Thread Robert Young
Hi,

We're just about to start work on a project in Solr and there are a
couple of points which I haven't been able to find out from the wiki
which I'm interested in.

1. Is there a REST interface for getting index stats? I would
particularly like access to terms and their document frequencies,
prefereably filtered by a query.

2. Is it possible to use different synonym sets for different queries
OR is it possible to search across multiple indexes with a single
query?

3. Is it possible to change stopword and synonym sets at runtime?

I'm sure I'll have lots more questions as time goes by and, hopefully,
I'll be able to answer others' questions in the future.

Thanks
Rob


Re: Does Solr Have?

2007-10-04 Thread Robert Young
Brilliant, thank you, that LukeRequestHandler looks very useful.

On 10/4/07, Erik Hatcher <[EMAIL PROTECTED]> wrote:
> > 3. Is it possible to change stopword and synonym sets at runtime?
>
> Only if the underlying text file is changed.

Will Solr automatically reload the file if it changes or does it have
to be informed of the change? Is changing the underlying file while
Solr is running dangerous?

Cheers
Rob


Re: Does Solr Have?

2007-10-04 Thread Robert Young
Is there, or are there plans to start, a plugin and extension repository?

Cheers
Rob

On 10/4/07, Ryan McKinley <[EMAIL PROTECTED]> wrote:
> dooh, should check all my email first!
>
> >>
> >> Will Solr automatically reload the file if it changes or does it have
> >> to be informed of the change?
> >
> > I'll expose my confusion here and say that I don't know for sure, but
> > I'm pretty sure that once it's been loaded it won't get reloaded without
> > bouncing Solr altogether.
>
> Correct.  The StopFilterFactory is initialized at startup, any changes
> to the file won't take effect 'till solr restarts.
>
> but you can write a custom FilterFactory based on StopFilterFactory that
> lets you change it dynamically.  Most likely this would also require
> writing a custom RequestHandler to manipulate it.
>
> note - changing the stop words at runtime will only effect queries, the
> index will keep whatever was there at index time.
>
> ryan
>


Opensearch XSLT

2007-10-12 Thread Robert Young
Hi,

Does anyone know of an XSLT out there for transforming Solr's default
output to Opensearch format? Our current frontend system uses
opensearch so we would like to integrate it like this.

Cheers
Rob


Querying for an id with a colon in it

2007-10-15 Thread Robert Young
Hi,

If my unique identifier is called guid and one of the ids in it is,
for example, "article:123". How can I query for that article id? I
have tried a number of ways but I always either get no results or an
error. It seems to be to do with having the colon in the id value.

eg.
?q=guid:article:123 -> error
?q=guid:"article:123" -> error
?q=guid:article%3A123 -> error

Any ideas?
Cheers
Rob


Re: Querying for an id with a colon in it

2007-10-15 Thread Robert Young
Hey,

Thanks Brian, that works perfectly.

Cheers
Rob

On 10/15/07, Brian Carmalt <[EMAIL PROTECTED]> wrote:
> Robert Young schrieb:
> > Hi,
> >
> > If my unique identifier is called guid and one of the ids in it is,
> > for example, "article:123". How can I query for that article id? I
> > have tried a number of ways but I always either get no results or an
> > error. It seems to be to do with having the colon in the id value.
> >
> > eg.
> > ?q=guid:article:123 -> error
> > ?q=guid:"article:123" -> error
> > ?q=guid:article%3A123 -> error
> >
> > Any ideas?
> > Cheers
> > Rob
> >
> >
> Try it with a \: That's what the Lucene Query Parser Syntax page says.
> It doesn't cause an error, but I don't know if it will provide the
> results you want.
>
> Brian
>


Re: preconfiguring which xsl file to use

2007-10-19 Thread Robert Young
Thanks Eric. For the moment we're only using one requestHandler for
basic querying so that should work OK

Cheers
Rob

On 10/19/07, Erik Hatcher <[EMAIL PROTECTED]> wrote:
>
> On Oct 19, 2007, at 8:30 AM, Robert Young wrote:
> > Is it possible to configure which xsl file to use for a particular
> > queryResponseWriter in the solrconfig.xml?
>
> I don't believe so, but instead I think something like this will work:
>
> class="solr.StandardRequestHandler">
>  
>xslt
>opensearch.xsl
>  
>
>
> And then a ?qt=opensearch should do the trick.   The dilemma here is
> that you can't then toggle between various request handlers, unless
> you mapped them separately.
>
> Erik
>
>
> >
> > I would like to have something like the following so that I don't have
> > put it in for every query.
> >  > class="org.apache.solr.request.XSLTResponseWriter">
> >   5
> >   opensearch.xsl
> > 
> >
> > Any ideas?
> >
> > Cheers
> > Rob
>
>


preconfiguring which xsl file to use

2007-10-19 Thread Robert Young
Hi,

Is it possible to configure which xsl file to use for a particular
queryResponseWriter in the solrconfig.xml?

I would like to have something like the following so that I don't have
put it in for every query.

  5
  opensearch.xsl


Any ideas?

Cheers
Rob


Can't get query when using xslt

2007-10-30 Thread Robert Young
Hi,

I'm using the XsltResponseWriter but I can't seem to get hold of the
query. I have copied in the xsl file I'm using but basically, I'm
trying to access the query element of the params list but I'm getting
nothing back. I've edited the xsl file a bit and as far as I can see
the whole params list isn't available. (ie. ...)

Does anyone know what this might be?

Cheers
Rob


http://www.ipcmedia.com/opensearchrss/1.0/";
  xmlns:nutch="http://www.nutch.org/opensearchrss/1.0/";
  xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/";
>
  
  

http://www.ipcmedia.com/opensearchrss/1.0/";
  xmlns:nutch="http://www.nutch.org/opensearchrss/1.0/";
  xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/";>
  

Search: 
Search results for query: 
http://127.0.0.1/blah




  

  

  
  



  
  
  
  
  

  



fieldNorm seems to be killing my score

2007-11-01 Thread Robert Young
Hi,

I've been trying to debug why one of my test cases doesn't work. I
have an index with two documents in, one talking mostly about apples
and one talking mostly about oranges (for the sake of this test case)
both of which have 'test_site' in their site field. If I run the query
+(apple^4 orange) +(site:"test_site") I would expect the document
which talks about apples to always apear first but it does not.
Looking at the debug output (below) it looks like fieldNorm is killing
the first part of the query. Why is this and how can I stop it?





 0
 4
 
  10
  0

  on
  +(apple^4 orange) +(site:"test_site")
  on
  2.2
 


 

  test_index-test_site-integration:124
  test_index
  /oranges
  test_site
  orange orange orange
  orange

 
 
  test_index-test_site-integration:123
  test_index
  /me
  test_site
  apple apple apple

  apple
 


 +(apple^4 orange) +(site:"test_site")
 +(apple^4 orange) +(site:"test_site")
 +(text:appl^4.0 text:orang) +site:test_site
 +(text:appl^4.0 text:orang)
+site:test_site

 
  
0.14332592 = (MATCH) sum of:
  0.0 = (MATCH) product of:
0.0 = (MATCH) sum of:
  0.0 = (MATCH) weight(text:orang in 13), product of:
0.24034579 = queryWeight(text:orang), product of:
  1.9162908 = idf(docFreq=5)
  0.1254224 = queryNorm
0.0 = (MATCH) fieldWeight(text:orang in 13), product of:
  2.236068 = tf(termFreq(text:orang)=5)
  1.9162908 = idf(docFreq=5)
  0.0 = fieldNorm(field=text, doc=13)
0.5 = coord(1/2)
  0.14332592 = (MATCH) weight(site:test_site in 13), product of:
0.13407566 = queryWeight(site:test_site), product of:
  1.0689929 = idf(docFreq=13)
  0.1254224 = queryNorm
1.0689929 = (MATCH) fieldWeight(site:test_site in 13), product of:
  1.0 = tf(termFreq(site:test_site)=1)
  1.0689929 = idf(docFreq=13)
  1.0 = fieldNorm(field=site, doc=13)

  
0.14332592 = (MATCH) sum of:
  0.0 = (MATCH) product of:
0.0 = (MATCH) sum of:
  0.0 = (MATCH) weight(text:appl^4.0 in 14), product of:
0.96138316 = queryWeight(text:appl^4.0), product of:
  4.0 = boost
  1.9162908 = idf(docFreq=5)
  0.1254224 = queryNorm
0.0 = (MATCH) fieldWeight(text:appl in 14), product of:
  2.236068 = tf(termFreq(text:appl)=5)
  1.9162908 = idf(docFreq=5)
  0.0 = fieldNorm(field=text, doc=14)
0.5 = coord(1/2)
  0.14332592 = (MATCH) weight(site:test_site in 14), product of:
0.13407566 = queryWeight(site:test_site), product of:
  1.0689929 = idf(docFreq=13)
  0.1254224 = queryNorm
1.0689929 = (MATCH) fieldWeight(site:test_site in 14), product of:
  1.0 = tf(termFreq(site:test_site)=1)
  1.0689929 = idf(docFreq=13)
  1.0 = fieldNorm(field=site, doc=14)

 




Re: fieldNorm seems to be killing my score

2007-11-01 Thread Robert Young
Oooh! I think I'll just get my coat...

My indexer was defaulting to zero for document boosts rather than 1.

On 11/1/07, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> Hmmm, a norm of 0.0???  That implies that the boost for that field
> (text) was set to zero when it was indexed.
> How did you index the data (straight HTTP, SolrJ, etc)?  What does
> your schema for this field (and copyFields) look like?
>
> -Yonik
>
> On 11/1/07, Robert Young <[EMAIL PROTECTED]> wrote:
> > Hi,
> >
> > I've been trying to debug why one of my test cases doesn't work. I
> > have an index with two documents in, one talking mostly about apples
> > and one talking mostly about oranges (for the sake of this test case)
> > both of which have 'test_site' in their site field. If I run the query
> > +(apple^4 orange) +(site:"test_site") I would expect the document
> > which talks about apples to always apear first but it does not.
> > Looking at the debug output (below) it looks like fieldNorm is killing
> > the first part of the query. Why is this and how can I stop it?
> >
> > 
> > 
> >
> > 
> >  0
> >  4
> >  
> >   10
> >   0
> >
> >   on
> >   +(apple^4 orange) +(site:"test_site")
> >   on
> >   2.2
> >  
> > 
> > 
> >  
> >
> >   test_index-test_site-integration:124
> >   test_index
> >   /oranges
> >   test_site
> >   orange orange orange
> >   orange
> >
> >  
> >  
> >   test_index-test_site-integration:123
> >   test_index
> >   /me
> >   test_site
> >   apple apple apple
> >
> >   apple
> >  
> > 
> > 
> >  +(apple^4 orange) +(site:"test_site")
> >  +(apple^4 orange) +(site:"test_site")
> >  +(text:appl^4.0 text:orang) +site:test_site
> >  +(text:appl^4.0 text:orang)
> > +site:test_site
> >
> >  
> >   
> > 0.14332592 = (MATCH) sum of:
> >   0.0 = (MATCH) product of:
> > 0.0 = (MATCH) sum of:
> >   0.0 = (MATCH) weight(text:orang in 13), product of:
> > 0.24034579 = queryWeight(text:orang), product of:
> >   1.9162908 = idf(docFreq=5)
> >   0.1254224 = queryNorm
> > 0.0 = (MATCH) fieldWeight(text:orang in 13), product of:
> >   2.236068 = tf(termFreq(text:orang)=5)
> >   1.9162908 = idf(docFreq=5)
> >   0.0 = fieldNorm(field=text, doc=13)
> > 0.5 = coord(1/2)
> >   0.14332592 = (MATCH) weight(site:test_site in 13), product of:
> > 0.13407566 = queryWeight(site:test_site), product of:
> >   1.0689929 = idf(docFreq=13)
> >   0.1254224 = queryNorm
> > 1.0689929 = (MATCH) fieldWeight(site:test_site in 13), product of:
> >   1.0 = tf(termFreq(site:test_site)=1)
> >   1.0689929 = idf(docFreq=13)
> >   1.0 = fieldNorm(field=site, doc=13)
> > 
> >   
> > 0.14332592 = (MATCH) sum of:
> >   0.0 = (MATCH) product of:
> > 0.0 = (MATCH) sum of:
> >   0.0 = (MATCH) weight(text:appl^4.0 in 14), product of:
> > 0.96138316 = queryWeight(text:appl^4.0), product of:
> >   4.0 = boost
> >   1.9162908 = idf(docFreq=5)
> >   0.1254224 = queryNorm
> > 0.0 = (MATCH) fieldWeight(text:appl in 14), product of:
> >   2.236068 = tf(termFreq(text:appl)=5)
> >   1.9162908 = idf(docFreq=5)
> >   0.0 = fieldNorm(field=text, doc=14)
> > 0.5 = coord(1/2)
> >   0.14332592 = (MATCH) weight(site:test_site in 14), product of:
> > 0.13407566 = queryWeight(site:test_site), product of:
> >   1.0689929 = idf(docFreq=13)
> >   0.1254224 = queryNorm
> > 1.0689929 = (MATCH) fieldWeight(site:test_site in 14), product of:
> >   1.0 = tf(termFreq(site:test_site)=1)
> >   1.0689929 = idf(docFreq=13)
> >   1.0 = fieldNorm(field=site, doc=14)
> > 
> >  
> > 
> > 
> >
>


Re: how to use PHP AND PHPS?

2007-11-05 Thread Robert Young
I would imagine you have to unserialize

On 11/5/07, James liu <[EMAIL PROTECTED]> wrote:
> i find they all return string
>
>$url = '
> http://localhost:8080/solr/select/?q=solr&version=2.2&start=0&rows=10&indent=on&wt=php
> ';
>   var_dump(file_get_contents($url);
> ?>
>
>
> --
> regards
> jl
>


Re: Boosting and copy fields

2007-11-30 Thread Robert Young
Right, ok, thanks.

On Nov 30, 2007 2:14 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
>
> On Nov 30, 2007 7:18 AM, Robert Young <[EMAIL PROTECTED]> wrote:
> > How does the copy field with boosted fields? If I have three fields
> > with different boost values and they all get copied into a copy field,
> > are these boosts taken into account during searching?
>
> They are all multiplied together (lucene does this in the indexing
> code as the index format only supports one boost per unique field per
> document).
>
> -Yonik
>


Boosting and copy fields

2007-11-30 Thread Robert Young
Hi,

How does the copy field with boosted fields? If I have three fields
with different boost values and they all get copied into a copy field,
are these boosts taken into account during searching?

Cheers
Rob


Disabling the cache?

2007-12-14 Thread Robert Young
Hi,

Is it possible to disable all the caches in Solr. We want to be able
to load test our Solr based application but we don't want the caches
to affect the results (we're using Apache benchmark so just sending
the same request over and over again).

Cheers
Rob


Re: Best practice for storing relational data in Solr

2008-01-04 Thread Robert Young
Short answer: It depends.
Long answer: It depends on whether you want to be able to search on.
If you need to search by recruiter name then obviously you'll need to
index it, if you don't you only really need to index the most relevent
db identifier, then work out the relations from that in MySQL (it's
what it's good at after all).

Cheers
Rob

On Jan 4, 2008 11:39 AM, steve.lillywhite
<[EMAIL PROTECTED]> wrote:
> Hi all,
>
>
>
> This is a (possibly very naive) newbie question regarding Solr best 
> practice...
>
>
>
> I run a website that displays/stores data on job applicants, together with 
> information on where they came from (e.g. which recruiter), which office they 
> are applying to, etc. This data is stored in a mySQL database. I currently 
> have a basic search facility, but I  plan to introduce Solr to improve this, 
> by also storing applicant data in a Solr schema.
>
>
>
> My problem is that *related* applicant data can also be updated in the web 
> GUI (e.g. if there was a typo a recruiter could be changed from "My Rcruiter" 
> to "My Recruiter", and I don't know how best to reflect this in the Solr 
> schema.
>
> Example:
>
> We may have 2 applicants that came from recruiter "My Recruiter". If the 
> name of this recruiter is altered in the GUI then I would have to reindex all 
> 2 of those applicants in the Solr schema, which seems very overkill. The 
> alternative would be if I didn't store the recruiter name in the Solr schema, 
> and instead only stored its mySQL database identifier. Then, I would need to 
> parse any search results from Solr to put in the recruiter name before 
> displaying the data in the GUI.
>
>
>
> So I guess I'm asking which of these is the better approach;
>
>
>
> 1.   Use Solr to store the text value of related applicant data that 
> exists in a relational mySQL database. Whenever that data is updated in the 
> database reindex all dependent entries in the Solr schema. Advantage of this 
> approach I guess is that search results can be returned from Solr and 
> displayed as is (if XSLT is used). E.g. search result for "John Smith" of 
> recruiter "My Recruiter" could be returned in the required HTML format from 
> Solr, and displayed in the web GUI without any reformatting or further 
> processing.
>
> 2.   Use Solr to store database Ids of related applicant data that exists 
> in a relational mySQL database. When that data is updated in the database 
> there is no need to reindex Solr. However, search results from Solr will need 
> to be parsed before they can be output in the web GUI. E.g. if Solr returns 
> "John Smith" of recruiter with database ID 143, then 143 will need to be 
> mapped back to "My Recruiter" by my application before it can be displayed.
>
>
>
> Can anyone offer any guidance here?
>
>
>
> Regards
>
>
>
> Steve
>
>
>
>
> No virus found in this outgoing message.
> Checked by AVG Free Edition.
> Version: 7.5.516 / Virus Database: 269.17.13/1208 - Release Date: 03/01/2008 
> 15:52
>
>


Re: Duplicated Keyword

2008-01-04 Thread Robert Young
I don't quite understand what you're getting at. What is the problem
you're encountering or what are you trying to achieve?

Cheers
Rob

On Jan 4, 2008 3:26 PM, Jae Joo <[EMAIL PROTECTED]> wrote:
> Hi,
>
> Is there any way to dedup the keyword cross the document?
>
> Ex.
>
> "china" keyword is in doc1 and doc2. Will Solr index have only 1 "china"
> keyword for both document?
>
> Thanks,
>
> Jae Joo
>


Re: Duplicated Keyword

2008-01-04 Thread Robert Young
You can think of it as the latter but it's quite a bit more
complicated than that. For details on how lucene stores it's index
check out the file formats page on lucene.
http://lucene.apache.org/java/docs/fileformats.html

Cheers
Rob


On Jan 4, 2008 4:59 PM, Jae Joo <[EMAIL PROTECTED]> wrote:
> title of Document 1 - "This is document 1 regarding china" - fieldtype =
> text
> title of Document 2 - "This is document 2 regarding china"  fieldtype=text
>
> Once it is indexed, will index hold  2 "china"  text fields  or just 1 china
> word which is pointing document1 and document2?
>
> Jae
>
>
> On Jan 4, 2008 10:54 AM, Robert Young <[EMAIL PROTECTED]> wrote:
>
> > I don't quite understand what you're getting at. What is the problem
> > you're encountering or what are you trying to achieve?
> >
> > Cheers
> > Rob
> >
> > On Jan 4, 2008 3:26 PM, Jae Joo <[EMAIL PROTECTED]> wrote:
> > > Hi,
> > >
> > > Is there any way to dedup the keyword cross the document?
> > >
> > > Ex.
> > >
> > > "china" keyword is in doc1 and doc2. Will Solr index have only 1 "china"
> > > keyword for both document?
> > >
> > > Thanks,
> > >
> > > Jae Joo
> > >
> >
>


Luke response format explained

2008-01-08 Thread Robert Young
Hi,

In the response for the LuceRequestHandler what do the different
fields mean? Some of them are obvious but some are less so. Is
numTerms the total number of terms or the total number of unique terms
(ie the dictionary), if it is the former how can I find the size of
the dictionary across all fields? I'm assuming that distinct in the
specific field sections is the number of unique terms in that field,
is this correct?

Thanks
Rob


Re: Luke response format explained

2008-01-08 Thread Robert Young
Thanks, that is very helpfull. So, is there a way to find out the
total number of distinct tokens, regardless of which field they're
associated with? And to find which are most popular?

Cheers
Rob

On Jan 8, 2008 5:04 PM, Ryan McKinley <[EMAIL PROTECTED]> wrote:
> numTerms counts the unique terms (field:value pair) in the index.  The
> source is:
>
>  TermEnum te = reader.terms();
>  int numTerms = 0;
>  while (te.next()) {
>numTerms++;
>  }
>  indexInfo.add("numTerms", numTerms );
>
> "distinct" is a similar calculation, but for each field.
>
> ryan
>
>
>
> Robert Young wrote:
> > Hi,
> >
> > In the response for the LuceRequestHandler what do the different
> > fields mean? Some of them are obvious but some are less so. Is
> > numTerms the total number of terms or the total number of unique terms
> > (ie the dictionary), if it is the former how can I find the size of
> > the dictionary across all fields? I'm assuming that distinct in the
> > specific field sections is the number of unique terms in that field,
> > is this correct?
> >
> > Thanks
> > Rob
> >
>
>


Re: Luke response format explained

2008-01-09 Thread Robert Young
On Jan 8, 2008 8:13 PM, Ryan McKinley <[EMAIL PROTECTED]> wrote:
> Perhaps consider using a copyField to copy the relevant values into
> another field - then you can get the top tokens across all these fields
> with luke.
That sounds like the best solution, thanks. Also means I'd  be able to
have it without stemming to make it a little easier to read.

Cheers
Rob


search abstraction library for PHP

2008-02-06 Thread Robert Young
Hi,

Thought you guys might be interested, I'm working on a search
abstraction library for PHP called Forage, you can check it out at the
link below. At the moment it just supports basic indexing and
searching with Solr, Xapian and Zend Search Lucene but I'm hoping to
add more engines and more features (tagging, faceting, etc) in the
very near future.

Cheers
Rob

http://code.google.com/p/forage


Re: upgrading to lucene 2.3

2008-02-12 Thread Robert Young
ok, and to do the change I just replace the jar directly in
sorl/WEB_INF/lib and restart tomcat?

Thanks
Rob

On Feb 12, 2008 1:55 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
> Solr Trunk is using the latest Lucene version.  Also note there are a
> couple edge cases in Lucene 2.3 that are causing problems if you use
> SOLR-342 with lucenAutoCommit == false.
>
> But, yes, you should be able to drop in 2.3, as that is one of the
> back-compatible goals for Lucene minor releases.
>
> -Grant
>
>
> On Feb 12, 2008, at 8:06 AM, Robert Young wrote:
>
> > I have heard that upgrading to lucene 2.3 in Solr 1.2 is as simple as
> > replacing the lucene jar and restarting. Is this the case? Has anyone
> > had any experience with upgrading lucene to 2.3? Did you have any
> > problems? Is there anything I should be looking out for?
> >
> > Thanks
> > Rob
>
>


upgrading to lucene 2.3

2008-02-12 Thread Robert Young
I have heard that upgrading to lucene 2.3 in Solr 1.2 is as simple as
replacing the lucene jar and restarting. Is this the case? Has anyone
had any experience with upgrading lucene to 2.3? Did you have any
problems? Is there anything I should be looking out for?

Thanks
Rob