RE: facet question

2007-05-31 Thread Gal Nitzan


> -Original Message-
> From: Mike Klaas [mailto:[EMAIL PROTECTED]
> Sent: Friday, June 01, 2007 12:36 AM
> To: solr-user@lucene.apache.org
> Subject: Re: facet question
>
> On 31-May-07, at 1:35 PM, Gal Nitzan wrote:
>
> >>>
> >>> However, the cache size brings us to the 2GB limit.
> >>
> >> If the cardinality of many of the tags is low, you can use HashSet-
> >> based filters (the default size at which a HashSet is used is 3000).
> > [Gal Nitzan]
> >
> > I will appreciate a pointer to documentation on HahsSet based filters
> > tahnks...
>
> http://wiki.apache.org/solr/SolrConfigXml#head-
> ffe19c34abf267ca2d49d9e7102feab8c79b5fb5
[Gal Nitzan]
Thanks...
>
> Scroll down to the HashDocSet comment.
>
> I'm not sure how much this will help--it depends greatly on the
> distribution of your tag values.
>
> Also, I'm still suspicious about your application.  You have 1.5M
> distinct tags for 4M documents?  That seems quite dense.
[Gal Nitzan]
Basically the facet query runs on each access to the home page (viewed in a 
tag cloud) but it almost doesn't change...
Here are my cache stats:
description: LRU Cache(maxSize=100, initialSize=60, 
autowarmCount=30, 
[EMAIL PROTECTED])
stats:  lookups : 230538968
hits : 228863426
hitratio : 0.99
inserts : 1675543
evictions : 0
size : 944831
cumulative_lookups : 230538968
cumulative_hits : 228863426
cumulative_hitratio : 0.99
cumulative_inserts : 1675543
cumulative_evictions : 0

>
> -Mike




RE: facet question

2007-05-31 Thread Gal Nitzan


> -Original Message-
> From: Mike Klaas [mailto:[EMAIL PROTECTED]
> Sent: Thursday, May 31, 2007 9:07 PM
> To: solr-user@lucene.apache.org
> Subject: Re: facet question
>
> On 31-May-07, at 1:33 AM, Gal Nitzan wrote:
>
> > Hi,
> >
> > We have a small index with about 4 million docs.
> >
> > On this index we have a field "tags" which is a multiple values field.
> >
> > Running a facet query on the index with something like:
> > facet=true&facetField=tags&q=type:video takes about 1 minute.
> >
> > We have defined a large cache which enables the query to run much
> > faster
> > (about 1 sec)
> >
> >  >  class="solr.LRUCache"
> >  size="150"
> >  initialSize="60"
> >  autowarmCount="30"/>
> >
> >
> > However, the cache size brings us to the 2GB limit.
>
> If the cardinality of many of the tags is low, you can use HashSet-
> based filters (the default size at which a HashSet is used is 3000).
[Gal Nitzan]

I will appreciate a pointer to documentation on HahsSet based filters 
tahnks...


>
> Do you really have 1.5M unique values in that field.  Are you
> analyzing the field (you probably shouldn't be)?

[Gal Nitzan]
No it is not analyzed. Just indexed and stored.



>
> -Mike
[Gal Nitzan]





RE: sorting problem

2007-05-08 Thread Gal Nitzan

Thank you, Hoss...

> -Original Message-
> From: Chris Hostetter [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, May 08, 2007 8:41 PM
> To: solr-user@lucene.apache.org; [EMAIL PROTECTED]
> Subject: Re: sorting problem
>
>
> : My query tries to search all entries which their ctype is video sorted
> by
> : tstamp descending and then sorted by popularity:
>
> : However the results returned are sorted only by the tstamp.
>
>
> Solr stores datefields with millisecond precision, so if you index a date
> field without rounding, then all of htat precision is going to be there
> when it comes time to sort ... you can clearly see in your output that the
> results are strictly sorted by your first critera .. the secondary sort
> will only come into play if two docs have *exactly* the same value for the
> tstamp field.
>
> assuming you generate your tstamp field using the "NOW" default in your
> schema.xml, you can get rounding by using the DateMath feature, something
> like this...
>
>
>
> ...but if you are generating the tstamp field values in your client and
> then sending them to Solr, you'll need to do the rounding there.
>
> -Hoss




sorting problem

2007-05-08 Thread Gal Nitzan
Hi,

I have 2 fields which I would like to sort by, one is a "date" field and the 
other is "sint".

My query tries to search all entries which their ctype is video sorted by 
tstamp descending and then sorted by popularity:

q=ctype:video;tstamp desc;popularity desc&fl=tstamp,popularity

However the results returned are sorted only by the tstamp.


0
2007-05-08T12:58:45,023Z


0
2007-05-08T12:58:39,062Z


4923
2007-05-08T12:58:37,710Z


0
2007-05-08T12:58:37,651Z


Any idea what I'm missing?

TIA,

Gal





simple query

2007-04-25 Thread Gal Nitzan
Hi,

I'm new to Solr...


I'm trying to create an Auto Complete combo which is based on results from 
Solr but I ran into a few problems which are related to the query.

The Schema field is:




What I try to do:
Q=title:"d*" # get all title which has the letter d followed by anything
Q=title:"do*" #get all titles that has the letters do followed by anything
Q=title:"dog*" get all titles with dog .
Q=title:"dog%20*" ... returns
- 
  My Dog
  
- 
  Black Dog
  
- 
  TACO DOG DIES
  
- 
  Humping Dog
  
- 
  Must Love Dogs - 2
  
- 
  Dog Dayze
  

Which is greate BUT

When I search
Q=title:"dog%20da*"






Any clue to what am I missing?

TIA,

Gal Nitzan