Re: filtering facets

2009-08-31 Thread Mike Topper
Hi Olivier,

are the facet counts on the urls you dont want 0?

if so you can use facet.mincount to only return results greater than 0.

-Mike

Olivier H. Beauchesne wrote:
> Hi,
>
> Long time lurker, first time poster.
>
> I have a multi-valued field, let's call it article_outlinks containing
> all outgoing urls from a document. I want to get all matching urls
> sorted by counts.
>
> For exemple, I want to get all outgoing wikipedia url in my documents
> sorted by counts.
>
> So I execute a query like this:
> q=article_outlinks:http*wikipedia.org*  and I facet on article_outlinks
>
> But I get facets containing the other urls in the documents. I can get
> something close by using facet.prefix=http://en.wikipedia.org but I
> want to include other subdomains on wikipedia (ex: fr.wikipedia.org).
>
> Is there a way to do a search and getting facets only matching my query?
>
> I know facet.prefix isn't a query, but is there a way to get that
> behavior?
>
> Is it easy to extend solr to do something like that?
>
> Thank you,
>
> Olivier
>
> Sorry for my english.
>


-- 
my public key can be found by:

gpg --keyserver pgp.mit.edu  --recv-keys 26A5C87F




indexing entire text but only storing first N characters?

2009-02-19 Thread Mike Topper
Hello,

In one of the fields in my schema I am sending somewhat large texts.  I
want to be able to index all of it since I want to search on the entire
text, but I only need the first N characters to be returned to me.  Is
there a way to do this with one field or would I just create two fields,
one that is indexed and not stored and one that is stored and not
indexed and only send the first N characters to the stored field?

-Mike


Re: indexing entire text but only storing first N characters?

2009-02-19 Thread Mike Topper
Cool, we are actually still on 1.2 but were planning on upgrading to 1.3

is this a feature of 1.3 or just on the nightly builds?

-Mike

Koji Sekiguchi wrote:
> Mike Topper wrote:
>> Hello,
>>
>> In one of the fields in my schema I am sending somewhat large texts.  I
>> want to be able to index all of it since I want to search on the entire
>> text, but I only need the first N characters to be returned to me.  Is
>> there a way to do this with one field or would I just create two fields,
>> one that is indexed and not stored and one that is stored and not
>> indexed and only send the first N characters to the stored field?
>>
>> -Mike
>>
>>   
>
> If you are using a nigtly version, you can use maxChars attribute of
> copyField feature
> to implement your idea:
>
>  maxChars="2000" />
>
> Koji
>
>



Re: Querying Greater Than and Less Than

2008-08-26 Thread mike topper

you can also  use queries like field:[* to Z]  or field:[Z TO *]

-Mike

Jake Conk wrote:

Hello,

I was trying to figure out how to query ranges greater than and less
than. The closest solution I could find was using the range format:

field:[x TO z]

While this solution works for querying greater than items how would I
query all items less than 10 assuming I have some items that have a
negative number that should be selected as well. The closest thing
I've came to was this:

field:[0 TO 10]

Given I don't know what is the smallest negative number but I want to
be able to somehow be able to get all items, is there a way somehow?

Thanks,

- Jake
  




Re: NOT NULL Query

2008-10-15 Thread mike topper

I think you can do field:["" TO *] to grab everything that is not null.

-Mike

John E. McBride wrote:

Hello All,

I need to run a query which asks:
field = NOT NULL

should this perhaps be done with a filter?

I can't find out how to do NOT NULL from the documentation, would 
appreciate any advice.


Thanks,
John


newbie question on determining fieldtype

2007-01-08 Thread mike topper

Hi,

I have a question that I couldn't find the exact answer to. 

I have some fields that I want to add to my schema but will never be 
searched on.  They are only used as additional information about a 
document when retrieved.  They are integers, so should i just have the 
field be:




I'm pretty sure this is right, but I just wanted to check that I'm not missing 
any speedups from using a different field
or adding some other parameters.

-Mike



problem with solr.HTMLStripWhitespaceTokenizerFactory

2007-03-06 Thread mike topper
I'm trying to use the html stripping factory in order to strip html tags 
from my description field when indexing.


I added this fieldtype:

   
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
   


and then in my schema i have this:





when inserting it it seems like nothing happens ie when i do a query 
here is the response for a test description:




himynameistopperand this  blahblah is a 
test






Any Ideas?

-Mike



limiting the rows returned for a query

2007-04-18 Thread mike topper

Hello,

I have a question that I couldn't really find the answer to and dont 
really know if its possible currently within solr.


I want to do a simple query to the solr index.  something like 
q=stateid:1 countryid:1


but i'm really only concerned with getting the record above and below a 
certain (dynamic) recordid in the search results.


Is there a way to do this through a query or is my only option to return 
all the search results and parse them to find the record id I want, and 
then get the one above and below that.  I'd also have to take into 
account pagination and whatnot which makes it also a little bit harder 
to do this way.


anyways hope that makes sense,

let me know!


-Mike


adjusting score slightly by date field

2007-05-09 Thread mike topper

Hello,

In our application there are a lot of old records that we still want in 
our index but would like for them to be scored lower than some newer 
records.


Is it possible for a date field to weigh in on the score slightly in 
some way?  Or if not is there another way to push up newer records in 
the order of results while still maintaining the scoring?


-Mike


how to use function queries

2007-05-21 Thread mike topper
I'm trying to retrieve results from solr such that newer documents' 
scores are boosted.  From the solr wiki it states that I should use a 
function query to influence the score but I'm a little confused on howto 
use a function query.


Searching through the archives I found a suggestion of using the _val_: 
hack in the standard query handler, but when i tried that with


recip(rord(date),1,1000,1000)^2

to just test it I got an error saying

org.apache.solr.core.SolrException: undefined field recip

Can someone explain the function queries a little clearer and if I would need 
to use a different query handler?

-Mike




almost realtime updates with replication

2007-08-22 Thread mike topper

Hello,

Currently in our application we are using the master/slave setup and 
have a batch update/commit about every 5 minutes.


There are a couple queries that we would like to run almost realtime so 
I would like to have it so our client sends an update on every new 
document and then have solr configured to do an autocommit every 5-10 
seconds.


reading the Wiki, it seems like this isn't possible because of the 
strain of snapshotting and pulling to the slaves at such a high rate.  
What I was thinking was for these few queries to just query the master 
and the rest can query the slave with the not realtime data, although 
I'm assuming this wouldn't work either because since a snapshot is 
created on every commit, we would still impact the performance too much?


anyone have any suggestions?  If I set autowarmingCount=0 would I be 
able to to pull to the slave faster than every couple of minutes (say, 
every 10 seconds)?


what if I take out the postcommit hook on the master and just have the 
snapshooter run on a cron every 5 minutes?


-Mike