Re: filtering facets
Hi Olivier, are the facet counts on the urls you dont want 0? if so you can use facet.mincount to only return results greater than 0. -Mike Olivier H. Beauchesne wrote: > Hi, > > Long time lurker, first time poster. > > I have a multi-valued field, let's call it article_outlinks containing > all outgoing urls from a document. I want to get all matching urls > sorted by counts. > > For exemple, I want to get all outgoing wikipedia url in my documents > sorted by counts. > > So I execute a query like this: > q=article_outlinks:http*wikipedia.org* and I facet on article_outlinks > > But I get facets containing the other urls in the documents. I can get > something close by using facet.prefix=http://en.wikipedia.org but I > want to include other subdomains on wikipedia (ex: fr.wikipedia.org). > > Is there a way to do a search and getting facets only matching my query? > > I know facet.prefix isn't a query, but is there a way to get that > behavior? > > Is it easy to extend solr to do something like that? > > Thank you, > > Olivier > > Sorry for my english. > -- my public key can be found by: gpg --keyserver pgp.mit.edu --recv-keys 26A5C87F
indexing entire text but only storing first N characters?
Hello, In one of the fields in my schema I am sending somewhat large texts. I want to be able to index all of it since I want to search on the entire text, but I only need the first N characters to be returned to me. Is there a way to do this with one field or would I just create two fields, one that is indexed and not stored and one that is stored and not indexed and only send the first N characters to the stored field? -Mike
Re: indexing entire text but only storing first N characters?
Cool, we are actually still on 1.2 but were planning on upgrading to 1.3 is this a feature of 1.3 or just on the nightly builds? -Mike Koji Sekiguchi wrote: > Mike Topper wrote: >> Hello, >> >> In one of the fields in my schema I am sending somewhat large texts. I >> want to be able to index all of it since I want to search on the entire >> text, but I only need the first N characters to be returned to me. Is >> there a way to do this with one field or would I just create two fields, >> one that is indexed and not stored and one that is stored and not >> indexed and only send the first N characters to the stored field? >> >> -Mike >> >> > > If you are using a nigtly version, you can use maxChars attribute of > copyField feature > to implement your idea: > > maxChars="2000" /> > > Koji > >
Re: Querying Greater Than and Less Than
you can also use queries like field:[* to Z] or field:[Z TO *] -Mike Jake Conk wrote: Hello, I was trying to figure out how to query ranges greater than and less than. The closest solution I could find was using the range format: field:[x TO z] While this solution works for querying greater than items how would I query all items less than 10 assuming I have some items that have a negative number that should be selected as well. The closest thing I've came to was this: field:[0 TO 10] Given I don't know what is the smallest negative number but I want to be able to somehow be able to get all items, is there a way somehow? Thanks, - Jake
Re: NOT NULL Query
I think you can do field:["" TO *] to grab everything that is not null. -Mike John E. McBride wrote: Hello All, I need to run a query which asks: field = NOT NULL should this perhaps be done with a filter? I can't find out how to do NOT NULL from the documentation, would appreciate any advice. Thanks, John
newbie question on determining fieldtype
Hi, I have a question that I couldn't find the exact answer to. I have some fields that I want to add to my schema but will never be searched on. They are only used as additional information about a document when retrieved. They are integers, so should i just have the field be: I'm pretty sure this is right, but I just wanted to check that I'm not missing any speedups from using a different field or adding some other parameters. -Mike
problem with solr.HTMLStripWhitespaceTokenizerFactory
I'm trying to use the html stripping factory in order to strip html tags from my description field when indexing. I added this fieldtype: and then in my schema i have this: when inserting it it seems like nothing happens ie when i do a query here is the response for a test description: himynameistopperand this blahblah is a test Any Ideas? -Mike
limiting the rows returned for a query
Hello, I have a question that I couldn't really find the answer to and dont really know if its possible currently within solr. I want to do a simple query to the solr index. something like q=stateid:1 countryid:1 but i'm really only concerned with getting the record above and below a certain (dynamic) recordid in the search results. Is there a way to do this through a query or is my only option to return all the search results and parse them to find the record id I want, and then get the one above and below that. I'd also have to take into account pagination and whatnot which makes it also a little bit harder to do this way. anyways hope that makes sense, let me know! -Mike
adjusting score slightly by date field
Hello, In our application there are a lot of old records that we still want in our index but would like for them to be scored lower than some newer records. Is it possible for a date field to weigh in on the score slightly in some way? Or if not is there another way to push up newer records in the order of results while still maintaining the scoring? -Mike
how to use function queries
I'm trying to retrieve results from solr such that newer documents' scores are boosted. From the solr wiki it states that I should use a function query to influence the score but I'm a little confused on howto use a function query. Searching through the archives I found a suggestion of using the _val_: hack in the standard query handler, but when i tried that with recip(rord(date),1,1000,1000)^2 to just test it I got an error saying org.apache.solr.core.SolrException: undefined field recip Can someone explain the function queries a little clearer and if I would need to use a different query handler? -Mike
almost realtime updates with replication
Hello, Currently in our application we are using the master/slave setup and have a batch update/commit about every 5 minutes. There are a couple queries that we would like to run almost realtime so I would like to have it so our client sends an update on every new document and then have solr configured to do an autocommit every 5-10 seconds. reading the Wiki, it seems like this isn't possible because of the strain of snapshotting and pulling to the slaves at such a high rate. What I was thinking was for these few queries to just query the master and the rest can query the slave with the not realtime data, although I'm assuming this wouldn't work either because since a snapshot is created on every commit, we would still impact the performance too much? anyone have any suggestions? If I set autowarmingCount=0 would I be able to to pull to the slave faster than every couple of minutes (say, every 10 seconds)? what if I take out the postcommit hook on the master and just have the snapshooter run on a cron every 5 minutes? -Mike