Fuzzy searching documents over multiple fields using Solr

2013-05-09 Thread britske
Not sure if this has ever come up (or perhaps even implemented without me knowing) , but I'm interested in doing Fuzzy search over multiple fields using Solr. What I mean is the ability to returns documents based on some 'distance calculation' without documents having to match 100% to the query.

Re: modeling prices based on daterange using multipoints

2012-12-11 Thread britske
g) [via Lucene] < ml-node+s472066n4026151...@n3.nabble.com> > Hi Britske, > This is a very interesting question! > > britske wrote > ... > I realize the new spatial-stuff in Solr 4 is no magic bullet, but I'm > wondering if I could model multiple prices per day as mul

modeling prices based on daterange using multipoints

2012-12-11 Thread britske
HI all, Based on some good discussion in Modeling openinghours using multipoints I was triggered to have a review of an old painpoint of mine: modeling pricing & availability of hotels which de

Re: Modeling openinghours using multipoints

2012-12-08 Thread britske
, at 05:35, "David Smiley (@MITRE.org) [via Lucene]" wrote: > britske wrote > That's seriously awesome! > > Some change in the query though: > You described: "To query for a business that is open during at least some > part of a given time duration&

Modeling openinghours using multipoints

2012-12-08 Thread britske
Hi all, Over a year ago I posted a usecase to, the in this context familiar, issue SOLR-2155 of modelling openinghours using multivalued points. https://issues.apache.org/jira/browse/SOLR-2155?focusedCommentId=13114839&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#commen

multiple dateranges/timeslots per doc: modeling openinghours.

2011-09-26 Thread britske
Sorry for the somewhat length post, I would like to make clear that I covered my basis here, and looking for an alternative solution, because the more trivial solutions don't seem to work for my use-case. Consider Bars, musea, etc. These places have multiple openinghours that can depend on: RE

Universal DataImport(AndExport)Handler

2010-06-08 Thread britske
Recently I looked a bit at DataImportHandler and I'm really impressed with the flexibility of transform / import options. Especially with integrations with Solr Cell / Tika this has become a great Data importer. Besides some use-cases that import to Solr (which I plan to migrate to DIH asap), DI

manually creating indices to speed up indexing with app-knowledge

2009-11-02 Thread Britske
This may seem like a strange question, but here it goes anyway. Im considering the possibility of low-level constructing indices for about 20.000 indexed fields (type sInt) if at all possible . (With indices in this context I mean the inverted indices from term to Documentid just to be 100% comp

Re: If field A is empty take field B. Functionality available?

2009-08-28 Thread Britske
g, you are really going to want them in the same field. > > > On Aug 28, 2009, at 1:16 PM, Britske wrote: > >> >> I have 2 fields: >> realprice >> avgprice >> >> I'd like to be able to take the contents of avgprice if realprice is >>

If field A is empty take field B. Functionality available?

2009-08-28 Thread Britske
I have 2 fields: realprice avgprice I'd like to be able to take the contents of avgprice if realprice is not available. due to design the average price cannot be encoded in the 'realprice'-field. Since I need to be able to filter, sort and facet on these fields, it would be really nice to be

Re: solr 1.4: extending StatsComponent to recognize localparm {!ex}

2009-08-26 Thread Britske
Thanks for that. it works now ;-) Erik Hatcher-4 wrote: > > > On Aug 25, 2009, at 6:35 PM, Britske wrote: >> Moreover, I can't seem to find the actual code in FacetComponent or >> anywhere >> else for that matter where the {!ex}-param case i

solr 1.4: extending StatsComponent to recognize localparm {!ex}

2009-08-25 Thread Britske
hi, I'm looking for a way to extend StatsComponent te recognize localparams especially the {!ex}-param. To my knowledge this isn't implemented in the current trunk. One of my use-cases for this is to be able to have a javascript price-slider, where the user can operate the slider and thus set a

Re: highlighting on edgeGramTokenized field --> hightlighting incorrect bc. position not incremented..

2009-06-12 Thread Britske
Thanks, I'll check it out. Otis Gospodnetic wrote: > > > Britske, > > I'd have to dig, but there are a couple of JIRA issues in Lucene's JIRA > (the actual ngram code is part of Lucene) that have to do with ngram > positions. I have a feeling

highlighting on edgeGramTokenized field --> hightlighting incorrect bc. position not incremented..

2009-06-12 Thread Britske
Orlando Verenigde Staten the field def: I checked that removing the EdgeNGramFilterFactory results in correct positioning of highlighting. (But then I can't search for ngrams...) What am I missing? Thanks in advance, Britske -- View this message in co

Re: how to get to highlitghting results using solrJ

2009-06-11 Thread Britske
ing to supply additional code in a client. You would only need to refer to the annotated field name... Britske wrote: > > first time I'm using highlighting and results work ok. > Im using it for an auto-suggest function. For reference I used the > following query: > &g

how to get to highlitghting results using solrJ

2009-06-11 Thread Britske
empty.(?) but debugging I see a field: QueryResponse._highlightingInfo with contents: {1-4167147={prefix1=[Orlando Verenigde Staten]},} which is exactly what I need. However there is no (public) method: QueryRepsonse.getHighlightingInfo() ! what am I missing? thanks, Britske -- View thi

correct? impossible to filter / facet on ExternalFileField

2009-06-11 Thread Britske
f not possible, is it on the roadmap? Thanks, Britske -- View this message in context: http://www.nabble.com/correct--impossible-to-filter---facet-on-ExternalFileField-tp23985106p23985106.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: speeding up indexing with a LOT of indexed fields

2009-03-25 Thread Britske
y ram usage is maxed all the time, before this will make any difference I guess. Thanks and please let the suggestions coming. Britske. Otis Gospodnetic wrote: > > > Britske, > > Here are a few quick ones: > > - Does that machine really have 10 CPU cores? If it has si

speeding up indexing with a LOT of indexed fields

2009-03-25 Thread Britske
- lastly: should I be able to get more out of this box or am I just complaining ;-) Thanks for making it to here, and hoping to receive some valuable info, Cheers, Britske -- View this message in context: http://www.nabble.com/speeding-up-indexing-with-a-LOT-of-indexed-fields-tp22702364p22702364.html Sent from the Solr - User mailing list archive at Nabble.com.

solr 1.4: multi-select for statscomponent

2009-02-25 Thread Britske
s to set the max-range of the slider. Is there any (undocumented) feature that makes this possible? If not, would it be easy to add? Thanks, Britske -- View this message in context: http://www.nabble.com/solr-1.4%3A-multi-select-for-statscomponent-tp22202971p22202971.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr on raid 0 --> no performance gain while indexing?

2008-10-15 Thread Britske
As a 'workaround' : would instead of striping the available disks, but treating them as N silos and merging the indices afterwards be an option ? Britske wrote: > > Hi, > > I understand that this may not be a 100% related question to the forum > (perhaps it

solr on raid 0 --> no performance gain while indexing?

2008-10-15 Thread Britske
ices after each insert creates such a load between physical disks that the normal write scenario (of software raid 0) of writing sequential chunks in round-robin fashion to all the disks in the array no longer holds? Does this seem logical or does someone know another reason? Thanks, Britske --

Re: DataImportHandler: way to merge multiple db-rows to 1 doc using transformer?

2008-09-29 Thread Britske
roductid. Currently I have a more or less working home-grown solution, but I would like to be able to set it up with DataImportHandler. thanks for your help, Britske Noble Paul നോബിള്‍ नोब्ळ् wrote: > > What is the basis on which you merge rows ? Then I may be able to > suggest an easy

DataImportHandler: way to merge multiple db-rows to 1 doc using transformer?

2008-09-27 Thread Britske
rmers that take multiple db-rows and merge it to a single solr-row/document. If so, how? Thanks, Britske -- View this message in context: http://www.nabble.com/DataImportHandler%3A-way-to-merge-multiple-db-rows-to-1-doc-using-transformer--tp19706722p19706722.html Sent from the Solr - User mailing

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-31 Thread Britske
design in which these criteria pinpoint a specific field / column to use and the difference should be clear. regards, Britske Funtick wrote: > > > Yes, it should be extremely simple! I simply can't understand how you > describe it: > > Britske wrote: >> >> Row

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-30 Thread Britske
Funtick wrote: > > > Britske wrote: >> >> - Rows in solr represent productcategories. I will have up to 100k of >> them. >> - Each product category can have 10k products each. These are encoded as >> the 10k columns / fields (all 10k fields

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-30 Thread Britske
Hi Fuad, Funtick wrote: > > > Britske wrote: >> >> When performing these queries I notice a big difference between qTime >> (which is mostly in the 15-30 ms range due to caching) and total time >> taken to return the response (measured through SolrJ'

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-30 Thread Britske
Currently, I can't say what the data actualle represents but the analogy of t Mike Klaas wrote: > > On 28-Jul-08, at 11:16 PM, Britske wrote: > >> >> That sounds interesting. Let me explain my situation, which may be a >> variant >> of what you are pro

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Britske
part of Solr that I can use for this, or would it be all home-grown? Thanks, Britske Mike Klaas wrote: > > Another possibility is to partition the stored fields into a > frequently-accessed set and a full set. If the frequently-accessed > set is significantly smaller (in terms

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Britske
I'm using the solr-nightly of 2008-04-05 Grant Ingersoll-6 wrote: > > What version of Solr/Lucene are you using? > > On Jul 28, 2008, at 4:53 PM, Britske wrote: > >> >> I'm on a development box currently and production servers will be >> bigger

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Britske
Thanks for clearing that up for me. I'm going to investigate some more... Yonik Seeley wrote: > > On Mon, Jul 28, 2008 at 4:53 PM, Britske <[EMAIL PROTECTED]> wrote: >> Each query requests at most 20 stored fields. Why doesn't help >> lazyfieldloading in

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Britske
between total elapsed time and qTIme of about 15-30 ms. Doesn't this seem strange, since to me it would seem logical that the discrepancy would be at least 1/10th of fetching 100 documents. hmm, hope you can shine some light on this, Thanks a lot, Britske Yonik Seeley wrote: > >

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Britske
x27;s the size > on disk of your index compared to your physical RAM? > > -Yonik > > On Mon, Jul 28, 2008 at 4:10 PM, Britske <[EMAIL PROTECTED]> wrote: >> >> Hi all, >> >> For some queries I need to return a lot of rows at once (say 100). >> When

big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Britske
hanks, Britske -- View this message in context: http://www.nabble.com/big-discrepancy-between-elapsedtime-and-qtime-although-enableLazyFieldLoading%3D-true-tp18698590p18698590.html Sent from the Solr - User mailing list archive at Nabble.com.

reusing docset to limit new query

2008-04-16 Thread Britske
I've found a method SolrIndexSearcher.cacheDocSet(..) but am not entirely sure what it does (sideeffects? ) Can someone please elaborate on this? Britske -- View this message in context: http://www.nabble.com/reusing-docset-to-limit-new-query-tp16721670p16721670.html Sent from the Solr - User mailing list archive at Nabble.com.

indexing slow, IO-bound?

2008-04-05 Thread Britske
Hi, I have a schema with a lot of (about 1) non-stored indexed fields, which I use for sorting. (no really, that is needed). Moreover I have about 30 stored fields. Indexing of these documents takes a long time. Because of the size of the documents (because of the indexed fields) I am curr

batch indexing takes more time than shown on SOLR output --> something to do with IO?

2008-01-14 Thread Britske
I have a batch program which inserts items in a solr/lucene index. all is going fine and I get update messages in the console like: 14-jan-2008 16:40:52 org.apache.solr.update.processor.LogUpdateProcessor finish INFO: {add=[10485, 10488, 10489, 10490, 10491, 10495, 10497, 10498, ...(42 more) ]}

Re: big perf-difference between solr-server vs. SOlrJ req.process(solrserver)

2007-12-31 Thread Britske
neral be considered small in relation to the abovementioned costs? Thanks, Geert-Jan Yonik Seeley wrote: > > On Dec 27, 2007 11:01 AM, Britske <[EMAIL PROTECTED]> wrote: >> after inspecting solrconfig.xml I see that I already have enabled lazy >> field >> lo

Re: big perf-difference between solr-server vs. SOlrJ req.process(solrserver)

2007-12-27 Thread Britske
ormance if you need to search the Wiki... > > Best > Erick > > On Dec 27, 2007 10:28 AM, Britske <[EMAIL PROTECTED]> wrote: > >> >> >> >> Yonik Seeley wrote: >> > >> > On Dec 27, 2007 9:45 AM, Britske <[EMAIL PROTECTED]>

Re: big perf-difference between solr-server vs. SOlrJ req.process(solrserver)

2007-12-27 Thread Britske
Yonik Seeley wrote: > > On Dec 27, 2007 9:45 AM, Britske <[EMAIL PROTECTED]> wrote: >> I am using SolrJ to communicate with SOLR. My Solr-queries perform within >> range (between 50 ms and 300 ms) by looking at the solr log as ouputted >> on >> my (w

big perf-difference between solr-server vs. SOlrJ req.process(solrserver)

2007-12-27 Thread Britske
Hi, I am using SolrJ to communicate with SOLR. My Solr-queries perform within range (between 50 ms and 300 ms) by looking at the solr log as ouputted on my (windows) commandline. However I discovered that the following command at all times takes significantly longer than the number outputted i

how to intersect a doclist with a docset and get a doclist back?

2007-12-14 Thread Britske
Is there a way to get a doclist based on intersecting an existing doclist with a docset? However doing doclist.intersection(docset) returns docset. Is there something I'm missing here? I figured this must be possible since the order of the returned doclist is the same as the order of the in

how do do most efficient: collapsing facets into top-N results

2007-12-13 Thread Britske
I've subclassed StandardRequestHandler to be able to show top-N results for some of the facet-values that I'm interested in. The functionality resembles the solr-236 field collapsing a bit, with the difference that I can arbitrarily specify which facet-query to collapse and to what extend. (possib

Re: possible to set mincount on facetquery?

2007-12-05 Thread Britske
It seemed handy in the mentioned case where its not certain if there are products in each of the budgetcategories so you simply ask them all, and only get back the categories which contain at least 1 product. >From a functional perspective to me that's kind of on par with doing facet.mincount=1

possible to set mincount on facetquery?

2007-12-05 Thread Britske
is it possible to set a mincount on a facetquery as well as on a facetfield? I have a situation in which I want to group facetqueries (price-ranges) but I obviously dont want to show ranges with 0 results. I tried things like: f.price:[0 TO 50].facet.mincount=1 and f.price:[0 TO 50].query.mincou

Re: how to load custom valuesource as plugin

2007-11-14 Thread Britske
Yonik Seeley wrote: > > Unfortunately, the function query parser isn't currently pluggable. > > -Yonik > > On Nov 14, 2007 2:02 PM, Britske <[EMAIL PROTECTED]> wrote: >> >> I've created a simple valueSource which is supposed to calculate a >>

how to load custom valuesource as plugin

2007-11-14 Thread Britske
I've created a simple valueSource which is supposed to calculate a weighted sum over a list of supplied valuesources. How can I let Solr recognise this valuesource? I tried to simply upload it as a plugin, and reference is by its name (wsum) in a functionquery, but got a "Unknown function wsum

Re: where to hook in to SOLR to read field-label from functionquery

2007-11-10 Thread Britske
hossman wrote: > > > : Say I have a custom functionquery MinFloatFunction which takes as its > : arguments an array of valuesources. > : > : MinFloatFunction(ValueSource[] sources) > : > : In my case all these valuesources are the values of a collection of > fields. > > a ValueSource isn't

where to hook in to SOLR to read field-label from functionquery

2007-11-05 Thread Britske
My question sounds strange I know, but I'll try to explain: Say I have a custom functionquery MinFloatFunction which takes as its arguments an array of valuesources. MinFloatFunction(ValueSource[] sources) In my case all these valuesources are the values of a collection of fields. What I need

Re: Solr-J: automatic url-escaping gives invalid uri exception. How to workaround?

2007-11-01 Thread Britske
I replaced { and } by (( resp. )). Not ideal (I like braces...) but it suffices for now. Still, if someone knows a general solution to the UrlEscaping-issue with Solr-J i'd love to hear it. Cheers, Geert-Jan Britske wrote: > > I have a custom requesthandler which does some

Solr-J: automatic url-escaping gives invalid uri exception. How to workaround?

2007-11-01 Thread Britske
I have a custom requesthandler which does some very basic dynamic parameter substitution. dynamic params are params which are enclosed in braces ({}). So this means i can do something like this: q={order}... where {order} is substituted by the name of an existing order-column. Now this all w

SOLR 1.3: defaultOperator always defaults to OR although AND is specifed.

2007-11-01 Thread Britske
experimenting with SOLR 1.3 and discovered that although I specified in schema.xml q=a+b behaves as q=a OR B instead of q=a AND b Obviously this is not correct. I used the nightly of 29 oct. Cheers, Geert-Jan -- View this message in context: http://www.nabble.com/SOLR-1.3%3A-defaultOperat

solr-139: support for adding fields which are not known at design-time?

2007-10-26 Thread Britske
is it / will it be possible to add priorly non-existing fields to a document with the upcoming solr-139? for instance, would something like this work? 318127 12 with schema.xml: ... ... ... btw: how is solr-139 coming along? By judging the latest posts on jira, there was still a lot

copyField with functionquery as source

2007-10-26 Thread Britske
is it possible to have a CopyField with a functionquery as it's source? for instance : If not, I think this would make a nice addition. thanks, Geert-Jan -- View this message in context: http://www.nabble.com/copyField-with-functionquery-as-source-tf4696019.html#a13423343 Sent from t

Re: quickie: do facetfields use same cached items in field cache as FQ-param?

2007-10-12 Thread Britske
as a related question: is here a way to inspect the queries currently in the filtercache? Britske wrote: > > Yeah i meant filter-cache, thanks. > It seemed that the particular field (cityname) was using a > keywordtokenizer (which doens't show at the front) which is why i mi

Re: quickie: do facetfields use same cached items in field cache as FQ-param?

2007-10-11 Thread Britske
Yeah i meant filter-cache, thanks. It seemed that the particular field (cityname) was using a keywordtokenizer (which doens't show at the front) which is why i missed it i guess :-S. This means the term field is tokenized so termEnums-apporach is used. This results in about 10.000 inserts on face

quickie: do facetfields use same cached items in field cache as FQ-param?

2007-10-11 Thread Britske
say I have the following (partial) querystring:...&facet=true&facet.field=country field 'country' is not tokenized, not multi-valued, and not boolean, so the field-cache approach is used. Morover, the following (partial) querystring is used as well: ..fq=country:france do these queries share ca

Re: showing results per facet-value efficiently

2007-10-11 Thread Britske
yup that clarifies things a lot, thanks. Mike Klaas wrote: > > On 10-Oct-07, at 4:16 AM, Britske wrote: > >> >> However, I realized that for calculating the count for each of the >> facetvalues, the original standardrequesthandler already loops the >>

implemented StandardReqeustHandler to show top-results per facet-value. Is this the fastest way?

2007-10-11 Thread Britske
Since the title of my original post may not have been so clear, here a repost. //Geert-Jan Britske wrote: > > First of all, I just wanted to say that I just started working with Solr > and really like the results I'm getting from Solr (in terms of > performance, flexibilit

showing results per facet-value efficiently

2007-10-10 Thread Britske
First of all, I just wanted to say that I just started working with Solr and really like the results I'm getting from Solr (in terms of performance, flexibility) as well as the good responses I'm getting from this group. Hopefully I will be able to contribute in way way or another to this wonderfu

Re: extending StandardRequestHandler gives ClassCastException

2007-10-09 Thread Britske
Thanks that was the problem! I mistakingly thought the lib-folder containing the jetty.jar etc. was the folder to put the plugins into. After adding a lib-folder to solr-home everything is resolved. Geert-Jan hossman wrote: > > > : SEVERE: java.lang.ClassCastException: > : wrappt.solr.requ

Re: extending StandardRequestHandler gives ClassCastException

2007-10-09 Thread Britske
Thanks, but I'm using the updated o.a.s.handler.StandardRequestHandler. I'm going to try on 1.2 instead to see if it changes things. Geert-Jan ryantxu wrote: > > >> It still seems odd that I have to include the jar, since the >> StandardRequestHandler should be picked up in the war right? I

Re: extending StandardRequestHandler gives ClassCastException

2007-10-09 Thread Britske
> you're compiling against an older version. > > Erik > > > On Oct 9, 2007, at 9:04 AM, Britske wrote: > >> >> I'm trying to add a new requestHandler-plugin to Solr by extending >> StandardRequestHandler. >> However, when starting solr-s

extending StandardRequestHandler gives ClassCastException

2007-10-09 Thread Britske
I'm trying to add a new requestHandler-plugin to Solr by extending StandardRequestHandler. However, when starting solr-server after configuration i get a ClassCastException: SEVERE: java.lang.ClassCastException: wrappt.solr.requesthandler.TopListRequestHandler cannot be cast to org.apache.solr.r

Re: how to make sure a particular query is ALWAYS cached

2007-10-09 Thread Britske
seperating requests over 2 ports is a nice solution when having multiple user-types. I like that althuigh I don't think i need it for this case. I'm just going to go the 'normal' caching-route and see where that takes me, instead of thinking it can't be done upfront :-) Thanks! hossman wrot

RE: how to make sure a particular query is ALWAYS cached

2007-10-04 Thread Britske
ery strings to fetch the filter query. "field:[* TO *]" > will > do nicely. > > Cheers, > > Lance Norskog > > -Original Message- > From: Britske [mailto:[EMAIL PROTECTED] > Sent: Thursday, October 04, 2007 1:38 PM > To: solr-user@lucene.a

Re: how to make sure a particular query is ALWAYS cached

2007-10-04 Thread Britske
hossman wrote: > > > : I want a couple of costly queries to be cached at all times in the > : queryResultCache. (unless I have a new searcher of course) > > first off: you can ensure that certain queries are in the cache, even if > there is a newSearcher, just configure a newSearcher Event L

Re: how to make sure a particular query stays cached (and is not overwritten)

2007-10-04 Thread Britske
the title of my original post was misguided. // Geert-Jan Britske wrote: > > I want a couple of costly queries to be cached at all times in the > queryResultCache. (unless I have a new searcher of course) > > As for as I know the only parameters to be supplied to the >

how to make sure a particular query is ALWAYS cached

2007-10-04 Thread Britske
I want a couple of costly queries to be cached at all times in the queryResultCache. (unless I have a new searcher of course) As for as I know the only parameters to be supplied to the LRU-implementation of the queryResultCache are size-related, which doens't give me this guarentee. what would