First I'd say I probably ought to learn Erlang. Anybody have any good tutorials/resources for a complete virgin? I really don't have the time for this, but what can you do -- it's inevitable I, guess -- why fight it?
On Mon, Jul 28, 2008 at 6:05 PM, Paul Davis <[EMAIL PROTECTED]>wrote: > Not sure on feasibility, but what would you say to making an erlang > function that would parse and do the boolean logic on the index? I > mean its kinda hackish, but it seems like it could be done fairly > easily. Also, it could be the basis for work on merging indeces. > > Paul > > On Mon, Jul 28, 2008 at 5:31 PM, Dean Landolt <[EMAIL PROTECTED]> > wrote: > > I updated http://wiki.apache.org/couchdb/FullTextIndexWithView with a > > slightly more robust implementation. Still no boolean abilities though -- > > I'm coming the internets trying to figure out how google does it in m/r, > but > > my best guess is they just brute-force the merge (and probably track some > > stats to guess a total). This doesn't seem like something that would lend > > itself easily to couch -- but I could be wrong. I'm probably wrong. > Please, > > someone tell me I'm wrong... > > > > Dean > > > > On Mon, Jul 28, 2008 at 1:18 PM, Dean Landolt <[EMAIL PROTECTED]> > wrote: > > > >> Gladly. I'll get it on the wiki and send a link after I clean it up. > >> > >> Regarding merging views, something like that would be fantastic, though > I > >> can't really comprehend the performance implications. If a view can peer > >> into another view for its processing, I gather this would mean it would > have > >> to be updated every time a change happens in the referenced view(s), and > an > >> incremental update here may really mean a full update of the view in > >> question, but I'm just guessing. Though this would allow real *joins > *and > >> end that whole question once and for all... :) > >> > >> > >> > >> On Sun, Jul 27, 2008 at 7:04 PM, Dan Reverri <[EMAIL PROTECTED]> wrote: > >> > >>> Dean, > >>> > >>> Any chance you want to share your view code? > >>> > >>> In regards to the query parsing, I am not sure how this will work. > Right > >>> now > >>> results for each term have to be pulled down to the client and merged > >>> together. Perhaps we could add a query method to views that allow > >>> different > >>> key values to be combined. > >>> > >>> A user could query a view with a set of keys and a merge function that > >>> could > >>> define how the key values could be combined. > >>> > >>> On Fri, Jul 25, 2008 at 5:01 PM, Dean Landolt <[EMAIL PROTECTED]> > >>> wrote: > >>> > >>> > On Mon, Jul 21, 2008 at 11:45 AM, Dean Landolt <[EMAIL PROTECTED] > > > >>> > wrote: > >>> > > >>> > > On Mon, Jul 21, 2008 at 1:08 AM, Dan Reverri <[EMAIL PROTECTED]> > >>> wrote: > >>> > > > >>> > >> Is it worthwhile to implement a full text indexer on top of > couchdbs > >>> > >> map/reduce functionality? > >>> > >> > >>> > >> http://wiki.apache.org/couchdb/FullTextIndexWithView > >>> > >> > >>> > > > >>> > > > >>> > > Interesting idea. There's definitely more to FTI than tokenization > >>> alone, > >>> > > but then again there's an awful lot of power in m/r and javascript > -- > >>> it > >>> > > didn't take me a second to find a porter stemming algorithm in js: > >>> > > http://tartarus.org/~martin/PorterStemmer/js.txt<http://tartarus.org/%7Emartin/PorterStemmer/js.txt> > <http://tartarus.org/%7Emartin/PorterStemmer/js.txt> > >>> <http://tartarus.org/%7Emartin/PorterStemmer/js.txt> > >>> > <http://tartarus.org/%7Emartin/PorterStemmer/js.txt> > >>> > > > >>> > > I bet variable weighting would be pretty close to impossible in the > >>> m/r > >>> > > paradigm though, and probably some other features (of course, I > could > >>> be > >>> > > wrong, and when it comes to couchdb, thus far I usually am). For a > >>> > strait-up > >>> > > word search, this is servicible as is. I'm going to see if I can't > >>> figure > >>> > > out how to shoehorn in some boolean features. > >>> > > > >>> > > >>> > I gave this approach another look and I was able to get a view > together > >>> > that > >>> > did a little more (stemming, optional case-insensitivity, min length > for > >>> > tokens, better whitespace handling). I'm working on an ngram view too > >>> and > >>> > so > >>> > far it's promising. But there's still one huge problem -- for the > life > >>> of > >>> > me > >>> > I can't figure out a workable strategy for boolean operations that > >>> doesn't > >>> > involve fully loading each piece of the query. Am I missing > something? > >>> Is > >>> > something like this even possible? I know there's no way to load a > piece > >>> of > >>> > a view from another view -- but I just can't help but really wish > there > >>> > were. > >>> > > >>> > >> > >> > > >
