What do we call Hadoop+HBase+Lucene+Zookeeper+etc....

2009-05-04 Thread Bradford Stephens
Hey all, I'm going to be speaking at OSCON about my company's experiences with Hadoop and Friends, but I'm having a hard time coming up with a name for the entire software ecosystem. I'm thinking of calling it the "Apache CloudStack". Does this sound legit to you all? :) Is there something more 'o

Lucene Index Encryption

2009-05-04 Thread Peter_Lenahan
I hope to make this a discussion rather than a request for a feature. In the database world, secure data is always encrypted in the database. Since I am interested in storing data from a database in the index, at times I want to encrypt the index when the file is one disk. Currently data stored

Re: multi-field index and search (Not MultiFieldQuery). Help setting up index and search

2009-05-04 Thread Erick Erickson
Yes, SHOULD is what you want I think here. Best Erick On Mon, May 4, 2009 at 6:41 PM, Christian Bongiorno wrote: > You mean to use > BooleanQuery bq = new BooleanQuery(); > bq.add(new TermQuery(new > Term("key","value")),BooleanClause.Occur.MUST_OCCUR)); > // above is eric's suggestion. > > If

Re: multi-field index and search (Not MultiFieldQuery). Help setting up index and search

2009-05-04 Thread Christian Bongiorno
You mean to use BooleanQuery bq = new BooleanQuery(); bq.add(new TermQuery(new Term("key","value")),BooleanClause.Occur.MUST_OCCUR)); // above is eric's suggestion. If so, doesn't that mean if they don't all match I won't get a result? Wouldn't it be better to use SHOULD_OCCUR? The documentation d

RE: multi-field index and search (Not MultiFieldQuery). Help setting up index and search

2009-05-04 Thread Uwe Schindler
In the case of such queries with keywords (not analyzed tokens), I would create directly the appropinquate TermQuerys and combine with BooleanQuery. QueryParser is normally not for program-internal queries, more for queries the user has entered. For your use-case, it seems better to just create the

Re: multi-field index and search (Not MultiFieldQuery). Help setting up index and search

2009-05-04 Thread Erick Erickson
MultiFieldQuery essentially (if I have this right) forms a "cross product". I.e. it is NOT required to specify specific values for discrete fields. MFQ helps form queries expressing something like "does any term appear in any field in a hit" or "Does every term appear in some field of a hit, regard

Re: multi-field index and search (Not MultiFieldQuery). Help setting up index and search

2009-05-04 Thread Christian Bongiorno
Yeah, you definitely got the idea. You're the second person to recommend putting each item in it's own document and just store the HTS code (which is easy for me). The HTS code actually comes with no extra info. I mean, there is info, but we don't store any of it. I will try as you and Paul have r

Re: multi-field index and search (Not MultiFieldQuery). Help setting up index and search

2009-05-04 Thread Erick Erickson
H, tricky. Let's see if I understand your problem. Basically, you have a bunch of HSTs that have had some number of items arbitrarily assigned to them, and you want to see if you can make Lucene behave as a kind of expert system to help you classify the next item. I *think* you'd get better r

Re: multi-field index and search (Not MultiFieldQuery). Help setting up index and search

2009-05-04 Thread Paul Elschot
Christian, I suppose each ASIN represents a product by key,value pairs and an HTS code? In that case you may want to denormalize to index each ASIN as a lucene document. Then search for the most similar products in your queries by key/value pairs, using the your key as a lucene field. Such keys w

Re: Searching for partial matches

2009-05-04 Thread Erick Erickson
I have no clue, but that would really surprise me. Did you import org.apache.lucene.search.regex.RegexQuery ?? Have fun Erick On Mon, May 4, 2009 at 1:19 PM, Huntsman84 wrote: > > Ok, I will try that, just one more question. > > Do you know why there is a class called "RegexQuery" that app

Re: Searching for partial matches

2009-05-04 Thread Huntsman84
Ok, I will try that, just one more question. Do you know why there is a class called "RegexQuery" that appears in the API documentation but doesn't exist in the lucene-core-2.4.1.jar? I think that class would be very useful for my problem... Thank you so much!! Erick Erickson wrote: > > "the

multi-field index and search (Not MultiFieldQuery). Help setting up index and search

2009-05-04 Thread Christian Bongiorno
I am trying to build a search (have been experimenting with using Lucene) and someone suggested contacting your team Background: Currently the service I am working on applies taxing/duties to products for international shipping by looking up something called an HTS code (a universally recognized t

Re: Searching for partial matches

2009-05-04 Thread Erick Erickson
"the guys" really helped me understand the issues with wildcards, it's harder than you think . Try looking over the searchable archive for a thread titled "I just don't get wildcards at all" from a couple of hears ago. Note: Lucene has advanced significantly since then, but the underlying combinato

Re: Searching for partial matches

2009-05-04 Thread Huntsman84
My aim is to handle * phrases *, as you say, but I don't know how to build a WildCardQuery for that purpose... I read in the documentation that those kind of queries can't start with '*' (e.g. * phrase *), so I tryed MultiPhraseQuery instead. Forgive me if I am too newbie, 10 days ago I didn't kn

Re: Searching for partial matches

2009-05-04 Thread Erick Erickson
Why are you using MultiPhraseQuery? It appears (warning, I haven't really used it) to be designed to handle *phrases*. You're problem statement isn't looking at phrases at all, just a wildcard single terms. And you're supposed to call the first MPQ.add with, say, the first word of the *phrase*, not

Re: Searching for partial matches

2009-05-04 Thread Huntsman84
Hi I've tryed this with MultiPhraseQuery, but it always returns me all documents of the index, no matter what expression I use. I've read that adding a set of terms wich their values are all the entered query (e.g. "str"), the search works as the symbol "*" (e.g. "str*"), so I tryed that. My c

Re: lucene / hibernate search in cluster

2009-05-04 Thread Stephane Nicoll
On Mon, May 4, 2009 at 3:33 PM, no spam wrote: > 5 seconds seems short to me also but this is what our client wants and so I > need to get as close to this number as possible :) It's a system that > records live video 24x7 and up to date information is extremely important. > I have the hibernate

Re: lucene / hibernate search in cluster

2009-05-04 Thread no spam
5 seconds seems short to me also but this is what our client wants and so I need to get as close to this number as possible :) It's a system that records live video 24x7 and up to date information is extremely important. I have the hibernate search in action book as well. I didn't see other alter

Re: MultiFieldQueryParser - using a different analyzer per field...

2009-05-04 Thread theDude_2
Hey guys: original poster here, and I found a solution! I created a wrapper that could accept multiple analyzers and then combined them into a search: here is the code. --wrapper class- public class PositionalPorterStopAnalyzer extends Analyzer { private Set stopWords;

DC/NOVA Lucene&Solr meetup

2009-05-04 Thread Erik Hatcher
My company is co-sponsoring a Lucene/Solr meetup later this month in the Northern VA / DC area (Reston). Details will be coming out soon. We've got one night of talks planned and considering adding another consecutive night. If you're in the area and have a Lucene (any of the Lucene fami

get the cosine similarity between two docs

2009-05-04 Thread Kamal Najib
Hi all, I try to get the cosine similarity between two docs: I have tried first to create a document for a String like this: Document doc1=new Document(); doc1.add(new Field("term","nodular lesions over years responding kamal najib nodular lesions over years responding",Field.Store.YES,Field.Inde

Re: Re: I cann't find the package org.apache.luc ene.search.similar

2009-05-04 Thread Kamal Najib
Thanks Mike, i have found it. Kamal. Original Message: This is in the contrib-queries JAR. Mike On Mon, May 4, 2009 at 6:02 AM, Kamal Najib wrote: > Hi all, > I try to use the class MoreLikeThis on the package org.apache.lucene.search.similar but i cann't be resolved in eclipse.I imported the

Re: I cann't find the package org.apache.lucene.search.similar

2009-05-04 Thread Michael McCandless
This is in the contrib-queries JAR. Mike On Mon, May 4, 2009 at 6:02 AM, Kamal Najib wrote: > Hi all, > I try to use the class MoreLikeThis on the package > org.apache.lucene.search.similar but i cann't be resolved in eclipse.I > imported the lucene-core-2.4.1.jar and lucene-demos-2.4.1.jar.an

I cann't find the package org.apache.lucene.search .similar

2009-05-04 Thread Kamal Najib
Hi all, I try to use the class MoreLikeThis on the package org.apache.lucene.search.similar but i cann't be resolved in eclipse.I imported the lucene-core-2.4.1.jar and lucene-demos-2.4.1.jar.any suggestion? thanks in advance. Kamal. -- -