unstructured query: default_field: abc ^5 and xyz seems to have created a confusion, what I meant was while initializing the parser I have "default_field" as the default text field. So, the query should be:
QueryParser parser = new QueryParser("default_field",analyzer); query = parser.parse("abc^5 and xyz"); so query will be: default_field:abc^5 and default_field:xyz^3 I am sorry for mentioning it wrong earlier. To answer Ericks question: I'll be indexing around 10-20 million documents of average size of 4 KB, but the number of documents could be mor. Now let me again clearly explain my problem: say i have a set of lucene documents as: Document 1: name: Anshul ^10 organization: EPFL ^5 sex: Male Document 2: name: Rakesh ^10 organization: IIT-B ^5 sex: Male Docuemt 3: name: erin brochowich^10 organization: ABC law firm sex: Female Document 4: title: lord of the rings ^10 directors: John ^2 actors: Kate Document 5: title: godfather ^10 directors: Kate ^2 actors: alpachino Docmuent 1, 2 and 3 belongs to a same class so there boosting parameters will be same. Similar is the case with document 4 and 5. If I give a query like: name: "Erin Brochowich" and Oranization: "ABC law firm". this query will work perfectly. but if the query is QueryParser parser = new QueryParser("default_field",analyzer); query = parser.parse("Erin Brochowich and ABC law firm"); it would not work. what i want is that default_field should be connected to the all the text somehow, but it should not take extra space for storing its own text. I think it should be clear enough now. Thank you for your responses. Regards, Anshul On Tue, Sep 23, 2008 at 4:55 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > > On Sep 23, 2008, at 8:35 AM, Anshul jain wrote: > >> yes you are partly correct >> >> what I need is that lucene should support two type of queries for the >> following document: >> name: abc^10 >> organization: xyz^3 >> >> structured query: >> name: abc and organization: xyz >> >> unstructured query: >> default_field: abc ^5 and xyz > > And what field(s) should "xyz" be searched against? Again, I ask, how do > you know what fields "xyz" should go against and why does abc go against the > default_field? You've said it shouldn't go against all fields (b/c there > are thousands of them), and you've said it shouldn't go against a catch-all > field, but otherwise I still have no clue your criteria for what fields xyz > should search. Are you saying that you want it to intelligently know that > when "xyz" comes in that it should search the organization field? > > Other than seconding Umesh's or Dino's suggestions of using machine learning > or heuristics or using some type of templating system, I'm not sure what > else to offer. You might look at Solr's Dismax Query Parser, which allows > you to specify the field structure of queries in a multi-field way, but > again, I doubt that is wholly what you are looking for. > >> >> >> But i do not want to create one more field(default_field) that will >> contain all the values concatenated in it. Also, even if i get all the >> fields during indexing and use it for multi field query parser, then >> the query will become very inefficient as there can be thousands of >> fields. I think it should clarify my point. >> >> >> >> On Tue, Sep 23, 2008 at 1:58 PM, Grant Ingersoll <[EMAIL PROTECTED]> >> wrote: >>> >>> So, the piece I'm missing is how do you know what field for which terms. >>> In >>> other words how do you know xyz goes against organization and abc against >>> name. Your wording implies that you don't know this before hand, yet you >>> are somehow suggesting that Lucene should be able to do it. Correct me >>> if >>> I'm wrong. >>> >>> -Grant >>> >>> >>> On Sep 23, 2008, at 6:51 AM, Anshul jain wrote: >>> >>>> Here is what I'm trying to do: >>>> >>>> say a lucene document: >>>> name: abc ^10 >>>> organization: xyz ^3 >>>> >>>> ^10 and ^3 are boosts in the document. >>>> >>>> now if I query name: abc ^5 AND organization: xyz this will work. >>>> >>>> but if I query (default_field): abc^5 AND xyz this won't work. >>>> >>>> Now what I want is that a text can be associated with more than one >>>> field. >>>> i.e. >>>> >>>> (field1,field2,field3):value >>>> name,(default_field),title: abc^10 >>>> organization,(default_field),institute: xyz^3 >>>> >>>> then both of my queries will work. >>>> >>>> Is it possible to do so in lucene without changing the source? >>>> If no then can anyone please explain the indexing and searching >>>> mechanism for lucene, so that I can start working on it. >>>> >>>> The solution given by the java-users won't work for me as I do not >>>> want to add all the contents of the document in a single field and >>>> then search for that field, as this would increase the index size and >>>> I've to index more than 10 million documents. Also >>>> multifieldqueryparser will make it query execution inefficient, as >>>> there will be thousands of fields. >>>> >>>> If I start storing just a single field as: (default_field): "name abc >>>> organization xyz", then it is possible that some other documents might >>>> get selected that are not relevant. Also i want to boost individual >>>> fields in a document. >>>> >>>> Anshul >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>>> For additional commands, e-mail: [EMAIL PROTECTED] >>>> >>> >>> -------------------------- >>> Grant Ingersoll >>> http://www.lucidimagination.com >>> >>> Lucene Helpful Hints: >>> http://wiki.apache.org/lucene-java/BasicsOfPerformance >>> http://wiki.apache.org/lucene-java/LuceneFAQ >>> >>> >>> >>> >>> >>> >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>> For additional commands, e-mail: [EMAIL PROTECTED] >>> >>> >> >> >> >> -- >> Anshul Jain >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> > > -------------------------- > Grant Ingersoll > http://www.lucidimagination.com > > Lucene Helpful Hints: > http://wiki.apache.org/lucene-java/BasicsOfPerformance > http://wiki.apache.org/lucene-java/LuceneFAQ > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > -- Anshul Jain --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]