On Sep 23, 2008, at 8:35 AM, Anshul jain wrote:
yes you are partly correct
what I need is that lucene should support two type of queries for the
following document:
name: abc^10
organization: xyz^3
structured query:
name: abc and organization: xyz
unstructured query:
default_field: abc ^5 and xyz
And what field(s) should "xyz" be searched against? Again, I ask, how
do you know what fields "xyz" should go against and why does abc go
against the default_field? You've said it shouldn't go against all
fields (b/c there are thousands of them), and you've said it shouldn't
go against a catch-all field, but otherwise I still have no clue your
criteria for what fields xyz should search. Are you saying that you
want it to intelligently know that when "xyz" comes in that it should
search the organization field?
Other than seconding Umesh's or Dino's suggestions of using machine
learning or heuristics or using some type of templating system, I'm
not sure what else to offer. You might look at Solr's Dismax Query
Parser, which allows you to specify the field structure of queries in
a multi-field way, but again, I doubt that is wholly what you are
looking for.
But i do not want to create one more field(default_field) that will
contain all the values concatenated in it. Also, even if i get all the
fields during indexing and use it for multi field query parser, then
the query will become very inefficient as there can be thousands of
fields. I think it should clarify my point.
On Tue, Sep 23, 2008 at 1:58 PM, Grant Ingersoll
<[EMAIL PROTECTED]> wrote:
So, the piece I'm missing is how do you know what field for which
terms. In
other words how do you know xyz goes against organization and abc
against
name. Your wording implies that you don't know this before hand,
yet you
are somehow suggesting that Lucene should be able to do it.
Correct me if
I'm wrong.
-Grant
On Sep 23, 2008, at 6:51 AM, Anshul jain wrote:
Here is what I'm trying to do:
say a lucene document:
name: abc ^10
organization: xyz ^3
^10 and ^3 are boosts in the document.
now if I query name: abc ^5 AND organization: xyz this will work.
but if I query (default_field): abc^5 AND xyz this won't work.
Now what I want is that a text can be associated with more than
one field.
i.e.
(field1,field2,field3):value
name,(default_field),title: abc^10
organization,(default_field),institute: xyz^3
then both of my queries will work.
Is it possible to do so in lucene without changing the source?
If no then can anyone please explain the indexing and searching
mechanism for lucene, so that I can start working on it.
The solution given by the java-users won't work for me as I do not
want to add all the contents of the document in a single field and
then search for that field, as this would increase the index size
and
I've to index more than 10 million documents. Also
multifieldqueryparser will make it query execution inefficient, as
there will be thousands of fields.
If I start storing just a single field as: (default_field): "name
abc
organization xyz", then it is possible that some other documents
might
get selected that are not relevant. Also i want to boost individual
fields in a document.
Anshul
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------
Grant Ingersoll
http://www.lucidimagination.com
Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--
Anshul Jain
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------
Grant Ingersoll
http://www.lucidimagination.com
Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]