by ( or , or at the beginning of the string, e.g. using a regex
like /(?:^|[\s(])[+-]/, and if you find a match, use default OR operator, and if not, use default AND operator?
Ok, thanks for the workaround.
--
Renaud Delbru
)
Is it the normal behaviour ? A Bug ? Am I doing something wrong ?
Thanks in advance for your help,
--
Renaud Delbru
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h
://lucidworks.lucidimagination.com/display/LWEUG/Boolean+Operators
Thanks
--
Renaud Delbru
On 20/05/11 13:21, Steven A Rowe wrote:
Hi Renaud,
That's normal behavior, since you have AND as default operator. This is equivalent to placing a
+ in front of every element of your query. In fact, if you removed the other two
this
information, so I will be able to check these segments and their encoded
blocks in order to find and understand the problem ?
Thanks in advance,
--
Renaud Delbru
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
, the index of the faulty
block), I will be able to display the content of the blocks, and see if
there is some problems in the PFor encoding.
Cheers,
--
Renaud Delbru
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
will be more suitable.
[1] http://siren.sindice.com/
--
Renaud Delbru
On 12/03/10 13:43, Erick Erickson wrote:
There's no requirement that all documents have the same
fields, Lucene is fine with different docs having different
fields.
There's no limit on the number of different fields allowed
interface ? How is it working currently ?
Is there some restrictions on how segments can be merged ?
Is there a way to extend easily the mechanism on how segments are merged ?
Cheers,
--
Renaud Delbru
-
To unsubscribe, e-mail: java
is to create my own MultiReader that is
optimised for my codec. Is that right ? A MultiReader that just iterates
over the subreaders, checks if they are using my codec (and therefore
associated fields), and uses them to iterate over my own postings ?
--
Renaud Delbru
the
DocsEnum interface, and therefore do not know if it manipulates
segment-level enum or a Multi*Enums. What search (or query operators) in
Lucene is using segment-level enums ?
Cheers
--
Renaud Delbru
-
To unsubscribe, e-mail: java-user
that the
way I am testing the postings (using termPositionsEnum on the top-level
reader) was not really the proper way to test it, and that the correct
way will be instead to use directly a TermQuery.
Thanks for the clarification.
--
Renaud Delbru
to extends
StandardCodec without having to deal with these classes ?
Cheers
--
Renaud Delbru
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
except the problem related to MultiReaders.
Thanks
--
Renaud Delbru
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
index data structure are correctly
retrieved.
In that case, I got the previous errors (a MultiDocsAndPositionsEnum is
returned). However, when I am indexing only one or two documents, the
original DocsAndPositionsEnum is returned.
Hope that helps,
cheers
--
Renaud Delbru
reasons), you will have
to use MultiField's static methods to get the enums.
In my previous example (registering my codec in IndexWriter, and then
use IndexReader), do I consume the flex API directly on top of the
multi-readers directly ? If yes, how to avoid that ?
Cheers
--
Renaud Delbru
Hi Michael,
I have started to look at the PFOR codec. However, when I include the
codec files inside the flex_1458 branch, it misses the
org.apache.lucene.util.pfor.PFor class which is the core of the codec.
Where can I find this class ?
Thanks,
Regards
--
Renaud Delbru
On 16/11/09 14:01
(correct me
if I am wrong) that this new version includes some optimisations for
dictionary lookups, which should minimize the overhead.
--
Renaud Delbru
On 30/12/09 16:18, Jason Tesser wrote:
I have a situation where I might have 1000 different types of Lucene
Documents each with 10 or so fields
period. I will continue to follow the advancement of 1458, test it,
and continue to report you my feedbacks and experiences with it.
Thanks,
Best Regards
[1] http://siren.sindice.com
--
Renaud Delbru
On 16/11/09 13:01, Michael McCandless wrote:
Yes, the branch is here:
https://svn.apache.org
,
in order to be able to plug my own chain, but I have the impression that
you've done something similar already (with the codec abstraction).
Would be a pity to lose my time doing something less convenient that
your appraoch.
Thanks.
--
Renaud Delbru
On 14/11/09 13:22, Michael McCandless
the
experience!
I will.
--
Renaud Delbru
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
your questions in our mailing list.
[1] http://siren.sindice.com
[2]
https://dev.deri.ie/confluence/display/SIREn/Indexing+and+Searching+Tabular+Data
Best Regards,
--
Renaud Delbru
Donal Murtagh wrote:
Hi,
I'm trying to use Lucene to query a domain that has the following structure
to
make this project happen ... but also to the Data Intensive
Infrastructure Group and DERI.
[1] http://di2.deri.ie/
--
Renaud Delbru
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e
] http://videolectures.net/wsdm09_dean_cblirs/
--
Renaud Delbru
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
and define how to serialise
positions and payloads
I think other parts of the FreqProxTermsWriter can stay generic. What do
you think ?
Regards.
--
Renaud Delbru
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional
these modifications ? Make a branch of
lucene, and add my new classes to the lucene package
org.apache.lucene.index ? Or do a more elegant solution is possible ?
Thanks in advance,
Regards.
--
Renaud Delbru
-
To unsubscribe, e-mail: [EMAIL
such as
ConjunctiveScorer). This will require a call to
IndexReader.docFreq(term) for each of the term queries. Is docFreq call
mean another IO access ?
Thanks for the clarification,
Regards.
--
Renaud Delbru
Andrzej Bialecki wrote:
Renaud Delbru wrote:
Hi all,
I am wondering if Lucene implements the query
? If you take that approach, please report back so we can
learn how to improve Lucene for these very advanced customizations!
Ok, thanks for the reference. I will try this solution, and will report
you any problems I will encounter.
Regards.
--
Renaud Delbru
Andrzej Bialecki wrote:
Renaud Delbru wrote:
Hi Andrzej,
sorry for the late reply.
I have looked at the code. As far as I understand, you sort the
posting lists based on the first doc skip. The first posting list
will be the one who have the first biggest document skip.
Do the sparseness
be a
not so bad predictor in general.
Regards.
--
Renaud Delbru
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Yes, I know to research project that have implemented a triple store on
top of Lucene:
- Semplore [1]
- Sindice [2]
[1] http://apex.sjtu.edu.cn/apex_wiki/Demos/Semplore
[2] http://www.sindice.com
--
Renaud Delbru
Cam Bazz wrote:
Has anyone tried to implement a triplet store with lucene?
Best
Hi all,
I am wondering if Lucene implements the query optimisation that consists
of ordering the posting lists based on the term frequency before
intersection ?
If yes, could somebody point me to the java class / method that
implements such strategy ?
Thanks in advance,
Regards.
--
Renaud
(.fdt) file.
Then, could it be possible to overwrite the old float value by a new
float value ?
Thanks,
--
Renaud delbru
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
(because tokenStream doesn't
return them). It's as if we need the ability to query a tokenStream
for its final offset or something.
One workaround might be to insert an end marker token, with the true
end offset, which is a term you would never search on?
Mike
Renaud Delbru wrote:
Hi,
I
instances will have their offset shifted back.
Is it a bug ? Or is it a desired behavior (in this case, why ?) ?
Regards.
--
Renaud Delbru,
E.C.S., Ph.D. Student,
Semantic Information Systems and
Language Engineering Group (SmILE),
Digital Enterprise Research Institute,
National University
33 matches
Mail list logo