Hi everyone,
I have a question regarding the quadtree implementation of the spatial
module of Lucene. Does the quadtree implementation (QuadPrefixTree)
explicitly build a tree structure and store this information? I have gone
over the QuadPrefixTree class, but from what I understand it mainly
Hi Steve,
I have to admit I also find it frequently useful to include
punctuation as tokens (even if it's filtered out by subsequent token
filters for indexing, it's a useful to-have for other NLP tasks). Do
you think it'd be possible (read: relatively easy) to create an
analyzer (or a
On 01/10/2014 08:08, Dawid Weiss wrote:
Hi Steve,
I have to admit I also find it frequently useful to include
punctuation as tokens (even if it's filtered out by subsequent token
filters for indexing, it's a useful to-have for other NLP tasks). Do
you think it'd be possible (read: relatively
I played with this possibility on the extremely experimental
https://issues.apache.org/jira/browse/LUCENE-5012 which I haven't
gotten back to for a long time...
The changes on that branch adds the idea of a deleted token, by just
setting a new DeletedAttribute marking whether the token is deleted
Hi Parth,
Lucene’s “terms dictionary” (an inverted index) is the physical
instantiation of the actual PrefixTree/Trie for numeric and spatial data.
It doesn’t know it is — it’s just a sorted list of keys pointing to
matching documents — it just so happens that the keys aren’t textual words
in
Paul,
Boilerplate upgrade recommendation: consider using the most recent Lucene
release (4.10.1) - it’s the most stable, performant, and featureful release
available, and many bugs have been fixed since the 4.1 release.
FYI, StandardTokenizer doesn’t find word boundaries for Chinese, Japanese,
I was helping to look into this with Nick I think we may have figured out
the core of the problem...
The problem is easily reproducible by starting replication on the slave and
then sending a shutdown command to tomcat (e.g. catalina.sh stop).
With a debugger attached, it looks like the
On 01/10/2014 18:42, Steve Rowe wrote:
Paul,
Boilerplate upgrade recommendation: consider using the most recent Lucene
release (4.10.1) - it’s the most stable, performant, and featureful release
available, and many bugs have been fixed since the 4.1 release.
Yeah sure, I did try this and hit