Conceptually this use case is similar to what translation memories do.
For an open-source TM engine, have a look at http://okapi.opentag.com/, and its
default TM engine (Pensieve TM).
Cheers, Oli
-Original Message-
From: Barry Coughlan [mailto:b.coughl...@gmail.com]
Sent: Wednesday,
The hard way may be to use the standard Analyzing Suggester but to add each
(analyzed) suffix of the surface string (mapping to the full surface form)
during automaton generation.
I.e. when adding Donau..., you add all analyzed suffixes donau...,
onau..., nau..., ... - all mapping to
Hi Michael,
There may be several entry points, I'm not sure which one still works - the
suggester data processing chain has changed quite a bit since I looked at it
about two years ago, maybe Mike or Robert can chime in if I'm totally off.
One way I experimented with was to implement a custom
Hi Clemens,
I haven't yet built a suggester which combines all three, and am not aware of
one. I'd love to have one though ;-)
Case- and diacritics insensitivity is supported out-of-the-box by the analyzing
suggesters, including the FuzzySuggester. The logic is in the Analyzer.
I haven't yet
Can you split your corpus across multiple Lucene instances?
Cheers, Oli
-Original Message-
From: Artem Gayardo-Matrosov [mailto:ar...@gayardo.com]
Sent: Friday, March 21, 2014 12:29 PM
To: java-user@lucene.apache.org
Subject: maxDoc/numDocs int fields
Hi all,
I am using lucene to
Hi,
It's great to see support for payloads in the suggesters - this is really
helpful, and pretty much addresses LUCENE-4516. Are there any plans to also
support them for WFSTs? We have some cases where we don't need the Analyzer's
capabilities (we look up the completion using the payload
Hi,
I've seen some changes in trunk regarding the data format of Lucene's
FST-based suggesters, and wonder whether the automata created by trunk
builds/next Lucene version are/will be binary-compatible to the ones
created with the current release, or whether any magic versioning is
taking
at 4:27 PM, Oliver Christ ochr...@ebscohost.com
wrote:
Hi,
Has anyone tried to implement circumfix suggesters, where the
suggestion is a circumfix of the lookup string?
E.g. sox rumor suggests boston red sox rumors (try it on
google.com).
I think there are several of ways
If I remember correctly it was Baeza-Yates or someone in his group at U
Santiago who came up with the rotated term indexing.
Indexing abc, you explicitly mark end of string and index all rotations using
a data structure which supports prefix search (such as a trie):
abc$
bc$a
c$ab
$abc
This
Hi,
I've added
LUCENE-4516 - Suggesters: allow to associate a user-specified key (int)
with a string
LUCENE-4517 - Suggesters: allow to pass a user-defined predicate/filter
to the completion searcher
LUCENE-4518 - Suggesters: highlighting (explicit markup of user-typed
portions vs. generated
Hi,
I'm currently researching using a WFST suggester on e.g. book titles.
While our basic use cases are well covered, there seem to be at least
three which aren't:
* The possibility to associate a foreign key with a string
(rather: final node) in the WFST (in addition to the
11 matches
Mail list logo