Conceptually this use case is similar to what translation memories do.
For an open-source TM engine, have a look at http://okapi.opentag.com/, and its
default TM engine (Pensieve TM).
Cheers, Oli
-Original Message-
From: Barry Coughlan [mailto:b.coughl...@gmail.com]
Sent: Wednesday,
automaton.
Could you give me a hint, how to start?
Thank you for your kind support
Michael
[cid:image001.jpg@01CFF1CB.4F6814F0]
Oliver Christ<mailto:ochr...@ebsco.com>
Montag, 27. Oktober 2014 12:47
The hard way may be to use the standard Analyzing Suggester but to add each
(analyzed) suffi
The hard way may be to use the standard Analyzing Suggester but to add each
(analyzed) suffix of the surface string (mapping to the full surface form)
during automaton generation.
I.e. when adding "Donau...", you add all analyzed suffixes "donau...",
"onau...", "nau...", ... - all mapping to "
Hi Clemens,
I haven't yet built a suggester which combines all three, and am not aware of
one. I'd love to have one though ;-)
Case- and diacritics insensitivity is supported out-of-the-box by the analyzing
suggesters, including the FuzzySuggester. The logic is in the Analyzer.
I haven't yet t
Can you split your corpus across multiple Lucene instances?
Cheers, Oli
-Original Message-
From: Artem Gayardo-Matrosov [mailto:ar...@gayardo.com]
Sent: Friday, March 21, 2014 12:29 PM
To: java-user@lucene.apache.org
Subject: maxDoc/numDocs int fields
Hi all,
I am using lucene to index
Hi,
It's great to see support for payloads in the suggesters - this is really
helpful, and pretty much addresses LUCENE-4516. Are there any plans to also
support them for WFSTs? We have some cases where we don't need the Analyzer's
capabilities (we look up the completion using the payload infor
Hi,
I've seen some changes in trunk regarding the data format of Lucene's
FST-based suggesters, and wonder whether the automata created by trunk
builds/next Lucene version are/will be binary-compatible to the ones
created with the current release, or whether any magic versioning is
taking plac
.. if you have strong
priors / boost (e.g. you have a good source of "popularity" or
something) then you could sort by that ...
Mike McCandless
http://blog.mikemccandless.com
On Wed, Jan 16, 2013 at 4:27 PM, Oliver Christ
wrote:
> Hi,
>
>
>
> Has anyone tried to implement circumfix suggeste
Hi,
Has anyone tried to implement circumfix suggesters, where the suggestion
is a circumfix of the lookup string?
E.g. "sox rumor" suggests "boston red sox rumors" (try it on
google.com).
I think there are several of ways to implement this:
* Given some multiword term, ad
If I remember correctly it was Baeza-Yates or someone in his group at U
Santiago who came up with the rotated term indexing.
Indexing "abc", you explicitly mark end of string and index all rotations using
a data structure which supports prefix search (such as a trie):
abc$
bc$a
c$ab
$abc
This
Hi,
I've added
LUCENE-4516 - Suggesters: allow to associate a user-specified key (int)
with a string
LUCENE-4517 - Suggesters: allow to pass a user-defined predicate/filter
to the completion searcher
LUCENE-4518 - Suggesters: highlighting (explicit markup of user-typed
portions vs. generated
Hi,
I'm currently researching using a WFST suggester on e.g. book titles.
While our basic use cases are well covered, there seem to be at least
three which aren't:
* The possibility to associate a "foreign key" with a string
(rather: final node) in the WFST (in addition to the rank
12 matches
Mail list logo