Hello again
Joachim Daiber said that "If you do not provide the models to the
training, the statistical backend will learn a dictionary-based spotting
model." If we give the spot to the system, it isnt neccesary to build
the OpenNLP models for spotting?
And the statistical disambiguation step will not be affected at all? One
of the probabilities used in disambiguation is context based. So it will
use the OpenNLP models to tokenize ...
Knowing this, the disambiguation step will be also dictionary-based?
We think that in the end, it will be a light version for Basque, without
the context knowledge.
thanks in advance ;)
ander
az., 2014.eko urtren 29a 22:07(e)an, Joachim Daiber(e)k idatzi zuen:
Hi Ander,
the statistical backend currently only supports OpenNLP models. This
is simply because they were readily available. So from my point of
view there are 2 things you can do:
1. change Spotlight to additionally accept your tool (assuming it's
JVM based)
2. retrain your models with OpenNLP
But regardless, you do not need those necessarily. If you do not
provide the models to the training, the statistical backend will learn
a dictionary-based spotting model. Depending on the size of the
Wikipedia input, this should work equally well (if the Wikipedia is
too small, it might be a bit sparse).
Hope that helps,
Jo
On Wed, Jan 29, 2014 at 3:11 PM, [email protected]
<mailto:[email protected]> <[email protected]
<mailto:[email protected]>> wrote:
Hi spotlight users,
Our main idea is to apply NED in basque documents, for this
proposal, we
want to use the dbpedia spotlight statistical backend system.
We want to create a Spotlight model for Basque language, but we have a
"little" problem. We have seen that there isn't any openNLP model for
Basque. We have all the resources such as tokenizer, chuncker, POS
tagger, stopwords... but not any of the openNLP pre-trained models for
this language.
Our questions are:
Is there any other way to use this resources instead of using openNLP
models? For example, integrating our resources in the system code and
giving the output to dbpedia spotlight system (without openNLP
models).
Does someone done something like this before?
Or
Do we need to build an openNLP model compulsorily?
thanks in advance,
Ander
------------------------------------------------------------------------------
WatchGuard Dimension instantly turns raw network data into actionable
security intelligence. It gives you real-time visual feedback on key
security issues and trends. Skip the complicated setup - simply
import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
_______________________________________________
Dbp-spotlight-users mailing list
[email protected]
<mailto:[email protected]>
https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
Dbp-spotlight-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users