Hi Rupert, All
I am building Speech To Text Engine ( [1] for those who need introduction).
Engine requires DataFileProvider infrastructure for handling configuration
file of acoustic and language modal. Basically what happens is client
provides the *Acoustic Modal* *folder *, *Dictionary file* and *Language
modal file* in jar file in following format.
eg.
sphinx4-data-1.0-SNAPSHOT.jar default modal file, it contains
/edu/cmu/sphinx/models/language/en-us.lm.dmp *File* for language modal
/edu/cmu/sphinx/models/acoustic/wsj/dict/cmudict.0.6d *File *for dictionary
/edu/cmu/sphinx/models/acoustic/wsj/ *Folder* for acoustic modal
This jar can be added to project using the following dependency:
<dependency>
<groupId>edu.cmu.sphinx</groupId>
<artifactId>sphinx4-data</artifactId>
<version>1.0-SNAPSHOT</version>
</dependency>
but when clients wants to use his own modal file, Stanbol
hasDataFileProvider infrastructure for handling such big binary
configuration
files.
I went through the documentation of DataFileProvider [2] and some of the
enhancement engine like Sentiment Word Classifier - source code that uses
DataFileProvider service, to see the implementation of DataFileProvider ,
but I am not yet clear how to use it.
Maybe you can provide some *insights* or *links* that provides better
description of it. It will save lot of time.
Regards,
Suman Saurabh
[1] https://sites.google.com/site/gsoc2014stanbol/home/abstract
[2] http://stanbol.apache.org/docs/trunk/utils/datafileprovider