Hi Rupert, All

I am building Speech To Text Engine ( [1] for those who need introduction).
Engine requires DataFileProvider infrastructure for handling configuration
file of acoustic and language modal. Basically what happens is client
provides the *Acoustic Modal* *folder *, *Dictionary file* and *Language
modal file* in jar file in following format.
eg.
sphinx4-data-1.0-SNAPSHOT.jar default modal file, it contains
/edu/cmu/sphinx/models/language/en-us.lm.dmp  *File* for language modal
/edu/cmu/sphinx/models/acoustic/wsj/dict/cmudict.0.6d *File *for dictionary
/edu/cmu/sphinx/models/acoustic/wsj/ *Folder* for acoustic modal

This jar can be added to project using the following dependency:
<dependency>
        <groupId>edu.cmu.sphinx</groupId>
        <artifactId>sphinx4-data</artifactId>
        <version>1.0-SNAPSHOT</version>
</dependency>

but when clients wants to use his own modal file, Stanbol
hasDataFileProvider infrastructure for handling such big binary
configuration
files.

I went through the documentation of DataFileProvider [2] and some of the
enhancement engine like Sentiment Word Classifier - source code that uses
DataFileProvider service, to see the implementation of DataFileProvider ,
but I am not yet clear how to use it.

Maybe you can provide some *insights* or *links* that provides better
description of it. It will save lot of time.

Regards,
Suman Saurabh

[1] https://sites.google.com/site/gsoc2014stanbol/home/abstract
[2] http://stanbol.apache.org/docs/trunk/utils/datafileprovider

Reply via email to