[ 
https://issues.apache.org/jira/browse/TIKA-94?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17338628#comment-17338628
 ] 

ASF GitHub Bot commented on TIKA-94:
------------------------------------

lewismc opened a new pull request #406:
URL: https://github.com/apache/tika/pull/406


   This is a WIP on the work we are doing as fulfillment of the Hackillinois 
program.
   We will be adding to this and I will be making comments in here.
   Great work team on the work so far... 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Speech-to-text transcription
> ----------------------------
>
>                 Key: TIKA-94
>                 URL: https://issues.apache.org/jira/browse/TIKA-94
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>            Reporter: Jukka Zitting
>            Assignee: Lewis John McGibbney
>            Priority: Minor
>              Labels: new-parser
>
> Like OCR for image files (TIKA-93), we could try using speech recognition to 
> extract text content (where available) from audio (and video!) files.
> The CMU Sphinx engine (http://cmusphinx.sourceforge.net/) looks promising and 
> comes with a friendly license.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to