Hi,

ocr is new in ffmpeg 4.0.


1)
I build ffmpeg 4.0 and its tesseract-3.05.01, and copy english training data to 
/usr/local/share/tessdata/eng.traineddata.


I use the following command to run ocr filter in ffmpeg, but fail to find 
eng.traineddata. But I have set TESSDATA_PREFIX:
TESSDATA_PREFIX=/usr/local/share LD_LIBRARY_PATH=/usr/local/lib 
/usr/local/bin/ffmpeg -f lavfi -i "movie=test_ocr.png, 
ocr=datapath=tessdata:language=eng, drawgraph=lavfi.ocr.text" test_ocr_out.png


ffmpeg version 4.0 Copyright (c) 2000-2018 the FFmpeg developers
  built with gcc 4.8.5 (GCC) 20150623 (Red Hat 4.8.5-16)
  configuration: --enable-version3 --enable-asm --enable-x86asm 
--enable-avfilter --disable-static --enable-shared --enable-gpl 
--enable-nonfree --prefix=/usr/local/ --enable-libvidstab --enable-libass 
--enable-libfreetype --extra-libs=-lfreetype --enable-libtesseract 
--enable-libfdk_aac --enable-libmp3lame --enable-libx264 --enable-libopenjpeg 
--enable-libwebp --enable-libx265 --enable-libvorbis 
--extra-cflags=-I/usr/local/include --extra-ldflags=-L/usr/local/lib 
--enable-stripping --enable-libmfx
  libavutil      56. 14.100 / 56. 14.100
  libavcodec     58. 18.100 / 58. 18.100
  libavformat    58. 12.100 / 58. 12.100
  libavdevice    58.  3.100 / 58.  3.100
  libavfilter     7. 16.100 /  7. 16.100
  libswscale      5.  1.100 /  5.  1.100
  libswresample   3.  1.100 /  3.  1.100
  libpostproc    55.  1.100 / 55.  1.100
Error opening data file /tessdata/eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to the parent 
directory of your "tessdata" directory.
Failed loading language 'eng'
Tesseract couldn't load any languages!
[Parsed_ocr_1 @ 0x1e03d80] failed to init tesseract
[lavfi @ 0x1df77c0] Error initializing filter 'ocr' with args 
'datapath=tessdata:language=eng'
movie=test_ocr.png, ocr=datapath=tessdata:language=eng, 
drawgraph=lavfi.ocr.text: Invalid argument




If I set TESSDATA_PREFIX to /usr/local/share/tessdata, the same error occurs:


TESSDATA_PREFIX=/usr/local/share/tessdata LD_LIBRARY_PATH=/usr/local/lib 
/usr/local/bin/ffmpeg -f lavfi -i "movie=test_ocr.png, 
ocr=datapath=tessdata:language=eng, drawgraph=lavfi.ocr.text" test_ocr_out.png


ffmpeg version 4.0 Copyright (c) 2000-2018 the FFmpeg developers
  built with gcc 4.8.5 (GCC) 20150623 (Red Hat 4.8.5-16)
  configuration: --enable-version3 --enable-asm --enable-x86asm 
--enable-avfilter --disable-static --enable-shared --enable-gpl 
--enable-nonfree --prefix=/usr/local/ --enable-libvidstab --enable-libass 
--enable-libfreetype --extra-libs=-lfreetype --enable-libtesseract 
--enable-libfdk_aac --enable-libmp3lame --enable-libx264 --enable-libopenjpeg 
--enable-libwebp --enable-libx265 --enable-libvorbis 
--extra-cflags=-I/usr/local/include --extra-ldflags=-L/usr/local/lib 
--enable-stripping --enable-libmfx
  libavutil      56. 14.100 / 56. 14.100
  libavcodec     58. 18.100 / 58. 18.100
  libavformat    58. 12.100 / 58. 12.100
  libavdevice    58.  3.100 / 58.  3.100
  libavfilter     7. 16.100 /  7. 16.100
  libswscale      5.  1.100 /  5.  1.100
  libswresample   3.  1.100 /  3.  1.100
  libpostproc    55.  1.100 / 55.  1.100
Error opening data file /tessdata/eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to the parent 
directory of your "tessdata" directory.
Failed loading language 'eng'
Tesseract couldn't load any languages!
[Parsed_ocr_1 @ 0x18a8d80] failed to init tesseract
[lavfi @ 0x189c7c0] Error initializing filter 'ocr' with args 
'datapath=tessdata:language=eng'
movie=test_ocr.png, ocr=datapath=tessdata:language=eng, 
drawgraph=lavfi.ocr.text: Invalid argument


If I use /usr/local/bin/tesseract to do the ocr task, there is no problem.


Why ffmpeg and tesseract-3.05.01 can't find eng.traineddata?




2)
How can I use ocr to parse text from png file, and output the text into some 
specified txt file?




Thanks


Regards


Andrew
_______________________________________________
ffmpeg-user mailing list
[email protected]
http://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".

Reply via email to