Hi, Last week, I've started to write a simple OCR tool for Okular. Generally it received good response from KDE users [1-3].
What do you think about adding such a tool to Okular? Is it possible? If possible, I'd be happy to help as far as I can do. But I would like to say that I'm not experienced in the KDE/Qt development. Currently my code (which mostly copy/paste from other projects) take an image part from active document and save it to os's temp dir. Then run a particular OCR app's executable file (for now only Tesseract) and convert image to text file. Finally code open the text file and copy its content to clipboard. And after all, the temporary files are deleted. I think before going any further it would be better to clarify some issues that I encountered. API vs Executable ------------------- Which one would be better to use? It's easier to use the executable file. But using API seems a more right approach. As far as I see Tesseract [4] and Cuneiform [5] provide API but I don't know about other OCR software. Maybe instead of trying to give support to more than one OCR software we can choose just a default one. But it will restrict the users. If we use API, Okular will link to OCR software libraries and this means more dependencies for Okular package. If we use executable, we can check executable file before running it and if it's not installed we can show a info message to user which tells something like that: "additional packages must be installed to use this feature". If we choose API way these [6-9] way help. OCR Output's Accuracy ----------------------- OCR performance isn't well enough (at least for comics) for now. There is almost 50% success. My current code use image directly from comics, may be it would be nice to convert image first black and white or 2-bit and apply some other image operations to make letters clearer. Do you have any suggestions about this? Icon for OCR Tool ------------------- Currently I used scanner icon from Oxygen [10] but if we have a better option we can use it. Document Language ------------------- To give OCR software correct parameters we must know document language. For now Okular can't determine language of opened documents [11]. Until this feature implemented we can add a new section to Okular Configurations for OCR tool. Users can select language for OCR process from here as well as which OCR software will be used. Links ------- [1] http://wklej.org/id/995982/ [2] http://www.youtube.com/watch?v=duSTyByIPLc [3] https://plus.google.com/113435503145887565355/posts/RqzC3hMcGcd [4] https://code.google.com/p/tesseract-ocr/ [5] https://launchpad.net/cuneiform-linux [6] https://raw.github.com/ruediger/VobSub2SRT/master/CMakeModules/FindTesseract.cmake [7] https://raw.github.com/ck1125/sikuli/master/cmake_modules/FindTesseract.cmake [8] https://projects.kde.org/projects/playground/libs/kolena/repository/revisions/master/entry/cmake/modules/FindTesseract.cmake [9] https://raw.github.com/uliss/quneiform_tests/master/cmake/FindCuneiform.cmake [10] http://i.imgur.com/xn8iyDw.png [11] https://bugs.kde.org/show_bug.cgi?id=317486 Regards, -- Anıl Özbek _______________________________________________ Okular-devel mailing list Okular-devel@kde.org https://mail.kde.org/mailman/listinfo/okular-devel