<https://stackoverflow.com/posts/79521014/timeline>
Following the indications found here: https://tesseract-ocr.github.io/tessdoc/Compiling I'm trying to compile, build and install tesseract 5 in Ubuntu 24.04 : (base) raphy@raohy:~$ git clone --recursive https://github.com/tesseract-ocr/tesseract.git (base) raphy@raohy:~/tesseract$ ./autogen.sh (base) raphy@raohy:~/tesseract$ ./configure --prefix=/home/raphy/Grasp/src/tesseract checking for g++... g++ checking whether the C++ compiler works... yes checking for C++ compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether the compiler supports GNU C++... yes checking whether g++ accepts -g... yes checking for g++ option to enable C++11 features... none needed checking for a BSD-compatible install... /usr/bin/install -c checking whether build environment is sane... yes checking for a race-free mkdir -p... /usr/bin/mkdir -p checking for gawk... gawk checking whether make sets $(MAKE)... yes checking whether make supports the include directive... yes (GNU style) checking whether make supports nested variables... yes checking dependency style of g++... gcc3 checking for a sed that does not truncate output... /usr/bin/sed checking Major version... 5 checking Minor version... 5 checking Point version... 0-48-gf96c checking whether make supports nested variables... (cached) yes checking build system type... x86_64-pc-linux-gnu checking host system type... x86_64-pc-linux-gnu checking whether C++ compiler accepts -Werror=unused-command-line-argument... no checking whether C++ compiler accepts -mavx... yes checking whether C++ compiler accepts -mavx2... yes checking whether C++ compiler accepts -mavx512f... yes checking whether C++ compiler accepts -mfma... yes checking whether C++ compiler accepts -msse4.1... yes checking for feenableexcept... yes checking whether C++ compiler accepts -fopenmp-simd... yes checking --enable-float32 argument... checking --enable-graphics argument... checking --enable-legacy argument... checking for g++ option to support OpenMP... -fopenmp checking for stdio.h... yes checking for stdlib.h... yes checking for string.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for strings.h... yes checking for sys/stat.h... yes checking for sys/types.h... yes checking for unistd.h... yes checking for tiffio.h... yes checking --enable-visibility argument... checking whether to use tessdata-prefix... yes checking if compiling with clang... no checking whether to enable debugging... checking how to print strings... printf checking for gcc... gcc checking whether the compiler supports GNU C... yes checking whether gcc accepts -g... yes checking for gcc option to enable C11 features... none needed checking whether gcc understands -c and -o together... yes checking dependency style of gcc... gcc3 checking for a sed that does not truncate output... (cached) /usr/bin/sed checking for grep that handles long lines and -e... /usr/bin/grep checking for egrep... /usr/bin/grep -E checking for fgrep... /usr/bin/grep -F checking for ld used by gcc... /usr/bin/ld checking if the linker (/usr/bin/ld) is GNU ld... yes checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B checking the name lister (/usr/bin/nm -B) interface... BSD nm checking whether ln -s works... yes checking the maximum length of command line arguments... 1572864 checking how to convert x86_64-pc-linux-gnu file names to x86_64-pc-linux-gnu format... func_convert_file_noop checking how to convert x86_64-pc-linux-gnu file names to toolchain format... func_convert_file_noop checking for /usr/bin/ld option to reload object files... -r checking for file... file checking for objdump... objdump checking how to recognize dependent libraries... pass_all checking for dlltool... no checking how to associate runtime and link libraries... printf %s\n checking for ar... ar checking for archiver @FILE support... @ checking for strip... strip checking for ranlib... ranlib checking command to parse /usr/bin/nm -B output from gcc object... ok checking for sysroot... no checking for a working dd... /usr/bin/dd checking how to truncate binary pipes... /usr/bin/dd bs=4096 count=1 checking for mt... mt checking if mt is a manifest tool... no checking for dlfcn.h... yes checking for objdir... .libs checking if gcc supports -fno-rtti -fno-exceptions... no checking for gcc option to produce PIC... -fPIC -DPIC checking if gcc PIC flag -fPIC -DPIC works... yes checking if gcc static flag -static works... yes checking if gcc supports -c -o file.o... yes checking if gcc supports -c -o file.o... (cached) yes checking whether the gcc linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes checking whether -lc should be explicitly linked in... no checking dynamic linker characteristics... ./configure: line 14056: warning: command substitution: ignored null byte in input GNU/Linux ld.so checking how to hardcode library paths into programs... immediate checking whether stripping libraries is possible... yes checking if libtool supports shared libraries... yes checking whether to build shared libraries... yes checking whether to build static libraries... yes checking how to run the C++ preprocessor... g++ -E checking for ld used by g++... /usr/bin/ld -m elf_x86_64 checking if the linker (/usr/bin/ld -m elf_x86_64) is GNU ld... yes checking whether the g++ linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes checking for g++ option to produce PIC... -fPIC -DPIC checking if g++ PIC flag -fPIC -DPIC works... yes checking if g++ static flag -static works... yes checking if g++ supports -c -o file.o... yes checking if g++ supports -c -o file.o... (cached) yes checking whether the g++ linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes checking dynamic linker characteristics... (cached) ./configure: line 18060: warning: command substitution: ignored null byte in input GNU/Linux ld.so checking how to hardcode library paths into programs... immediate checking whether C++ compiler accepts -std=c++17... yes checking whether C++ compiler accepts -std=c++20... yes checking for library containing pthread_create... none required checking for brew... false checking for asciidoc... false checking for xsltproc... true checking for wchar_t... yes checking for long long int... yes checking for pkg-config... /usr/bin/pkg-config checking pkg-config is at least version 0.9.0... yes checking for libcurl... yes checking for lept >= 1.74... yes checking for libarchive... yes checking for icu-uc >= 52.1... yes checking for icu-i18n >= 52.1... yes checking for pango >= 1.38.0... yes checking for cairo... yes checking for pangocairo... yes checking for pangoft2... yes checking that generated files are newer than configure... done configure: creating ./config.status config.status: creating include/tesseract/version.h config.status: creating Makefile config.status: creating tesseract.pc config.status: creating tessdata/Makefile config.status: creating tessdata/configs/Makefile config.status: creating tessdata/tessconfigs/Makefile config.status: creating java/Makefile config.status: creating java/com/Makefile config.status: creating java/com/google/Makefile config.status: creating java/com/google/scrollview/Makefile config.status: creating java/com/google/scrollview/events/Makefile config.status: creating java/com/google/scrollview/ui/Makefile config.status: creating nsis/Makefile config.status: creating include/config_auto.h config.status: executing depfiles commands config.status: executing libtool commands Configuration is done. (base) raphy@raohy:~/tesseract$ cmake -B builddir -- The C compiler identification is GNU 13.3.0 -- The CXX compiler identification is GNU 14.2.0 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/cc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/c++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Configuring tesseract version 5.5.0-48-gf96c... -- Setting build type to 'Release' as none was specified. -- IPO / LTO supported -- CMAKE_SYSTEM_PROCESSOR=<x86_64> -- Performing Test HAVE_AVX -- Performing Test HAVE_AVX - Success -- Performing Test HAVE_AVX2 -- Performing Test HAVE_AVX2 - Success -- Performing Test HAVE_AVX512F -- Performing Test HAVE_AVX512F - Success -- Performing Test HAVE_FMA -- Performing Test HAVE_FMA - Success -- Performing Test HAVE_SSE4_1 -- Performing Test HAVE_SSE4_1 - Success -- Performing Test OPENMP_SIMD -- Performing Test OPENMP_SIMD - Success -- Found PkgConfig: /usr/bin/pkg-config (found version "1.8.1") -- Could NOT find Leptonica (missing: Leptonica_DIR) -- Checking for module 'lept>=1.74' -- Found lept, version 1.82.0 -- Found leptonica version: 1.82.0 -- Leptonica was build with TIFF support. -- Found TIFF: /usr/lib/x86_64-linux-gnu/libtiff.so (found version "4.5.1") -- Found LibArchive: /usr/lib/x86_64-linux-gnu/libarchive.so (found version "3.7.2") -- Found CURL: /usr/local/lib/cmake/CURL/CURLConfig.cmake (found version "8.8.0-DEV") -- Looking for feenableexcept -- Looking for feenableexcept - found -- -- General configuration for Tesseract 5.5.0-48-gf96c -- -------------------------------------------------------- -- Build type: Release 64 bits -- Compiler: GNU -- Compiler version: 14.2.0 -- Used standard: C++20 -- CXX compiler options: -O3 -DNDEBUG -- Compile definitions = HAVE_AVX;HAVE_AVX2;HAVE_AVX512F;HAVE_FMA;HAVE_SSE4_1;OPENMP_SIMD;CMAKE_BUILD;HAVE_CONFIG_H -- Linker options: -- Install directory: /usr/local -- HAVE_AVX: 1 -- HAVE_AVX2: 1 -- HAVE_AVX512F: 1 -- HAVE_FMA: 1 -- HAVE_SSE4_1: 1 -- MARCH_NATIVE_OPT: OFF -- HAVE_NEON: FALSE -- Link-time optimization: FALSE -- -------------------------------------------------------- -- Build with sw [SW_BUILD]: OFF -- Build with openmp support [OPENMP_BUILD]: OFF -- Build with libarchive support [HAVE_LIBARCHIVE]: ON -- Build with libcurl support [HAVE_LIBCURL]: ON -- Enable float for LSTM [FAST_FLOAT]: ON -- Enable optimization for host CPU (could break HW compatibility) [ENABLE_NATIVE]: OFF -- Disable disable graphics (ScrollView) [GRAPHICS_DISABLED]: OFF -- Disable the legacy OCR engine [DISABLED_LEGACY_ENGINE]: OFF -- Build training tools [BUILD_TRAINING_TOOLS]: ON -- Build tests [BUILD_TESTS]: OFF -- Use system ICU Library [USE_SYSTEM_ICU]: OFF -- Install tesseract configs [INSTALL_CONFIGS]: ON -- -------------------------------------------------------- -- -- Checking for modules 'icu-uc;icu-i18n' -- Found icu-uc, version 74.2 // <----------------------------------------------------- -- Found icu-i18n, version 74.2 // <----------------------------------------------------- >> ICU_FOUND 1 icui18n;icuuc;icudata /usr/include // <----------------------------------------------------- -- Performing Test COMPILER_HAS_HIDDEN_VISIBILITY -- Performing Test COMPILER_HAS_HIDDEN_VISIBILITY - Success -- Performing Test COMPILER_HAS_HIDDEN_INLINE_VISIBILITY -- Performing Test COMPILER_HAS_HIDDEN_INLINE_VISIBILITY - Success -- Performing Test COMPILER_HAS_DEPRECATED_ATTR -- Performing Test COMPILER_HAS_DEPRECATED_ATTR - Success -- Checking for modules 'pango>=1.38.0;cairo;pangoft2;pangocairo;fontconfig' -- Found pango, version 1.52.1 -- Found cairo, version 1.18.0 -- Found pangoft2, version 1.52.1 -- Found pangocairo, version 1.52.1 -- Found fontconfig, version 2.15.0 -- Configuring done (1.8s) -- Generating done (0.1s) -- Build files have been written to: /home/raphy/tesseract/builddir But on the building phase I get `undefined reference to the icu 72 files` : (base) raphy@raohy:~/tesseract$ cmake --build builddir/ [ 93%] Linking CXX executable ../../bin/combine_lang_model /usr/bin/ld: libunicharset_training.a(normstrngs.cpp.o): warning: relocation against `_ZTVN6icu_7213UnicodeStringE' in read-only section `.text' /usr/bin/ld: libunicharset_training.a(unicharset_training_utils.cpp.o): in function `tesseract::SetupBasicProperties(bool, bool, tesseract::UNICHARSET*)': unicharset_training_utils.cpp:(.text+0xf6): undefined reference to `u_isalpha_72' /usr/bin/ld: unicharset_training_utils.cpp:(.text+0x105): undefined reference to `u_islower_72' /usr/bin/ld: unicharset_training_utils.cpp:(.text+0x114): undefined reference to `u_isupper_72' /usr/bin/ld: unicharset_training_utils.cpp:(.text+0x123): undefined reference to `u_isdigit_72' /usr/bin/ld: unicharset_training_utils.cpp:(.text+0x133): undefined reference to `u_ispunct_72' /usr/bin/ld: unicharset_training_utils.cpp:(.text+0x1ca): undefined reference to `uscript_getScript_72' /usr/bin/ld: unicharset_training_utils.cpp:(.text+0x1d1): undefined reference to `uscript_getName_72' /usr/bin/ld: unicharset_training_utils.cpp:(.text+0x2a5): undefined reference to `u_charMirror_72' /usr/bin/ld: unicharset_training_utils.cpp:(.text+0x2c1): undefined reference to `u_charDirection_72' /usr/bin/ld: unicharset_training_utils.cpp:(.text+0x78a): undefined reference to `u_tolower_72' /usr/bin/ld: unicharset_training_utils.cpp:(.text+0x842): undefined reference to `u_toupper_72' /usr/bin/ld: libunicharset_training.a(normstrngs.cpp.o): in function `tesseract::StripJoiners(std::vector<int, std::allocator<int> >*)': normstrngs.cpp:(.text+0x2c): undefined reference to `u_isalpha_72' /usr/bin/ld: libunicharset_training.a(normstrngs.cpp.o): in function `tesseract::NormalizeUTF8ToUTF32(tesseract::UnicodeNormMode, tesseract::OCRNorm, char const*, std::vector<int, std::allocator<int> >*)': normstrngs.cpp:(.text+0x1f8): undefined reference to `icu_72::UnicodeString::UnicodeString(char const*, char const*)' /usr/bin/ld: normstrngs.cpp:(.text+0x239): undefined reference to `icu_72::Normalizer2::getInstance(char const*, char const*, UNormalization2Mode, UErrorCode&)' /usr/bin/ld: normstrngs.cpp:(.text+0x244): undefined reference to `icu_72::ErrorCode::assertSuccess() const' /usr/bin/ld: normstrngs.cpp:(.text+0x24c): undefined reference to `icu_72::ErrorCode::reset()' /usr/bin/ld: normstrngs.cpp:(.text+0x253): undefined reference to `vtable for icu_72::UnicodeString' /usr/bin/ld: normstrngs.cpp:(.text+0x282): undefined reference to `icu_72::ErrorCode::assertSuccess() const' /usr/bin/ld: normstrngs.cpp:(.text+0x2d7): undefined reference to `icu_72::UnicodeString::char32At(int) const' /usr/bin/ld: normstrngs.cpp:(.text+0x32c): undefined reference to `icu_72::UnicodeString::moveIndex32(int, int) const' /usr/bin/ld: normstrngs.cpp:(.text+0x353): undefined reference to `icu_72::UnicodeString::~UnicodeString()' /usr/bin/ld: normstrngs.cpp:(.text+0x363): undefined reference to `icu_72::UnicodeString::~UnicodeString()' /usr/bin/ld: normstrngs.cpp:(.text+0x3d1): undefined reference to `icu_72::UnicodeString::char32At(int) const' /usr/bin/ld: normstrngs.cpp:(.text+0x52c): undefined reference to `icu_72::UnicodeString::moveIndex32(int, int) const' I do not understand why this happens, since it found icu 74 folder How to make it work? -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/tesseract-ocr/178718ac-e189-4ac2-ad9d-61edb59eed66n%40googlegroups.com.

