URL: <http://puszcza.gnu.org.ua/bugs/?620>
Summary: tex4ht breaks URL text leading to empty spaces between words generated in final HTML when \urldef Project: tex4ht Submitted by: nma123 Submitted on: Mon Jan 15 23:27:58 2024 Category: None Priority: 5 - Normal Severity: 5 - Normal Status: None Privacy: Public Assigned to: None Originator Email: Open/Closed: Open Discussion Lock: Any _______________________________________________________ Details: reference and screen shot at https://tex.stackexchange.com/questions/707149/tex4ht-breaks-url-text-leading-to-empty-spaces-between-words-generated-in-final I use \urldef in order to make href, because the names are folder/file path which can contain many different strange characters. This works fine in PDF with lualatex. But I noticed that the HTML generated by tex4ht breaks the names into 2 lines, which causes BLANK space to show in the name when looking at it on the screen in the page. This makes it hard to read sometimes. Here is MWE ------------------------ \documentclass[12pt,oneside]{book} \usepackage{hyperref} \usepackage{url} \begin{document} \section{Tests completed} \begin{enumerate} \item \urldef\mytarget\nolinkurl{test_cases/rubi_tests/0_Independent_test_suites/1_Apostol_Problems} \href{test_cases/rubi_tests/0_Independent_test_suites/1_Apostol_Problems/output/report.htm}{\mytarget} \hspace{5pt} [175] \item \urldef\mytarget\nolinkurl{test_cases/rubi_tests/0_Independent_test_suites/2_Bondarenko_Problems} \href{test_cases/rubi_tests/0_Independent_test_suites/2_Bondarenko_Problems/output/report.htm}{\mytarget} \hspace{5pt} [35] \end{enumerate} \end{document} -------------------------- When compiled using make4ht -ulm default -a debug index.tex "mathjax,htm,nostyle" It gives enter image description here The reason this happens is because tex4ht breaks the name when it sees _. Here is the raw html -------------------------- <!DOCTYPE html> <html lang='en-US' xml:lang='en-US'> <head><title></title> <meta charset='utf-8' /> <meta content='TeX4ht (https://tug.org/tex4ht/)' name='generator' /> <meta content='width=device-width,initial-scale=1' name='viewport' /> <link href='index.css' rel='stylesheet' type='text/css' /> <meta content='index.tex' name='src' /> <script>window.MathJax = { tex: { tags: "ams", }, }; </script> <script async='async' id='MathJax-script' src='https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml-full.js' type='text/javascript'></script> </head><body> <h3 class='sectionHead' id='tests-completed'><span class='titlemark'>0.1 </span> <a id='x1-10000.1'></a>Tests completed</h3> <!-- l. 9 --><p class='noindent'> </p> <ol class='enumerate1'> <li class='enumerate' id='x1-1002x1'> <a href='test_cases/rubi_tests/0_Independent_test_suites/1_Apostol_Problems/output/report.htm'><span class='ec-lmtt-12'>test_cases/rubi_tests/0_Independent_test_suites/1_ Apostol_Problems</span></a> [175] </li> <li class='enumerate' id='x1-1004x2'> <a href='test_cases/rubi_tests/0_Independent_test_suites/2_Bondarenko_Problems/output/report.htm'><span class='ec-lmtt-12'>test_cases/rubi_tests/0_Independent_test_suites/2_ Bondarenko_Problems</span></a> [35]</li></ol> </body> </html> -------------------- If I edit the index.htm by hand and make the name one long line by removing the extra CR it added, the HTML now becomes ---------------------------------- <!DOCTYPE html> <html lang='en-US' xml:lang='en-US'> <head><title></title> <meta charset='utf-8' /> <meta content='TeX4ht (https://tug.org/tex4ht/)' name='generator' /> <meta content='width=device-width,initial-scale=1' name='viewport' /> <link href='index.css' rel='stylesheet' type='text/css' /> <meta content='index.tex' name='src' /> <script>window.MathJax = { tex: { tags: "ams", }, }; </script> <script async='async' id='MathJax-script' src='https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml-full.js' type='text/javascript'></script> </head><body> <h3 class='sectionHead' id='tests-completed'><span class='titlemark'>0.1 </span> <a id='x1-10000.1'></a>Tests completed</h3> <!-- l. 9 --><p class='noindent'> </p> <ol class='enumerate1'> <li class='enumerate' id='x1-1002x1'> <a href='test_cases/rubi_tests/0_Independent_test_suites/1_Apostol_Problems/output/report.htm'><span class='ec-lmtt-12'>test_cases/rubi_tests/0_Independent_test_suites/1_Apostol_Problems</span></a> [175] </li> <li class='enumerate' id='x1-1004x2'> <a href='test_cases/rubi_tests/0_Independent_test_suites/2_Bondarenko_Problems/output/report.htm'><span class='ec-lmtt-12'>test_cases/rubi_tests/0_Independent_test_suites/2_Bondarenko_Problems</span></a> [35]</li></ol> </body> </html> -------------- and on screen it now looks like this enter image description here How to fix tex4ht so it does not break long names in href and keep the name on same line? TL 2023 installed few days ago on Linux. _______________________________________________________ Reply to this item at: <http://puszcza.gnu.org.ua/bugs/?620> _______________________________________________ Message sent via/by Puszcza http://puszcza.gnu.org.ua/