[Libreoffice-bugs] [Bug 149457] Arabic Text Scrambled and Unreadable in PDF Files Opened by LibreOffice Draw
https://bugs.documentfoundation.org/show_bug.cgi?id=149457 خالد حسني changed: What|Removed |Added Blocks||99746 Referenced Bugs: https://bugs.documentfoundation.org/show_bug.cgi?id=99746 [Bug 99746] [META] PDF import filter in Draw -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 149457] Arabic Text Scrambled and Unreadable in PDF Files Opened by LibreOffice Draw
https://bugs.documentfoundation.org/show_bug.cgi?id=149457 V Stuart Foote changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |DUPLICATE --- Comment #12 from V Stuart Foote --- The primary issue of the reversed text runs is corrected for the 7.4.3 release, with additional work in master against a 7.5 release. Any residual formatting or conversion of extracted RTL text runs should be opened as new issues against 7.4.3 *** This bug has been marked as a duplicate of bug 104597 *** -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 149457] Arabic Text Scrambled and Unreadable in PDF Files Opened by LibreOffice Draw
https://bugs.documentfoundation.org/show_bug.cgi?id=149457 V Stuart Foote changed: What|Removed |Added Depends on|104597 | See Also||https://bugs.documentfounda ||tion.org/show_bug.cgi?id=10 ||4597 Referenced Bugs: https://bugs.documentfoundation.org/show_bug.cgi?id=104597 [Bug 104597] RTL script text runs are reversed on PDF import, PDFIProcessor::mirrorString misbehaving -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 149457] Arabic Text Scrambled and Unreadable in PDF Files Opened by LibreOffice Draw
https://bugs.documentfoundation.org/show_bug.cgi?id=149457 Bug 149457 depends on bug 104597, which changed state. Bug 104597 Summary: RTL script text runs are reversed on PDF import, PDFIProcessor::mirrorString misbehaving https://bugs.documentfoundation.org/show_bug.cgi?id=104597 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 149457] Arabic Text Scrambled and Unreadable in PDF Files Opened by LibreOffice Draw
https://bugs.documentfoundation.org/show_bug.cgi?id=149457 Bug 149457 depends on bug 104597, which changed state. Bug 104597 Summary: RTL script text runs are reversed on PDF import, PDFIProcessor::mirrorString misbehaving https://bugs.documentfoundation.org/show_bug.cgi?id=104597 What|Removed |Added Status|RESOLVED|NEW Resolution|FIXED |--- -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 149457] Arabic Text Scrambled and Unreadable in PDF Files Opened by LibreOffice Draw
https://bugs.documentfoundation.org/show_bug.cgi?id=149457 Bug 149457 depends on bug 104597, which changed state. Bug 104597 Summary: RTL script text runs are reversed on PDF import, PDFIProcessor::mirrorString misbehaving https://bugs.documentfoundation.org/show_bug.cgi?id=104597 What|Removed |Added Status|REOPENED|RESOLVED Resolution|--- |FIXED -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 149457] Arabic Text Scrambled and Unreadable in PDF Files Opened by LibreOffice Draw
https://bugs.documentfoundation.org/show_bug.cgi?id=149457 --- Comment #11 from Eyal Rozenberg --- The first attachment ("PDF sample file with Arabic text") is already kind of scrambled to begin with. Specifically, observe how, on line 2, the % sign overlaps the two aleef characters. Also, the text is not in the Arabic language, and I doubt it is properly in any language. So, let's please start with a proper PDF document (with Arabic, or Farsi or whatever), then analyze any problems. -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 149457] Arabic Text Scrambled and Unreadable in PDF Files Opened by LibreOffice Draw
https://bugs.documentfoundation.org/show_bug.cgi?id=149457 Bug 149457 depends on bug 104597, which changed state. Bug 104597 Summary: RTL script text runs are reversed on PDF import, PDFIProcessor::mirrorString misbehaving https://bugs.documentfoundation.org/show_bug.cgi?id=104597 What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |--- -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 149457] Arabic Text Scrambled and Unreadable in PDF Files Opened by LibreOffice Draw
https://bugs.documentfoundation.org/show_bug.cgi?id=149457 V Stuart Foote changed: What|Removed |Added Status|REOPENED|NEW --- Comment #10 from V Stuart Foote --- @Khaldoun, thanks for the analysis. I did notice the 1st issue. I don't know if that is a font fallback, or just manifestation of the way the glyphs are being extracted from the PDF--where the logic for handling the glyph transformations is probably not present. For the second, best to think of them as partial text runs or snippets. Glyphs are encoded into the PDF with no sense of source script. We filter import them (using poppler libs) into LibreOffice as just a run of text, all lexical context is missing. Normal break iterators are not parsed even if present. They end up recorded into the draw canvas as text box objects--disjointed by which glyphs get strung together. So, given the coarseness of the filter import, just getting them into the correct RTL sequence (for bug 104597) is a great improvement. Assembling them into lexically useful strings, sentences and paragraphs is work still to be done, work done for bug 118370 is not doing well with assembling the RTL textboxes, suspect that needs additional logic to do so. I'm interested in Khaled's take on things at this juncture. -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 149457] Arabic Text Scrambled and Unreadable in PDF Files Opened by LibreOffice Draw
https://bugs.documentfoundation.org/show_bug.cgi?id=149457 --- Comment #9 from Khaldoun --- For got to mention: Version: 7.5.0.0.alpha0+ / LibreOffice Community Build ID: a09c5c69e3b5fbf448cae1d6c476f39067e40023 CPU threads: 8; OS: Linux 6.0; UI render: default; VCL: gtk3 Locale: en-US (en_US.utf8); UI: en-US Calc: threaded -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 149457] Arabic Text Scrambled and Unreadable in PDF Files Opened by LibreOffice Draw
https://bugs.documentfoundation.org/show_bug.cgi?id=149457 --- Comment #8 from Khaldoun --- Created attachment 183055 --> https://bugs.documentfoundation.org/attachment.cgi?id=183055&action=edit Lam-Alef and Lam-Hamza issue and Splitting singles words -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 149457] Arabic Text Scrambled and Unreadable in PDF Files Opened by LibreOffice Draw
https://bugs.documentfoundation.org/show_bug.cgi?id=149457 --- Comment #7 from Khaldoun --- Hello, I agree, Draw is not a PDF editor. But Draw still show handle the RTL/Arabic letters properly. Which is not yet 100% fixed in this fix. In Arabic when a "Lam" letter is followed by a 'Alef" letter or "Hamza" letter, both letters are combined into a new form/shape. This looks like not being handled yet properly in this fix. Also, another issue appears that Draw sometimes split the "same word" into multiple blocks. NB. I call it a "block" but it can be named: frame, box.. etc. I am attaching new file that describes both issues. IMPORTANT: This commit fixes a big portion of the issue. It deserves to go live. -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 149457] Arabic Text Scrambled and Unreadable in PDF Files Opened by LibreOffice Draw
https://bugs.documentfoundation.org/show_bug.cgi?id=149457 --- Comment #6 from Khaldoun --- Hello @ V Stuart Foote Thanks a lot for the link to the build. I am not seeking to LO to be a PDF editor, but properly display RTL (Arabic in my case). I can assure that many Arabic users are not using Arabic because of such issues they are not facing with other apps. Anyways, I downloaded the 2022-10-14 build: Version: 7.5.0.0.alpha0+ / LibreOffice Community Build ID: a09c5c69e3b5fbf448cae1d6c476f39067e40023 CPU threads: 8; OS: Linux 6.0; UI render: default; VCL: gtk3 Locale: en-US (en_US.utf8); UI: en-US Calc: threaded The text rendering is much better but still the reverse order did not handle all the letters properly. Please note the added attachment that describes an issue in handling specific 2 letter combinations. Also, there is an issue of splitting the same word over multiple blocks rather coming into 1 block. -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 149457] Arabic Text Scrambled and Unreadable in PDF Files Opened by LibreOffice Draw
https://bugs.documentfoundation.org/show_bug.cgi?id=149457 --- Comment #5 from V Stuart Foote --- The 2022-10-14 nightly [1] imports the sample PDF to Draw pretty well. Some font glitches and obvious spots where combining glyphs get separated from their root glyph. Overall greatly improved, but please consider LibreOffice is *NOT* a PDF editor, the filter import to Draw produces an ODF holding sdraw text objects arranged on a document canvas. Version: 7.5.0.0.alpha0+ (x64) / LibreOffice Community Build ID: 8991cbb7986d3967bc6c3719d95254ff04428d1a CPU threads: 8; OS: Windows 10.0 Build 19044; UI render: Skia/Vulkan; VCL: win Locale: en-US (en_US); UI: en-US Calc: threaded =-ref-= [1] https://dev-builds.libreoffice.org/daily/master/ -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 149457] Arabic Text Scrambled and Unreadable in PDF Files Opened by LibreOffice Draw
https://bugs.documentfoundation.org/show_bug.cgi?id=149457 --- Comment #4 from Khaldoun --- Hello How can I check the 104597 fix and decide if this is as well is solved?? How the new commit will be delivered as a new LO version? @Eyal Rozenberg -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 149457] Arabic Text Scrambled and Unreadable in PDF Files Opened by LibreOffice Draw
https://bugs.documentfoundation.org/show_bug.cgi?id=149457 Bug 149457 depends on bug 104597, which changed state. Bug 104597 Summary: RTL script text runs are reversed on PDF import, PDFIProcessor::mirrorString misbehaving https://bugs.documentfoundation.org/show_bug.cgi?id=104597 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 149457] Arabic Text Scrambled and Unreadable in PDF Files Opened by LibreOffice Draw
https://bugs.documentfoundation.org/show_bug.cgi?id=149457 Eyal Rozenberg changed: What|Removed |Added Depends on||104597 Blocks|43808 |112810 Status|VERIFIED|REOPENED Resolution|DUPLICATE |--- --- Comment #3 from Eyal Rozenberg --- While this bug is about PDF import of RTL language text runs - it is not the same problem described in 104597. There, the problem is the reversal of order in text runs. Here we have additional problems, like character repetitions, shifting, excessive and insufficient (horizontal) spacing. So, this is not clearly a dupe. Perhaps the fix for 104597 will resolve this one as well, but - perhaps not. I think the more careful relation between the bugs is dependence. Referenced Bugs: https://bugs.documentfoundation.org/show_bug.cgi?id=43808 [Bug 43808] [META] Right-To-Left and Complex Text Layout language issues (RTL/CTL) https://bugs.documentfoundation.org/show_bug.cgi?id=104597 [Bug 104597] RTL script text runs are reversed on PDF import, PDFIProcessor::mirrorString misbehaving https://bugs.documentfoundation.org/show_bug.cgi?id=112810 [Bug 112810] [META] Arabic & Farsi language-specific RTL issues -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 149457] Arabic Text Scrambled and Unreadable in PDF Files Opened by LibreOffice Draw
https://bugs.documentfoundation.org/show_bug.cgi?id=149457 Eyal Rozenberg changed: What|Removed |Added Blocks||43808 Referenced Bugs: https://bugs.documentfoundation.org/show_bug.cgi?id=43808 [Bug 43808] [META] Right-To-Left and Complex Text Layout language issues (RTL/CTL) -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 149457] Arabic Text Scrambled and Unreadable in PDF Files Opened by LibreOffice Draw
https://bugs.documentfoundation.org/show_bug.cgi?id=149457 Eyal Rozenberg changed: What|Removed |Added Status|RESOLVED|VERIFIED -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 149457] Arabic Text Scrambled and Unreadable in PDF Files Opened by LibreOffice Draw
https://bugs.documentfoundation.org/show_bug.cgi?id=149457 V Stuart Foote changed: What|Removed |Added Resolution|--- |DUPLICATE Status|NEW |RESOLVED CC||vstuart.fo...@utsa.edu --- Comment #2 from V Stuart Foote --- Thanks for filing, but a known and long running PDF import filter issue for RTL text runs. *** This bug has been marked as a duplicate of bug 104597 *** -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 149457] Arabic Text Scrambled and Unreadable in PDF Files Opened by LibreOffice Draw
https://bugs.documentfoundation.org/show_bug.cgi?id=149457 Khaldoun changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0 |1 -- You are receiving this mail because: You are the assignee for the bug.
[Libreoffice-bugs] [Bug 149457] Arabic Text Scrambled and Unreadable in PDF Files Opened by LibreOffice Draw
https://bugs.documentfoundation.org/show_bug.cgi?id=149457 Khaldoun changed: What|Removed |Added Summary|Arabic Text Scrambled and |Arabic Text Scrambled and |Unreadable in LibreOffice |Unreadable in PDF Files |Draw|Opened by LibreOffice Draw -- You are receiving this mail because: You are the assignee for the bug.