[O] Bug: text export and multi-word link descriptions with line breaks
Dear Maintainers, I just stumbled over Org's plain text export and how it works on links with descriptions consisting of multiple words and line breaks between them. I'm running Org stable version 8.2.5h. Org source (spaces at the end of line 1 and 2 don't matter): snip OpenPGP Message Format ([[https://tools.ietf.org/html/rfc4880][RFC 4880]] which obsoletes [[https://tools.ietf.org/html/rfc1991][RFC 1991]] and [[https://tools.ietf.org/html/rfc2440][RFC 2440]])... ... foo [[https://tools.ietf.org/html/rfc4880][RFC 4880]] bar baz [[https://tools.ietf.org/html/rfc1991][RFC 1991]] foo bar [[https://tools.ietf.org/html/rfc2440][RFC 2440]] baz snip Text export result: snip OpenPGP Message Format ([RFC 4880] which obsoletes [RFC 1991] and [RFC 2440])... ... foo [RFC 4880] bar baz [RFC 1991] foo bar [RFC 2440] baz [RFC 4880] https://tools.ietf.org/html/rfc4880 [RFC 1991] https://tools.ietf.org/html/rfc1991 [RFC 2440] https://tools.ietf.org/html/rfc2440 [RFC 4880] https://tools.ietf.org/html/rfc4880 [RFC 1991] https://tools.ietf.org/html/rfc1991 snip These multiple references look quite bad. Is it possible to normalize the descriptions in some way *before* checking them for uniqueness and output them thereafter? Thanks for considering this issue. Kind regards Mathias
Re: [O] Bug: text export and multi-word link descriptions with line breaks
Hello Nicolas, * Nicolas Goaziou wrote on 2014-04-03 at 17:25 (+0200): Mathias Bauer mba...@gmx.org writes: I just stumbled over Org's plain text export and how it works on links with descriptions consisting of multiple words and line breaks between them. I'm running Org stable version 8.2.5h. Org source (spaces at the end of line 1 and 2 don't matter): snip OpenPGP Message Format ([[https://tools.ietf.org/html/rfc4880][RFC 4880]] which obsoletes [[https://tools.ietf.org/html/rfc1991][RFC 1991]] and [[https://tools.ietf.org/html/rfc2440][RFC 2440]])... ... foo [[https://tools.ietf.org/html/rfc4880][RFC 4880]] bar baz [[https://tools.ietf.org/html/rfc1991][RFC 1991]] foo bar [[https://tools.ietf.org/html/rfc2440][RFC 2440]] baz snip Text export result: snip OpenPGP Message Format ([RFC 4880] which obsoletes [RFC 1991] and [RFC 2440])... ... foo [RFC 4880] bar baz [RFC 1991] foo bar [RFC 2440] baz [RFC 4880] https://tools.ietf.org/html/rfc4880 [RFC 1991] https://tools.ietf.org/html/rfc1991 [RFC 2440] https://tools.ietf.org/html/rfc2440 [RFC 4880] https://tools.ietf.org/html/rfc4880 [RFC 1991] https://tools.ietf.org/html/rfc1991 snip These multiple references look quite bad. Is it possible to normalize the descriptions in some way *before* checking them for uniqueness and output them thereafter? Could you be more explicit? What does look quite bad? What did you expect instead? How is related to line breaks in the descriptions? Ok, let's go into more details. See the Org source text: 1. There are three links and each of them appears twice. The link targets of every two of them are identical. 2. Each of the two [...][RFC 2440] links appear in one line; the links [...][RFC 4880] and [...][RFC 1991] each have a newline in their description. They are in fact [...][RFC\n4880] and [...][RFC 4880] and, respectively, [...][RFC\n1991] and [...][RFC 1991]. So, now let's examine the Org text export: The final reference part - the five links below the paragraph - shows two links, [RFC 4880] and [RFC 1991], which appear twice but the link [RFC 2440] appears only once there. This is, at least, inconsistent. The point is, that Org obviously considers [...][RFC 4880] and [...][RFC\n4880] as being two different links internally and list both of them in the reference part. For this listing, the \n is removed. This is, what I called normalization in my first post. Human eyes, however, won't see any difference between this two forms and start being surprised. I expect, Org to do the following steps while parsing the source text: 1. Normalize or clean the link description, i.e. remove any newlines, starting and trailing spaces, and replace any occurrences of [ \t]+ in the interior by a single space only. (To be done.) 2. Check the tuple (description,target) for duplicates and drop them. (Seems ok to me.) 3. Below the paragraph list the tuples as [description] target in the order of occurrence in the original text. (Also seems ok to me.) I hope this makes this issue a little bit more clear now. Kind regards, Mathias
[O] Bug: 3 bugs and 2 proposals on ascii/html export [7.8.03]
Hi! I just played with org's export functionality and following minimal org file. This results in three minor bugs and two proposals/questions on org's behavior. file1.org #+STARTUP: showeverything #+OPTIONS: author:nil email:nil timestamp:nil * Some section Some text. * TODO Some section with a TODO keyword Some text. * DONE Some section with another TODO keyword Some text. * Some section with TAG at the end :some_tag: Some text. * TODO Some section with TODO keyword and TAG at the end :another_tag: Some text. --- * ASCII/Latin-1/UTF-8 export ** Bug 1: Underlining the headlines Headlines without tags are underlined in a wrong manner. It's one character too long. ** Question/Proposal As default, all level 1 headlines are underlined by - characters and level 2 headlines with =. Wouldn't it be more logical the other way round: the lower the level, the more important the headline and hence the bigger its underlining? (Of course the user can change the variable org-export-ascii-underline.) * HTML export ** Question/Proposal --snip-- h2...Some section with TAG at the end nbsp;nbsp;nbsp;span class=tagspan class=some_tagsome_tag... --snip-- Isn't a single space enough for separating the heading's text and the tag? Beside their number, the additional three (why not five or n?) nbsp; seem a little bit freaky to me... To keep things even more flexible, couldn't the blank and the nbsp; be skipped both and could the CSS tag class be modified instead. Unfortunately, I don't know enough of CSS yet to check if that will be possible at all. For the table of contents we will have a similar phenomenon if an additional #+OPTIONS: tags:t is added. The separation between text and tag in this case consists of three nbsp; and *no* space before. ** Bug 2: Exporting the tag into the toc Adding #+OPTIONS: tags:t results in the following exported toc: --snip-- li...Some section with TAG at the endnbsp;nbsp;nbsp;span class=tag some_tag/span/a/li --snip-- There is a space inside the span.../span just before the tag name which should not be there. ** Bug 3: Exporting the TODO keywords --snip-- h2...span class=todo TODO TODO/span Some section with a TODO keyword/h2 --snip-- There is a space inside the span.../span just before the TODO keyword which should not be there. Could you please consider fixing these bugs. Thanks for this wonderful piece of software :-) Mathias P.S. For proving the above topics I used the proposed minimal org installation. So nothing in the following settings report has something to do with my personal configuration. But if it's the will of (org-submit-bug-report)... :-) Emacs : GNU Emacs 23.2.1 (i486-pc-linux-gnu, GTK+ Version 2.20.0) of 2010-12-11 on raven, modified by Debian Package: Org-mode version 7.8.03 current state: == (setq org-export-latex-after-initial-vars-hook '(org-beamer-after-initial-vars) org-speed-command-hook '(org-speed-command-default-hook org-babel-speed-command-hook) org-metaup-hook '(org-babel-load-in-session-maybe) org-after-todo-state-change-hook '(org-clock-out-if-current) org-export-latex-format-toc-function 'org-export-latex-format-toc-default org-tab-first-hook '(org-hide-block-toggle-maybe org-src-native-tab-command-maybe) org-src-mode-hook '(org-src-babel-configure-edit-buffer org-src-mode-configure-edit-buffer) org-confirm-shell-link-function 'yes-or-no-p org-export-first-hook '(org-beamer-initialize-open-trackers) org-agenda-before-write-hook '(org-agenda-add-entry-text) org-blank-before-new-entry nil org-babel-pre-tangle-hook '(save-buffer) org-cycle-hook '(org-cycle-hide-archived-subtrees org-cycle-hide-drawers org-cycle-show-empty-lines org-optimize-window-after-visibility-change) org-export-preprocess-before-normalizing-links-hook '(org-remove-file-link-modifiers) org-mode-hook '(#[nil \300\301\302\303\304$\207 [org-add-hook change-major-mode-hook org-show-block-all append local] 5] org-babel-hide-all-hashes) org-ctrl-c-ctrl-c-hook '(org-babel-hash-at-point org-babel-execute-safely-maybe) org-confirm-elisp-link-function 'yes-or-no-p org-export-interblocks '((lob org-babel-exp-lob-one-liners) (src org-babel-exp-inline-src-blocks)) org-clock-out-hook '(org-clock-remove-empty-clock-drawer) org-occur-hook '(org-first-headline-recenter) org-export-preprocess-before-selecting-backend-code-hook '(org-beamer-select-beamer-code) org-export-latex-final-hook '(org-beamer-amend-header
Re: [O] Bug: 3 bugs and 2 proposals on ascii/html export [7.8.03]
Hi Bastien, * Bastien wrote on 2012-03-09 at 03:09 (+0100): Mathias Bauer mba...@gmx.org writes: Thanks for this report -- next time, please consider sending one mail per bug/request, it makes issues easier to track. ok, I'll do so - even for small bugs. Promised :-) Headlines without tags are underlined in a wrong manner. It's one character too long. It's a matter of taste. I like this additionnal character and I think Carsten added it intentionally. Hm, yes it is. I just wondered because the strings of the title and the toc headline have a different underlining. --snip-- h2...Some section with TAG at the end nbsp;nbsp;nbsp;span class=tagspan class=some_tagsome_tag... --snip-- Isn't a single space enough for separating the heading's text and the tag? Beside their number, the additional three (why not five or n?) nbsp; seem a little bit freaky to me... They _are_ freaky :) But they are also needed. Even if the tags display is taken care of by the CSS, we must prevent collapsing the tags with the previous strings in case the CSS is not available -- just think of what the HTML page should look like with w3m/lynx. Thanks for your explanation. I completely missed text based browsers. And ordinary spaces are _really_ not enough for them? Concerning CSS I'm digging into the docs ... but tomorrow :-) Regards, Mathias
Re: [O] Bug: 2nd, 3rd, ... ext link in normal text NOT exported [7.7]
On Sun, Dec 11, 2011 at 11:01:13AM +0100, David Maus wrote: At Mon, 5 Dec 2011 17:19:29 +0100, M. Bauer wrote: as in the last paragraph of the Org v7.7 manual section 4.3 about external links, Org also finds external links in the normal text and activates them as links. While editing, this completely works as expected. But when it comes to exporting, Org will *not* recognize the second, third, etc. external link in normal text if it is *not* marked by square brackets. See below for some tests that will fail in ASCII, UTF8, and HTML export. Can you please consider this issue for one of the next versions of Org? Pushed a fix for this to master, all links in the example file are now exported as expected. Thanks very much. Mathias
Re: [O] Bug: HTML export broken [7.8]
Hello Bastien, On Tue, Dec 13, 2011 at 01:07:15AM +0100, Bastien wrote: Mathias Bauer mba...@gmx.org writes: Can you please check this issue? This is now fixed in 7.8.01. Sorry for this. Never mind. Thanks for your quick response. But,... there is still a litte flaw at the website's org-mode-download.html page: The download links for the zip file and the gzipped tar archive both show the v7.6 files. However v7.8.01 is directly accessible via http://www.orgmode.org/org-7.8.01.tar.gz or http://www.orgmode.org/org-7.8.01.zip respectively Above all, with the new design the website really looks great and gives the text related Org a modern first impression. :-) Many thanks for the renovation. Best, Mathias