Re: [O] Canonical way to strip off all markup from an element in Org exporter backend?

2017-12-22 Thread Kaushal Modi
On Thu, Dec 21, 2017 at 9:22 AM Nicolas Goaziou 
wrote:

> (let ((no-thrill (lambda (o c _) (or c (org-element-property :value
> o)
>   (org-export-create-backend
>:parent 'ascii   ;or `hugo', depending on what you mean
>:transcoders (mapcar (lambda (type) (cons type no-thrill))
> '(bold code italic strike-through underline
> verbatim
>
> Five locs. Not bad either.
>

Thank you. That also looks a cleaner way to implement what I want.

You're basically describing `ox-ascii' with stripped emphasis markers.
>

Exactly. That's why I suggested extending ox-ascii from this "raw" backend.

At this point, I'm not convinced we need this in Org proper.
>

That's understood. No problem. The snippet you suggested above serves the
purpose very well for now.

Thanks!
-- 

Kaushal Modi


Re: [O] Canonical way to strip off all markup from an element in Org exporter backend?

2017-12-21 Thread Nicolas Goaziou
Hello,

Kaushal Modi  writes:

> Thank you! That function is educational. I'll play more with that idea. It
> will be a lot more verbose than the 3 line solution I have right now..

(let ((no-thrill (lambda (o c _) (or c (org-element-property :value o)
  (org-export-create-backend
   :parent 'ascii   ;or `hugo', depending on what you mean
   :transcoders (mapcar (lambda (type) (cons type no-thrill))
'(bold code italic strike-through underline 
verbatim

Five locs. Not bad either.

> It can be used wherever just the element content is needed without
> formatting properties, like in my case where the element title is needed to
> be extracted without any formatting.

So far, no major back-end needs this. Also, it is very simple to provide
the back-end above.

> I haven't yet invested any time into serious development of this "base
> class" backend. The idea of this exporter is to give formatting-free output
> (like when you select plain text option in an email client).. so at whim,
> entities will be translated to the correct unicode chars, footnotes
> behavior could be the same as ox-ascii, and latex-snippets can stay in the
> raw ascii form.

You're basically describing `ox-ascii' with stripped emphasis markers.
At this point, I'm not convinced we need this in Org proper.


Regards,

-- 
Nicolas Goaziou0x80A93738



Re: [O] Canonical way to strip off all markup from an element in Org exporter backend?

2017-12-20 Thread Kaushal Modi
On Wed, Dec 20, 2017 at 5:27 PM Nicolas Goaziou 
wrote:

> You must be kidding. It must be around 8 locs. See for example
> `org-export-toc-entry-backend'.
>

Thank you! That function is educational. I'll play more with that idea. It
will be a lot more verbose than the 3 line solution I have right now.. but
I am intrigued enough to still try that out to see how it turns out.

I fail to see how it could be generally useful.
>

It can be used wherever just the element content is needed without
formatting properties, like in my case where the element title is needed to
be extracted without any formatting.


> What are you doing with entities, footnotes, latex-snippets...?
>

I haven't yet invested any time into serious development of this "base
class" backend. The idea of this exporter is to give formatting-free output
(like when you select plain text option in an email client).. so at whim,
entities will be translated to the correct unicode chars, footnotes
behavior could be the same as ox-ascii, and latex-snippets can stay in the
raw ascii form.

If there is an interest to move this forward, I can come up with a "raw"
backend spec, and we can discuss on the details.
-- 

Kaushal Modi


Re: [O] Canonical way to strip off all markup from an element in Org exporter backend?

2017-12-20 Thread Nicolas Goaziou
Kaushal Modi  writes:

> Thanks! I feared so. Then the strip-HTML-tags approach seems to be the
> quickest.

You must be kidding. It must be around 8 locs. See for example
`org-export-toc-entry-backend'.

> Would there be an interest to add that to the core something like a "base
> class" of exporter backends?

I fail to see how it could be generally useful.

> "strip off all markup" simply means export something like "*abc* /def/
> =ghi= ~jkl~ +mno+ _pqr_" as "abc def ghi jkl mno pqr". Think of that as a
> backend without even the minimal adornment that ox-ascii has... and
> ox-ascii can be a derived backend from this one.

What are you doing with entities, footnotes, latex-snippets...?



Re: [O] Canonical way to strip off all markup from an element in Org exporter backend?

2017-12-20 Thread Kaushal Modi
On Wed, Dec 20, 2017 at 5:04 PM Nicolas Goaziou 
wrote:

> You could write a dedicated (anonymous) back-end for that,


Thanks! I feared so. Then the strip-HTML-tags approach seems to be the
quickest.


> if you have a clear idea about what "strip off all markup" means.
>

Would there be an interest to add that to the core something like a "base
class" of exporter backends?

"strip off all markup" simply means export something like "*abc* /def/
=ghi= ~jkl~ +mno+ _pqr_" as "abc def ghi jkl mno pqr". Think of that as a
backend without even the minimal adornment that ox-ascii has... and
ox-ascii can be a derived backend from this one.
-- 

Kaushal Modi


Re: [O] Canonical way to strip off all markup from an element in Org exporter backend?

2017-12-20 Thread Nicolas Goaziou
Hello,

Kaushal Modi  writes:

> What's the canonical way to strip off all markup from an element in an Org
> exporter backend.

You could write a dedicated (anonymous) back-end for that, if you have
a clear idea about what "strip off all markup" means.

Regards,

-- 
Nicolas Goaziou



[O] Canonical way to strip off all markup from an element in Org exporter backend?

2017-12-20 Thread Kaushal Modi
Hello,

What's the canonical way to strip off all markup from an element in an Org
exporter backend.

I do it in this round-about way in ox-hugo..it works but feels convoluted.
The trick is to remove all markup chars from an element while retaining the
*, /, `, etc chars *not* used for any markup.

I export Org subtrees to individual posts, where the subtree headline will
become the post title. So I need to sanitize that headline of any markup.

Step1: I get the HTMLized version of the title

(org-export-data-with-backend (plist-get info :title) 'html info)

But getting the HTMLized version of the title, it would be easy to strip
off the HTML tags which would be inserted basically for formatting (bold,
italics, etc.).

Step 2: Strip off the HTML tags.

(while (string-match "<\\(?1:[a-z]+\\)[^>]*>\\(?2:[^<]+\\)" title)
  (setq title (replace-match "\\2" nil nil title)))

If I do any other exporter like md, I will lose the ability to distinguish
a literal * in the title from a * meant for bold/italics markup in
Markdown. Even ascii is not good because then I'd need to do some intensive
parsing to figure out if ` is meant to be a literal ` or part of `code'.

So the question: Is this the best way.. or is there a canonical way to
export an element without any markup char?

Full actual code[1].

[1]:
https://github.com/kaushalmodi/ox-hugo/blob/dffb7e970f33959a0b97fb8df267a54d01a98a2a/ox-hugo.el#L1769-L1802
-- 

Kaushal Modi