Re: [O] Tweaking the export

2012-02-17 Thread Nicolas Goaziou
Hello,

Christian Wittern cwitt...@gmail.com writes:

3. If all went well, you now have an impressive Org to Org converter.
   You can even test it with:

   #+begin_src emacs-lisp
   (switch-to-buffer (org-export-to-buffer 'translator *Translation*))
   #+end_src

   Obviously, there is not much to see.

 It worked wonderful until here.

 Now, we're going to redefine `org-translator-paragraph' to properly
 ignore one language or the other, depending on `:translator-side' value.

 #+begin_src emacs-lisp
 (defun org-translator-paragraph (paragraph contents info)
Convert PARAGRAPH to Org, ignoring one language.
 Language kept is determined by `:translator-side' value.
(let ((leftp (eq (plist-get info :translator-side) 'left)))
  (replace-regexp-in-string
   (if leftp \t+.*$ ^.*\t+)  contents)))
 #+end_src

 With a little tweaking, I got rid of errors when running this code.
 However, no changes in the output where observable.  Finally, I looked
 at the output from step 3 above and realized that the parser
 normalizes my tab characters away.  Only a bunch of spaces in the
 output!  Ouch!!
 So I guess I would need an option on the parser to switch tab expansion off.

 I also intended to implement my transformer in a way that I first
 define the general org-e-org transformer and then derive a specialized
 transcormer by somehow inheriting the general transformer and then
 implement my specialized paragraph transformation.   It seems that
 this is at the moment not possible, but I think it would be good to
 think about this, that will make defining new exporters or even
 org-file tweakers a breeze.

In fact the problem is subtle.  For example, you don't want include
keywords to be expanded and babel block to be executed when exporting
from Org to Org.  I've added a noexpand keyword for that.  Hence, you
will need to call your converter with:

#+begin_src emacs-lisp
(switch-to-buffer (org-export-to-buffer 'translator *Translation* nil nil nil 
nil 'noexpand))
#+end_src

The TAB problem is different.  I expand tab early because the machine
creating the parse-tree and the machine exporting it may not be the
same.  Tab widths may differ, and it could lead to subtle bugs.  I may
add a :tab-width property in the initial environment.  I'm not sure
about it yet.

Anyway, your tabs have been replaced with spaces, for now. `tab-width'
of them.  Your paragraph translator may then become something like:

#+begin_src emacs-lisp
(defun org-translator-paragraph (paragraph contents info)
   Convert PARAGRAPH to Org, ignoring one language.
Language kept is determined by `:translator-side' value.
   (let ((leftp (eq (plist-get info :translator-side) 'left)))
 (replace-regexp-in-string
  (format (if leftp  \\{%d,\\}.*$ ^.* \\{%d,\\}) tab-width)  
contents)))
#+end_src

Is it better?


Regards,

-- 
Nicolas Goaziou



Re: [O] Tweaking the export

2012-02-03 Thread Christian Wittern

Hi Nicolas,

Thank you very much for taking the time for such a detailed recipe.  Today I 
finally found time to go over it and try to implement my transformer.  It 
turned out to be really easy to get going, but in the end, I hit a roadblock.



On 2012-01-29 18:07, Nicolas Goaziou wrote:


   3. If all went well, you now have an impressive Org to Org converter.
  You can even test it with:

  #+begin_src emacs-lisp
  (switch-to-buffer (org-export-to-buffer 'translator *Translation*))
  #+end_src

  Obviously, there is not much to see.


It worked wonderful until here.


Now, we're going to redefine `org-translator-paragraph' to properly
ignore one language or the other, depending on `:translator-side' value.

#+begin_src emacs-lisp
(defun org-translator-paragraph (paragraph contents info)
   Convert PARAGRAPH to Org, ignoring one language.
Language kept is determined by `:translator-side' value.
   (let ((leftp (eq (plist-get info :translator-side) 'left)))
 (replace-regexp-in-string
  (if leftp \t+.*$ ^.*\t+)  contents)))
#+end_src


With a little tweaking, I got rid of errors when running this code.  
However, no changes in the output where observable.  Finally, I looked at 
the output from step 3 above and realized that the parser normalizes my 
tab characters away.  Only a bunch of spaces in the output!  Ouch!!

So I guess I would need an option on the parser to switch tab expansion off.

I also intended to implement my transformer in a way that I first define the 
general org-e-org transformer and then derive a specialized transcormer by 
somehow inheriting the general transformer and then implement my specialized 
paragraph transformation.   It seems that this is at the moment not 
possible, but I think it would be good to think about this, that will make 
defining new exporters or even org-file tweakers a breeze.


Anyhow, again thanks for writing the new parser /  exporter and for your 
help with my problem!


All the best,

Christian


--
Christian Wittern, Kyoto




Re: [O] Tweaking the export

2012-01-27 Thread Nicolas Goaziou
Hello,

Christian Wittern cwitt...@gmail.com writes:

 For the last couple of years, I have used org-mode more and more for
 working with and translating texts from classical Chinese.  Over time,
 some special conventions have crept in, like the fact that I like (for
 the draft translation) to work in a way that has a short chunk of
 Chinese text on the left and, separated by a tab character, the
 translation of that piece following on the same line (there are other
 special conventions like specialized drawers etc., but I don't need to
 discuss these here now.)

 While this is setup is extremely pleasant to work with, at some point
 I need to separate these two parts in separate texts; the stuff to the
 left of the tab has to go into one file, the stuff to the right to
 some other file, while at the same time merging the chunks of texts
 into paragraphs.   Now for quite some while if have thought about how
 to automate that, but until now, I have usually done it by hand with
 a couple of regex search-and-replace.

 Now, with the new export engine, it looks like all I would need to do
 would be to tweak the way paragraphs are handled, while leaving the
 rest intact, some kind of org to org transform that simply tweaks one
 single aspect of the text.  However, I am a bit baffled on where to
 start with this.  I would be glad if you or somebody else could give
 me some pointers at how to tackle this problem.  (And please be kind,
 since my elisp fu is pretty insignificant:-(  )

While I understand the shape of your input, I fail to see what you
output should you look like. For example, given the following paragraph,

--8---cut here---start-8---
text A  text A'
line 2  line 2 bis
A line with *emphasis*  A traduced line with *emphasis*
--8---cut here---end---8---

what exactly do you want to obtain ?


Regards,

-- 
Nicolas Goaziou



Re: [O] Tweaking the export

2012-01-27 Thread Jambunathan K
Nicolas

I will let Christian answer for himself.

 [Nicolas]
 While I understand the shape of your input, I fail to see what you
 output should you look like. For example, given the following paragraph,

 text Atext A'
 line 2line 2 bis
 A line with *emphasis*A traduced line with *emphasis*


 [Christian]
 I need to separate these two parts in separate texts; the stuff to the
 left of the tab has to go into one file, the stuff to the right to
 some other file, 

 while at the same time merging the chunks of texts
 into paragraphs.

If I interpret the above lines, I imagine his request more along the
following lines:

text A text A'
line 2 line 2

My name is Jambunathan. I live  Mon nom est Jambunathan. Je vis 
in India.   en India...

He wants the English column to be collected in to an English file and
the French column to be collected in to a French file.

It is possible that English column constitutes a poem and the French
column is a line-by-line translation of the column to the left.

In some sense, he wants to tangle the English column, let's say as
verse_en.org and French column to verse_fr.org and later include them
as a table cell or a column of a 2-C section. 

Notionally something like:
|+---|
|#+INCLUDE: verse_en.org |#+INCLUDE: verse_fr.org|
|+---|

Put another way, collect Column-X in to Paragraph-X and do whatver.

ps: French translation is courtesy google.
-- 



Re: [O] Tweaking the export

2012-01-27 Thread Sebastien Vauban
Hi all,

Jambunathan K wrote:
 Nicolas

 I will let Christian answer for himself.

 [Nicolas]
 While I understand the shape of your input, I fail to see what you
 output should you look like. For example, given the following paragraph,

 text A   text A'
 line 2   line 2 bis
 A line with *emphasis*   A traduced line with *emphasis*


 [Christian]
 I need to separate these two parts in separate texts; the stuff to the
 left of the tab has to go into one file, the stuff to the right to
 some other file, 

 while at the same time merging the chunks of texts
 into paragraphs.

 If I interpret the above lines, I imagine his request more along the
 following lines:

 text A text A'
 line 2 line 2

 My name is Jambunathan. I liveMon nom est Jambunathan. Je vis 
 in India. en India...

 He wants the English column to be collected in to an English file and
 the French column to be collected in to a French file.

 It is possible that English column constitutes a poem and the French
 column is a line-by-line translation of the column to the left.

 In some sense, he wants to tangle the English column, let's say as
 verse_en.org and French column to verse_fr.org and later include them
 as a table cell or a column of a 2-C section. 

 Notionally something like:
 |+---|
 |#+INCLUDE: verse_en.org |#+INCLUDE: verse_fr.org|
 |+---|

 Put another way, collect Column-X in to Paragraph-X and do whatver.

 ps: French translation is courtesy google.

Just a side comment: isn't easier to work in 2 different files or buffers
(eventually, within the same file) and use some sort of parallel
follow-mode?  I thought such a thing existed, but can't find it back right
now.

Anyway, it would be quite easy to implement: it's more or less implementing
C-v/M-v so that it's done in two parallel buffers at the same time, instead of
just in one!?

Best regards,
  Seb

-- 
Sebastien Vauban




Re: [O] Tweaking the export

2012-01-27 Thread Christian Wittern

Hi, Jambunathan and Nicolas,

On 2012-01-27 22:47, Jambunathan K wrote:

Nicolas

I will let Christian answer for himself.
Thanks Jambunathan, you are not only an excellent coder, but also an expert 
mind reader:-)

What you describe is exactly what I want to achieve.


text A text A'
line 2 line 2

My name is Jambunathan. I live  Mon nom est Jambunathan. Je vis 
in India.   en India...

He wants the English column to be collected in to an English file and
the French column to be collected in to a French file.



In some sense, he wants to tangle the English column, let's say as
verse_en.org and French column to verse_fr.org


Exactly.  The reason for wanting to do this is that the above is my setup 
for translating, but in some cases the publication will have only the 
translation, for such cases, I want to extract just the translation.  This 
should then produce a new org file, that simple has either everything before 
the tab (the original) or everything after the tab (the translation), while 
leaving all lines that do not contain a tab character as they are.


I assume this would be an easy task with the new exporter -- but still a bit 
at loss on where to start...


All the best,

Christian




--
Christian Wittern, Kyoto




Re: [O] Tweaking the export

2012-01-27 Thread Christian Wittern

Hi Sebastian,

On 2012-01-27 23:03, Sebastien Vauban wrote:
Just a side comment: isn't easier to work in 2 different files or buffers 
(eventually, within the same file) and use some sort of parallel 
follow-mode? I thought such a thing existed, but can't find it back right 
now. Anyway, it would be quite easy to implement: it's more or less 
implementing C-v/M-v so that it's done in two parallel buffers at the same 
time, instead of just in one!? Best regards, Seb 
What you describe is Two-Column mode, and this was suggested by Jambunathan 
before.  I did try this alley, but for me org-mode works better.  One of the 
reasons for this is, that there are some structural aspects that are common 
to both files.  Another reason is that I want to be able grep through the 
files and be able to see matching lines in both languages -- this helps me 
ensure a consistent translation.  So the current setup is really nice for me 
for doing the work, but now I need to construct the pipeline for 
publication.  As Jambunathan put it, this is really a problem of tangling 
the output.


BTW, I think the general exporter should also be able to to a org-mode to 
org-mode conversion.  This would provide a general framework to 
systematically correct little problems in files.  I guess here it shows that 
I am coming from the XML world, where a conversion from one XML file to 
another XML file with slight alterations of some aspects is a very common 
pattern.


All the best,

Christian

--
Christian Wittern, Kyoto




Re: [O] Tweaking the export

2012-01-27 Thread Eric Abrahamsen
On Sat, Jan 28 2012, Christian Wittern wrote:

 Hi, Jambunathan and Nicolas,

 On 2012-01-27 22:47, Jambunathan K wrote:
 Nicolas

 I will let Christian answer for himself.
 Thanks Jambunathan, you are not only an excellent coder, but also an
 expert mind reader:-)
 What you describe is exactly what I want to achieve.

 text A text A'
 line 2 line 2

 My name is Jambunathan. I live   Mon nom est Jambunathan. Je vis 
 in India.    en India...

 He wants the English column to be collected in to an English file and
 the French column to be collected in to a French file.

 In some sense, he wants to tangle the English column, let's say as
 verse_en.org and French column to verse_fr.org

 Exactly.  The reason for wanting to do this is that the above is my
 setup for translating, but in some cases the publication will have
 only the translation, for such cases, I want to extract just the
 translation.  This should then produce a new org file, that simple has
 either everything before the tab (the original) or everything after
 the tab (the translation), while leaving all lines that do not contain
 a tab character as they are.

I also use org mode for translating (from modern Chinese,
coincidentally), and as Sebastien mentioned, I find it easiest to split
a single file into two subtrees, source and target, then split the
window so that I've got the two subtrees side-by-side. You could use
follow-mode at this point, though I don't. Selective export then becomes
trivial, though you'd have a harder time getting it into a two-column
table.

It's always annoying to ask how to do something and then be told to do
something else, so I'm not going to do that, but I do think you might
encounter fewer difficulties making the above setup do what you want,
rather than the TAB arrangement.

Of course, classical Chinese (particularly poetry) lends itself better
to doing discrete chunks one at a timeā€¦ modern prose would be a
nightmare with TABs, though.

I've toyed with a home-made follow-type setup, where the two subtrees
are displayed in split windows as above, and the sub-headings of the two
subtrees have properties pointing to the IDs of their corresponding
sub-heading (ie, source chapters are linked to target chapters and vice
versa). I got about halfway to implementing something where
corresponding paragraphs are highlighted in the non-active window,
before getting distracted by an actual translation deadline.

(The pie-in-the-sky next step would be to use org-mode to maintain a
TMX-formatted translation database
(http://en.wikipedia.org/wiki/Translation_Memory_eXchange), and allow
for automatic insertion of translations of known terms, a library I
expect to have written some time before the obsoletion of Emacs itself.)

Anyway, I'm not sure I had much of a point, but if there are any other
translators using org-mode, it might be interesting to discuss how we
could make it more useful, perhaps in a separate thread.

Eric

-- 
Gnu Emacs 24.0.92.1 (i686-pc-linux-gnu, GTK+ Version 2.24.9)
 of 2012-01-26 on pellet
Org-mode version 7.8.03 (release_7.8.03.249.g742c4e9)




[O] Tweaking the export (was: Re: [ANN] ASCII back-end for new export engine)

2012-01-25 Thread Christian Wittern

Hi Nicolas, hello everybody,

I am extremely excited about this new export engine; it seems to fill a need 
I have felt for quite a while now.  What I need it for is the following:


For the last couple of years, I have used org-mode more and more for working 
with and translating texts from classical Chinese.  Over time, some special 
conventions have crept in, like the fact that I like (for the draft 
translation) to work in a way that has a short chunk of Chinese text on the 
left and, separated by a tab character, the translation of that piece 
following on the same line (there are other special conventions like 
specialized drawers etc., but I don't need to discuss these here now.)


While this is setup is extremely pleasant to work with, at some point I need 
to separate these two parts in separate texts; the stuff to the left of the 
tab has to go into one file, the stuff to the right to some other file, 
while at the same time merging the chunks of texts into paragraphs.   Now 
for quite some while if have thought about how to automate that, but until 
now, I have usually done it by hand with a couple of regex search-and-replace.


Now, with the new export engine, it looks like all I would need to do would 
be to tweak the way paragraphs are handled, while leaving the rest intact, 
some kind of org to org transform that simply tweaks one single aspect of 
the text.  However, I am a bit baffled on where to start with this.  I would 
be glad if you or somebody else could give me some pointers at how to tackle 
this problem.  (And please be kind, since my elisp fu is pretty 
insignificant:-(  )


All the best,

Christian Wittern, Kyoto