Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-11 Thread Rainer M Krug


On 02/10/14, 21:47 , stefano franchi wrote:
 
 
 
 On Mon, Feb 10, 2014 at 2:36 PM, Rainer M Krug rai...@krugs.de
 mailto:rai...@krugs.de wrote:
 
 I have the feeling, this discussion can be summed up in two lines:
 
 1) we would like to have a round-trip (docx backend)
 2) but we need a sematic export (to docx)
 
 
 Or perhaps:
 1) we would like to have a round-trip (docx backend)
 2) but we need a semantic export (to docx)
 3) so let's begin with (2)
 4) and hopefully will will halfway through to (1)

Agreed - We can work on that.

 
 The important question is:
 
 - which design decisions in (2) could prevent a successful roundtrip?
 How do we avoid those?

Exactly - and also, which elements in LyX should be (initially) included
in the semantic export and how they should be mapped.

I won't be able to help in regards of any programming questions, as I
know neither the inner workings of LyX nor of docx, but I would be more
then interested in participating in these kind of discussions and support.

Cheers,

Rainer

 
 
 S.  
 
 
 
 -- 
 __
 Stefano Franchi
 Associate Research Professor
 Department of Hispanic Studies Ph:   +1 (979) 845-2125
 Texas AM University  Fax:  +1 (979) 845-6421
 College Station, Texas, USA
 
 stef...@tamu.edu mailto:stef...@tamu.edu
 http://stefano.cleinias.org

-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :   +33 - (0)9 53 10 27 44
Cell:   +33 - (0)6 85 62 59 98
Fax :   +33 - (0)9 58 10 27 44

Fax (D):+49 - (0)3 21 21 25 22 44

email:  rai...@krugs.de

Skype:  RMkrug



signature.asc
Description: OpenPGP digital signature


Re: Copying from pdf (was Re: GSOC 2014 project list: on LyX--docx roundtrip conversion)

2014-02-11 Thread stefano franchi
On Mon, Feb 10, 2014 at 2:25 PM, Andrew Parsloe apars...@clear.net.nzwrote:



 On 11/02/2014 2:59 a.m., stefano franchi wrote:



 /rantI had to convert a ~50,000 words book from LyX to Word last month
 and it took me 2 full days. I think I tried all exporters known to men
 (and women). They all failed to various degrees. In the end, I had
 better luck converting the file from the pdf (!) output to word and
 then reinserting manually all footnotes (all 450 of them).  I am facing
 the prospect of converting a 200,000 words manuscript in a few months
 and I am already sweating at night at the very idea. \rant


 Cheers,

 Stefano


 Did you just copy  paste from the pdf? That's something I've done before.
 The main problem is always that each line on the page in the pdf ends up as
 a separate paragraph in the pasted text in Word. How did you handle that?

 I wrote a macro in Word to join up the lines into paragraphs, judging the
 end of a paragraph by the existence of a shorter line -- which obviously
 fails sometimes. (Copying a pdf followed by paste special has the same
 problem in LyX. I have an unfinished script, for the pLyX system, to do the
 same in LyX.)



No I didn't copy and paste. That would have been even worse. In addition
to  the problem of line-paragraphs you also face the problem of headers and
footers, hyphenation, etc. I guess I could have produced a pdf with no
hyphenation, no headrs no footers, etc before trying the conversion, but I
didn't do that. I used on of the many pdf-to-word or pdf-to-odt utilities
available online. I cannot actually remember which one, to be frank. I
tried several until I got a reasonable output. I still had to do some
cleaning, as some (but not all) apostrophe were lost, and, as I mentioned,
all footnotes came through but as text and not as footnote.


S.


-- 
__
Stefano Franchi
Associate Research Professor
Department of Hispanic Studies Ph:   +1 (979) 845-2125
Texas AM University  Fax:  +1 (979) 845-6421
College Station, Texas, USA

stef...@tamu.edu
http://stefano.cleinias.org


Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-11 Thread Rainer M Krug


On 02/10/14, 21:47 , stefano franchi wrote:
> 
> 
> 
> On Mon, Feb 10, 2014 at 2:36 PM, Rainer M Krug  > wrote:
> 
> I have the feeling, this discussion can be summed up in two lines:
> 
> 1) we would like to have a round-trip (docx backend)
> 2) but we need a sematic export (to docx)
> 
> 
> Or perhaps:
> 1) we would like to have a round-trip (docx backend)
> 2) but we need a semantic export (to docx)
> 3) so let's begin with (2)
> 4) and hopefully will will halfway through to (1)

Agreed - We can work on that.

> 
> The important question is:
> 
> - which design decisions in (2) could prevent a successful roundtrip?
> How do we avoid those?

Exactly - and also, which elements in LyX should be (initially) included
in the semantic export and how they should be mapped.

I won't be able to help in regards of any programming questions, as I
know neither the inner workings of LyX nor of docx, but I would be more
then interested in participating in these kind of discussions and support.

Cheers,

Rainer

> 
> 
> S.  
> 
> 
> 
> -- 
> __
> Stefano Franchi
> Associate Research Professor
> Department of Hispanic Studies Ph:   +1 (979) 845-2125
> Texas A University  Fax:  +1 (979) 845-6421
> College Station, Texas, USA
> 
> stef...@tamu.edu 
> http://stefano.cleinias.org

-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :   +33 - (0)9 53 10 27 44
Cell:   +33 - (0)6 85 62 59 98
Fax :   +33 - (0)9 58 10 27 44

Fax (D):+49 - (0)3 21 21 25 22 44

email:  rai...@krugs.de

Skype:  RMkrug



signature.asc
Description: OpenPGP digital signature


Re: Copying from pdf (was Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion)

2014-02-11 Thread stefano franchi
On Mon, Feb 10, 2014 at 2:25 PM, Andrew Parsloe wrote:

>
>
> On 11/02/2014 2:59 a.m., stefano franchi wrote:
>
>>
>>
>> I had to convert a ~50,000 words book from LyX to Word last month
>> and it took me 2 full days. I think I tried all exporters known to men
>> (and women). They all failed to various degrees. In the end, I had
>> better luck converting the file from the pdf (!) output to word and
>> then reinserting manually all footnotes (all 450 of them).  I am facing
>> the prospect of converting a 200,000 words manuscript in a few months
>> and I am already sweating at night at the very idea. <\rant>
>>
>>
>> Cheers,
>>
>> Stefano
>>
>>
> Did you just copy & paste from the pdf? That's something I've done before.
> The main problem is always that each line on the page in the pdf ends up as
> a separate paragraph in the pasted text in Word. How did you handle that?
>
> I wrote a macro in Word to join up the lines into paragraphs, judging the
> end of a paragraph by the existence of a shorter line -- which obviously
> fails sometimes. (Copying a pdf followed by paste special has the same
> problem in LyX. I have an unfinished script, for the pLyX system, to do the
> same in LyX.)
>
>

No I didn't copy and paste. That would have been even worse. In addition
to  the problem of line-paragraphs you also face the problem of headers and
footers, hyphenation, etc. I guess I could have produced a pdf with no
hyphenation, no headrs no footers, etc before trying the conversion, but I
didn't do that. I used on of the many pdf-to-word or pdf-to-odt utilities
available online. I cannot actually remember which one, to be frank. I
tried several until I got a reasonable output. I still had to do some
cleaning, as some (but not all) apostrophe were lost, and, as I mentioned,
all footnotes came through but as text and not as footnote.


S.


-- 
__
Stefano Franchi
Associate Research Professor
Department of Hispanic Studies Ph:   +1 (979) 845-2125
Texas A University  Fax:  +1 (979) 845-6421
College Station, Texas, USA

stef...@tamu.edu
http://stefano.cleinias.org


Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-10 Thread Guenter Milde
On 2014-02-09, Georg Baum wrote:
 Rainer M Krug wrote:
 On 02/07/14, 10:49 , Vincent van Ravesteijn wrote:
 On Fri, Feb 7, 2014 at 10:02 AM, Rainer M Krug rai...@krugs.de wrote:

 One feature where additional metadata would 
 definitely help are branches.

Not necessarily: With use the comment package, that not only provides
comments, but also branches (as named comment environments)
http://www.ctan.org/pkg/comment
all branches could be exported to the LaTeX file and the selection of active
branches done in the preamble.

This feature should be customizable, because sometimes you don't want 
disabled branches in an exported file.

On the LaTeX-LyX route, the use of the comment package and custom (named)
comment environments should translate to LyX branches.

Günter



Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-10 Thread Rainer M Krug


On 02/09/14, 20:25 , Georg Baum wrote:
 Rainer M Krug wrote:
 
 On 02/07/14, 10:49 , Vincent van Ravesteijn wrote:
 On Fri, Feb 7, 2014 at 10:02 AM, Rainer M Krug rai...@krugs.de wrote:

 The idea would be that a round-trip framework is envisaged, which
 provides the facilities to easily expand it from one export backend
 (docx) to another (possibly odt? markdown?).

 This sounds like a sort of testing framework which would indicate for
 each export backend which features are exported and imported
 successfully. It would be cool to have some matrix showing how mature
 each of the supported formats is.

 Nicely put! That would be brilliant. Not only formats, but converters:
 different converters convert different features.
 
 Yes, such a matrix would indeed be a nice tool.
 
 Would this also solve some of the LyX-LaTeX-LyX roundtrip issues ?
 
 Some, but I believe not many. The main LyX-LaTeX-LyX problems come from 
 the fact that LaTeX as a macro language is really ugly to parse. Only some 
 of them come from the fact that the exported LaTeX contains less information 
 than the original LyX file. One feature where additional metadata would 
 definitely help are branches.
 
 Partly - if the export to LaTeX is split from the round trip LyX -
 LaTeX I would say yes, with the caveat, that only a subset of features
 would be supported by the round trip. In contrast, export - import would
 (hopefully sometime in the case of import from LaTeX) the full set of
 LyX and LaTeX features with (possibly ugly in LyX) the export / import.
 
 This is not possible. There are LyX features that simply do not appear in 
 the exported LaTeX, so they can't be imported (e.g. branches or notes). It 
 might be possible to support all LaTeX features, but the cost would be 
 extremely high, so there will always be LaTeX files which can't be imported 
 (usually the stuff found in .cls or .sty files).

OK - in this regard you are right - haven't considered branches. But LyX
notes could be exported as LaTeX comments starting with %%LyX-Note%%.

Branches: isn't there conditional compiling in LaTeX? In this way
branches could be kept and switched by activating these in the preamble?

 
 So: yes, the round-trip framework could be used for a subset of features
 initially for LyX - LaTeX, which can then be extended over time - I
 guess this would be the easiest to start with, actually.
 
 This does not make sense IMHO. Why artificially restrict the roundtrip?

Because, as you said above, some features in LyX can not be exported
into LaTeX and the other way round? In addition, the round-trip would be
needed to mainly edit content, and not that much formating - how a
section header looks in word or in LyX is irrelevant, as long as it is
recognised in the re-import / re-export for round-trip as a section
header. In Contrast, when exporting (non-round trip) one wants a
document as similar as possible to the LyX / LaTeX pdf (in most cases).

 
 The LyX-LaTeX-LyX roundtrip is special in the sense that the LaTeX-LyX 
 step is very tightly integrated with LyX. Therefore it is indeed a good 
 starting point, but not in the way of splitting off a separate roundtrip, 
 but by extending the existing export/import with the additional metadata 
 file you mentioned. The advantage would be that you would not need to put 
 too much stuff into the metadata file, so it would be clear quickly the 
 general approach works.

You are right - LaTeX is a special case, as it is the default backend
for LyX. So there are more strict requirements for the round-trip, and
all improvements in the round-trip should be immediately in the LaTeX
importer as well. But the story is different with other backends, e.g.
docx, where, if you go to replicating the LaTeX view, you might end with
painted documents which are not easily to be re-imported into LyX. But
for round trip, the look is not that relevant, as long as the content
and the structure can be re-imported.

Rainer

 
 
 Georg
 

-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :   +33 - (0)9 53 10 27 44
Cell:   +33 - (0)6 85 62 59 98
Fax :   +33 - (0)9 58 10 27 44

Fax (D):+49 - (0)3 21 21 25 22 44

email:  rai...@krugs.de

Skype:  RMkrug



signature.asc
Description: OpenPGP digital signature


Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-10 Thread Rainer M Krug


On 02/07/14, 17:35 , Rob Oakes wrote:
 Hi all,

Hi Rob,

good to hear from you on this topic.

 
 I have been following conversations with significant interest. I 
 started a new job in the middle of last year. That combined with an 
 18-month old little boy have left me almost no time for 
 extracurriculars. There is no way I can realistically mentor a student 
 this year. I tried really hard last year, and the effor didn't turn out 
 quite so well.
 
 With that said, I would be happy to provide some support to a student 
 who might be trying to work on a round-trip Python module, or on the 
 non-linear writing project. If someone else were to be primary mentor, 
 I could certainly answer questions.
 
 Re: Round-Tripping
 
 As others in the thread have mentioned, I think the magic happens in 
 the export from LyX to Word. There are places where meta-data can be 
 placed in DOCX so that we can try and preserve LyX-specific features 
 through a round trip; and if we focus on semantic markup only, I think 
 we can do a pretty good job of maintaining content.

This is exactly what I think the round-triop has to focus on: semantic
markup to maintain the content and to have it available after edits for
re-import.

 
 I think this means that we would need our own library to handle the 
 export. Most of the other modules out there do a very poor job of 
 handling semantic markup, and instead try and get the styling right. 
 That way madness lies. The Python module I started hacking on for the 
 import provides some support for generating docx, but this would need 
 to be beefed up pretty heavily. Moreover, it would be a really good 
 idea to make use of XSLT stylesheets to handle the transformation from 
 docx to LyX, instead of the Python approach I took.

This is why I also think that round-trip is a different story then
export - import. Different aspects of the document are relevant and need
to be preserved.

 
 On another note, you should also probably have a discussion about how 
 you want to handle maths. The math XML vocabulary in docx is pretty 
 well contained, but it would still be an enormous job to translate it 
 to LyX/LaTeX. For export, we might make use of the MathML support LyX 
 already supports, and then translate thtat to docx math XML. There may 
 even be XSLT to do that.

I can't comment here, but I think that it would be nice to have this,
and in many cases this might be absolutely essential, but it could be
addad at a later stage, when the basic round-trip features are in place.
But it should definitely be considered.

Cheers,

Rainer

 
 Cheers,
 
 Rob
 

-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :   +33 - (0)9 53 10 27 44
Cell:   +33 - (0)6 85 62 59 98
Fax :   +33 - (0)9 58 10 27 44

Fax (D):+49 - (0)3 21 21 25 22 44

email:  rai...@krugs.de

Skype:  RMkrug



signature.asc
Description: OpenPGP digital signature


Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-10 Thread Rainer M Krug


On 02/09/14, 19:44 , Georg Baum wrote:
 Rainer M Krug wrote:
 
 The idea would be that a round-trip framework is envisaged, which
 provides the facilities to easily expand it from one export backend
 (docx) to another (possibly odt? markdown?).

 IMPORTANT: this would NOT change ANYTHING in the existing export /
 import features, as these are geared to export / import the documents as
 good as possible, with maintaining as many features as possible in the
 document.

 The round-trip would guarantee that:

 A document authored in LyX would result in a e.g. docx with a LIMITED
 set of features, but that a re-import would result in the SAME .lyx
 file. features and formats not supported by the backend should be stored
 in a metadata file.

 The important point here is *limited set of features*!

 In addition, the framework should be easily, possibly only by using
 config files, able to be extended to other formats.
 
 I don't understand the difference between round trip and the existing 
 export/import here. Why is it important? If the additional metadata is 
 stored in a different file, it could simply be generated for the standard 
 export, and be used by the standard import (if it exists).
 
 The goal of the export/import is to support as many features as possible. 
 This is needed for round trip as well. The only difference I see is the 
 additional metadata file, so the roundtrip framework vs. export/import 
 difference reduces to a switch whether the metadata file should be generated 
 (for export) or used (for import). Or did I understand anything wrong?

The difference is that for round-trip, i.e. working together with
co-authors and getting comments back, a different set of features are
relevant. These are mainly concerned about content and not that much
formating. The import - export is concerned with both. In addition, a
round trip has to be symmetric, i.e. that exported features have to be
available in the re-importd as well - this is not the case in the export
and import. Lastly, round-trip is for editing, and export - import is
for editing and final consumption (reading).

 
 Yes - although I see one problem which I could not find in any of the
 .lyx - .docx : comments and track changes. These *have to be handled*.
 I somehow have the feeling, that an inclusion of comments and track
 changes into pandoc would be the best way forward...
 
 I agree. Unfortunately pandoc is written in Haskell which reduces the number 
 of possible contributors significantly (which does not mean that Haskell is 
 a bad language, but that it is much less known than e.g. C++ or python).

True.

Cheers,

Rainer


 
 
 Georg
 
 

-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :   +33 - (0)9 53 10 27 44
Cell:   +33 - (0)6 85 62 59 98
Fax :   +33 - (0)9 58 10 27 44

Fax (D):+49 - (0)3 21 21 25 22 44

email:  rai...@krugs.de

Skype:  RMkrug



signature.asc
Description: OpenPGP digital signature


Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-10 Thread stefano franchi
On Mon, Feb 10, 2014 at 3:41 AM, Rainer M Krug rai...@krugs.de wrote:



 On 02/09/14, 19:44 , Georg Baum wrote:
  Rainer M Krug wrote:
 
  The idea would be that a round-trip framework is envisaged, which
  provides the facilities to easily expand it from one export backend
  (docx) to another (possibly odt? markdown?).
 
  IMPORTANT: this would NOT change ANYTHING in the existing export /
  import features, as these are geared to export / import the documents as
  good as possible, with maintaining as many features as possible in the
  document.
 
  The round-trip would guarantee that:
 
  A document authored in LyX would result in a e.g. docx with a LIMITED
  set of features, but that a re-import would result in the SAME .lyx
  file. features and formats not supported by the backend should be stored
  in a metadata file.
 
  The important point here is *limited set of features*!
 
  In addition, the framework should be easily, possibly only by using
  config files, able to be extended to other formats.
 
  I don't understand the difference between round trip and the existing
  export/import here. Why is it important? If the additional metadata is
  stored in a different file, it could simply be generated for the standard
  export, and be used by the standard import (if it exists).
 
  The goal of the export/import is to support as many features as possible.
  This is needed for round trip as well. The only difference I see is the
  additional metadata file, so the roundtrip framework vs. export/import
  difference reduces to a switch whether the metadata file should be
 generated
  (for export) or used (for import). Or did I understand anything wrong?

 The difference is that for round-trip, i.e. working together with
 co-authors and getting comments back, a different set of features are
 relevant. These are mainly concerned about content and not that much
 formating. The import - export is concerned with both. In addition, a
 round trip has to be symmetric, i.e. that exported features have to be
 available in the re-importd as well - this is not the case in the export
 and import. Lastly, round-trip is for editing, and export - import is
 for editing and final consumption (reading).



I actually disagree on this point: the most useful doc-export facility for
LyX would be equally focused on semantic content and not on formatting. In
other words, it would be just half (or slightly less than half) of the
round-trip project. The rationale is simple: exporting to doc(x) makes
sense and is actually required when working with a third party (typically,
for  Lyx's main audience, with a publisher) who will then either provide
final formatting directly with Word (the worst case) or will use the doc(x)
file as import into a real typesetting program (InDesign, etc). In neither
case formatting instructions are relevant. I think it is a losing
proposition to aim for the preservation of format when exporting to
Word---and in fact it is the reason why, in my experience, *all* latex-to-
word- (or to-odt) or lyx-to-word exporters actually fail in practice. It is
impossibly hard to provide the same pdf look that (la)tex produces with
Word. And the use cases in which this conversion is required are
exceedingly rare. Far more pressing for our user base is the need to
guarantee a hassle-free 100% valid export to a sanitized word format
which is narrowly restricted, on both sides, to the semantic information
contained in LyX.

To put it more bluntly (and to repeat what I and others have stated many
times in the past): LyX is  barely usable right now for any academic work
in the Humanities, due to the necessity to deliver doc documents to
virtually any publisher. If you are a student, you are similarly asked by
professor to submit drafts in Word format.

/rantI had to convert a ~50,000 words book from LyX to Word last month
and it took me 2 full days. I think I tried all exporters known to men (and
women). They all failed to various degrees. In the end, I had better luck
converting the file from the pdf (!) output to word and then
reinserting manually all footnotes (all 450 of them).  I am facing the
prospect of converting a 200,000 words manuscript in a few months and I am
already sweating at night at the very idea. \rant

Anyway: I am willing to mentor a student through the process of producing
a LyX-to-Word semantic-only exporter. Scare quotes are necessary, because I
would have to learn as much as the student. If Rob can provide some
guidance and expert advice (both as a previous mentor and obviously as an
expert in the area) I think we may have something working by the end of the
summer. What I know is that *I* will absolutely a solid word-exporter by
that time.


Cheers,

Stefano


-- 
__
Stefano Franchi
Associate Research Professor
Department of Hispanic Studies Ph:   +1 (979) 845-2125
Texas AM University  Fax:  +1 (979) 845-6421

Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-10 Thread Rainer M Krug


On 02/10/14, 14:59 , stefano franchi wrote:
 
 
 
 On Mon, Feb 10, 2014 at 3:41 AM, Rainer M Krug rai...@krugs.de
 mailto:rai...@krugs.de wrote:
 
 
 
 On 02/09/14, 19:44 , Georg Baum wrote:
  Rainer M Krug wrote:
 
  The idea would be that a round-trip framework is envisaged, which
  provides the facilities to easily expand it from one export backend
  (docx) to another (possibly odt? markdown?).
 
  IMPORTANT: this would NOT change ANYTHING in the existing export /
  import features, as these are geared to export / import the
 documents as
  good as possible, with maintaining as many features as possible
 in the
  document.
 
  The round-trip would guarantee that:
 
  A document authored in LyX would result in a e.g. docx with a LIMITED
  set of features, but that a re-import would result in the SAME .lyx
  file. features and formats not supported by the backend should be
 stored
  in a metadata file.
 
  The important point here is *limited set of features*!
 
  In addition, the framework should be easily, possibly only by using
  config files, able to be extended to other formats.
 
  I don't understand the difference between round trip and the existing
  export/import here. Why is it important? If the additional metadata is
  stored in a different file, it could simply be generated for the
 standard
  export, and be used by the standard import (if it exists).
 
  The goal of the export/import is to support as many features as
 possible.
  This is needed for round trip as well. The only difference I see
 is the
  additional metadata file, so the roundtrip framework vs. export/import
  difference reduces to a switch whether the metadata file should be
 generated
  (for export) or used (for import). Or did I understand anything wrong?
 
 The difference is that for round-trip, i.e. working together with
 co-authors and getting comments back, a different set of features are
 relevant. These are mainly concerned about content and not that much
 formating. The import - export is concerned with both. In addition, a
 round trip has to be symmetric, i.e. that exported features have to be
 available in the re-importd as well - this is not the case in the export
 and import. Lastly, round-trip is for editing, and export - import is
 for editing and final consumption (reading).
 
 
 
 I actually disagree on this point: the most useful doc-export facility
 for LyX would be equally focused on semantic content and not on
 formatting. 

Only partial disagreement - In the case you describe below, the export
to semantic, which is equal to the round-trip export, would be the end
product.

So let's call it a semantic exporter versus a complete exporter (as
the one used export to for LaTeX).

 In other words, it would be just half (or slightly less than
 half) of the round-trip project. The rationale is simple: exporting to
 doc(x) makes sense and is actually required when working with a third
 party (typically, for  Lyx's main audience, with a publisher) who will
 then either provide final formatting directly with Word (the worst case)
 or will use the doc(x) file as import into a real typesetting program
 (InDesign, etc). In neither case formatting instructions are relevant. I
 think it is a losing proposition to aim for the preservation of format
 when exporting to Word---and in fact it is the reason why, in my
 experience, *all* latex-to- word- (or to-odt) or lyx-to-word exporters
 actually fail in practice. It is impossibly hard to provide the same pdf
 look that (la)tex produces with Word. And the use cases in which this
 conversion is required are exceedingly rare. Far more pressing for our
 user base is the need to guarantee a hassle-free 100% valid export to a
 sanitized word format which is narrowly restricted, on both sides, to
 the semantic information contained in LyX.
 
 To put it more bluntly (and to repeat what I and others have stated many
 times in the past): LyX is  barely usable right now for any academic
 work in the Humanities, due to the necessity to deliver doc documents to
 virtually any publisher. If you are a student, you are similarly asked
 by professor to submit drafts in Word format.   
 
 /rantI had to convert a ~50,000 words book from LyX to Word last month
 and it took me 2 full days. I think I tried all exporters known to men
 (and women). They all failed to various degrees. In the end, I had
 better luck converting the file from the pdf (!) output to word and
 then reinserting manually all footnotes (all 450 of them).  I am facing
 the prospect of converting a 200,000 words manuscript in a few months
 and I am already sweating at night at the very idea. \rant
 
 Anyway: I am willing to mentor a student through the process of
 producing a LyX-to-Word semantic-only exporter. Scare quotes 

Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-10 Thread Georg Baum
Guenter Milde wrote:

 On 2014-02-09, Georg Baum wrote:
 Rainer M Krug wrote:
 On 02/07/14, 10:49 , Vincent van Ravesteijn wrote:
 On Fri, Feb 7, 2014 at 10:02 AM, Rainer M Krug rai...@krugs.de wrote:
 
 One feature where additional metadata would
 definitely help are branches.
 
 Not necessarily: With use the comment package, that not only provides
 comments, but also branches (as named comment environments)
 http://www.ctan.org/pkg/comment
 all branches could be exported to the LaTeX file and the selection of
 active branches done in the preamble.

Sure. In a more general sense this would be metadata as well: It would not 
be used by LaTeX, but preserve LyX stuff through roundtrip.

 This feature should be customizable, because sometimes you don't want
 disabled branches in an exported file.
 
 On the LaTeX-LyX route, the use of the comment package and custom (named)
 comment environments should translate to LyX branches.

Yes, this would be one method to transport metadata through LaTeX which does 
not need an additional file. It could probably be used for other stuff 
besides branches or notes as well. When this project is started it needs 
probably a thorough investigation whether an additional file or special 
comments are preferrable.


Georg



Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-10 Thread Guenter Milde
On 2014-02-10, Georg Baum wrote:
 Guenter Milde wrote:

 On 2014-02-09, Georg Baum wrote:
 Rainer M Krug wrote:
 On 02/07/14, 10:49 , Vincent van Ravesteijn wrote:
 On Fri, Feb 7, 2014 at 10:02 AM, Rainer M Krug rai...@krugs.de wrote:

 One feature where additional metadata would
 definitely help are branches.

 Not necessarily: With use the comment package, that not only provides
 comments, but also branches (as named comment environments)
 http://www.ctan.org/pkg/comment
 all branches could be exported to the LaTeX file and the selection of
 active branches done in the preamble.

 Sure. In a more general sense this would be metadata as well: It would not 
 be used by LaTeX, but preserve LyX stuff through roundtrip.

 This feature should be customizable, because sometimes you don't want
 disabled branches in an exported file.

 On the LaTeX-LyX route, the use of the comment package and custom (named)
 comment environments should translate to LyX branches.

 Yes, this would be one method to transport metadata through LaTeX which does 
 not need an additional file. It could probably be used for other stuff 
 besides branches or notes as well. When this project is started it needs 
 probably a thorough investigation whether an additional file or special 
 comments are preferrable.

My point was somewhat different: The comment package is a LaTeX equivalent
to LyX's branches. Semantic export means that we use this feature to get
all the data into the LaTeX file instead of solving the active/inactive
state during export.

This is similar to either hard-printing chapter/section/... numbers in the
LaTeX file or using the auto-numbering LaTeX commands
(i.e. \section*{0.1 my beer}  vs. \section{my beer}).

Günter







Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-10 Thread Georg Baum
Rainer M Krug wrote:

 On 02/09/14, 20:25 , Georg Baum wrote:
 
 This is not possible. There are LyX features that simply do not appear in
 the exported LaTeX, so they can't be imported (e.g. branches or notes).
 It might be possible to support all LaTeX features, but the cost would be
 extremely high, so there will always be LaTeX files which can't be
 imported (usually the stuff found in .cls or .sty files).
 
 OK - in this regard you are right - haven't considered branches. But LyX
 notes could be exported as LaTeX comments starting with %%LyX-Note%%.
 
 Branches: isn't there conditional compiling in LaTeX? In this way
 branches could be kept and switched by activating these in the preamble?

Yes. IIRC there was even a discussion about how to translate branches into 
LaTeX if-statements some time ago, so branches are may not be the best 
example. Anyway, there will always be features without direct LaTeX 
representation,

 So: yes, the round-trip framework could be used for a subset of features
 initially for LyX - LaTeX, which can then be extended over time - I
 guess this would be the easiest to start with, actually.
 
 This does not make sense IMHO. Why artificially restrict the roundtrip?
 
 Because, as you said above, some features in LyX can not be exported
 into LaTeX and the other way round?

OK, if you meant these features I agree, then I probably misunderstood you 
in the first place.

 In addition, the round-trip would be
 needed to mainly edit content, and not that much formating - how a
 section header looks in word or in LyX is irrelevant, as long as it is
 recognised in the re-import / re-export for round-trip as a section
 header. In Contrast, when exporting (non-round trip) one wants a
 document as similar as possible to the LyX / LaTeX pdf (in most cases).

This is a useful feature as well, but IMHO not restricted to roundtrip: Even 
if you want to do a one-way export (e.g. because you know that somebody else 
will continue to work on the document and it will never come back to you), a 
switch similar to the clean option of writer2latex would be a good thing 
to have.

 You are right - LaTeX is a special case, as it is the default backend
 for LyX. So there are more strict requirements for the round-trip, and
 all improvements in the round-trip should be immediately in the LaTeX
 importer as well. But the story is different with other backends, e.g.
 docx, where, if you go to replicating the LaTeX view, you might end with
 painted documents which are not easily to be re-imported into LyX. But
 for round trip, the look is not that relevant, as long as the content
 and the structure can be re-imported.

I believe this alls boils down to semantic export as Stefano called it vs. 
painted export, and semantic export would be useful with roundtrip and 
without.


Georg



Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-10 Thread Georg Baum
stefano franchi wrote:

 Anyway: I am willing to mentor a student through the process of
 producing a LyX-to-Word semantic-only exporter. Scare quotes are
 necessary, because I would have to learn as much as the student. If Rob
 can provide some guidance and expert advice (both as a previous mentor and
 obviously as an expert in the area) I think we may have something working
 by the end of the summer. What I know is that *I* will absolutely a solid
 word-exporter by that time.

I don't have the time to be an official mentor, but I could help with 
technical advice as well. I also have some personal motivation, since my 
wife faces similar problems when exchanging documents with coworkers. And if 
nobody works on this for GSOC we may also be able to hack a quick and dirty 
version together that just works for you so that you don't need to go 
through pdf again. This is BTW exactly the way how I got involved with 
tex2lyx.


Georg



Copying from pdf (was Re: GSOC 2014 project list: on LyX--docx roundtrip conversion)

2014-02-10 Thread Andrew Parsloe



On 11/02/2014 2:59 a.m., stefano franchi wrote:



/rantI had to convert a ~50,000 words book from LyX to Word last month
and it took me 2 full days. I think I tried all exporters known to men
(and women). They all failed to various degrees. In the end, I had
better luck converting the file from the pdf (!) output to word and
then reinserting manually all footnotes (all 450 of them).  I am facing
the prospect of converting a 200,000 words manuscript in a few months
and I am already sweating at night at the very idea. \rant


Cheers,

Stefano



Did you just copy  paste from the pdf? That's something I've done 
before. The main problem is always that each line on the page in the pdf 
ends up as a separate paragraph in the pasted text in Word. How did you 
handle that?


I wrote a macro in Word to join up the lines into paragraphs, judging 
the end of a paragraph by the existence of a shorter line -- which 
obviously fails sometimes. (Copying a pdf followed by paste special has 
the same problem in LyX. I have an unfinished script, for the pLyX 
system, to do the same in LyX.)


---
This email is free from viruses and malware because avast! Antivirus protection 
is active.
http://www.avast.com



Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-10 Thread Georg Baum
Rainer M Krug wrote:

 On 02/07/14, 17:35 , Rob Oakes wrote:
 
 On another note, you should also probably have a discussion about how
 you want to handle maths. The math XML vocabulary in docx is pretty
 well contained, but it would still be an enormous job to translate it
 to LyX/LaTeX. For export, we might make use of the MathML support LyX
 already supports, and then translate thtat to docx math XML. There may
 even be XSLT to do that.

Yes, a good part of math export to docx is already there.

 I can't comment here, but I think that it would be nice to have this,
 and in many cases this might be absolutely essential, but it could be
 addad at a later stage, when the basic round-trip features are in place.
 But it should definitely be considered.

This depends a bit on the future of the LyX file format: If the plans to go 
to XML are revived it might be a good idea to use MathML for math in .lyx 
files, and in this case one would implement the MathML reader in C++ in 
LyX. If MathML will not be used in .lyx files then it is probably better to 
implement a MathML2LaTeX python module and use that.


Georg


Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-10 Thread Rainer M Krug


On 02/10/14, 21:09 , Georg Baum wrote:
 Rainer M Krug wrote:
 
 On 02/09/14, 20:25 , Georg Baum wrote:

 This is not possible. There are LyX features that simply do not appear in
 the exported LaTeX, so they can't be imported (e.g. branches or notes).
 It might be possible to support all LaTeX features, but the cost would be
 extremely high, so there will always be LaTeX files which can't be
 imported (usually the stuff found in .cls or .sty files).

 OK - in this regard you are right - haven't considered branches. But LyX
 notes could be exported as LaTeX comments starting with %%LyX-Note%%.

 Branches: isn't there conditional compiling in LaTeX? In this way
 branches could be kept and switched by activating these in the preamble?
 
 Yes. IIRC there was even a discussion about how to translate branches into 
 LaTeX if-statements some time ago, so branches are may not be the best 
 example. Anyway, there will always be features without direct LaTeX 
 representation,
 
 So: yes, the round-trip framework could be used for a subset of features
 initially for LyX - LaTeX, which can then be extended over time - I
 guess this would be the easiest to start with, actually.

 This does not make sense IMHO. Why artificially restrict the roundtrip?

 Because, as you said above, some features in LyX can not be exported
 into LaTeX and the other way round?
 
 OK, if you meant these features I agree, then I probably misunderstood you 
 in the first place.
 
 In addition, the round-trip would be
 needed to mainly edit content, and not that much formating - how a
 section header looks in word or in LyX is irrelevant, as long as it is
 recognised in the re-import / re-export for round-trip as a section
 header. In Contrast, when exporting (non-round trip) one wants a
 document as similar as possible to the LyX / LaTeX pdf (in most cases).
 
 This is a useful feature as well, but IMHO not restricted to roundtrip: Even 
 if you want to do a one-way export (e.g. because you know that somebody else 
 will continue to work on the document and it will never come back to you), a 
 switch similar to the clean option of writer2latex would be a good thing 
 to have.

I agree - there would be nothing stopping you to use the round-trip
export for a semantic export, as you define it below.

 
 You are right - LaTeX is a special case, as it is the default backend
 for LyX. So there are more strict requirements for the round-trip, and
 all improvements in the round-trip should be immediately in the LaTeX
 importer as well. But the story is different with other backends, e.g.
 docx, where, if you go to replicating the LaTeX view, you might end with
 painted documents which are not easily to be re-imported into LyX. But
 for round trip, the look is not that relevant, as long as the content
 and the structure can be re-imported.
 
 I believe this alls boils down to semantic export as Stefano called it vs. 
 painted export, and semantic export would be useful with roundtrip and 
 without.

Nothing to add here - a semantic export would be really a very useful
addition to LyX.

 
 
 Georg
 

-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :   +33 - (0)9 53 10 27 44
Cell:   +33 - (0)6 85 62 59 98
Fax :   +33 - (0)9 58 10 27 44

Fax (D):+49 - (0)3 21 21 25 22 44

email:  rai...@krugs.de

Skype:  RMkrug



signature.asc
Description: OpenPGP digital signature


Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-10 Thread Rainer M Krug
I have the feeling, this discussion can be summed up in two lines:

1) we would like to have a round-trip (docx backend)
2) but we need a sematic export (to docx)

Rainer

On 02/10/14, 21:09 , Georg Baum wrote:
 Rainer M Krug wrote:
 
 On 02/09/14, 20:25 , Georg Baum wrote:

 This is not possible. There are LyX features that simply do not appear in
 the exported LaTeX, so they can't be imported (e.g. branches or notes).
 It might be possible to support all LaTeX features, but the cost would be
 extremely high, so there will always be LaTeX files which can't be
 imported (usually the stuff found in .cls or .sty files).

 OK - in this regard you are right - haven't considered branches. But LyX
 notes could be exported as LaTeX comments starting with %%LyX-Note%%.

 Branches: isn't there conditional compiling in LaTeX? In this way
 branches could be kept and switched by activating these in the preamble?
 
 Yes. IIRC there was even a discussion about how to translate branches into 
 LaTeX if-statements some time ago, so branches are may not be the best 
 example. Anyway, there will always be features without direct LaTeX 
 representation,
 
 So: yes, the round-trip framework could be used for a subset of features
 initially for LyX - LaTeX, which can then be extended over time - I
 guess this would be the easiest to start with, actually.

 This does not make sense IMHO. Why artificially restrict the roundtrip?

 Because, as you said above, some features in LyX can not be exported
 into LaTeX and the other way round?
 
 OK, if you meant these features I agree, then I probably misunderstood you 
 in the first place.
 
 In addition, the round-trip would be
 needed to mainly edit content, and not that much formating - how a
 section header looks in word or in LyX is irrelevant, as long as it is
 recognised in the re-import / re-export for round-trip as a section
 header. In Contrast, when exporting (non-round trip) one wants a
 document as similar as possible to the LyX / LaTeX pdf (in most cases).
 
 This is a useful feature as well, but IMHO not restricted to roundtrip: Even 
 if you want to do a one-way export (e.g. because you know that somebody else 
 will continue to work on the document and it will never come back to you), a 
 switch similar to the clean option of writer2latex would be a good thing 
 to have.
 
 You are right - LaTeX is a special case, as it is the default backend
 for LyX. So there are more strict requirements for the round-trip, and
 all improvements in the round-trip should be immediately in the LaTeX
 importer as well. But the story is different with other backends, e.g.
 docx, where, if you go to replicating the LaTeX view, you might end with
 painted documents which are not easily to be re-imported into LyX. But
 for round trip, the look is not that relevant, as long as the content
 and the structure can be re-imported.
 
 I believe this alls boils down to semantic export as Stefano called it vs. 
 painted export, and semantic export would be useful with roundtrip and 
 without.
 
 
 Georg
 

-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :   +33 - (0)9 53 10 27 44
Cell:   +33 - (0)6 85 62 59 98
Fax :   +33 - (0)9 58 10 27 44

Fax (D):+49 - (0)3 21 21 25 22 44

email:  rai...@krugs.de

Skype:  RMkrug



signature.asc
Description: OpenPGP digital signature


Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-10 Thread stefano franchi
On Mon, Feb 10, 2014 at 2:36 PM, Rainer M Krug rai...@krugs.de wrote:

 I have the feeling, this discussion can be summed up in two lines:

 1) we would like to have a round-trip (docx backend)
 2) but we need a sematic export (to docx)


Or perhaps:
1) we would like to have a round-trip (docx backend)
2) but we need a semantic export (to docx)
3) so let's begin with (2)
4) and hopefully will will halfway through to (1)

The important question is:

- which design decisions in (2) could prevent a successful roundtrip? How
do we avoid those?


S.



-- 
__
Stefano Franchi
Associate Research Professor
Department of Hispanic Studies Ph:   +1 (979) 845-2125
Texas AM University  Fax:  +1 (979) 845-6421
College Station, Texas, USA

stef...@tamu.edu
http://stefano.cleinias.org


Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-10 Thread stefano franchi
On Mon, Feb 10, 2014 at 2:22 PM, Georg Baum
georg.b...@post.rwth-aachen.dewrote:

 stefano franchi wrote:

  Anyway: I am willing to mentor a student through the process of
  producing a LyX-to-Word semantic-only exporter. Scare quotes are
  necessary, because I would have to learn as much as the student. If Rob
  can provide some guidance and expert advice (both as a previous mentor
 and
  obviously as an expert in the area) I think we may have something working
  by the end of the summer. What I know is that *I* will absolutely a solid
  word-exporter by that time.

 I don't have the time to be an official mentor, but I could help with
 technical advice as well. I also have some personal motivation, since my
 wife faces similar problems when exchanging documents with coworkers. And
 if
 nobody works on this for GSOC we may also be able to hack a quick and dirty
 version together that just works for you so that you don't need to go
 through pdf again. This is BTW exactly the way how I got involved with
 tex2lyx.



Thanks Georg. I will definitely need all the help I can get. And I fully
agree with your last point. In fact, disgusted by my lyx--pdf--doc
experience, I started reading on docx and odt formats and was trying to
come up with a strategy. I am gald we are having this discussion!


Cheers,

S.




-- 
__
Stefano Franchi
Associate Research Professor
Department of Hispanic Studies Ph:   +1 (979) 845-2125
Texas AM University  Fax:  +1 (979) 845-6421
College Station, Texas, USA

stef...@tamu.edu
http://stefano.cleinias.org


Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-10 Thread Guenter Milde
On 2014-02-09, Georg Baum wrote:
> Rainer M Krug wrote:
>> On 02/07/14, 10:49 , Vincent van Ravesteijn wrote:
>>> On Fri, Feb 7, 2014 at 10:02 AM, Rainer M Krug  wrote:

> One feature where additional metadata would 
> definitely help are branches.

Not necessarily: With use the "comment" package, that not only provides
comments, but also branches (as named comment environments)
http://www.ctan.org/pkg/comment
all branches could be exported to the LaTeX file and the selection of active
branches done in the preamble.

This feature should be customizable, because sometimes you don't want 
disabled branches in an exported file.

On the LaTeX->LyX route, the use of the comment package and custom (named)
comment environments should translate to LyX branches.

Günter



Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-10 Thread Rainer M Krug


On 02/09/14, 20:25 , Georg Baum wrote:
> Rainer M Krug wrote:
> 
>> On 02/07/14, 10:49 , Vincent van Ravesteijn wrote:
>>> On Fri, Feb 7, 2014 at 10:02 AM, Rainer M Krug  wrote:
>>>
 The idea would be that a round-trip framework is envisaged, which
 provides the facilities to easily expand it from one export backend
 (docx) to another (possibly odt? markdown?).
>>>
>>> This sounds like a sort of testing framework which would indicate for
>>> each export backend which features are exported and imported
>>> successfully. It would be cool to have some matrix showing how mature
>>> each of the supported formats is.
>>
>> Nicely put! That would be brilliant. Not only formats, but converters:
>> different converters convert different features.
> 
> Yes, such a matrix would indeed be a nice tool.
> 
>>> Would this also solve some of the LyX->LaTeX->LyX roundtrip issues ?
> 
> Some, but I believe not many. The main LyX->LaTeX->LyX problems come from 
> the fact that LaTeX as a macro language is really ugly to parse. Only some 
> of them come from the fact that the exported LaTeX contains less information 
> than the original LyX file. One feature where additional metadata would 
> definitely help are branches.
> 
>> Partly - if the export to LaTeX is split from the round trip LyX <->
>> LaTeX I would say yes, with the caveat, that only a subset of features
>> would be supported by the round trip. In contrast, export - import would
>> (hopefully sometime in the case of import from LaTeX) the full set of
>> LyX and LaTeX features with (possibly ugly in LyX) the export / import.
> 
> This is not possible. There are LyX features that simply do not appear in 
> the exported LaTeX, so they can't be imported (e.g. branches or notes). It 
> might be possible to support all LaTeX features, but the cost would be 
> extremely high, so there will always be LaTeX files which can't be imported 
> (usually the stuff found in .cls or .sty files).

OK - in this regard you are right - haven't considered branches. But LyX
notes could be exported as LaTeX comments starting with %%LyX-Note%%.

Branches: isn't there conditional compiling in LaTeX? In this way
branches could be kept and switched by activating these in the preamble?

> 
>> So: yes, the round-trip framework could be used for a subset of features
>> initially for LyX <-> LaTeX, which can then be extended over time - I
>> guess this would be the easiest to start with, actually.
> 
> This does not make sense IMHO. Why artificially restrict the roundtrip?

Because, as you said above, some features in LyX can not be exported
into LaTeX and the other way round? In addition, the round-trip would be
needed to mainly edit content, and not that much formating - how a
section header looks in word or in LyX is irrelevant, as long as it is
recognised in the re-import / re-export for round-trip as a section
header. In Contrast, when exporting (non-round trip) one wants a
document as similar as possible to the LyX / LaTeX pdf (in most cases).

> 
> The LyX->LaTeX->LyX roundtrip is special in the sense that the LaTeX->LyX 
> step is very tightly integrated with LyX. Therefore it is indeed a good 
> starting point, but not in the way of splitting off a separate roundtrip, 
> but by extending the existing export/import with the additional metadata 
> file you mentioned. The advantage would be that you would not need to put 
> too much stuff into the metadata file, so it would be clear quickly the 
> general approach works.

You are right - LaTeX is a special case, as it is the default backend
for LyX. So there are more strict requirements for the round-trip, and
all improvements in the round-trip should be immediately in the LaTeX
importer as well. But the story is different with other backends, e.g.
docx, where, if you go to replicating the LaTeX view, you might end with
"painted" documents which are not easily to be re-imported into LyX. But
for round trip, the look is not that relevant, as long as the content
and the structure can be re-imported.

Rainer

> 
> 
> Georg
> 

-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :   +33 - (0)9 53 10 27 44
Cell:   +33 - (0)6 85 62 59 98
Fax :   +33 - (0)9 58 10 27 44

Fax (D):+49 - (0)3 21 21 25 22 44

email:  rai...@krugs.de

Skype:  RMkrug



signature.asc
Description: OpenPGP digital signature


Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-10 Thread Rainer M Krug


On 02/07/14, 17:35 , Rob Oakes wrote:
> Hi all,

Hi Rob,

good to hear from you on this topic.

> 
> I have been following conversations with significant interest. I 
> started a new job in the middle of last year. That combined with an 
> 18-month old little boy have left me almost no time for 
> extracurriculars. There is no way I can realistically mentor a student 
> this year. I tried really hard last year, and the effor didn't turn out 
> quite so well.
> 
> With that said, I would be happy to provide some support to a student 
> who might be trying to work on a round-trip Python module, or on the 
> non-linear writing project. If someone else were to be primary mentor, 
> I could certainly answer questions.
> 
> Re: Round-Tripping
> 
> As others in the thread have mentioned, I think the magic happens in 
> the export from LyX to Word. There are places where meta-data can be 
> placed in DOCX so that we can try and preserve LyX-specific features 
> through a round trip; and if we focus on semantic markup only, I think 
> we can do a pretty good job of maintaining content.

This is exactly what I think the round-triop has to focus on: semantic
markup to maintain the content and to have it available after edits for
re-import.

> 
> I think this means that we would need our own library to handle the 
> export. Most of the other modules out there do a very poor job of 
> handling semantic markup, and instead try and get the styling right. 
> That way madness lies. The Python module I started hacking on for the 
> import provides some support for generating docx, but this would need 
> to be beefed up pretty heavily. Moreover, it would be a really good 
> idea to make use of XSLT stylesheets to handle the transformation from 
> docx to LyX, instead of the Python approach I took.

This is why I also think that round-trip is a different story then
export - import. Different aspects of the document are relevant and need
to be preserved.

> 
> On another note, you should also probably have a discussion about how 
> you want to handle maths. The math XML vocabulary in docx is pretty 
> well contained, but it would still be an enormous job to translate it 
> to LyX/LaTeX. For export, we might make use of the MathML support LyX 
> already supports, and then translate thtat to docx math XML. There may 
> even be XSLT to do that.

I can't comment here, but I think that it would be nice to have this,
and in many cases this might be absolutely essential, but it could be
addad at a later stage, when the basic round-trip features are in place.
But it should definitely be considered.

Cheers,

Rainer

> 
> Cheers,
> 
> Rob
> 

-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :   +33 - (0)9 53 10 27 44
Cell:   +33 - (0)6 85 62 59 98
Fax :   +33 - (0)9 58 10 27 44

Fax (D):+49 - (0)3 21 21 25 22 44

email:  rai...@krugs.de

Skype:  RMkrug



signature.asc
Description: OpenPGP digital signature


Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-10 Thread Rainer M Krug


On 02/09/14, 19:44 , Georg Baum wrote:
> Rainer M Krug wrote:
> 
>> The idea would be that a round-trip framework is envisaged, which
>> provides the facilities to easily expand it from one export backend
>> (docx) to another (possibly odt? markdown?).
>>
>> IMPORTANT: this would NOT change ANYTHING in the existing export /
>> import features, as these are geared to export / import the documents as
>> good as possible, with maintaining as many features as possible in the
>> document.
>>
>> The round-trip would guarantee that:
>>
>> A document authored in LyX would result in a e.g. docx with a LIMITED
>> set of features, but that a re-import would result in the SAME .lyx
>> file. features and formats not supported by the backend should be stored
>> in a metadata file.
>>
>> The important point here is *limited set of features*!
>>
>> In addition, the framework should be easily, possibly only by using
>> config files, able to be extended to other formats.
> 
> I don't understand the difference between round trip and the existing 
> export/import here. Why is it important? If the additional metadata is 
> stored in a different file, it could simply be generated for the standard 
> export, and be used by the standard import (if it exists).
> 
> The goal of the export/import is to support as many features as possible. 
> This is needed for round trip as well. The only difference I see is the 
> additional metadata file, so the roundtrip framework vs. export/import 
> difference reduces to a switch whether the metadata file should be generated 
> (for export) or used (for import). Or did I understand anything wrong?

The difference is that for round-trip, i.e. working together with
co-authors and getting comments back, a different set of features are
relevant. These are mainly concerned about content and not that much
formating. The import - export is concerned with both. In addition, a
round trip has to be symmetric, i.e. that exported features have to be
available in the re-importd as well - this is not the case in the export
and import. Lastly, round-trip is for editing, and export - import is
for editing and final consumption (reading).

> 
>> Yes - although I see one problem which I could not find in any of the
>> .lyx <-> .docx : comments and track changes. These *have to be handled*.
>> I somehow have the feeling, that an inclusion of comments and track
>> changes into pandoc would be the best way forward...
> 
> I agree. Unfortunately pandoc is written in Haskell which reduces the number 
> of possible contributors significantly (which does not mean that Haskell is 
> a bad language, but that it is much less known than e.g. C++ or python).

True.

Cheers,

Rainer


> 
> 
> Georg
> 
> 

-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :   +33 - (0)9 53 10 27 44
Cell:   +33 - (0)6 85 62 59 98
Fax :   +33 - (0)9 58 10 27 44

Fax (D):+49 - (0)3 21 21 25 22 44

email:  rai...@krugs.de

Skype:  RMkrug



signature.asc
Description: OpenPGP digital signature


Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-10 Thread stefano franchi
On Mon, Feb 10, 2014 at 3:41 AM, Rainer M Krug  wrote:

>
>
> On 02/09/14, 19:44 , Georg Baum wrote:
> > Rainer M Krug wrote:
> >
> >> The idea would be that a round-trip framework is envisaged, which
> >> provides the facilities to easily expand it from one export backend
> >> (docx) to another (possibly odt? markdown?).
> >>
> >> IMPORTANT: this would NOT change ANYTHING in the existing export /
> >> import features, as these are geared to export / import the documents as
> >> good as possible, with maintaining as many features as possible in the
> >> document.
> >>
> >> The round-trip would guarantee that:
> >>
> >> A document authored in LyX would result in a e.g. docx with a LIMITED
> >> set of features, but that a re-import would result in the SAME .lyx
> >> file. features and formats not supported by the backend should be stored
> >> in a metadata file.
> >>
> >> The important point here is *limited set of features*!
> >>
> >> In addition, the framework should be easily, possibly only by using
> >> config files, able to be extended to other formats.
> >
> > I don't understand the difference between round trip and the existing
> > export/import here. Why is it important? If the additional metadata is
> > stored in a different file, it could simply be generated for the standard
> > export, and be used by the standard import (if it exists).
> >
> > The goal of the export/import is to support as many features as possible.
> > This is needed for round trip as well. The only difference I see is the
> > additional metadata file, so the roundtrip framework vs. export/import
> > difference reduces to a switch whether the metadata file should be
> generated
> > (for export) or used (for import). Or did I understand anything wrong?
>
> The difference is that for round-trip, i.e. working together with
> co-authors and getting comments back, a different set of features are
> relevant. These are mainly concerned about content and not that much
> formating. The import - export is concerned with both. In addition, a
> round trip has to be symmetric, i.e. that exported features have to be
> available in the re-importd as well - this is not the case in the export
> and import. Lastly, round-trip is for editing, and export - import is
> for editing and final consumption (reading).
>
>

I actually disagree on this point: the most useful doc-export facility for
LyX would be equally focused on semantic content and not on formatting. In
other words, it would be just half (or slightly less than half) of the
round-trip project. The rationale is simple: exporting to doc(x) makes
sense and is actually required when working with a third party (typically,
for  Lyx's main audience, with a publisher) who will then either provide
final formatting directly with Word (the worst case) or will use the doc(x)
file as import into a real typesetting program (InDesign, etc). In neither
case formatting instructions are relevant. I think it is a losing
proposition to aim for the preservation of format when exporting to
Word---and in fact it is the reason why, in my experience, *all* latex-to-
word- (or to-odt) or lyx-to-word exporters actually fail in practice. It is
impossibly hard to provide the same pdf look that (la)tex produces with
Word. And the use cases in which this conversion is required are
exceedingly rare. Far more pressing for our user base is the need to
guarantee a hassle-free 100% valid export to a "sanitized word format"
which is narrowly restricted, on both sides, to the semantic information
contained in LyX.

To put it more bluntly (and to repeat what I and others have stated many
times in the past): LyX is  barely usable right now for any academic work
in the Humanities, due to the necessity to deliver doc documents to
virtually any publisher. If you are a student, you are similarly asked by
professor to submit drafts in Word format.

I had to convert a ~50,000 words book from LyX to Word last month
and it took me 2 full days. I think I tried all exporters known to men (and
women). They all failed to various degrees. In the end, I had better luck
converting the file from the pdf (!) output to word and then
reinserting manually all footnotes (all 450 of them).  I am facing the
prospect of converting a 200,000 words manuscript in a few months and I am
already sweating at night at the very idea. <\rant>

Anyway: I am willing to "mentor" a student through the process of producing
a LyX-to-Word semantic-only exporter. Scare quotes are necessary, because I
would have to learn as much as the student. If Rob can provide some
guidance and expert advice (both as a previous mentor and obviously as an
expert in the area) I think we may have something working by the end of the
summer. What I know is that *I* will absolutely a solid word-exporter by
that time.


Cheers,

Stefano


-- 
__
Stefano Franchi
Associate Research Professor
Department of Hispanic Studies

Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-10 Thread Rainer M Krug


On 02/10/14, 14:59 , stefano franchi wrote:
> 
> 
> 
> On Mon, Feb 10, 2014 at 3:41 AM, Rainer M Krug  > wrote:
> 
> 
> 
> On 02/09/14, 19:44 , Georg Baum wrote:
> > Rainer M Krug wrote:
> >
> >> The idea would be that a round-trip framework is envisaged, which
> >> provides the facilities to easily expand it from one export backend
> >> (docx) to another (possibly odt? markdown?).
> >>
> >> IMPORTANT: this would NOT change ANYTHING in the existing export /
> >> import features, as these are geared to export / import the
> documents as
> >> good as possible, with maintaining as many features as possible
> in the
> >> document.
> >>
> >> The round-trip would guarantee that:
> >>
> >> A document authored in LyX would result in a e.g. docx with a LIMITED
> >> set of features, but that a re-import would result in the SAME .lyx
> >> file. features and formats not supported by the backend should be
> stored
> >> in a metadata file.
> >>
> >> The important point here is *limited set of features*!
> >>
> >> In addition, the framework should be easily, possibly only by using
> >> config files, able to be extended to other formats.
> >
> > I don't understand the difference between round trip and the existing
> > export/import here. Why is it important? If the additional metadata is
> > stored in a different file, it could simply be generated for the
> standard
> > export, and be used by the standard import (if it exists).
> >
> > The goal of the export/import is to support as many features as
> possible.
> > This is needed for round trip as well. The only difference I see
> is the
> > additional metadata file, so the roundtrip framework vs. export/import
> > difference reduces to a switch whether the metadata file should be
> generated
> > (for export) or used (for import). Or did I understand anything wrong?
> 
> The difference is that for round-trip, i.e. working together with
> co-authors and getting comments back, a different set of features are
> relevant. These are mainly concerned about content and not that much
> formating. The import - export is concerned with both. In addition, a
> round trip has to be symmetric, i.e. that exported features have to be
> available in the re-importd as well - this is not the case in the export
> and import. Lastly, round-trip is for editing, and export - import is
> for editing and final consumption (reading).
> 
> 
> 
> I actually disagree on this point: the most useful doc-export facility
> for LyX would be equally focused on semantic content and not on
> formatting. 

Only partial disagreement - In the case you describe below, the export
to semantic, which is equal to the round-trip export, would be the end
product.

So let's call it a "semantic exporter" versus a "complete exporter" (as
the one used export to for LaTeX).

> In other words, it would be just half (or slightly less than
> half) of the round-trip project. The rationale is simple: exporting to
> doc(x) makes sense and is actually required when working with a third
> party (typically, for  Lyx's main audience, with a publisher) who will
> then either provide final formatting directly with Word (the worst case)
> or will use the doc(x) file as import into a real typesetting program
> (InDesign, etc). In neither case formatting instructions are relevant. I
> think it is a losing proposition to aim for the preservation of format
> when exporting to Word---and in fact it is the reason why, in my
> experience, *all* latex-to- word- (or to-odt) or lyx-to-word exporters
> actually fail in practice. It is impossibly hard to provide the same pdf
> look that (la)tex produces with Word. And the use cases in which this
> conversion is required are exceedingly rare. Far more pressing for our
> user base is the need to guarantee a hassle-free 100% valid export to a
> "sanitized word format" which is narrowly restricted, on both sides, to
> the semantic information contained in LyX.
> 
> To put it more bluntly (and to repeat what I and others have stated many
> times in the past): LyX is  barely usable right now for any academic
> work in the Humanities, due to the necessity to deliver doc documents to
> virtually any publisher. If you are a student, you are similarly asked
> by professor to submit drafts in Word format.   
> 
> I had to convert a ~50,000 words book from LyX to Word last month
> and it took me 2 full days. I think I tried all exporters known to men
> (and women). They all failed to various degrees. In the end, I had
> better luck converting the file from the pdf (!) output to word and
> then reinserting manually all footnotes (all 450 of them).  I am facing
> the prospect of converting a 200,000 words manuscript in a few months
> and I am already sweating at night at 

Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-10 Thread Georg Baum
Guenter Milde wrote:

> On 2014-02-09, Georg Baum wrote:
>> Rainer M Krug wrote:
>>> On 02/07/14, 10:49 , Vincent van Ravesteijn wrote:
 On Fri, Feb 7, 2014 at 10:02 AM, Rainer M Krug  wrote:
> 
>> One feature where additional metadata would
>> definitely help are branches.
> 
> Not necessarily: With use the "comment" package, that not only provides
> comments, but also branches (as named comment environments)
> http://www.ctan.org/pkg/comment
> all branches could be exported to the LaTeX file and the selection of
> active branches done in the preamble.

Sure. In a more general sense this would be metadata as well: It would not 
be used by LaTeX, but preserve LyX stuff through roundtrip.

> This feature should be customizable, because sometimes you don't want
> disabled branches in an exported file.
> 
> On the LaTeX->LyX route, the use of the comment package and custom (named)
> comment environments should translate to LyX branches.

Yes, this would be one method to transport metadata through LaTeX which does 
not need an additional file. It could probably be used for other stuff 
besides branches or notes as well. When this project is started it needs 
probably a thorough investigation whether an additional file or special 
comments are preferrable.


Georg



Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-10 Thread Guenter Milde
On 2014-02-10, Georg Baum wrote:
> Guenter Milde wrote:

>> On 2014-02-09, Georg Baum wrote:
>>> Rainer M Krug wrote:
 On 02/07/14, 10:49 , Vincent van Ravesteijn wrote:
> On Fri, Feb 7, 2014 at 10:02 AM, Rainer M Krug  wrote:

>>> One feature where additional metadata would
>>> definitely help are branches.

>> Not necessarily: With use the "comment" package, that not only provides
>> comments, but also branches (as named comment environments)
>> http://www.ctan.org/pkg/comment
>> all branches could be exported to the LaTeX file and the selection of
>> active branches done in the preamble.

> Sure. In a more general sense this would be metadata as well: It would not 
> be used by LaTeX, but preserve LyX stuff through roundtrip.

>> This feature should be customizable, because sometimes you don't want
>> disabled branches in an exported file.

>> On the LaTeX->LyX route, the use of the comment package and custom (named)
>> comment environments should translate to LyX branches.

> Yes, this would be one method to transport metadata through LaTeX which does 
> not need an additional file. It could probably be used for other stuff 
> besides branches or notes as well. When this project is started it needs 
> probably a thorough investigation whether an additional file or special 
> comments are preferrable.

My point was somewhat different: The "comment" package is a LaTeX equivalent
to LyX's "branches". "Semantic" export means that we use this feature to get
all the data into the LaTeX file instead of "solving" the active/inactive
state during export.

This is similar to either hard-printing chapter/section/... numbers in the
LaTeX file or using the auto-numbering LaTeX commands
(i.e. \section*{0.1 my beer}  vs. \section{my beer}).

Günter







Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-10 Thread Georg Baum
Rainer M Krug wrote:

> On 02/09/14, 20:25 , Georg Baum wrote:
> 
>> This is not possible. There are LyX features that simply do not appear in
>> the exported LaTeX, so they can't be imported (e.g. branches or notes).
>> It might be possible to support all LaTeX features, but the cost would be
>> extremely high, so there will always be LaTeX files which can't be
>> imported (usually the stuff found in .cls or .sty files).
> 
> OK - in this regard you are right - haven't considered branches. But LyX
> notes could be exported as LaTeX comments starting with %%LyX-Note%%.
> 
> Branches: isn't there conditional compiling in LaTeX? In this way
> branches could be kept and switched by activating these in the preamble?

Yes. IIRC there was even a discussion about how to translate branches into 
LaTeX if-statements some time ago, so branches are may not be the best 
example. Anyway, there will always be features without direct LaTeX 
representation,

>>> So: yes, the round-trip framework could be used for a subset of features
>>> initially for LyX <-> LaTeX, which can then be extended over time - I
>>> guess this would be the easiest to start with, actually.
>> 
>> This does not make sense IMHO. Why artificially restrict the roundtrip?
> 
> Because, as you said above, some features in LyX can not be exported
> into LaTeX and the other way round?

OK, if you meant these features I agree, then I probably misunderstood you 
in the first place.

> In addition, the round-trip would be
> needed to mainly edit content, and not that much formating - how a
> section header looks in word or in LyX is irrelevant, as long as it is
> recognised in the re-import / re-export for round-trip as a section
> header. In Contrast, when exporting (non-round trip) one wants a
> document as similar as possible to the LyX / LaTeX pdf (in most cases).

This is a useful feature as well, but IMHO not restricted to roundtrip: Even 
if you want to do a one-way export (e.g. because you know that somebody else 
will continue to work on the document and it will never come back to you), a 
switch similar to the "clean" option of writer2latex would be a good thing 
to have.

> You are right - LaTeX is a special case, as it is the default backend
> for LyX. So there are more strict requirements for the round-trip, and
> all improvements in the round-trip should be immediately in the LaTeX
> importer as well. But the story is different with other backends, e.g.
> docx, where, if you go to replicating the LaTeX view, you might end with
> "painted" documents which are not easily to be re-imported into LyX. But
> for round trip, the look is not that relevant, as long as the content
> and the structure can be re-imported.

I believe this alls boils down to semantic export as Stefano called it vs. 
"painted" export, and semantic export would be useful with roundtrip and 
without.


Georg



Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-10 Thread Georg Baum
stefano franchi wrote:

> Anyway: I am willing to "mentor" a student through the process of
> producing a LyX-to-Word semantic-only exporter. Scare quotes are
> necessary, because I would have to learn as much as the student. If Rob
> can provide some guidance and expert advice (both as a previous mentor and
> obviously as an expert in the area) I think we may have something working
> by the end of the summer. What I know is that *I* will absolutely a solid
> word-exporter by that time.

I don't have the time to be an official mentor, but I could help with 
technical advice as well. I also have some personal motivation, since my 
wife faces similar problems when exchanging documents with coworkers. And if 
nobody works on this for GSOC we may also be able to hack a quick and dirty 
version together that just works for you so that you don't need to go 
through pdf again. This is BTW exactly the way how I got involved with 
tex2lyx.


Georg



Copying from pdf (was Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion)

2014-02-10 Thread Andrew Parsloe



On 11/02/2014 2:59 a.m., stefano franchi wrote:



I had to convert a ~50,000 words book from LyX to Word last month
and it took me 2 full days. I think I tried all exporters known to men
(and women). They all failed to various degrees. In the end, I had
better luck converting the file from the pdf (!) output to word and
then reinserting manually all footnotes (all 450 of them).  I am facing
the prospect of converting a 200,000 words manuscript in a few months
and I am already sweating at night at the very idea. <\rant>


Cheers,

Stefano



Did you just copy & paste from the pdf? That's something I've done 
before. The main problem is always that each line on the page in the pdf 
ends up as a separate paragraph in the pasted text in Word. How did you 
handle that?


I wrote a macro in Word to join up the lines into paragraphs, judging 
the end of a paragraph by the existence of a shorter line -- which 
obviously fails sometimes. (Copying a pdf followed by paste special has 
the same problem in LyX. I have an unfinished script, for the pLyX 
system, to do the same in LyX.)


---
This email is free from viruses and malware because avast! Antivirus protection 
is active.
http://www.avast.com



Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-10 Thread Georg Baum
Rainer M Krug wrote:

> On 02/07/14, 17:35 , Rob Oakes wrote:
>> 
>> On another note, you should also probably have a discussion about how
>> you want to handle maths. The math XML vocabulary in docx is pretty
>> well contained, but it would still be an enormous job to translate it
>> to LyX/LaTeX. For export, we might make use of the MathML support LyX
>> already supports, and then translate thtat to docx math XML. There may
>> even be XSLT to do that.

Yes, a good part of math export to docx is already there.

> I can't comment here, but I think that it would be nice to have this,
> and in many cases this might be absolutely essential, but it could be
> addad at a later stage, when the basic round-trip features are in place.
> But it should definitely be considered.

This depends a bit on the future of the LyX file format: If the plans to go 
to XML are revived it might be a good idea to use MathML for math in .lyx 
files, and in this case one would implement the MathML reader in C++ in 
LyX. If MathML will not be used in .lyx files then it is probably better to 
implement a MathML2LaTeX python module and use that.


Georg


Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-10 Thread Rainer M Krug


On 02/10/14, 21:09 , Georg Baum wrote:
> Rainer M Krug wrote:
> 
>> On 02/09/14, 20:25 , Georg Baum wrote:
>>
>>> This is not possible. There are LyX features that simply do not appear in
>>> the exported LaTeX, so they can't be imported (e.g. branches or notes).
>>> It might be possible to support all LaTeX features, but the cost would be
>>> extremely high, so there will always be LaTeX files which can't be
>>> imported (usually the stuff found in .cls or .sty files).
>>
>> OK - in this regard you are right - haven't considered branches. But LyX
>> notes could be exported as LaTeX comments starting with %%LyX-Note%%.
>>
>> Branches: isn't there conditional compiling in LaTeX? In this way
>> branches could be kept and switched by activating these in the preamble?
> 
> Yes. IIRC there was even a discussion about how to translate branches into 
> LaTeX if-statements some time ago, so branches are may not be the best 
> example. Anyway, there will always be features without direct LaTeX 
> representation,
> 
 So: yes, the round-trip framework could be used for a subset of features
 initially for LyX <-> LaTeX, which can then be extended over time - I
 guess this would be the easiest to start with, actually.
>>>
>>> This does not make sense IMHO. Why artificially restrict the roundtrip?
>>
>> Because, as you said above, some features in LyX can not be exported
>> into LaTeX and the other way round?
> 
> OK, if you meant these features I agree, then I probably misunderstood you 
> in the first place.
> 
>> In addition, the round-trip would be
>> needed to mainly edit content, and not that much formating - how a
>> section header looks in word or in LyX is irrelevant, as long as it is
>> recognised in the re-import / re-export for round-trip as a section
>> header. In Contrast, when exporting (non-round trip) one wants a
>> document as similar as possible to the LyX / LaTeX pdf (in most cases).
> 
> This is a useful feature as well, but IMHO not restricted to roundtrip: Even 
> if you want to do a one-way export (e.g. because you know that somebody else 
> will continue to work on the document and it will never come back to you), a 
> switch similar to the "clean" option of writer2latex would be a good thing 
> to have.

I agree - there would be nothing stopping you to use the round-trip
export for a semantic export, as you define it below.

> 
>> You are right - LaTeX is a special case, as it is the default backend
>> for LyX. So there are more strict requirements for the round-trip, and
>> all improvements in the round-trip should be immediately in the LaTeX
>> importer as well. But the story is different with other backends, e.g.
>> docx, where, if you go to replicating the LaTeX view, you might end with
>> "painted" documents which are not easily to be re-imported into LyX. But
>> for round trip, the look is not that relevant, as long as the content
>> and the structure can be re-imported.
> 
> I believe this alls boils down to semantic export as Stefano called it vs. 
> "painted" export, and semantic export would be useful with roundtrip and 
> without.

Nothing to add here - a semantic export would be really a very useful
addition to LyX.

> 
> 
> Georg
> 

-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :   +33 - (0)9 53 10 27 44
Cell:   +33 - (0)6 85 62 59 98
Fax :   +33 - (0)9 58 10 27 44

Fax (D):+49 - (0)3 21 21 25 22 44

email:  rai...@krugs.de

Skype:  RMkrug



signature.asc
Description: OpenPGP digital signature


Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-10 Thread Rainer M Krug
I have the feeling, this discussion can be summed up in two lines:

1) we would like to have a round-trip (docx backend)
2) but we need a sematic export (to docx)

Rainer

On 02/10/14, 21:09 , Georg Baum wrote:
> Rainer M Krug wrote:
> 
>> On 02/09/14, 20:25 , Georg Baum wrote:
>>
>>> This is not possible. There are LyX features that simply do not appear in
>>> the exported LaTeX, so they can't be imported (e.g. branches or notes).
>>> It might be possible to support all LaTeX features, but the cost would be
>>> extremely high, so there will always be LaTeX files which can't be
>>> imported (usually the stuff found in .cls or .sty files).
>>
>> OK - in this regard you are right - haven't considered branches. But LyX
>> notes could be exported as LaTeX comments starting with %%LyX-Note%%.
>>
>> Branches: isn't there conditional compiling in LaTeX? In this way
>> branches could be kept and switched by activating these in the preamble?
> 
> Yes. IIRC there was even a discussion about how to translate branches into 
> LaTeX if-statements some time ago, so branches are may not be the best 
> example. Anyway, there will always be features without direct LaTeX 
> representation,
> 
 So: yes, the round-trip framework could be used for a subset of features
 initially for LyX <-> LaTeX, which can then be extended over time - I
 guess this would be the easiest to start with, actually.
>>>
>>> This does not make sense IMHO. Why artificially restrict the roundtrip?
>>
>> Because, as you said above, some features in LyX can not be exported
>> into LaTeX and the other way round?
> 
> OK, if you meant these features I agree, then I probably misunderstood you 
> in the first place.
> 
>> In addition, the round-trip would be
>> needed to mainly edit content, and not that much formating - how a
>> section header looks in word or in LyX is irrelevant, as long as it is
>> recognised in the re-import / re-export for round-trip as a section
>> header. In Contrast, when exporting (non-round trip) one wants a
>> document as similar as possible to the LyX / LaTeX pdf (in most cases).
> 
> This is a useful feature as well, but IMHO not restricted to roundtrip: Even 
> if you want to do a one-way export (e.g. because you know that somebody else 
> will continue to work on the document and it will never come back to you), a 
> switch similar to the "clean" option of writer2latex would be a good thing 
> to have.
> 
>> You are right - LaTeX is a special case, as it is the default backend
>> for LyX. So there are more strict requirements for the round-trip, and
>> all improvements in the round-trip should be immediately in the LaTeX
>> importer as well. But the story is different with other backends, e.g.
>> docx, where, if you go to replicating the LaTeX view, you might end with
>> "painted" documents which are not easily to be re-imported into LyX. But
>> for round trip, the look is not that relevant, as long as the content
>> and the structure can be re-imported.
> 
> I believe this alls boils down to semantic export as Stefano called it vs. 
> "painted" export, and semantic export would be useful with roundtrip and 
> without.
> 
> 
> Georg
> 

-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :   +33 - (0)9 53 10 27 44
Cell:   +33 - (0)6 85 62 59 98
Fax :   +33 - (0)9 58 10 27 44

Fax (D):+49 - (0)3 21 21 25 22 44

email:  rai...@krugs.de

Skype:  RMkrug



signature.asc
Description: OpenPGP digital signature


Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-10 Thread stefano franchi
On Mon, Feb 10, 2014 at 2:36 PM, Rainer M Krug  wrote:

> I have the feeling, this discussion can be summed up in two lines:
>
> 1) we would like to have a round-trip (docx backend)
> 2) but we need a sematic export (to docx)
>

Or perhaps:
1) we would like to have a round-trip (docx backend)
2) but we need a semantic export (to docx)
3) so let's begin with (2)
4) and hopefully will will halfway through to (1)

The important question is:

- which design decisions in (2) could prevent a successful roundtrip? How
do we avoid those?


S.



-- 
__
Stefano Franchi
Associate Research Professor
Department of Hispanic Studies Ph:   +1 (979) 845-2125
Texas A University  Fax:  +1 (979) 845-6421
College Station, Texas, USA

stef...@tamu.edu
http://stefano.cleinias.org


Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-10 Thread stefano franchi
On Mon, Feb 10, 2014 at 2:22 PM, Georg Baum
wrote:

> stefano franchi wrote:
>
> > Anyway: I am willing to "mentor" a student through the process of
> > producing a LyX-to-Word semantic-only exporter. Scare quotes are
> > necessary, because I would have to learn as much as the student. If Rob
> > can provide some guidance and expert advice (both as a previous mentor
> and
> > obviously as an expert in the area) I think we may have something working
> > by the end of the summer. What I know is that *I* will absolutely a solid
> > word-exporter by that time.
>
> I don't have the time to be an official mentor, but I could help with
> technical advice as well. I also have some personal motivation, since my
> wife faces similar problems when exchanging documents with coworkers. And
> if
> nobody works on this for GSOC we may also be able to hack a quick and dirty
> version together that just works for you so that you don't need to go
> through pdf again. This is BTW exactly the way how I got involved with
> tex2lyx.
>
>

Thanks Georg. I will definitely need all the help I can get. And I fully
agree with your last point. In fact, disgusted by my lyx--pdf--doc
experience, I started reading on docx and odt formats and was trying to
come up with a strategy. I am gald we are having this discussion!


Cheers,

S.




-- 
__
Stefano Franchi
Associate Research Professor
Department of Hispanic Studies Ph:   +1 (979) 845-2125
Texas A University  Fax:  +1 (979) 845-6421
College Station, Texas, USA

stef...@tamu.edu
http://stefano.cleinias.org


Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-09 Thread Georg Baum
Jerry wrote:

 One can hope (right?) that since a commonality between LyX and docx is
 math that this would be included on the feature set(s). On OS X, the new
 versions of Word have a built-in math typesetting capability (and thus no
 longer depends on MathType). Presumably this is allowed by the docx format
 and presumably this is also an aspect of Windows Word. Autonumbering would
 also be hoped for.

Of course this would be a nice feature, but I guess it could easily be a 
GSOC project on its own.

The native equations in docx use a format called OMML (Office MathML). 
According to 
http://blogs.msdn.com/b/brian_jones/archive/2006/10/12/comparison-of-
openxml-math-and-mathml.aspx?Redirected=true there exist a XSLT style sheet 
to convert OMML to MathML. So, a docx-LyX converter could either directly 
translate OMML to LaTeX, or use the XSLT and translate MathML to LaTeX (I 
write LaTeX since math equations in LyX are directly stored in LaTeX 
syntax). Maybe some MathML-LaTeX converters are already available, but a 
quick search did not turn up anything. Fortunately this way is much easier 
to implement than LaTeX-MathML!


Georg



Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-09 Thread Georg Baum
Rainer M Krug wrote:

 The idea would be that a round-trip framework is envisaged, which
 provides the facilities to easily expand it from one export backend
 (docx) to another (possibly odt? markdown?).
 
 IMPORTANT: this would NOT change ANYTHING in the existing export /
 import features, as these are geared to export / import the documents as
 good as possible, with maintaining as many features as possible in the
 document.
 
 The round-trip would guarantee that:
 
 A document authored in LyX would result in a e.g. docx with a LIMITED
 set of features, but that a re-import would result in the SAME .lyx
 file. features and formats not supported by the backend should be stored
 in a metadata file.
 
 The important point here is *limited set of features*!
 
 In addition, the framework should be easily, possibly only by using
 config files, able to be extended to other formats.

I don't understand the difference between round trip and the existing 
export/import here. Why is it important? If the additional metadata is 
stored in a different file, it could simply be generated for the standard 
export, and be used by the standard import (if it exists).

The goal of the export/import is to support as many features as possible. 
This is needed for round trip as well. The only difference I see is the 
additional metadata file, so the roundtrip framework vs. export/import 
difference reduces to a switch whether the metadata file should be generated 
(for export) or used (for import). Or did I understand anything wrong?

 Yes - although I see one problem which I could not find in any of the
 .lyx - .docx : comments and track changes. These *have to be handled*.
 I somehow have the feeling, that an inclusion of comments and track
 changes into pandoc would be the best way forward...

I agree. Unfortunately pandoc is written in Haskell which reduces the number 
of possible contributors significantly (which does not mean that Haskell is 
a bad language, but that it is much less known than e.g. C++ or python).


Georg




Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-09 Thread Georg Baum
Rainer M Krug wrote:

 On 02/07/14, 10:49 , Vincent van Ravesteijn wrote:
 On Fri, Feb 7, 2014 at 10:02 AM, Rainer M Krug rai...@krugs.de wrote:
 
 The idea would be that a round-trip framework is envisaged, which
 provides the facilities to easily expand it from one export backend
 (docx) to another (possibly odt? markdown?).
 
 This sounds like a sort of testing framework which would indicate for
 each export backend which features are exported and imported
 successfully. It would be cool to have some matrix showing how mature
 each of the supported formats is.
 
 Nicely put! That would be brilliant. Not only formats, but converters:
 different converters convert different features.

Yes, such a matrix would indeed be a nice tool.

 Would this also solve some of the LyX-LaTeX-LyX roundtrip issues ?

Some, but I believe not many. The main LyX-LaTeX-LyX problems come from 
the fact that LaTeX as a macro language is really ugly to parse. Only some 
of them come from the fact that the exported LaTeX contains less information 
than the original LyX file. One feature where additional metadata would 
definitely help are branches.

 Partly - if the export to LaTeX is split from the round trip LyX -
 LaTeX I would say yes, with the caveat, that only a subset of features
 would be supported by the round trip. In contrast, export - import would
 (hopefully sometime in the case of import from LaTeX) the full set of
 LyX and LaTeX features with (possibly ugly in LyX) the export / import.

This is not possible. There are LyX features that simply do not appear in 
the exported LaTeX, so they can't be imported (e.g. branches or notes). It 
might be possible to support all LaTeX features, but the cost would be 
extremely high, so there will always be LaTeX files which can't be imported 
(usually the stuff found in .cls or .sty files).

 So: yes, the round-trip framework could be used for a subset of features
 initially for LyX - LaTeX, which can then be extended over time - I
 guess this would be the easiest to start with, actually.

This does not make sense IMHO. Why artificially restrict the roundtrip?

The LyX-LaTeX-LyX roundtrip is special in the sense that the LaTeX-LyX 
step is very tightly integrated with LyX. Therefore it is indeed a good 
starting point, but not in the way of splitting off a separate roundtrip, 
but by extending the existing export/import with the additional metadata 
file you mentioned. The advantage would be that you would not need to put 
too much stuff into the metadata file, so it would be clear quickly the 
general approach works.


Georg



Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-09 Thread Georg Baum
Cyrille Artho wrote:

 The issue is that the target file will be edited before it's re-imported
 (otherwise there is no point in exporting the data to being with). This
 can make a clean re-import very challenging.
 
 For example:
 
 Good: lyx - latex: Store extra data as special LaTeX comments.

This is very difficult to get right (see 
http://www.lyx.org/trac/ticket/6059)

 Bad: lyx - something rather alien (.docx or such): If you need to
 store information in other files, how are the parts going to be
 reconstituted after the .docx file has been changed?

I don't believe that docx would be more difficult than .tex here. I bet that 
it is possible to insert some custom (nonprinting) marks, where the metadata 
file could refer to.

 Regardless of the number of files, the problem is much harder than just
 a reversible mapping. It has to survive a certain amount of editing. The
 same edits in the original and in the exported version should map to the
 same result after re-importing:
 
 file -  lyx2target - target
 
  |  |
  | edit | edit
  V  V
 
 file' - target2lyx - target'
 
 
 At least for some editing, this should be supported. I don't think it is
 necessary to be perfect here, so it can probably be achieved for many
 useful practical cases, but I also think it's harder than just
 converting back and forth.

This is indeed a very challenging part of the task.


Georg



Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-09 Thread Georg Baum
Jerry wrote:

> One can hope (right?) that since a commonality between LyX and docx is
> math that this would be included on the feature set(s). On OS X, the new
> versions of Word have a built-in math typesetting capability (and thus no
> longer depends on MathType). Presumably this is allowed by the docx format
> and presumably this is also an aspect of Windows Word. Autonumbering would
> also be hoped for.

Of course this would be a nice feature, but I guess it could easily be a 
GSOC project on its own.

The native equations in docx use a format called OMML (Office MathML). 
According to 
http://blogs.msdn.com/b/brian_jones/archive/2006/10/12/comparison-of-
openxml-math-and-mathml.aspx?Redirected=true there exist a XSLT style sheet 
to convert OMML to MathML. So, a docx->LyX converter could either directly 
translate OMML to LaTeX, or use the XSLT and translate MathML to LaTeX (I 
write LaTeX since math equations in LyX are directly stored in LaTeX 
syntax). Maybe some MathML->LaTeX converters are already available, but a 
quick search did not turn up anything. Fortunately this way is much easier 
to implement than LaTeX->MathML!


Georg



Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-09 Thread Georg Baum
Rainer M Krug wrote:

> The idea would be that a round-trip framework is envisaged, which
> provides the facilities to easily expand it from one export backend
> (docx) to another (possibly odt? markdown?).
> 
> IMPORTANT: this would NOT change ANYTHING in the existing export /
> import features, as these are geared to export / import the documents as
> good as possible, with maintaining as many features as possible in the
> document.
> 
> The round-trip would guarantee that:
> 
> A document authored in LyX would result in a e.g. docx with a LIMITED
> set of features, but that a re-import would result in the SAME .lyx
> file. features and formats not supported by the backend should be stored
> in a metadata file.
> 
> The important point here is *limited set of features*!
> 
> In addition, the framework should be easily, possibly only by using
> config files, able to be extended to other formats.

I don't understand the difference between round trip and the existing 
export/import here. Why is it important? If the additional metadata is 
stored in a different file, it could simply be generated for the standard 
export, and be used by the standard import (if it exists).

The goal of the export/import is to support as many features as possible. 
This is needed for round trip as well. The only difference I see is the 
additional metadata file, so the roundtrip framework vs. export/import 
difference reduces to a switch whether the metadata file should be generated 
(for export) or used (for import). Or did I understand anything wrong?

> Yes - although I see one problem which I could not find in any of the
> .lyx <-> .docx : comments and track changes. These *have to be handled*.
> I somehow have the feeling, that an inclusion of comments and track
> changes into pandoc would be the best way forward...

I agree. Unfortunately pandoc is written in Haskell which reduces the number 
of possible contributors significantly (which does not mean that Haskell is 
a bad language, but that it is much less known than e.g. C++ or python).


Georg




Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-09 Thread Georg Baum
Rainer M Krug wrote:

> On 02/07/14, 10:49 , Vincent van Ravesteijn wrote:
>> On Fri, Feb 7, 2014 at 10:02 AM, Rainer M Krug  wrote:
>> 
>>> The idea would be that a round-trip framework is envisaged, which
>>> provides the facilities to easily expand it from one export backend
>>> (docx) to another (possibly odt? markdown?).
>> 
>> This sounds like a sort of testing framework which would indicate for
>> each export backend which features are exported and imported
>> successfully. It would be cool to have some matrix showing how mature
>> each of the supported formats is.
> 
> Nicely put! That would be brilliant. Not only formats, but converters:
> different converters convert different features.

Yes, such a matrix would indeed be a nice tool.

>> Would this also solve some of the LyX->LaTeX->LyX roundtrip issues ?

Some, but I believe not many. The main LyX->LaTeX->LyX problems come from 
the fact that LaTeX as a macro language is really ugly to parse. Only some 
of them come from the fact that the exported LaTeX contains less information 
than the original LyX file. One feature where additional metadata would 
definitely help are branches.

> Partly - if the export to LaTeX is split from the round trip LyX <->
> LaTeX I would say yes, with the caveat, that only a subset of features
> would be supported by the round trip. In contrast, export - import would
> (hopefully sometime in the case of import from LaTeX) the full set of
> LyX and LaTeX features with (possibly ugly in LyX) the export / import.

This is not possible. There are LyX features that simply do not appear in 
the exported LaTeX, so they can't be imported (e.g. branches or notes). It 
might be possible to support all LaTeX features, but the cost would be 
extremely high, so there will always be LaTeX files which can't be imported 
(usually the stuff found in .cls or .sty files).

> So: yes, the round-trip framework could be used for a subset of features
> initially for LyX <-> LaTeX, which can then be extended over time - I
> guess this would be the easiest to start with, actually.

This does not make sense IMHO. Why artificially restrict the roundtrip?

The LyX->LaTeX->LyX roundtrip is special in the sense that the LaTeX->LyX 
step is very tightly integrated with LyX. Therefore it is indeed a good 
starting point, but not in the way of splitting off a separate roundtrip, 
but by extending the existing export/import with the additional metadata 
file you mentioned. The advantage would be that you would not need to put 
too much stuff into the metadata file, so it would be clear quickly the 
general approach works.


Georg



Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-09 Thread Georg Baum
Cyrille Artho wrote:

> The issue is that the target file will be edited before it's re-imported
> (otherwise there is no point in exporting the data to being with). This
> can make a clean re-import very challenging.
> 
> For example:
> 
> "Good": lyx -> latex: Store extra data as special LaTeX comments.

This is very difficult to get right (see 
http://www.lyx.org/trac/ticket/6059)

> "Bad": lyx -> something rather alien (.docx or such): If you need to
> store information in other files, how are the parts going to be
> reconstituted after the .docx file has been changed?

I don't believe that docx would be more difficult than .tex here. I bet that 
it is possible to insert some custom (nonprinting) marks, where the metadata 
file could refer to.

> Regardless of the number of files, the problem is much harder than just
> a reversible mapping. It has to survive a certain amount of editing. The
> same edits in the original and in the exported version should map to the
> same result after re-importing:
> 
> file - > lyx2target -> target
> 
>  |  |
>  | edit | edit
>  V  V
> 
> file' <- target2lyx <- target'
> 
> 
> At least for some editing, this should be supported. I don't think it is
> necessary to be perfect here, so it can probably be achieved for many
> useful practical cases, but I also think it's harder than just
> converting back and forth.

This is indeed a very challenging part of the task.


Georg



Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-07 Thread Rainer M Krug
As I was one of the ones who initially raised this issue, let me coment
on it.

On 02/06/14, 18:35 , stefano franchi wrote:
 The first project in our wiki page for GSOC 2014:
 
 http://wiki.lyx.org/GSoC/GSoCProjectIdeasFor2014
 
 is, at the same time, the most ambitious and the least well-defined
 (possibly because the former implies the latter).

Exactly - this was thought of as a starting point to develop a GSoC
project - so it is not complete, but could also be split into different
aspects which can be treated separately.

 
 It is not clear to me if this a coding project or if it defines the
 outline of a preliminary non-coding feasibility study that would, at
 most, produce a document describing the minimal-lyx and minimal-docx
 feature sets (plus, possibly, a minimal-lyx-layout and a
 minimal-doc-template).

Well - the entry is about the mainly non-coding framework, and the
project would be about implementing it.

 
 Any thought on how to make it more focused?

The idea would be that a round-trip framework is envisaged, which
provides the facilities to easily expand it from one export backend
(docx) to another (possibly odt? markdown?).

IMPORTANT: this would NOT change ANYTHING in the existing export /
import features, as these are geared to export / import the documents as
good as possible, with maintaining as many features as possible in the
document.

The round-trip would guarantee that:

A document authored in LyX would result in a e.g. docx with a LIMITED
set of features, but that a re-import would result in the SAME .lyx
file. features and formats not supported by the backend should be stored
in a metadata file.

The important point here is *limited set of features*!

In addition, the framework should be easily, possibly only by using
config files, able to be extended to other formats.

 
 Perhaps we could define the goals as:
 
 1. Define a minimal-lyx feature set (I.e. the supported LyX/LaTeX features)

Agreed.

 2. Write a corresponding lyx-layout

As I said, non-supported formats / features should be available to the
user and handled gracefully, i.e. stored in a metadata file which will
be re-applied when re-iporting the round-trip file.

 3. Define a minimal-doc feature set  (Word/ODF features corresponding to (1)

Yes - but these have to be the same as in 1).

 4, Write a Word/OO template (the set of styles corresponding to 2)

Might be a good idea.

 5. Provide an automated path from 1 to 4 and back using glue-code and
 existing internal and external tools (e.g.: LyX export functions to
 XHTML/EPub, eLyxer, pandoc, writer2latex, etc).

Yes - although I see one problem which I could not find in any of the
.lyx - .docx : comments and track changes. These *have to be handled*.
I somehow have the feeling, that an inclusion of comments and track
changes into pandoc would be the best way forward...

 
 I am not sure points 1-5 above capture the existing description, partly
 because I am not sure about what is meant by develop a framework.
 Perhaps my summary caputeres the subgoal only?

Well - framework in the sense that the coded framework would not be
specific to .lyx - .docx but that it would be applicable to other
export backends as well:

Roundtrip from lyx:

.lyx
-- extract non-maintained formats / features and store these in
metadata / sidecar file (based on converter X)
-- convert .lyx to .??? using converter X
.???

and back after the .??? has been edited:

.???
-- extract non-maintained formats / features and store these in
metadata / sidecar file (based on converter Y)
-- convert .??? to .lyx
-- apply formats from metadata / sidecar file
.lyx

and back to .??? after editing:


.lyx
-- extract non-maintained formats / features and store these in
metadata / sidecar file (based on converter X)
-- convert .lyx to .??? using converter X
-- it would be great if there is a way of applying the sidecar file,
but I think this won't be possible
.???

Hope this clarifies some questions,

Rainer

 
 Stefano
 
 
 -- 
 __
 Stefano Franchi
 Associate Research Professor
 Department of Hispanic Studies Ph:   +1 (979) 845-2125
 Texas AM University  Fax:  +1 (979) 845-6421
 College Station, Texas, USA
 
 stef...@tamu.edu mailto:stef...@tamu.edu
 http://stefano.cleinias.org

-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :   +33 - (0)9 53 10 27 44
Cell:   +33 - (0)6 85 62 59 98
Fax :   +33 - (0)9 58 10 27 44

Fax (D):+49 - (0)3 21 21 25 22 44

email:  rai...@krugs.de

Skype:  RMkrug



signature.asc
Description: OpenPGP digital signature


Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-07 Thread Vincent van Ravesteijn
On Fri, Feb 7, 2014 at 10:02 AM, Rainer M Krug rai...@krugs.de wrote:

 The idea would be that a round-trip framework is envisaged, which
 provides the facilities to easily expand it from one export backend
 (docx) to another (possibly odt? markdown?).

This sounds like a sort of testing framework which would indicate for
each export backend which features are exported and imported
successfully. It would be cool to have some matrix showing how mature
each of the supported formats is.



 Perhaps we could define the goals as:

 1. Define a minimal-lyx feature set (I.e. the supported LyX/LaTeX features)

 Agreed.

 2. Write a corresponding lyx-layout

 As I said, non-supported formats / features should be available to the
 user and handled gracefully, i.e. stored in a metadata file which will
 be re-applied when re-iporting the round-trip file.


Would this also solve some of the LyX-LaTeX-LyX roundtrip issues ?


 Yes - although I see one problem which I could not find in any of the
 .lyx - .docx : comments and track changes. These *have to be handled*.
 I somehow have the feeling, that an inclusion of comments and track
 changes into pandoc would be the best way forward...

What is the problem you see ?

Vincent


Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-07 Thread Rainer M Krug


On 02/07/14, 10:49 , Vincent van Ravesteijn wrote:
 On Fri, Feb 7, 2014 at 10:02 AM, Rainer M Krug rai...@krugs.de wrote:
 
 The idea would be that a round-trip framework is envisaged, which
 provides the facilities to easily expand it from one export backend
 (docx) to another (possibly odt? markdown?).
 
 This sounds like a sort of testing framework which would indicate for
 each export backend which features are exported and imported
 successfully. It would be cool to have some matrix showing how mature
 each of the supported formats is.

Nicely put! That would be brilliant. Not only formats, but converters:
different converters convert different features.

 
 

 Perhaps we could define the goals as:

 1. Define a minimal-lyx feature set (I.e. the supported LyX/LaTeX features)

 Agreed.

 2. Write a corresponding lyx-layout

 As I said, non-supported formats / features should be available to the
 user and handled gracefully, i.e. stored in a metadata file which will
 be re-applied when re-iporting the round-trip file.

 
 Would this also solve some of the LyX-LaTeX-LyX roundtrip issues ?

Partly - if the export to LaTeX is split from the round trip LyX -
LaTeX I would say yes, with the caveat, that only a subset of features
would be supported by the round trip. In contrast, export - import would
(hopefully sometime in the case of import from LaTeX) the full set of
LyX and LaTeX features with (possibly ugly in LyX) the export / import.

So: yes, the round-trip framework could be used for a subset of features
initially for LyX - LaTeX, which can then be extended over time - I
guess this would be the easiest to start with, actually.

 

 Yes - although I see one problem which I could not find in any of the
 .lyx - .docx : comments and track changes. These *have to be handled*.
 I somehow have the feeling, that an inclusion of comments and track
 changes into pandoc would be the best way forward...
 
 What is the problem you see ?

With pandoc? Actually none, only that the development work would need to
be done in pandoc and not LyX.

 
 Vincent
 

-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :   +33 - (0)9 53 10 27 44
Cell:   +33 - (0)6 85 62 59 98
Fax :   +33 - (0)9 58 10 27 44

Fax (D):+49 - (0)3 21 21 25 22 44

email:  rai...@krugs.de

Skype:  RMkrug



signature.asc
Description: OpenPGP digital signature


Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-07 Thread Cyrille Artho

Dear all,
I think we have had earlier discussions (a few months ago) that touched 
some of the problems, but I see some major challenges when having a 
roundtrip conversion where extra data cannot be stored in the target 
file itself.


The issue is that the target file will be edited before it's re-imported 
(otherwise there is no point in exporting the data to being with). This 
can make a clean re-import very challenging.


For example:

Good: lyx - latex: Store extra data as special LaTeX comments.

A comment at the beginning of the file can let a LaTeX user know that 
some features (starting with %%LyX or such) should not be edited, 
because details are lost otherwise.


The reverse conversion should work well even if the LaTeX file is 
changed a lot, as a normal user can be expected to leave the extra 
comments where they belong.



Bad: lyx - something rather alien (.docx or such): If you need to 
store information in other files, how are the parts going to be 
reconstituted after the .docx file has been changed?



Regardless of the number of files, the problem is much harder than just 
a reversible mapping. It has to survive a certain amount of editing. The 
same edits in the original and in the exported version should map to the 
same result after re-importing:


   file -  lyx2target - target

|  |
| edit | edit
V  V

   file' - target2lyx - target'


At least for some editing, this should be supported. I don't think it is 
necessary to be perfect here, so it can probably be achieved for many 
useful practical cases, but I also think it's harder than just 
converting back and forth.




Vincent van Ravesteijn wrote:

On Fri, Feb 7, 2014 at 10:02 AM, Rainer M Krug rai...@krugs.de wrote:


The idea would be that a round-trip framework is envisaged, which
provides the facilities to easily expand it from one export backend
(docx) to another (possibly odt? markdown?).


This sounds like a sort of testing framework which would indicate for
each export backend which features are exported and imported
successfully. It would be cool to have some matrix showing how mature
each of the supported formats is.




Perhaps we could define the goals as:

1. Define a minimal-lyx feature set (I.e. the supported LyX/LaTeX features)


Agreed.


2. Write a corresponding lyx-layout


As I said, non-supported formats / features should be available to the
user and handled gracefully, i.e. stored in a metadata file which will
be re-applied when re-iporting the round-trip file.



Would this also solve some of the LyX-LaTeX-LyX roundtrip issues ?



Yes - although I see one problem which I could not find in any of the
.lyx - .docx : comments and track changes. These *have to be handled*.
I somehow have the feeling, that an inclusion of comments and track
changes into pandoc would be the best way forward...


What is the problem you see ?

Vincent



--
Regards,
Cyrille Artho - http://artho.com/
Those who will not reason, are bigots, those who cannot,
are fools, and those who dare not, are slaves.
-- George Gordon Noel Byron


Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-07 Thread Rainer M Krug
On 02/07/14, 11:11 , Cyrille Artho wrote:
 Dear all,
 I think we have had earlier discussions (a few months ago) that touched
 some of the problems, but I see some major challenges when having a
 roundtrip conversion where extra data cannot be stored in the target
 file itself.

To quote Coldplay and Willie Nelson (From The Scientist):

   Nobody said it was easy
   No one ever said it would be this hard

True - we had this discussion before and this is what the entry on the
wiki is based on.

I agree that there are big difficulties, and especially on the dark
side (read: the side we are exporting to) as we have no control over it.

 
 The issue is that the target file will be edited before it's re-imported
 (otherwise there is no point in exporting the data to being with). This
 can make a clean re-import very challenging.

Absolutely.

 
 For example:
 
 Good: lyx - latex: Store extra data as special LaTeX comments.
 
 A comment at the beginning of the file can let a LaTeX user know that
 some features (starting with %%LyX or such) should not be edited,
 because details are lost otherwise.
 
 The reverse conversion should work well even if the LaTeX file is
 changed a lot, as a normal user can be expected to leave the extra
 comments where they belong.
 

Exactly - that is what one can hope for.

 
 Bad: lyx - something rather alien (.docx or such): If you need to
 store information in other files, how are the parts going to be
 reconstituted after the .docx file has been changed?

Same way probably? Using notes and comments, convert these to LyX, as
they have a location, they will stay where they are, the post-process
the LyX file by applying the %%LyX notes?

 
 
 Regardless of the number of files, the problem is much harder than just
 a reversible mapping. It has to survive a certain amount of editing. The
 same edits in the original and in the exported version should map to the
 same result after re-importing:
 
file -  lyx2target - target
 
 |  |
 | edit | edit
 V  V
 
file' - target2lyx - target'
 

I have no idea how this can be done elegantly, but I hope this can be
made easier by

a) using some type of tags in comments / notes / et al which all
programs have and
b) a certain discipline by the person editing the documents by not
deleting these. This applies to the LaTeX user as well as the Word user.

Nothing is foolproof.

 
 At least for some editing, this should be supported. I don't think it is
 necessary to be perfect here, so it can probably be achieved for many
 useful practical cases, but I also think it's harder than just
 converting back and forth.
 

You are definitely right here, and that is the reason why this proposal
/ idea sounds quite vague: it is a difficult problem (especially as I
think one should keep the structure very flexible, so that it will be
easy to support different backends) and there is no control over the
other side of the roundtrip. In LyX one could restrict the features
usable, display warnings on ex[port for roundtrip, etc. But there is no
way this can be done in e.g. word or LaTeX.

And I agree: one has to start with a small subset of features supported,
and then they can be extended.

That is why on http://wiki.lyx.org/GSoC/GSoCProjectIdeasFor2014#toc1

there is a list of *suggested* features:


Features to be included in the round trip are:

sections, headers, ...
lists
emphasis, bold, ...
comments
track changes
tables and figures?
footnotes
bibliographic references
math?
cross-references


So I see the difficulties, but a system like this would be tremendously
useful to support roundtrips to many different backends.

Cheers,

Rainer

 
 
 Vincent van Ravesteijn wrote:
 On Fri, Feb 7, 2014 at 10:02 AM, Rainer M Krug rai...@krugs.de wrote:

 The idea would be that a round-trip framework is envisaged, which
 provides the facilities to easily expand it from one export backend
 (docx) to another (possibly odt? markdown?).

 This sounds like a sort of testing framework which would indicate for
 each export backend which features are exported and imported
 successfully. It would be cool to have some matrix showing how mature
 each of the supported formats is.



 Perhaps we could define the goals as:

 1. Define a minimal-lyx feature set (I.e. the supported LyX/LaTeX
 features)

 Agreed.

 2. Write a corresponding lyx-layout

 As I said, non-supported formats / features should be available to the
 user and handled gracefully, i.e. stored in a metadata file which will
 be re-applied when re-iporting the round-trip file.


 Would this also solve some of the LyX-LaTeX-LyX roundtrip issues ?


 Yes - although I see one problem which I could not find in any of the
 .lyx - .docx : comments and track changes. These *have to be handled*.
 I somehow have the feeling, that an inclusion of comments and track
 changes into pandoc would be the best way 

Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-07 Thread Rainer M Krug
As I was one of the ones who initially raised this issue, let me coment
on it.

On 02/06/14, 18:35 , stefano franchi wrote:
> The first project in our wiki page for GSOC 2014:
> 
> http://wiki.lyx.org/GSoC/GSoCProjectIdeasFor2014
> 
> is, at the same time, the most ambitious and the least well-defined
> (possibly because the former implies the latter).

Exactly - this was thought of as a starting point to develop a GSoC
project - so it is not complete, but could also be split into different
aspects which can be treated separately.

> 
> It is not clear to me if this a coding project or if it defines the
> outline of a preliminary non-coding feasibility study that would, at
> most, produce a document describing the minimal-lyx and minimal-docx
> feature sets (plus, possibly, a minimal-lyx-layout and a
> minimal-doc-template).

Well - the entry is about the mainly non-coding framework, and the
project would be about implementing it.

> 
> Any thought on how to make it more focused?

The idea would be that a round-trip framework is envisaged, which
provides the facilities to easily expand it from one export backend
(docx) to another (possibly odt? markdown?).

IMPORTANT: this would NOT change ANYTHING in the existing export /
import features, as these are geared to export / import the documents as
good as possible, with maintaining as many features as possible in the
document.

The round-trip would guarantee that:

A document authored in LyX would result in a e.g. docx with a LIMITED
set of features, but that a re-import would result in the SAME .lyx
file. features and formats not supported by the backend should be stored
in a metadata file.

The important point here is *limited set of features*!

In addition, the framework should be easily, possibly only by using
config files, able to be extended to other formats.

> 
> Perhaps we could define the goals as:
> 
> 1. Define a minimal-lyx feature set (I.e. the supported LyX/LaTeX features)

Agreed.

> 2. Write a corresponding lyx-layout

As I said, non-supported formats / features should be available to the
user and handled gracefully, i.e. stored in a metadata file which will
be re-applied when re-iporting the round-trip file.

> 3. Define a minimal-doc feature set  (Word/ODF features corresponding to (1)

Yes - but these have to be the same as in 1).

> 4, Write a Word/OO template (the set of styles corresponding to 2)

Might be a good idea.

> 5. Provide an automated path from 1 to 4 and back using glue-code and
> existing internal and external tools (e.g.: LyX export functions to
> XHTML/EPub, eLyxer, pandoc, writer2latex, etc).

Yes - although I see one problem which I could not find in any of the
.lyx <-> .docx : comments and track changes. These *have to be handled*.
I somehow have the feeling, that an inclusion of comments and track
changes into pandoc would be the best way forward...

> 
> I am not sure points 1-5 above capture the existing description, partly
> because I am not sure about what is meant by "develop a framework".
> Perhaps my summary caputeres the subgoal only?

Well - "framework" in the sense that the coded "framework" would not be
specific to .lyx <-> .docx but that it would be applicable to other
export backends as well:

Roundtrip from lyx:

.lyx
--> extract non-maintained formats / features and store these in
metadata / sidecar file (based on converter X)
--> convert .lyx to .??? using converter X
.???

and back after the .??? has been edited:

.???
--> extract non-maintained formats / features and store these in
metadata / sidecar file (based on converter Y)
--> convert .??? to .lyx
--> apply formats from metadata / sidecar file
.lyx

and back to .??? after editing:


.lyx
--> extract non-maintained formats / features and store these in
metadata / sidecar file (based on converter X)
--> convert .lyx to .??? using converter X
--> it would be great if there is a way of applying the sidecar file,
but I think this won't be possible
.???

Hope this clarifies some questions,

Rainer

> 
> Stefano
> 
> 
> -- 
> __
> Stefano Franchi
> Associate Research Professor
> Department of Hispanic Studies Ph:   +1 (979) 845-2125
> Texas A University  Fax:  +1 (979) 845-6421
> College Station, Texas, USA
> 
> stef...@tamu.edu 
> http://stefano.cleinias.org

-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :   +33 - (0)9 53 10 27 44
Cell:   +33 - (0)6 85 62 59 98
Fax :   +33 - (0)9 58 10 27 44

Fax (D):+49 - (0)3 21 21 25 22 44

email:  rai...@krugs.de

Skype:  RMkrug



signature.asc
Description: OpenPGP digital signature


Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-07 Thread Vincent van Ravesteijn
On Fri, Feb 7, 2014 at 10:02 AM, Rainer M Krug  wrote:

> The idea would be that a round-trip framework is envisaged, which
> provides the facilities to easily expand it from one export backend
> (docx) to another (possibly odt? markdown?).

This sounds like a sort of testing framework which would indicate for
each export backend which features are exported and imported
successfully. It would be cool to have some matrix showing how mature
each of the supported formats is.


>>
>> Perhaps we could define the goals as:
>>
>> 1. Define a minimal-lyx feature set (I.e. the supported LyX/LaTeX features)
>
> Agreed.
>
>> 2. Write a corresponding lyx-layout
>
> As I said, non-supported formats / features should be available to the
> user and handled gracefully, i.e. stored in a metadata file which will
> be re-applied when re-iporting the round-trip file.
>

Would this also solve some of the LyX->LaTeX->LyX roundtrip issues ?

>
> Yes - although I see one problem which I could not find in any of the
> .lyx <-> .docx : comments and track changes. These *have to be handled*.
> I somehow have the feeling, that an inclusion of comments and track
> changes into pandoc would be the best way forward...

What is the problem you see ?

Vincent


Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-07 Thread Rainer M Krug


On 02/07/14, 10:49 , Vincent van Ravesteijn wrote:
> On Fri, Feb 7, 2014 at 10:02 AM, Rainer M Krug  wrote:
> 
>> The idea would be that a round-trip framework is envisaged, which
>> provides the facilities to easily expand it from one export backend
>> (docx) to another (possibly odt? markdown?).
> 
> This sounds like a sort of testing framework which would indicate for
> each export backend which features are exported and imported
> successfully. It would be cool to have some matrix showing how mature
> each of the supported formats is.

Nicely put! That would be brilliant. Not only formats, but converters:
different converters convert different features.

> 
> 
>>>
>>> Perhaps we could define the goals as:
>>>
>>> 1. Define a minimal-lyx feature set (I.e. the supported LyX/LaTeX features)
>>
>> Agreed.
>>
>>> 2. Write a corresponding lyx-layout
>>
>> As I said, non-supported formats / features should be available to the
>> user and handled gracefully, i.e. stored in a metadata file which will
>> be re-applied when re-iporting the round-trip file.
>>
> 
> Would this also solve some of the LyX->LaTeX->LyX roundtrip issues ?

Partly - if the export to LaTeX is split from the round trip LyX <->
LaTeX I would say yes, with the caveat, that only a subset of features
would be supported by the round trip. In contrast, export - import would
(hopefully sometime in the case of import from LaTeX) the full set of
LyX and LaTeX features with (possibly ugly in LyX) the export / import.

So: yes, the round-trip framework could be used for a subset of features
initially for LyX <-> LaTeX, which can then be extended over time - I
guess this would be the easiest to start with, actually.

> 
>>
>> Yes - although I see one problem which I could not find in any of the
>> .lyx <-> .docx : comments and track changes. These *have to be handled*.
>> I somehow have the feeling, that an inclusion of comments and track
>> changes into pandoc would be the best way forward...
> 
> What is the problem you see ?

With pandoc? Actually none, only that the development work would need to
be done in pandoc and not LyX.

> 
> Vincent
> 

-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :   +33 - (0)9 53 10 27 44
Cell:   +33 - (0)6 85 62 59 98
Fax :   +33 - (0)9 58 10 27 44

Fax (D):+49 - (0)3 21 21 25 22 44

email:  rai...@krugs.de

Skype:  RMkrug



signature.asc
Description: OpenPGP digital signature


Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-07 Thread Cyrille Artho

Dear all,
I think we have had earlier discussions (a few months ago) that touched 
some of the problems, but I see some major challenges when having a 
roundtrip conversion where extra data cannot be stored in the target 
file itself.


The issue is that the target file will be edited before it's re-imported 
(otherwise there is no point in exporting the data to being with). This 
can make a clean re-import very challenging.


For example:

"Good": lyx -> latex: Store extra data as special LaTeX comments.

A comment at the beginning of the file can let a LaTeX user know that 
some features (starting with %%LyX or such) should not be edited, 
because details are lost otherwise.


The reverse conversion should work well even if the LaTeX file is 
changed a lot, as a normal user can be expected to leave the extra 
comments where they belong.



"Bad": lyx -> something rather alien (.docx or such): If you need to 
store information in other files, how are the parts going to be 
reconstituted after the .docx file has been changed?



Regardless of the number of files, the problem is much harder than just 
a reversible mapping. It has to survive a certain amount of editing. The 
same edits in the original and in the exported version should map to the 
same result after re-importing:


   file - > lyx2target -> target

|  |
| edit | edit
V  V

   file' <- target2lyx <- target'


At least for some editing, this should be supported. I don't think it is 
necessary to be perfect here, so it can probably be achieved for many 
useful practical cases, but I also think it's harder than just 
converting back and forth.




Vincent van Ravesteijn wrote:

On Fri, Feb 7, 2014 at 10:02 AM, Rainer M Krug  wrote:


The idea would be that a round-trip framework is envisaged, which
provides the facilities to easily expand it from one export backend
(docx) to another (possibly odt? markdown?).


This sounds like a sort of testing framework which would indicate for
each export backend which features are exported and imported
successfully. It would be cool to have some matrix showing how mature
each of the supported formats is.




Perhaps we could define the goals as:

1. Define a minimal-lyx feature set (I.e. the supported LyX/LaTeX features)


Agreed.


2. Write a corresponding lyx-layout


As I said, non-supported formats / features should be available to the
user and handled gracefully, i.e. stored in a metadata file which will
be re-applied when re-iporting the round-trip file.



Would this also solve some of the LyX->LaTeX->LyX roundtrip issues ?



Yes - although I see one problem which I could not find in any of the
.lyx <-> .docx : comments and track changes. These *have to be handled*.
I somehow have the feeling, that an inclusion of comments and track
changes into pandoc would be the best way forward...


What is the problem you see ?

Vincent



--
Regards,
Cyrille Artho - http://artho.com/
Those who will not reason, are bigots, those who cannot,
are fools, and those who dare not, are slaves.
-- George Gordon Noel Byron


Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-07 Thread Rainer M Krug
On 02/07/14, 11:11 , Cyrille Artho wrote:
> Dear all,
> I think we have had earlier discussions (a few months ago) that touched
> some of the problems, but I see some major challenges when having a
> roundtrip conversion where extra data cannot be stored in the target
> file itself.

To quote Coldplay and Willie Nelson (From "The Scientist"):

   Nobody said it was easy
   No one ever said it would be this hard

True - we had this discussion before and this is what the entry on the
wiki is based on.

I agree that there are big difficulties, and especially on the "dark
side" (read: the side we are exporting to) as we have no control over it.

> 
> The issue is that the target file will be edited before it's re-imported
> (otherwise there is no point in exporting the data to being with). This
> can make a clean re-import very challenging.

Absolutely.

> 
> For example:
> 
> "Good": lyx -> latex: Store extra data as special LaTeX comments.
> 
> A comment at the beginning of the file can let a LaTeX user know that
> some features (starting with %%LyX or such) should not be edited,
> because details are lost otherwise.
> 
> The reverse conversion should work well even if the LaTeX file is
> changed a lot, as a normal user can be expected to leave the extra
> comments where they belong.
> 

Exactly - that is what one can hope for.

> 
> "Bad": lyx -> something rather alien (.docx or such): If you need to
> store information in other files, how are the parts going to be
> reconstituted after the .docx file has been changed?

Same way probably? Using notes and comments, convert these to LyX, as
they have a location, they will stay where they are, the post-process
the LyX file by applying the %%LyX notes?

> 
> 
> Regardless of the number of files, the problem is much harder than just
> a reversible mapping. It has to survive a certain amount of editing. The
> same edits in the original and in the exported version should map to the
> same result after re-importing:
> 
>file - > lyx2target -> target
> 
> |  |
> | edit | edit
> V  V
> 
>file' <- target2lyx <- target'
> 

I have no idea how this can be done elegantly, but I hope this can be
made easier by

a) using some type of tags in comments / notes / et al which all
programs have and
b) a certain discipline by the person editing the documents by not
deleting these. This applies to the LaTeX user as well as the Word user.

Nothing is foolproof.

> 
> At least for some editing, this should be supported. I don't think it is
> necessary to be perfect here, so it can probably be achieved for many
> useful practical cases, but I also think it's harder than just
> converting back and forth.
> 

You are definitely right here, and that is the reason why this proposal
/ idea sounds quite vague: it is a difficult problem (especially as I
think one should keep the structure very flexible, so that it will be
easy to support different backends) and there is no control over the
other side of the roundtrip. In LyX one could restrict the features
usable, display warnings on ex[port for roundtrip, etc. But there is no
way this can be done in e.g. word or LaTeX.

And I agree: one has to start with a small subset of features supported,
and then they can be extended.

That is why on http://wiki.lyx.org/GSoC/GSoCProjectIdeasFor2014#toc1

there is a list of *suggested* features:


Features to be included in the round trip are:

sections, headers, ...
lists
emphasis, bold, ...
comments
track changes
tables and figures?
footnotes
bibliographic references
math?
cross-references


So I see the difficulties, but a system like this would be tremendously
useful to support roundtrips to many different backends.

Cheers,

Rainer

> 
> 
> Vincent van Ravesteijn wrote:
>> On Fri, Feb 7, 2014 at 10:02 AM, Rainer M Krug  wrote:
>>
>>> The idea would be that a round-trip framework is envisaged, which
>>> provides the facilities to easily expand it from one export backend
>>> (docx) to another (possibly odt? markdown?).
>>
>> This sounds like a sort of testing framework which would indicate for
>> each export backend which features are exported and imported
>> successfully. It would be cool to have some matrix showing how mature
>> each of the supported formats is.
>>
>>

 Perhaps we could define the goals as:

 1. Define a minimal-lyx feature set (I.e. the supported LyX/LaTeX
 features)
>>>
>>> Agreed.
>>>
 2. Write a corresponding lyx-layout
>>>
>>> As I said, non-supported formats / features should be available to the
>>> user and handled gracefully, i.e. stored in a metadata file which will
>>> be re-applied when re-iporting the round-trip file.
>>>
>>
>> Would this also solve some of the LyX->LaTeX->LyX roundtrip issues ?
>>
>>>
>>> Yes - although I see one problem which I could not find in any of the
>>> .lyx <-> .docx : comments 

Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-06 Thread Liviu Andronic
On Thu, Feb 6, 2014 at 6:35 PM, stefano franchi
stefano.fran...@gmail.com wrote:
 The first project in our wiki page for GSOC 2014:

 http://wiki.lyx.org/GSoC/GSoCProjectIdeasFor2014

 is, at the same time, the most ambitious and the least well-defined
 (possibly because the former implies the latter).

 It is not clear to me if this a coding project or if it defines the outline
 of a preliminary non-coding feasibility study that would, at most, produce a
 document describing the minimal-lyx and minimal-docx feature sets (plus,
 possibly, a minimal-lyx-layout and a minimal-doc-template).

 Any thought on how to make it more focused?

Rob has already done some work on this, with a working first release:
http://blog.oak-tree.us/index.php/2012/03/08/word2lyx01-2

Maybe that could be a starting point, and perhaps he has some pointers
on what can/needs to be done.

Liviu


 Perhaps we could define the goals as:

 1. Define a minimal-lyx feature set (I.e. the supported LyX/LaTeX features)
 2. Write a corresponding lyx-layout
 3. Define a minimal-doc feature set  (Word/ODF features corresponding to (1)
 4, Write a Word/OO template (the set of styles corresponding to 2)
 5. Provide an automated path from 1 to 4 and back using glue-code and
 existing internal and external tools (e.g.: LyX export functions to
 XHTML/EPub, eLyxer, pandoc, writer2latex, etc).

 I am not sure points 1-5 above capture the existing description, partly
 because I am not sure about what is meant by develop a framework. Perhaps
 my summary caputeres the subgoal only?

 Stefano


 --
 __
 Stefano Franchi
 Associate Research Professor
 Department of Hispanic Studies Ph:   +1 (979) 845-2125
 Texas AM University  Fax:  +1 (979) 845-6421
 College Station, Texas, USA

 stef...@tamu.edu
 http://stefano.cleinias.org



-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail


Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-06 Thread stefano franchi
On Thu, Feb 6, 2014 at 12:36 PM, Liviu Andronic landronim...@gmail.comwrote:

 On Thu, Feb 6, 2014 at 6:35 PM, stefano franchi
 stefano.fran...@gmail.com wrote:
  The first project in our wiki page for GSOC 2014:
 
  http://wiki.lyx.org/GSoC/GSoCProjectIdeasFor2014
 
  is, at the same time, the most ambitious and the least well-defined
  (possibly because the former implies the latter).
 
  It is not clear to me if this a coding project or if it defines the
 outline
  of a preliminary non-coding feasibility study that would, at most,
 produce a
  document describing the minimal-lyx and minimal-docx feature sets (plus,
  possibly, a minimal-lyx-layout and a minimal-doc-template).
 
  Any thought on how to make it more focused?
 
 Rob has already done some work on this, with a working first release:
 http://blog.oak-tree.us/index.php/2012/03/08/word2lyx01-2

 Maybe that could be a starting point, and perhaps he has some pointers
 on what can/needs to be done.


This is great news. I was unaware of Rob's work in this area. I guess it's
take care of both points (3) and (4) (and half of (5)) in my list. At list
in a preliminary way. Perhaps we could focus on Python skills and on
familiarity with LyX/LaTeX and Word formats?
If so, I'd be willing to be potential mentor for the project, unless Rob
wants to step in.
The conversion to Word and the correlated problems when cooperating with
non-LyX using colleagues is by far the biggest problem I have with LyX and
I would love to see some progress on it.

Cheers,

Stefano


-- 
__
Stefano Franchi
Associate Research Professor
Department of Hispanic Studies Ph:   +1 (979) 845-2125
Texas AM University  Fax:  +1 (979) 845-6421
College Station, Texas, USA

stef...@tamu.edu
http://stefano.cleinias.org


Re: GSOC 2014 project list: on LyX--docx roundtrip conversion

2014-02-06 Thread Jerry

On Feb 6, 2014, at 10:35 AM, stefano franchi stefano.fran...@gmail.com wrote:

 The first project in our wiki page for GSOC 2014:
 
 http://wiki.lyx.org/GSoC/GSoCProjectIdeasFor2014
 
 is, at the same time, the most ambitious and the least well-defined (possibly 
 because the former implies the latter).
 
 It is not clear to me if this a coding project or if it defines the outline 
 of a preliminary non-coding feasibility study that would, at most, produce a 
 document describing the minimal-lyx and minimal-docx feature sets (plus, 
 possibly, a minimal-lyx-layout and a minimal-doc-template).
 
 Any thought on how to make it more focused?
 
 Perhaps we could define the goals as:
 
 1. Define a minimal-lyx feature set (I.e. the supported LyX/LaTeX features)
 2. Write a corresponding lyx-layout
 3. Define a minimal-doc feature set  (Word/ODF features corresponding to (1)
 4, Write a Word/OO template (the set of styles corresponding to 2)
 5. Provide an automated path from 1 to 4 and back using glue-code and 
 existing internal and external tools (e.g.: LyX export functions to 
 XHTML/EPub, eLyxer, pandoc, writer2latex, etc).
 
 I am not sure points 1-5 above capture the existing description, partly 
 because I am not sure about what is meant by develop a framework. Perhaps 
 my summary caputeres the subgoal only?
 
 Stefano

One can hope (right?) that since a commonality between LyX and docx is math 
that this would be included on the feature set(s). On OS X, the new versions of 
Word have a built-in math typesetting capability (and thus no longer depends on 
MathType). Presumably this is allowed by the docx format and presumably this is 
also an aspect of Windows Word. Autonumbering would also be hoped for.

Jerry

 
 
 -- 
 __
 Stefano Franchi
 Associate Research Professor
 Department of Hispanic Studies Ph:   +1 (979) 845-2125
 Texas AM University  Fax:  +1 (979) 845-6421
 College Station, Texas, USA
 
 stef...@tamu.edu
 http://stefano.cleinias.org



Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-06 Thread Liviu Andronic
On Thu, Feb 6, 2014 at 6:35 PM, stefano franchi
 wrote:
> The first project in our wiki page for GSOC 2014:
>
> http://wiki.lyx.org/GSoC/GSoCProjectIdeasFor2014
>
> is, at the same time, the most ambitious and the least well-defined
> (possibly because the former implies the latter).
>
> It is not clear to me if this a coding project or if it defines the outline
> of a preliminary non-coding feasibility study that would, at most, produce a
> document describing the minimal-lyx and minimal-docx feature sets (plus,
> possibly, a minimal-lyx-layout and a minimal-doc-template).
>
> Any thought on how to make it more focused?
>
Rob has already done some work on this, with a working first release:
http://blog.oak-tree.us/index.php/2012/03/08/word2lyx01-2

Maybe that could be a starting point, and perhaps he has some pointers
on what can/needs to be done.

Liviu


> Perhaps we could define the goals as:
>
> 1. Define a minimal-lyx feature set (I.e. the supported LyX/LaTeX features)
> 2. Write a corresponding lyx-layout
> 3. Define a minimal-doc feature set  (Word/ODF features corresponding to (1)
> 4, Write a Word/OO template (the set of styles corresponding to 2)
> 5. Provide an automated path from 1 to 4 and back using glue-code and
> existing internal and external tools (e.g.: LyX export functions to
> XHTML/EPub, eLyxer, pandoc, writer2latex, etc).
>
> I am not sure points 1-5 above capture the existing description, partly
> because I am not sure about what is meant by "develop a framework". Perhaps
> my summary caputeres the subgoal only?
>
> Stefano
>
>
> --
> __
> Stefano Franchi
> Associate Research Professor
> Department of Hispanic Studies Ph:   +1 (979) 845-2125
> Texas A University  Fax:  +1 (979) 845-6421
> College Station, Texas, USA
>
> stef...@tamu.edu
> http://stefano.cleinias.org



-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail


Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-06 Thread stefano franchi
On Thu, Feb 6, 2014 at 12:36 PM, Liviu Andronic wrote:

> On Thu, Feb 6, 2014 at 6:35 PM, stefano franchi
>  wrote:
> > The first project in our wiki page for GSOC 2014:
> >
> > http://wiki.lyx.org/GSoC/GSoCProjectIdeasFor2014
> >
> > is, at the same time, the most ambitious and the least well-defined
> > (possibly because the former implies the latter).
> >
> > It is not clear to me if this a coding project or if it defines the
> outline
> > of a preliminary non-coding feasibility study that would, at most,
> produce a
> > document describing the minimal-lyx and minimal-docx feature sets (plus,
> > possibly, a minimal-lyx-layout and a minimal-doc-template).
> >
> > Any thought on how to make it more focused?
> >
> Rob has already done some work on this, with a working first release:
> http://blog.oak-tree.us/index.php/2012/03/08/word2lyx01-2
>
> Maybe that could be a starting point, and perhaps he has some pointers
> on what can/needs to be done.
>
>
This is great news. I was unaware of Rob's work in this area. I guess it's
take care of both points (3) and (4) (and half of (5)) in my list. At list
in a preliminary way. Perhaps we could focus on Python skills and on
familiarity with LyX/LaTeX and Word formats?
If so, I'd be willing to be potential mentor for the project, unless Rob
wants to step in.
The conversion to Word and the correlated problems when cooperating with
non-LyX using colleagues is by far the biggest problem I have with LyX and
I would love to see some progress on it.

Cheers,

Stefano


-- 
__
Stefano Franchi
Associate Research Professor
Department of Hispanic Studies Ph:   +1 (979) 845-2125
Texas A University  Fax:  +1 (979) 845-6421
College Station, Texas, USA

stef...@tamu.edu
http://stefano.cleinias.org


Re: GSOC 2014 project list: on LyX<-->docx roundtrip conversion

2014-02-06 Thread Jerry

On Feb 6, 2014, at 10:35 AM, stefano franchi  wrote:

> The first project in our wiki page for GSOC 2014:
> 
> http://wiki.lyx.org/GSoC/GSoCProjectIdeasFor2014
> 
> is, at the same time, the most ambitious and the least well-defined (possibly 
> because the former implies the latter).
> 
> It is not clear to me if this a coding project or if it defines the outline 
> of a preliminary non-coding feasibility study that would, at most, produce a 
> document describing the minimal-lyx and minimal-docx feature sets (plus, 
> possibly, a minimal-lyx-layout and a minimal-doc-template).
> 
> Any thought on how to make it more focused?
> 
> Perhaps we could define the goals as:
> 
> 1. Define a minimal-lyx feature set (I.e. the supported LyX/LaTeX features)
> 2. Write a corresponding lyx-layout
> 3. Define a minimal-doc feature set  (Word/ODF features corresponding to (1)
> 4, Write a Word/OO template (the set of styles corresponding to 2)
> 5. Provide an automated path from 1 to 4 and back using glue-code and 
> existing internal and external tools (e.g.: LyX export functions to 
> XHTML/EPub, eLyxer, pandoc, writer2latex, etc).
> 
> I am not sure points 1-5 above capture the existing description, partly 
> because I am not sure about what is meant by "develop a framework". Perhaps 
> my summary caputeres the subgoal only?
> 
> Stefano

One can hope (right?) that since a commonality between LyX and docx is math 
that this would be included on the feature set(s). On OS X, the new versions of 
Word have a built-in math typesetting capability (and thus no longer depends on 
MathType). Presumably this is allowed by the docx format and presumably this is 
also an aspect of Windows Word. Autonumbering would also be hoped for.

Jerry

> 
> 
> -- 
> __
> Stefano Franchi
> Associate Research Professor
> Department of Hispanic Studies Ph:   +1 (979) 845-2125
> Texas A University  Fax:  +1 (979) 845-6421
> College Station, Texas, USA
> 
> stef...@tamu.edu
> http://stefano.cleinias.org