On Monday, 6 March 2017 22.27.25 WET Enrico Forestieri wrote:
> The problem is that the file starts with a U+FEFF BOM mark, and most
> probably lyx2lyx doesn't expect to deal with BOM marks. Actually, the
> conversion succeeds, but the BOM mark is left in place (it now appears
> on the second line, though). Consequently, lyx then chokes on reading it.
> It suffices removing the bogus second line of the converted file to make
> it perfectly readable by lyx 2.2.
> LyX 2.1 is able to read it because no conversion is necessary and the BOM
> mark is skipped because it is dealt with by the lexer when it is at the
> beginning of the file.

Thank you for noticing that. :-)

The next question is how it got there. I assume of course that somewhere 
windows was involved. :-)

Is this a valid process that we should support?

If so the following patch fixes this for me, for both python 2 and 3.

-- 
José Abílio
diff --git a/lib/lyx2lyx/LyX.py b/lib/lyx2lyx/LyX.py
index 77ccdd0..b70fa3f 100644
--- a/lib/lyx2lyx/LyX.py
+++ b/lib/lyx2lyx/LyX.py
@@ -29,6 +29,7 @@ import sys
 import re
 import time
 import io
+import codecs
 
 try:
     import lyx2lyx_version
@@ -525,6 +526,11 @@ class LyX_base:
         initial_comment = " ".join(["#LyX %s created this file." % version__,
                                     "For more info see http://www.lyx.org/";])
 
+        # Remove UTF8 BOM marker if present
+        text = unicode if PY2 else str
+        if text(self.header[0]).encode("utf-8").startswith(codecs.BOM_UTF8):
+                self.header[0] = self.header[0][1:]
+
         # Simple heuristic to determine the comment that always starts
         # a lyx file
         if self.header[0].startswith("#"):
@@ -547,7 +553,7 @@ class LyX_base:
             if PY2:
                 result = fileformat.match(line)
             else:
-                result = fileformat.match(line.decode('ascii'))
+                result = fileformat.match(line.decode('ascii','ignore'))
             if result:
                 return self.lyxformat(result.group(1))
         else:

Reply via email to