Support of csv files with .xls extension

2013-10-24 Thread Maxim Monastirsky
Hi All,
Starting with 4.1 csv files with xls extension & also tsv files are opened 
using 
writer instead of calc, when opening from the start center or from cli without 
'--calc'. While it's clear that the tsv part should be fixed (BTW it's a 
trivial fix [1]), I'm not sure about csv files with xls extension. Users 
shouldn't expect that after changing the file extension to some random 
extension, it still would be recognized as a csv by default. But on the other 
hand the fact that some user opened a bug for it, might indicate that some 
app\web app generates such files. So what do you think? Should it be fixed, or 
should we close the bug as WONTFIX?

The relevant bug reports are:
https://bugs.freedesktop.org/show_bug.cgi?id=68903 (csv with .xls extension)
https://bugs.freedesktop.org/show_bug.cgi?id=69290 (tsv)

Best Regards,
Maxim Monastirsky

(When responding, please be aware that I'm not subscribed to the list)

[1] https://bugs.freedesktop.org/show_bug.cgi?id=68903#c8
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Support of csv files with .xls extension

2013-10-24 Thread Kohei Yoshida
On Thu, 2013-10-24 at 10:38 +0300, Maxim Monastirsky wrote:
> Hi All,
> Starting with 4.1 csv files with xls extension & also tsv files are opened 
> using 
> writer instead of calc, when opening from the start center or from cli 
> without 
> '--calc'. While it's clear that the tsv part should be fixed (BTW it's a 
> trivial fix [1]), I'm not sure about csv files with xls extension. Users 
> shouldn't expect that after changing the file extension to some random 
> extension, it still would be recognized as a csv by default. But on the other 
> hand the fact that some user opened a bug for it, might indicate that some 
> app\web app generates such files. So what do you think? Should it be fixed, 
> or 
> should we close the bug as WONTFIX?
> 
> The relevant bug reports are:
> https://bugs.freedesktop.org/show_bug.cgi?id=68903 (csv with .xls extension)
> https://bugs.freedesktop.org/show_bug.cgi?id=69290 (tsv)

Well, when the app name is not given, the current detection code relies
on the extension to decide which app to open the file with (as you
already noticed).  I believe we associate "csv" to Calc in this case, 

http://cgit.freedesktop.org/libreoffice/core/tree/filter/source/textfilterdetect/filterdetect.cxx#n75

so I would say it's also reasonable to associate "tsv" and "xls" to Calc
as well.

Feel free to propose a patch for this and push it to gerrit.  It should
be a very easy fix.

Kohei

___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Support of csv files with .xls extension

2013-10-24 Thread Maxim Monastirsky
Hi Kohei, great that you responded!

Could you also take a look at 
https://bugs.freedesktop.org/show_bug.cgi?id=70100. There is a file which is 
detected by Excel 2010 as 'Excel 2' sheet and LO also sets the excel4 filter, 
but it fails in isExcel40 function from sc/source/ui/unoobj/exceldetect.cxx 
because it has a value like in BIFF5.

Thanks in advance,
Maxim Monastirsky
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: Support of csv files with .xls extension

2013-10-24 Thread Kohei Yoshida
On Thu, 2013-10-24 at 16:34 +0300, Maxim Monastirsky wrote:
> Hi Kohei, great that you responded!
> 
> Could you also take a look at 
> https://bugs.freedesktop.org/show_bug.cgi?id=70100. There is a file which is 
> detected by Excel 2010 as 'Excel 2' sheet and LO also sets the excel4 filter, 
> but it fails in isExcel40 function from sc/source/ui/unoobj/exceldetect.cxx 
> because it has a value like in BIFF5.

The key here is that the bug report mentions that this xls file is
generated by a third party software, which may write incorrect BOF ID
for the files it generates.

Ultimately the answer lies here:
http://opengrok.libreoffice.org/xref/core/sc/source/filter/excel/read.cxx#145

Looks like our import filter code itself ignores this BOF ID and lumps
0x0809 toger with the ID's for BIFF2-4.  If that's what the import
filter does, then we should probably do the same in the file format
detection code as well i.e. add 0x0809 to the list of "correct" BOF ID's
for Excel 4.0 format or earlier.

This shouldn't affect detection of more recent Excel file formats, since
those formats use OLE to store workbook and other streams, and we use
that to detect those versions, not the BOF ID's.

As with the earlier bug you cited, this one should be an easy fix as
well, and I believe you already have a fix for this?

Kohei


___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice