Eike Rathke wrote:
Hi William,

On Saturday, 2008-11-08 23:48:43 +0000, William S Fulton wrote:

> I've had a good look at many of these and have posted a new patch fixing > various multiline problems.

This is great! I think that patch does it, other opinions?

> It includes some
> subtle changes which I hope are okay as they bring these non-Calc
> formats in line with other spreadsheet programs, I've been looking
> closely at Excel, Gnumeric and Quattro Pro. Excel is definitely the most
> polished and I've mostly based compatibility on this program. Of
> particular note is the unformatted text and SYLK quoting convention
> change.

Bringing SYLK closer to what Excel reads/writes is always good. I'm not
sure about the unformatted text, is the quoting convention now what
Excel writes to the clipboard if requested?

The quoting convention for unformatted text copied to the clipboard is now identical to Excel as far as I can see. The patch adds in quoting of any text with newline characters in it.

> The biggest area for change though is DDE links and I need some help > here before implementing them. Firstly, tabs within a cell are broken in > the current versions of Calc and the problems are closely related to > newline characters within cells. Excel deals with both tabs and newlines > in cells and as this is a working solution, I'd like to know if there is > anyone who can provide some information as to how it works. Somehow it > is doing the impossible, here is why...
>
> If a cell contains either a newline (\n) or a tab (\t), it escapes the > entire contents with an opening and a closing quote ("). If a cell is > quoted like this and it contains a quote character in the contents, then > the quote is escaped by double quoting, ie " is replaced by "".

So far the CSV conventions as if tab instead of comma was used as field
separator (TSV).
Yes, I can see CSV is similar to TSV. I didn't make any changes for csv because when Calc saves to csv format all text is always quoted (with the chosen quote character chosen in the export text files dialog). There is a slight difference to Excel here as Excel will only quote when necessary, eg when text contains a comma or newline.

> Note > that within cells, a newline is represented by \n, not \r\n, even though > this is Windows. The end of a line, however, is designated by \r\n and > cells are separated by \t. My latest patch has replicated this protocol > when copying text. With this info in mind, consider two adjacent cells > both containing three single quotes and another cell containing a tab > within quotes, so visually where | indicates the division between cells, > the contents are:
>
> 1) """|"""
> 2) "\t"
>
> When copying or dde linking using unformatted text, we get the following > for both:
>
> """\t"""\r\n
>
> So it is impossible to distinguish these two sets of contents.

Following CSV rules, quotes would be escaped by doubling them. This
would give:

1) """"""""\t""""""""
2) """\t"""

But that does not seem to be what Excel delivers with DDE?
Correct. Excel's quoting convention is different when comparing CSV and unformatted text. The DDE format used when Excel is communicating with Calc seems to be unformatted text, although to confuse matters I see a SYLK request is received by Calc. There is a little test program using the unformatted text and csv protocols for dde links to Excel in xlddec.zip <http://www.angelfire.com/biz/rhaminisys/binaries/xlddec.zip> at http://www.angelfire.com/biz/rhaminisys/ddeinfo.html. Using this, Excel seems to be inconsistent in the quoting conventions comparing it as a DDE client and server, so I don't think Excel is a good model for CSV quoting.

> However, > Excel always distinguishing them correctly when dde linking, not pasting > though. Initially I was wondering if it uses the 'item' information that > comes alongside the dde data, eg "R1C1:R1C2" to help determine the > number of rows/columns. However, this is not possible as it also deals > with this case of having a fixed number of cells, as in this 3 cell case:
>
>
> 1) "\t"|"""|"""
> 2) """|"""|"\t"
>
> both of which result in the following dde data:
>
> """\t"""\t"""\t"""\r\n
>
> When simply copying into Excel, it does not always get it right, which I > would expect. Also dde linking unformatted text from Word gives Excel > problems, so the question is how does it solve it for dde linking, which > contains the same textual data?

I think it simply uses a different protocol when linking between two
Excel documents, as you noticed and mentioned in your other mail.

> I have a hunch it uses dde links using > the SYLK format instead as when debugging paste linking unformatted text > from Excel into Calc, a SYLK request arrives in addition to unformatted > text. In my patch, I've fixed SYLK quoting, however, Calc's version of > SYLK still does not match the standard approach used by Excel and I > presume the original Multiplan, so I *think* the SYLK format is > incorrect, so when dde linking to Excel from OOo, Excel doesn't get it > right, but Excel to Excel it does.
>
> I've arrived at a juncture. Firstly, does anyone have a good insight > into all this? Secondly, assuming the dde links are done using SYLK, is > it okay to change this in OOo to match?

No, not unconditionally. You'd have to somehow distinguish between
a Calc-Calc link and other links that only understand unformatted text,
in sending or receiving.

Okay I wasn't too clear on this and have answered in reply to Niklas' mail.

> Finally, how does this relate to adding in the newline support? Well, > Calc uses \n as the line terminator on Unix and \r\n on Windows for > unformatted text copy/paste and linking. If \n exists within cells and > is escaped with quotes as on Windows, then the same problem arises as I > showed above with tabs in not being able to determine if \n is the end > of the line or a new line within a cell. That would mean for dde > linking, \r\n would need to be used on Unix (\n is used at the moment), > but this may not be such a surprise given that dde linking is a Windows > protocol.

Indeed. I'm also not aware of any application other than OOo supporting
DDE on other platforms, except OS/2 that has the same line end
convention.


> A couple more related queries...
> - Does anyone know of any other unix DDE clients, in particular > spreadsheets? If not the impact of changing the end of line terminator > from \n to \r\n won't be so big. > - Calc implies that it supports DDE links as a server using SYLK, DIF, > HTML, etc in addition to plain text

Erm.. what are you referring with "Calc implies that it supports"?
Copy some cells in Calc, then go to Excel, Paste Special, paste link. In the list it has numerous different formats that Excel thinks Calc supports. However only Text/Unicode Text is what is used.

> but in actual fact they all end up > calling the text format. This can be observed by debugging impex.cpp > when doing a copy from Calc and paste link in something like Excel. Is > this a known quirk? There seem to be other DDE problems eg some of the > paste link graphic formats into Word give errors.

AFAIK it was not intended to support anything else than cell content
with DDE.

If you go to any other Windows program thats supports linking, you get a list of supposedly supported formats. In Word, for example, many of the formats give an error when attempting paste link. The html format seems to be accepted, however, if you debug it, it is using the rtf format!! So maybe only unformatted text is meant to be the only dde format, but in reality some of the other formats are half working.
> - It has been getting progressively harder to get a tab character into a > cell, from 2.4 to DEV300_m16 to OOO300_m7. Is this an accident or are > there some bug fixes related to this behaviour? I can't see any pattern, > eg aa\tbb\n will keep the tab in OOO300_m7, but aa\tbb will not and > aa\tbb will keep the tab in 2.4, but aa\tb will not.

I don't see any difference in behavior with OOO300_m9, given that a tab
can be entered to a cell only when copied from the clipboard. Or what
was your attempt?

Yes I get the tab in via the clipboard too. If I use an official OOO300_m8, and paste a\tb the tab is replaced with a space, but pasting aa\tbb keeps the tab. I think this strange bug is manifesting itself slightly differently with my own builds which include the multiline patch. All in all rather odd.

> The patch is getting rather big with numerous knock on fixes. Is this > the sort of point in time that a child workspace should be created for > it to ease development?

Maybe the best, if this tends to last longer, for rebasing or other
developers to jump in. Do you want me to create one?

I don't really fully appreciate your procedures, but if that makes sense it sounds good. I'm familiar with svn, so that is hopefully half the battle of learning CWS. Working from source code control would be less painful than putting together patches and if it is easier for other developers to help out, I'd be keen as my free time is limited so progress might be quicker then.

William

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to