https://bugs.documentfoundation.org/show_bug.cgi?id=144576

Hossein <hoss...@libreoffice.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|needsDevEval                |difficultyMedium, easyHack,
                   |                            |skillCpp

--- Comment #8 from Hossein <hoss...@libreoffice.org> ---
(In reply to Stéphane Guillou (stragu) from comment #7)
> Hossein, could this qualify as an easy hack? Merged cells should be tested
> too.
Yes, I think this can be an EasyHack with the difficultyMedium.

Code pointers:

There are many steps in copy/pasting, including the data/format conversion and
clipboard format handling. Here, you have to know that the document is
converted to plain text via "text" filter.

The plaintext (ascii) filter is located here in the LibreOffice core source
code:

sw/source/filter/ascii

Therefore, to change the copy/paste output, you have to fix the ascii filter.
That would also provide the benefit that plain text export will be also fixed
as requested here.

In this folder, there are a few files:

$ ls sw/source/filter/ascii/
ascatr.cxx  parasc.cxx  wrtasc.cxx  wrtasc.hxx

To change the output, you have to edit this file:

sw/source/filter/ascii/wrtasc.cxx

In this file, there is a loop dedicated to create the output.

 // Output all areas of the pam into the ASC file
 do {
     bool bTstFly = true;
    ...
 }

Inside this loop, the code iterates over the nodes inside the document
structure, and extracts text from them. To check for yourself, add the one line
below to the code, build LO, and then test. You will see that a * is appended
before each node.

 SwTextNode* pNd = m_pCurrentPam->GetPoint()->GetNode().GetTextNode();
 if( pNd )
 {
+   Strm().WriteUChar('*');
  ...
 }

For example, having this table, with 1 blank paragraph up and down:

A | B
--|--
C | D

You will get this after copy/paste into a plain text editor:

*
*a
*b
*c
*d
*

To fix the bug, you have to differentiate between table cells and other nodes.
Then, you should take care of the table columns and print tab between them.

To go further, you can only add star before table cells:

 if( pNd )
 {
     SwTableNode *pTableNd = pNd->FindTableNode();
     if (pTableNd)
     {
         Strm().WriteUChar('*');
     }
     ...
 }

You can look into how other filters handled tables. For example, inside
sw/source/filter/html/htmltab.cxx you will see how table is managed, first cell
is tracked and appropriate functions to handle HTML table are called.

For the merged cells, I suggest the EasyHacker first checks the behavior in
other software, then design and implement the appropriate behavior.

To gain a better understanding of the Writer document model / layout, please
see this document:

Writer/Core And Layout
https://wiki.openoffice.org/wiki/Writer/Core_And_Layout

And also this presentation:

Introduction to Writer Development - LibreOffice 2023 Conference Workshop
Miklos Vajna
https://www.youtube.com/watch?v=oM0tB1A0JHA

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to