[Issue 96958] Problem in importing big HTML files in Writer 3.0

2023-01-03 Thread bugzilla
https://bz.apache.org/ooo/show_bug.cgi?id=96958

--- Comment #14 from Czesław Wolański  ---
(In reply to damjan from comment #13)
>
> That must be an awesome laptop then, it takes closer to 25 minutes on my PC.
> 
Acer Aspire A317-53 (Intel Core i7 @ 2.80 GHz, 16 GB RAM, 512 GB SSD)
Windows 11 Home

Nothing out of the ordinary, I guess.

-- 
You are receiving this mail because:
You are on the CC list for the issue.
You are the assignee for the issue.

[Issue 96958] Problem in importing big HTML files in Writer 3.0

2023-01-03 Thread bugzilla
https://bz.apache.org/ooo/show_bug.cgi?id=96958

Matthias Seidel  changed:

   What|Removed |Added

 CC||msei...@apache.org

-- 
You are receiving this mail because:
You are on the CC list for the issue.
You are the assignee for the issue.

[Issue 96958] Problem in importing big HTML files in Writer 3.0

2023-01-03 Thread bugzilla
https://bz.apache.org/ooo/show_bug.cgi?id=96958

dam...@apache.org changed:

   What|Removed |Added

   Keywords||performance

--- Comment #13 from dam...@apache.org ---
(In reply to Czesław Wolański from comment #12)
> (In reply to damjan from comment #8)
> > Calc opens the file perfectly, but Writer hangs in an infinite loop.
> 
> If "the file" above means a file from the archive
> attached to this report ("html_table.html"),
> Writer opens it on my laptop after about 4 to 5 minutes.

That must be an awesome laptop then, it takes closer to 25 minutes on my PC.

Nice find, thank you.

So it's a performance bug then. Maybe we should replace that ::std::vector with
a dictionary of some kind, ::std::map or whatever? Or it could also be
something in the calling code that does too many lookups.

-- 
You are receiving this mail because:
You are on the CC list for the issue.
You are the assignee for the issue.

[Issue 96958] Problem in importing big HTML files in Writer 3.0

2023-01-03 Thread bugzilla
https://bz.apache.org/ooo/show_bug.cgi?id=96958

Czesław Wolański  changed:

   What|Removed |Added

 CC||czeslaw.wolan...@gmail.com

--- Comment #12 from Czesław Wolański  ---
(In reply to damjan from comment #8)
> Calc opens the file perfectly, but Writer hangs in an infinite loop.

If "the file" above means a file from the archive
attached to this report ("html_table.html"),
Writer opens it on my laptop after about 4 to 5 minutes.

-- 
You are receiving this mail because:
You are the assignee for the issue.
You are on the CC list for the issue.

[Issue 96958] Problem in importing big HTML files in Writer 3.0

2023-01-02 Thread bugzilla
https://bz.apache.org/ooo/show_bug.cgi?id=96958

--- Comment #11 from dam...@apache.org ---
(In reply to Peter from comment #10)
> what is the return value of getAllMarksEnd()?

IDocumentMarkAccess::const_iterator_t MarkManager::getAllMarksEnd() const
{ return m_vAllMarks.end(); }

where OpenGrok tell us ::sw::mark::MarkManager's m_vAllMarks is defined in
main/sw/source/core/inc/MarkManager.hxx as:

101  // container for all marks
102  container_t m_vAllMarks;

and container_t is defined in main/sw/inc/IDocumentMarkAccess.hxx as:

60  typedef ::std::vector< pMark_t > container_t;

-- 
You are receiving this mail because:
You are the assignee for the issue.
You are on the CC list for the issue.

[Issue 96958] Problem in importing big HTML files in Writer 3.0

2023-01-02 Thread bugzilla
https://bz.apache.org/ooo/show_bug.cgi?id=96958

Peter  changed:

   What|Removed |Added

 CC||pe...@apache.org

--- Comment #10 from Peter  ---
what is the return value of getAllMarksEnd()?

-- 
You are receiving this mail because:
You are the assignee for the issue.
You are on the CC list for the issue.

[Issue 96958] Problem in importing big HTML files in Writer 3.0

2023-01-02 Thread bugzilla
https://bz.apache.org/ooo/show_bug.cgi?id=96958

dam...@apache.org changed:

   What|Removed |Added

   Keywords|needmoreinfo|
 Latest|--- |4.2.0-dev
Confirmation in||
 OS|Windows Vista   |All

--- Comment #9 from dam...@apache.org ---
Reproduced on FreeBSD PC so changing hardware to "All", setting latest
confirmation version, and clearing "needmoreinfo".

-- 
You are receiving this mail because:
You are on the CC list for the issue.
You are the assignee for the issue.

[Issue 96958] Problem in importing big HTML files in Writer 3.0

2023-01-02 Thread bugzilla
https://bz.apache.org/ooo/show_bug.cgi?id=96958

dam...@apache.org changed:

   What|Removed |Added

 CC||dam...@apache.org

--- Comment #8 from dam...@apache.org ---
Calc opens the file perfectly, but Writer hangs in an infinite loop. Attaching
a debugger and backtracing a few times, I saw it's often running code in this
function from main/sw/source/core/doc/docbm.cxx:

---snip---
::rtl::OUString MarkManager::getUniqueMarkName(const ::rtl::OUString& rName)
const
{
OSL_ENSURE(rName.getLength(),
" - a name should be proposed");
if ( findMark(rName) == getAllMarksEnd() )
{
return rName;
}

::rtl::OUStringBuffer sBuf;
::rtl::OUString sTmp;
for(sal_Int32 nCnt = 1; nCnt < SAL_MAX_INT32; nCnt++)
{
sTmp = sBuf.append(rName).append(nCnt).makeStringAndClear();
if ( findMark(sTmp) == getAllMarksEnd() )
{
break;
}
}
return sTmp;
}
---snip---

That "for" loop has a limit of SAL_MAX_INT32 (over 2 billion), and the
condition that would cause it to "break" seems to never be met, thus it just
spins there.

Putting a breakpoint on that "if" statement within the "for" loop and printing
the contents of "sTmp" on each loop run, I get:

__tmpTD1547
__tmpTD1548
__tmpTD1549
...

and the "break" is never reached.

-- 
You are receiving this mail because:
You are on the CC list for the issue.
You are the assignee for the issue.