On 06.06.2011 19:43, tora - Takamichi Akiyama wrote:
And also, please cover the underlying memory allocation mechanism which
would be another key factor for the performance improvement.

On 2011/06/07 3:04, Niklas Nebel wrote:
There's an old suggestion to treat small strings differently, see 
http://wiki.services.openoffice.org/wiki/Uno/Binary/Analysis/String_Performance.

Thank you for the information!

In addition to it, I am wondering if these ideas might help a lot.

As many already know, malloc() is too general and too expensive.
Moreover, free() is much more expensive than malloc().
e.g. a source code of malloc() in glibc:
http://sourceware.org/git/?p=glibc.git;a=blob;f=malloc/malloc.c

Not only the large number of machine instructions, but also its waste
of memory usage affects the system-wide performance.
In reality, malloc(1) consumes 32 bytes on CentOS 5.4 64 bit kernel:
http://tora-japan.com/wiki/Boundaries_of_memory_allocation_with_malloc%28%29


Even though current OpenOffice.org runs as a multi-thread process,
it runs as if it is a single thread. So, we could have several options
to implement its underlying memory allocation mechanism for the specific
purposes of OpenOffice.org.

1) Memory allocation mechanism used in a kernel
For temporal use, utilize the memory allocation mechanism similar to
the one normally used in a kernel. Use a single bit to hold the status
of memory chunk. e.g: 0 means vacant ; 1 denotes occupied. The size of
memory chunk could be 128, 256, 512, 1024, ...

2) Slicing cheese and throwing them out at once
For the internal tasks such as "Save as" and "Export to" we might get
a big advantage. Such a task starts from the framework, calls thousands
of methods, and finally leaves the only single value meaning a SUCCESS
or FAILURE. No String instance involved during the task is needed to be
persistent.

#define ALLOCATION_SIZE ( 1024 * 1024 ) // 1MB
#define ALIGNMENT       4

void* SCATTOAO::xmalloc( size_t nSize )
{
    nSize = ( ( nSize - 1 ) / ALIGNMENT + 1 ) * ALIGNMENT;
    if ( m_nRest < nSize ) {
        nAllocationSize = ( ( nSize - 1 ) / ALLOCATION_SIZE + 1 ) * 
ALLOCATION_SIZE;
        p = memory_page_allocation( nAllocationSize, PRIVATE|ANONIMOUS );
        m_vector.append( Entry( p, nAllocationSize ) );
        m_pNose = p;
        n_nRest = nAllocationSize;
    }
    ret = m_pNose;
    m_pNose += nSize;  // Slice a block of cheese
    m_nRest -= nSize;
    return (void *) ret;
}

void SCATTOAO::xfree( void* )
{
    // do nothing at all
}

SCATTOAO::~SCATTOAO()
{
    for ( iterator m_vector )  // Throw them at once
        memory_page_deallocation( it->m_pAddress, it->m_nSize );
}

The instance of allocator class SCATTOAO is a thread specific object and
it is used by the only own thread. Therefore, no mutex lock is required.

I think the above are just a tip of potential, brilliant ideas.
Let's discuss later this kind of topic once the surrounding situation is 
settled.

Best regards,
Tora
--
-----------------------------------------------------------------
To unsubscribe send email to dev-unsubscr...@openoffice.org
For additional commands send email to sy...@openoffice.org
with Subject: help

Reply via email to