[dev] Re: refactoring OUString
On Fri, Jun 10, 2011 at 6:51 AM, tora - Takamichi Akiyama < t...@openoffice.org> wrote: > Sorry, this mail is too long... No problem, I'll briefly go through your five items one by one: 1. Delegation of the responsibility to choose a type of memory allocator > To achieve both stability and performance at the same time, I would like to > propose "Don't do all of it in the SAL", rather "Delegate certain > responsibility to its users, i.e. programmers." > > Who knows the type of life time of data? SAL does? No. The programmers do. > Do they? Theoretically, they do have the information available that is necessary to choose the most efficient approach that is still correct (depending on how early data can be destroyed again, whether it needs to be multi-thread safe or async safe, etc.). However, in practice, the necessary information is often so big that programmers will make mistakes, choosing approaches that are incorrect. If these are not caught at compile time, and lead to program failure, I think this is a problem (and history shows it to be a rather severe one). > 2. Potential dead lock > A code for crash reporter has a potential, dead lock problem. > http://hg.services.openoffice.org/DEV300/file/tip/sal/osl/unx/backtrace.c > Asynchronous-unsafe functions such as fprintf() are used in the context of > signal handler. > > Consider this situation: > 1. A "Segment violation, aka SEGV" occurs in malloc() or free() due to > memory corruption. Such a function holds the global mutex lock. > 2. The first call of fprintf(), it internally calls malloc() to obtain a > memory area as a text buffer. Then a dead lock occurs. > > For that topic, I would be posing a question later. Yes, OOo's signal handler code is horribly broken. I do not know whether its original authors were unaware of the gross violations of correct programming that are taking place here, but general consensus appears to be that the code happens to work (by accident, I would say) most of the time, and sometimes just locks up even worse than the crash that caused the signal handler to be called in the first place. Anyway, this should be cleaned up one day (and is indeed a topic for a thread of its own). > Please have a look at an additional code fragment in the destructor above: > >if ( Applicatoin::IsMemoryCheckRequested() ) >for ( iterator m_vector ) // Turn them to be a trap >alter_page_attribute( *it, > NO_READ_ACCESS|NO_WRITE_ACCESS|NO_EXEC ); > > 1. soffice.bin is invoked with a new command line option such as > "-memorycheck" > 2. Applicatoin::IsMemoryCheckRequested() returns TRUE. > 3. The memory pages being freed turns to be a trap. > 4. A problematic code mistakenly attempts to read or write data in the > already-freed-memory-area. > 5. The trap sets off the alarm and an interruption is sent by the OS. > 6. A signal handler in the SAL catches the interruption. > 7. A crash report that reveals the exact location of the code is made. > > We have been cultivating thousands of test scenarios for more than a > decade. > Just leave the qatesttool running for a day and night with the option > -memorycheck. > > > 4. Utilizing the cutting-edge technology invented in the 21th century. > > solaris$ cat attempt-of-accessing-the-already-freed-memory-area.c > > #include > int main() > { >char *p = (char *) malloc(10); >free(p); >*p = 1; >return 0; > } > > $ cc -g attempt-of-accessing-the-already-freed-memory-area.c > > $ LD_PRELOAD=watchmalloc.so.1 MALLOC_DEBUG=WATCH,RW ./a.out > Trace/Breakpoint Trap (core dumped) > > $ dbx ./a.out core > ... > program terminated by signal TRAP (write access watchpoint trap) > Current function is main >7 *p = 1; > > Is it easy enough? > Both approaches above (Applicatoin::IsMemoryCheckRequested and watchmalloc) are good for debugging buggy software, but I do not think they are very good answers to the question: "When designing classes like OUString etc., how should efficiency be balanced against safety and maintainability?" I understand that you argue that efficiency should be a priority, and safety can be guaranteed (more or less thoroughly) by testing the code with the mechanisms outlined above. I rather argue that the abstractions available to the programmers should be as safe as possible (even if that costs some efficiency), as programmers will invariably make mistakes, so the potential for mistakes should be minimized. Testing code is all well and important (very much so!), but the tests cannot find all problems (let alone the fact that test coverage for OOo is still rather small). This is probably just another facet of the everlasting dispute between the dynamic and static typing camps. I confess I am sold on the benefits of type theory. 5. 99.9% use cases could be the default. > [...] > In the case above, i.e. in the typical, 99.9% code of OpenOffice.org, I > don't think multithread awareness is required. > > Therefore, the current implem
[dev] Re: refactoring OUString
Sorry, this mail is too long... On Thu, Jun 9, 2011 at 9:20 AM, tora - Takamichi Akiyama mailto:t...@openoffice.org>> wrote: That is why I would like to encourage programmers to take care of the life time of data. I know that that statement is controversial. On 2011/06/09 18:02, Stephan Bergmann wrote: First of, I am doubtful that encouraging manual memory management is a good idea. Errors in manual memory management probably are the cause for the vast majority of severe failures in C/C++ programs. Please be noticed that I don't say programmers should need to explicitly call memory management related functions such as malloc() or free(). Rather, I would like to suggest thinking of the characteristics of the questioned data. 1. Delegation of the responsibility to choose a type of memory allocator To achieve both stability and performance at the same time, I would like to propose "Don't do all of it in the SAL", rather "Delegate certain responsibility to its users, i.e. programmers." Who knows the type of life time of data? SAL does? No. The programmers do. Life time of data (1) data lasting until the soffice.bin quits. (2) data lasting until a document is closed. (3) data lasting until a current thread ends. (4) data lasting until a certain task finishes. (5) data lasting until a current function call returns. (6) data lasting until a current block ends. Multithread awareness (a) data that is shared with more than one threads. (b) data that is used in the only this thread. Asynchronous awareness (i) data that is used in a asynchronously called function such as a signal handler. (ii) data that is used in a normal function. 2. Potential dead lock A code for crash reporter has a potential, dead lock problem. http://hg.services.openoffice.org/DEV300/file/tip/sal/osl/unx/backtrace.c Asynchronous-unsafe functions such as fprintf() are used in the context of signal handler. Consider this situation: 1. A "Segment violation, aka SEGV" occurs in malloc() or free() due to memory corruption. Such a function holds the global mutex lock. 2. The first call of fprintf(), it internally calls malloc() to obtain a memory area as a text buffer. Then a dead lock occurs. For that topic, I would be posing a question later. > Hence, I would always try to abstract from actual memory as much as possible. (Performance considerations are of course valid, but they must be balanced against safety and maintainability considerations.) 3. Come up with the exciting measures There in no need to keep relying on the traditional approaches invented in the 20th century. With my experiences from 8 bit processor, I certainly believe the programmers' awareness of how memory area is treated is the crucial factor to achieve performance, safety, and maintainability at the same time. I do not have an objection against your idea "abstraction," though. = // Slicing cheese and throwing them out at once #define ALLOCATION_SIZE ( 1024 * 1024 ) // 1MB #define ALIGNMENT 4 void* SCATTOAO::xmalloc( size_t nSize ) { nSize = ( ( nSize - 1 ) / ALIGNMENT + 1 ) * ALIGNMENT; if ( m_nRest < nSize ) { nAllocationSize = ( ( nSize - 1 ) / ALLOCATION_SIZE + 1 ) * ALLOCATION_SIZE; p = memory_page_allocation( nAllocationSize, PRIVATE|ANONIMOUS ); m_vector.append( Entry( p, nAllocationSize ) ); m_pNose = p; n_nRest = nAllocationSize; } ret = m_pNose; m_pNose += nSize; // Slice a block of cheese m_nRest -= nSize; return (void *) ret; } void SCATTOAO::xfree( void* ) { // do nothing at all } SCATTOAO::~SCATTOAO() { if ( Applicatoin::IsMemoryCheckRequested() ) for ( iterator m_vector ) // Turn them to be a trap alter_page_attribute( *it, NO_READ_ACCESS|NO_WRITE_ACCESS|NO_EXEC ); else for ( iterator m_vector ) // Throw them at once memory_page_deallocation( it->m_pAddress, it->m_nSize ); } = Please have a look at an additional code fragment in the destructor above: if ( Applicatoin::IsMemoryCheckRequested() ) for ( iterator m_vector ) // Turn them to be a trap alter_page_attribute( *it, NO_READ_ACCESS|NO_WRITE_ACCESS|NO_EXEC ); 1. soffice.bin is invoked with a new command line option such as "-memorycheck" 2. Applicatoin::IsMemoryCheckRequested() returns TRUE. 3. The memory pages being freed turns to be a trap. 4. A problematic code mistakenly attempts to read or write data in the already-freed-memory-area. 5. The trap sets off the alarm and an interruption is sent by the OS. 6. A signal handler in the SAL catches the interruption. 7. A crash report that reveals the exact location of the code is made. We have been cultivating thousands of test scenarios for more than a decade. Just leave the qatesttool running for a day and night with the option -memorycheck. 4. Utilizing the cutting-edge technology invented in the 21th century. solaris$ cat attempt-of
[dev] Re: refactoring OUString
On Thu, Jun 9, 2011 at 9:20 AM, tora - Takamichi Akiyama < t...@openoffice.org> wrote: > That is why I would like to encourage programmers to take care of the life > time of data. > First of, I am doubtful that encouraging manual memory management is a good idea. Errors in manual memory management probably are the cause for the vast majority of severe failures in C/C++ programs. Hence, I would always try to abstract from actual memory as much as possible. (Performance considerations are of course valid, but they must be balanced against safety and maintainability considerations.) What you describe with "Slicing cheese and throwing them out at once" can be done, but I would not want to do it manually. There are systems more clever than C++, building on effect types and region-based memory management, that exploit such optimizations. But there, it is the language implementation---and not the programmer writing a program in that language---that carries out the proof that keeping data in a region of memory that is discarded wholesale at a certain point in time is sound. That said, it might work to map your various levels of data---from "data lasting until the soffice.bin quits" to "data lasting until a current function call returns"---to different C++ types with appropriate conversion functions that potentially need to copy data, to statically ensure sound memory access while on the one hand allowing to exploit optimized memory management strategies and on the other hand still being safe if data does escape from its anticipated level. Would be a nice experiment. -Stephan -- - To unsubscribe send email to dev-unsubscr...@openoffice.org For additional commands send email to sy...@openoffice.org with Subject: help
[dev] Re: refactoring OUString
On 2011/06/08 0:22, Niklas Nebel wrote: Of course we should try to make more use of multiple threads. This isn't a new idea either, see http://wiki.services.openoffice.org/wiki/Calc/Performance/misc. Christian did some experiments with parallel loading a while ago (http://blogs.oracle.com/GullFOSS/entry/xml_performance_and_now_for). The results for Impress weren't spectacular, but Calc or Writer may be different. Yep! I am a multithread, data-driven programming lover, too. :-) On 07.06.2011 13:15, tora - Takamichi Akiyama wrote: 2) Slicing cheese and throwing them out at once For the internal tasks such as "Save as" and "Export to" we might get a big advantage. Such a task starts from the framework, calls thousands of methods, and finally leaves the only single value meaning a SUCCESS or FAILURE. No String instance involved during the task is needed to be persistent. On 2011/06/08 0:22, Niklas Nebel wrote: Right now, that isn't entirely true. For example, saving might need to calculate a formula, and the calculated result is then kept in the cell, in a string that continues to be referenced after saving. There might be similar cases elsewhere. These would probably have to be moved into a separate step before saving. Sounds a bit fragile, but then it could actually save a significant amount of time. That is why I would like to encourage programmers to take care of the life time of data. For instance, in the user scenario below, there might be (1) data lasting until the soffice.bin quits. (2) data lasting until a document is closed. (3) data lasting until a current thread ends. (4) data lasting until a certain task finishes. (5) data lasting until a current function call returns. 1. File - New - Spreadsheet 2. work on it and save it. 3. File - Close. In the step 1, construct an instance of memory allocator for (2). In the step 2, use it to allocate memory chunks lasting as long as the document is open. In the step 3, destroy the allocator to completely free the allocated memory. Lessons we might have learned: We can implement and utilize some purpose oriented memory allocators as well as the general, expensive one: malloc() and free(). Programmers may wisely choose what memory allocator is appropriate for questioned data. On the other hand, now might be a perfect time to discuss "crazy ideas", without mundane details getting in the way. Aha! here is another "crazy ideas" :-) https://bitbucket.org/tora/ooo-idea-zstring/src memory_allocator_for_zstring.cxx shows an idea of reusable, cache, memory allocation mechanism for new String class. The key concept here is not to actually "free" the memory being freed, but to cache it for a later use. Reuse the most recently freed memory first so that the Translation Lookaside Buffer (TLB) achieves higher hit ratio. In contrast, if the oldest freed memory is used first, the entire system performance might suffer because the relevant entry is surely absent from the TLB and, moreover, the relevant memory page might have been swapped out to a disk device. vec.hxx implements a c++ template for cheaply expandable vector. test_vec.cxx demonstrates usage of vec.hxx Best regards, Tora -- - To unsubscribe send email to dev-unsubscr...@openoffice.org For additional commands send email to sy...@openoffice.org with Subject: help
[dev] Re: refactoring OUString
On 07.06.2011 17:22, Niklas Nebel wrote: On 07.06.2011 13:15, tora - Takamichi Akiyama wrote: As many already know, malloc() is too general and too expensive. Moreover, free() is much more expensive than malloc(). e.g. a source code of malloc() in glibc: http://sourceware.org/git/?p=glibc.git;a=blob;f=malloc/malloc.c We use our own implementation, rtl_allocateMemory (see sal/rtl/source/alloc*). But of course the point remains valid: Both allocation and deallocation take time. Even though current OpenOffice.org runs as a multi-thread process, it runs as if it is a single thread. So, we could have several options to implement its underlying memory allocation mechanism for the specific purposes of OpenOffice.org. If there was only a single thread, we could get rid of quite some locking overhead. But in fact, with clipboard, UNO acceptor thread and such stuff, we have just enough multithreading going on to cause the overhead, without the benefit of actually doing work in parallel. Properly using a read-only string class (at least in code that might be accessed in multiple threads) could also prevent locking overhead. Regards, Mathias -- Mathias Bauer (mba) - Project Lead OpenOffice.org Writer OpenOffice.org Engineering at Oracle: http://blogs.sun.com/GullFOSS Please don't reply to "nospamfor...@gmx.de". I use it for the OOo lists and only rarely read other mails sent to it. -- - To unsubscribe send email to dev-unsubscr...@openoffice.org For additional commands send email to sy...@openoffice.org with Subject: help
[dev] Re: refactoring OUString
On 07.06.2011 13:15, tora - Takamichi Akiyama wrote: As many already know, malloc() is too general and too expensive. Moreover, free() is much more expensive than malloc(). e.g. a source code of malloc() in glibc: http://sourceware.org/git/?p=glibc.git;a=blob;f=malloc/malloc.c We use our own implementation, rtl_allocateMemory (see sal/rtl/source/alloc*). But of course the point remains valid: Both allocation and deallocation take time. Even though current OpenOffice.org runs as a multi-thread process, it runs as if it is a single thread. So, we could have several options to implement its underlying memory allocation mechanism for the specific purposes of OpenOffice.org. If there was only a single thread, we could get rid of quite some locking overhead. But in fact, with clipboard, UNO acceptor thread and such stuff, we have just enough multithreading going on to cause the overhead, without the benefit of actually doing work in parallel. Of course we should try to make more use of multiple threads. This isn't a new idea either, see http://wiki.services.openoffice.org/wiki/Calc/Performance/misc. Christian did some experiments with parallel loading a while ago (http://blogs.oracle.com/GullFOSS/entry/xml_performance_and_now_for). The results for Impress weren't spectacular, but Calc or Writer may be different. 2) Slicing cheese and throwing them out at once For the internal tasks such as "Save as" and "Export to" we might get a big advantage. Such a task starts from the framework, calls thousands of methods, and finally leaves the only single value meaning a SUCCESS or FAILURE. No String instance involved during the task is needed to be persistent. Right now, that isn't entirely true. For example, saving might need to calculate a formula, and the calculated result is then kept in the cell, in a string that continues to be referenced after saving. There might be similar cases elsewhere. These would probably have to be moved into a separate step before saving. Sounds a bit fragile, but then it could actually save a significant amount of time. I think the above are just a tip of potential, brilliant ideas. Let's discuss later this kind of topic once the surrounding situation is settled. On the other hand, now might be a perfect time to discuss "crazy ideas", without mundane details getting in the way. Niklas -- - To unsubscribe send email to dev-unsubscr...@openoffice.org For additional commands send email to sy...@openoffice.org with Subject: help
[dev] Re: refactoring OUString
On 06.06.2011 19:43, tora - Takamichi Akiyama wrote: And also, please cover the underlying memory allocation mechanism which would be another key factor for the performance improvement. On 2011/06/07 3:04, Niklas Nebel wrote: There's an old suggestion to treat small strings differently, see http://wiki.services.openoffice.org/wiki/Uno/Binary/Analysis/String_Performance. Thank you for the information! In addition to it, I am wondering if these ideas might help a lot. As many already know, malloc() is too general and too expensive. Moreover, free() is much more expensive than malloc(). e.g. a source code of malloc() in glibc: http://sourceware.org/git/?p=glibc.git;a=blob;f=malloc/malloc.c Not only the large number of machine instructions, but also its waste of memory usage affects the system-wide performance. In reality, malloc(1) consumes 32 bytes on CentOS 5.4 64 bit kernel: http://tora-japan.com/wiki/Boundaries_of_memory_allocation_with_malloc%28%29 Even though current OpenOffice.org runs as a multi-thread process, it runs as if it is a single thread. So, we could have several options to implement its underlying memory allocation mechanism for the specific purposes of OpenOffice.org. 1) Memory allocation mechanism used in a kernel For temporal use, utilize the memory allocation mechanism similar to the one normally used in a kernel. Use a single bit to hold the status of memory chunk. e.g: 0 means vacant ; 1 denotes occupied. The size of memory chunk could be 128, 256, 512, 1024, ... 2) Slicing cheese and throwing them out at once For the internal tasks such as "Save as" and "Export to" we might get a big advantage. Such a task starts from the framework, calls thousands of methods, and finally leaves the only single value meaning a SUCCESS or FAILURE. No String instance involved during the task is needed to be persistent. #define ALLOCATION_SIZE ( 1024 * 1024 ) // 1MB #define ALIGNMENT 4 void* SCATTOAO::xmalloc( size_t nSize ) { nSize = ( ( nSize - 1 ) / ALIGNMENT + 1 ) * ALIGNMENT; if ( m_nRest < nSize ) { nAllocationSize = ( ( nSize - 1 ) / ALLOCATION_SIZE + 1 ) * ALLOCATION_SIZE; p = memory_page_allocation( nAllocationSize, PRIVATE|ANONIMOUS ); m_vector.append( Entry( p, nAllocationSize ) ); m_pNose = p; n_nRest = nAllocationSize; } ret = m_pNose; m_pNose += nSize; // Slice a block of cheese m_nRest -= nSize; return (void *) ret; } void SCATTOAO::xfree( void* ) { // do nothing at all } SCATTOAO::~SCATTOAO() { for ( iterator m_vector ) // Throw them at once memory_page_deallocation( it->m_pAddress, it->m_nSize ); } The instance of allocator class SCATTOAO is a thread specific object and it is used by the only own thread. Therefore, no mutex lock is required. I think the above are just a tip of potential, brilliant ideas. Let's discuss later this kind of topic once the surrounding situation is settled. Best regards, Tora -- - To unsubscribe send email to dev-unsubscr...@openoffice.org For additional commands send email to sy...@openoffice.org with Subject: help
[dev] Re: refactoring OUString
Am 06.06.2011 17:27, schrieb Michael Stahl: On 06.06.11 16:35, tora - Takamichi Akiyama wrote: Has anyone tried refactoring OUString? - It converts iso-8859-1 letters ranging 0x00-0x7f into UCS2 even it is not necessary. - It requires malloc(), realloc(), and free() or their equivalents. - It prevents debugging efforts because of sal_Unicode buffer[1]. - It mixtures different purposes: passing/returning parameters and long-lasting data. - and else... hi Tora, refactoring OUString has to be done carefully because it is a central part of the URE API/ABI and those must be compatible. I would put that "must be" up for discussion. a number of people here have come to the conclusion that it would be an improvement to use ::rtl::OString with UTF8 encoding as the standard string type, but unfortunately this would be an enormous effort to change, and it would mean breaking the backward compatibility of the C++ UNO binding, so it was never likely to actually happen. Still desirable project Regards, Christian -- - To unsubscribe send email to dev-unsubscr...@openoffice.org For additional commands send email to sy...@openoffice.org with Subject: help
[dev] Re: refactoring OUString
On 06.06.2011 19:43, tora - Takamichi Akiyama wrote: And also, please cover the underlying memory allocation mechanism which would be another key factor for the performance improvement. There's an old suggestion to treat small strings differently, see http://wiki.services.openoffice.org/wiki/Uno/Binary/Analysis/String_Performance. Niklas -- - To unsubscribe send email to dev-unsubscr...@openoffice.org For additional commands send email to sy...@openoffice.org with Subject: help
[dev] Re: refactoring OUString
On 06.06.2011 18:57, tora - Takamichi Akiyama wrote: This is just an idea. How about adding a new class besides OUString? On 2011/06/07 2:16, Mathias Bauer wrote: We already have enough string classes. :-) Yes, we have! :-) Besides that, you are right, rtl::OUString is stupid. We planned to discuss its replacement in the context of a future OOo 4.0 release, allowing for some incompatibility here. If done properly, the changes would require only recompilation of in-process C++ code. Sounds nice! And also, please cover the underlying memory allocation mechanism which would be another key factor for the performance improvement. One more thing. This might be controversial. But, IMHO, it would be better if a programmer takes care of the life duration of a string instance. Is it for a temporal use, or persistent use? I would like to say the new String class might offer certain ways to take care of its life duration. But as you know, we are faced with a completely new situation for the OOo future. So we should postpone discussing this topic until the dust has settled. Yep, we should postpone this exciting topic! Thank you for your time, Michael, Mathias. Best regards, Tora -- - To unsubscribe send email to dev-unsubscr...@openoffice.org For additional commands send email to sy...@openoffice.org with Subject: help
[dev] Re: refactoring OUString
On 06.06.2011 18:57, tora - Takamichi Akiyama wrote: On 2011/06/07 0:27, Michael Stahl wrote: > refactoring OUString has to be done carefully because it is a central part > of the URE API/ABI and those must be compatible. > > a number of people here have come to the conclusion that it would be an > improvement to use ::rtl::OString with UTF8 encoding as the standard > string type, but unfortunately this would be an enormous effort to change, > and it would mean breaking the backward compatibility of the C++ UNO > binding, so it was never likely to actually happen. > > so far we haven't even got rid of the tools strings... sigh. I see. This is just an idea. How about adding a new class besides OUString? We already have enough string classes. :-) Besides that, you are right, rtl::OUString is stupid. We planned to discuss its replacement in the context of a future OOo 4.0 release, allowing for some incompatibility here. If done properly, the changes would require only recompilation of in-process C++ code. But as you know, we are faced with a completely new situation for the OOo future. So we should postpone discussing this topic until the dust has settled. Regards, Mathias -- Mathias Bauer (mba) - Project Lead OpenOffice.org Writer OpenOffice.org Engineering at Oracle: http://blogs.sun.com/GullFOSS Please don't reply to "nospamfor...@gmx.de". I use it for the OOo lists and only rarely read other mails sent to it. -- - To unsubscribe send email to dev-unsubscr...@openoffice.org For additional commands send email to sy...@openoffice.org with Subject: help
[dev] Re: refactoring OUString
On 2011/06/07 0:27, Michael Stahl wrote: > refactoring OUString has to be done carefully because it is a central part > of the URE API/ABI and those must be compatible. > > a number of people here have come to the conclusion that it would be an > improvement to use ::rtl::OString with UTF8 encoding as the standard > string type, but unfortunately this would be an enormous effort to change, > and it would mean breaking the backward compatibility of the C++ UNO > binding, so it was never likely to actually happen. > > so far we haven't even got rid of the tools strings... sigh. I see. This is just an idea. How about adding a new class besides OUString? class ZString { sal_Char*buffer; sal_Int32 length; sal_uInt16 type; rtl_TextEncodingencoding; oslInterlockedCount refCount; }; - Gradually shift to the new one ZString, if applicable, in place of OUString and OString. - "type" might be an ID number denoting "const char*" "char *" "const sal_Unicode *", ... - "encoding" is an encoding id defined in "rtl/textenc.h" - refCount, assignment, copy constructor, ... would be done in the same manner. - No encoding conversion will be done until the conversion is really demanded. - Use arrays as a memory pool for the fixed-sized structure ZString. - ... e.g 1. String literal that is treated as it is ZString a( "xyz" ); buffer directly points to "xyz" no memory allocation neither data copy is involved until encoding conversion is demanded. length is left uninitialized in this case, but will be measured and cached upon being requested. type denotes "const char*, zero terminated" encoding might be ASCII_US or UTF-8; which might depend on the OS and compiler. (debugger) print a.buffer ... prints "xyz" 2. Receiving a result string from a callee in a storage allocated by alloca() instead of malloc() ZString temp( 100, RTL_ALLOCA ); func( temp ); func( ZString& x ) { x = "abc"; } In a destructor of temp above, if a reference count is 1, nothing special would be done and the allocated memory in the stack area will be automatically freed upon returning to the upper frame. if a reference count is more than 1, then memory allocation and data copy will be involved. Best, Tora On 06.06.11 16:35, tora - Takamichi Akiyama wrote: Has anyone tried refactoring OUString? - It converts iso-8859-1 letters ranging 0x00-0x7f into UCS2 even it is not necessary. - It requires malloc(), realloc(), and free() or their equivalents. - It prevents debugging efforts because of sal_Unicode buffer[1]. - It mixtures different purposes: passing/returning parameters and long-lasting data. - and else... -- - To unsubscribe send email to dev-unsubscr...@openoffice.org For additional commands send email to sy...@openoffice.org with Subject: help
[dev] Re: refactoring OUString
On 06.06.11 16:35, tora - Takamichi Akiyama wrote: > Has anyone tried refactoring OUString? > > - It converts iso-8859-1 letters ranging 0x00-0x7f into UCS2 even it is not > necessary. > - It requires malloc(), realloc(), and free() or their equivalents. > - It prevents debugging efforts because of sal_Unicode buffer[1]. > - It mixtures different purposes: passing/returning parameters and > long-lasting data. > - and else... hi Tora, refactoring OUString has to be done carefully because it is a central part of the URE API/ABI and those must be compatible. a number of people here have come to the conclusion that it would be an improvement to use ::rtl::OString with UTF8 encoding as the standard string type, but unfortunately this would be an enormous effort to change, and it would mean breaking the backward compatibility of the C++ UNO binding, so it was never likely to actually happen. so far we haven't even got rid of the tools strings... sigh. regards, michael -- "One of [the Middle Ages'] characteristics was that 'reasoning by analogy' was rampant; another characteristic was almost total intellectual stag- nation, and we now see why the two go together. [...] by developing a keen ear for unwarranted analogies, one can detect a lot of medieval thinking today." -- Edsger W. Dijkstra -- - To unsubscribe send email to dev-unsubscr...@openoffice.org For additional commands send email to sy...@openoffice.org with Subject: help