Re: [sc-dev] Our plan to improve Calc functionality related to phonetic text

Leonard Mada Sat, 18 Aug 2007 02:20:33 -0700

Hi,

I don't understand all details of the discussion. What I would like topoint out is a somewhat similar situation in R and the S+ language anddescribe the way it was brilliantly solved. (for R, seehttp://cran.R-project.org).

In the S+ language one has many types of objects (there are some atomicvariables like all integers, floating points ,... and there are complexobjects; and all these objects have S3 and S4-methods attached).

This is also true of spreadsheets. You have simple objects (likenumbers, strings), and there should also be more complex objects.


In R (and S+), these complex objects have methods embedded in them:
 - e.g. the *print.'specific_object'* displays the object on the screen

- specific coercion methods: when one wants to convert the value ofthe object to some *other object type*(like a number or a string, or a vector or a data.frame, whateverthe developer builds in)

I strongly believe that this is the way Calc should go. One has somebasic data-types:

 - scalar numbers and ordinary strings
 - complex objects: currency, other (typed) units, date, complex strings

(like the ruby-mentioned here - which I misinterpreted in my firstpost -), url, graphics, many-many more

   -- each of these objects should have specific methods implemented:
        ---  display method: what is displayed on screen

--- mathematical operator methods: how do mathematicaloperations work--- coercion methods: are specific transforms allowed (e.g.from date-to-number ) and the logic accomplishing it


This would have great advantages:

- all object classes become lighter, less memory consumption,application becomes much faster (efficient classes)

 -  BIG ISSUE: the actual data is separated from the display

Currently, the design of spreadsheets *badly mixes up content anddisplay*. I therefore strongly advocatethe splitting of content from display! And this is exactly whatthis design would do.- NO problems with functions that expect number-vs-string and when anumber is formatted as string,

   the function breaks (admixing again content with display)

- consistent and transparent handling of these situations (there arespecific embedded methods that do all the coercions)- only the 'developer' of the object knows exactly how this objectshould handle requests for other data types(e.g. date -> string, ... based probably on some otheroptions/variables)- easy adding new object types: e.g. chemical formula (from a recentdiscussion on the gnumeric-list):one can easily embed a 'mass'-property in a'chemical_formula'-object and access it then using e.g.:

   =get(A1,"mass"), where A1 is a cell that contains a chemical formula

Therefore I advocate:
 - splitting the huge classes in smaller classes;
 - separating the content (objects) from the display ('print'-method)
 - implement everything as lightweight  objects (classes)

This is something that Calc should really learn from R (S+ language)where everything is an object (having various methods). (UNIX has asimilar concept, everything is a file!).

This issue makes it, too, in my TOP 5 of major design flaws of existingspreadsheet programs. I sincerely hope that OOo Calc will evolve in thissaid direction and that some refactoring of the current code will takeplace.


Kind regards,

Leonard


Kohei Yoshida wrote:

Hi Eike & Takashi,

On Fri, 2007-08-17 at 21:39 +0200, Eike Rathke wrote:

Hi Takashi,

On Saturday, 2007-08-18 01:05:18 +0900, Takashi Nakamoto wrote:

1) Store that in rtl::OUString.

At the beginning of our project, this idea was the first choise, but
now we think this implmentation might be very difficult and seek for
better ideas.

I'd consider implementing ruby at OUString an abuse of OUString..
besides that it would be a completely incompatible change, it would
increase the size of an OUString instance, which in 99% use cases is
just waste.


That's certainly true if we add another data member to OUString to store
the ruby text.  Also, changing OUString is a risky business since it's
used everywhere, not just in one application, hence should not be done
lightly.

But, logically, it would make sense to store the ruby text together with
the base text, because the ruby text is conceptually a property of the
base text.  Putting them together would also eliminate the
synchronization problem because the ruby and the base texts would never
be separate.

One idea I came across in the past was to embed the ruby text
information at the end of the sal_Unicode array, and borrow some bits
>from the refCount variable to flag the presence of ruby text, then
OUString would know to look for it.  It could be done transparently to
those existing code that doesn't need to use ruby texts, and the size of
OUString would remain the same because there is no additional data
member required.

That said, we probably shouldn't pursue this idea just for the ruby text
implementation in Calc, at least without giving it more careful
consideration.

2) Store that in ScStringCell.  This may be easier but string cells are
used a lot, and most likely only a faction of them use phonetic guides
under normal use cases.  So, adding another data member to ScStringCell
may not be desirable.

Now "Storing phonetic text in ScStringCell" is the first choise
because it seems to be easier. I can understand your concerns but I
wonder what concrete problems would be raised by this approach? I
can't evaluate the bad effect of this idea correctly.


Eike already answered this correctly.  Basically, if you add another
data member to a class, it would increase the size of that class by that
new data member size regardless of whether the data is used or not.  The
size increase may be small per object, but if there are thousands of
ScStringCell instances, then the increase may become substantial.  And
that increase may not be justified if only a fraction of the cell
instances need to use that data member.

That said.  It may be a workable approach if the memory increase turns
out to be a non-issue under average use cases. (who knows?)  But I'd
still favor subclassing it and leave ScStringCell alone.

Memory consumption. Adding yet another member variable to ScStringCell
that again is only needed in not even 1% of use cases is waste.


[...]

4) Or maybe subclass ScStringCell to create ScRubyStringCell, and use an
instance of that class to store the ruby text information when needed ?
Just a wild idea, but could this work (maybe) ?

I think this is not a good idea. It would make String type cell have
4 different classes, ScStringCell, ScRubyStringCell, ScEditCell,
ScRubyEditCell.

If ruby was implemented at EditEngine we would end up with just 3:
ScStringCell, ScRubyStringCell, ScEditCell.

Switching routine that choose one class from the four
classes for creating an instance of a cell would be complex.

I don't think so. You need to distinguish anyway, and subclassing may
actually ease things because switching can be done internally using
factory patterns and RTTI.


So, overall, introducing a new class named ScRubyStringCell as a child
class of ScStringCell, and making changes to the EditEngine class (in
svx) seems to be the most attractive choice.  But I wonder what the
level of difficulty is for this approach.

Kohei

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [sc-dev] Our plan to improve Calc functionality related to phonetic text

Reply via email to