After reading this, I looked at the VNode class definition for the first 
time.  Ouch!  I think this must be an example of serious technical debt.  
No doubt it seemed reasonable or even necessary but as I look at it without 
knowing the history or even much about how it's used, it's not how I try to 
go about things.  Like Vitalije, I would try hard to be able to create a 
basic stand-alone VNode and then insert it onto whatever application- or- 
outline-wide structures need it.

One example is this line in __init__():

    self.expandedPositions: list[Position] = []  # Positions that should be 
expanded.

In my limited understanding, positions have nothing to do with VNodes 
except for saying that "this position contains that VNode".  It's the tree 
that needs to know about which positions are expanded, not the VNode.  If 
you would say "But when I clone a position, I need to know its children and 
their expansion state", I would respond "ask the tree - it's the 
authority".  So positions should not be part of a VNode.

I have never written a tree anywhere near as complicated as Leo's.  But I 
have learned that to get good speed when you have to access objects that 
are collected in a list-like structure, it's necessary to have pointers to 
them in a structure that has much better access time than a list.  
Typically I cache them in a dict. It's hard enough to keep such a cache 
current.  When the list's contents can form part of its entries, it sounds 
like a nightmare.

Otoh, two points.  It can be hard to keep various lists/dicts. etc in sync 
when changes take place.  And with non-VNode structures embedded into a 
VNode, refactoring will likely be really hard.

 Of course, I don't know anything much about this part of Leo's code so I 
may be misunderstanding in a big way!
On Friday, July 14, 2023 at 6:27:57 AM UTC-4 vitalije wrote:

> I haven't post here for a while, but from time to time, I'm still reading 
> this forum. After reading the recent Edward's post about "big problems with 
> undo/redo tree operations", I've just had to write something about this 
> topic.
>
> Personally, I've never had too much confidence in using Leo's undo 
> before/after machinery. I've felt it has a deep defect buried inside which 
> makes things fragile and easy to get wrong. The defect I am talking here 
> about is the fact that all v-nodes instances constantly live and are 
> present in the `c.fileCommands.gnxDict`. I don't remember all the details, 
> but I do remember that this fact alone has caused some very difficult to 
> solve bugs throughout the years. One that comes to my mind right now was 
> the issue 1348 <https://github.com/leo-editor/leo-editor/issues/1348> 
> reported by Segundo Bob on the September 2019, and few months later another 
> one issue 1437 <https://github.com/leo-editor/leo-editor/issues/1437>.
>
> These two bugs were so much hard to solve that for a while I was very 
> convinced that their cause is somewhere deep in the operating system, 
> outside of Leo and even outside of the Python itself.
>
> After I gave up in searching for these two bug's solution in March next 
> year, I've found the solution by trying to solve something else and the 
> solution was just one simple line clearing the gnxDict.
>
> Instead of using Leo's before and after undo methods, I used to create my 
> own undo data for the operations that I wrote. (Search for nodes in the 
> LeoPy outline that have 'vitalije' in their gnx) for some examples.
>
> Here is just one example.
> def deletePositionsInList(self, aList):
>     """
>     Delete all vnodes corresponding to the positions in aList.
>     Set c.p if the old position no longer exists.
>     See "Theory of operation of c.deletePositionsInList" in LeoDocs.leo.
>     """
>     # New implementation by Vitalije 2020-03-17 17:29
>     c = self
>     # Ensure all positions are valid.
>     aList = [p for p in aList if c.positionExists(p)]
>     if not aList:
>         return []
>
>     def p2link(p):
>         parent_v = p.stack[-1][0] if p.stack else c.hiddenRootNode
>         return p._childIndex, parent_v
>
>     links_to_be_cut = sorted( set(map(p2link, aList))
>                             , key=lambda x: -x[0]
>                             )
>     undodata = []
>     for i, v in links_to_be_cut:
>         ch = v.children.pop(i)
>         ch.parents.remove(v)
>         undodata.append((v.gnx, i, ch.gnx))
>     if not c.positionExists(c.p):
>         c.selectPosition(c.rootPosition())
>     return undodata
>
> It collects undo data as a list of tuples (parent_gnx, index, child_gnx), 
> in other words tuple (str, int, str) which in Python is immutable and the 
> whole list can be also easily turned into the tuple and made immutable too. 
> This immutability of the undo data greatly simplifies things and leaves no 
> place for bugs to sneak in. Keeping undo data in the form of some mutable 
> data structures whose elements can also be mutated leaves the door widely 
> open and puts the big invitation sign at the entrance for all the bugs to 
> come and stay.
>
> With all of this, I just wished to illustrate how dangerous and tricky 
> things can get when we have mutable state buried so deep down in the 
> foundations of the Leo's code. Through the years I tried (without much 
> success so far) to warn about this issue and even tried several other 
> approaches in modeling outline data.
>
> I've built several data models and each of them provided support for all 
> tree operations that are supported by Leo. Each operation was totally 
> undoable and handling undo/redo was part of the data model itself.
>
> My last (so far) and most advanced data model is written in Rust as a 
> native python extension module. I haven't the time to actually do something 
> useful with it yet. I remember offering it to the author of the Leo vscode 
> plugin when he first announced the idea of writing it, but he dismissed the 
> idea and decided to use Leo bridge instead and not to rewrite Leo's core. 
> However, later he has done just that rewrite Leo's core in javascript (or 
> maybe he is still doing it).
>
> Rust library can be just as easily packaged in the native module for Node 
> as it was easy to turn it into a Python native module.
>
> Not only this library runs much faster than Leo, but it has much simpler 
> undo/redo logic. I haven't really tested it, but I wouldn't be too much 
> surprised if the same code that I wrote in rust if rewritten in plain 
> python, would run at the same speed as Leo or maybe even a bit faster than 
> that.
>
> All the operations (which modify the outline) return a string representing 
> undo data. The undo/redo operations accept string as an argument and 
> perform the same modification as the original operation in both directions. 
> In fact all the operations are implemented in a similar way. In the first 
> phase they do not modify anything and instead they just calculate the undo 
> data. Once they have calculated the undo data, they simply perform redo 
> operation which actually modifies the outline.
>
> For most of the operation this undo data string is quite small.
>
> But the best part of this library which I am most proud of is the fact 
> that it allows stable outline positions. In the rust code they are called 
> the labels and they are just plain integers. 
>
> In all of my previous attempts in creating data models for Leo outlines, 
> I've tried to achieve this, and I had a stable positions but they could not 
> survive undo/redo operation, but on each redo the model generated new 
> different positions. This in turn has complicated undo/redo logic. Finally 
> in my last attempt, I've found a way for generating positions (or labels) 
> in such a way that the same outline will always generate same labels. That 
> way tree modifications always produce the exact same memory representation 
> of the outline. This means that the undo/redo data is stable too. One can 
> send it over the net and apply it to the outline of the same shape and 
> produce exactly the same result. They can be written in the log file 
> allowing virtually unlimited undo/redo history.
>
> The time of my last three commits to my mini_leo library are Jul 2019, Jul 
> 2021 and Jul 2022. It seems that in July I am usually getting  interested 
> in Leo data models. The library needs documentation and some cleaning of 
> the building methods previously used. I haven't yet found a good way to 
> make library automatically built on every commit. Probably github support 
> for automatic rust builds is improved now comparing to the time I last 
> tried to do it.
>
> *In what way is my data model different than the Leo's model?*
>
> Leo's basic data structure is VNode and it is by definition tied to many 
> different parts of Leo. In tests one can't create VNode unless full Leo's 
> commander is provided. Just creating an instance causes some modifications 
> in the Leo's core data. If you detach VNode from the outline and you wish 
> to keep the node around in the form of undo data, the outline itself is not 
> the same because it's cache in the fileCommands.gnxDict is different and it 
> will cause some hard to predict effects on the gnx generation functions. 
>
> On the other hand, in my data model node is just a simple object with the 
> few simple values (strings, booleans, and integers). It can be instantiated 
> without causing any side effect on the rest of the program. It's easy to 
> move this simple values around, to keep them in the logs, to send them, to 
> recreate them if necessary. This makes testing easy and allows simplicity 
> throughout the library.
>
> The plain immutable data types are really great for modeling and they can 
> represent quite a lot of complex data. I use them whenever I can. I always 
> start solving the problems by wandering how to represent all the necessary 
> data using just a bunch of strings, integers, booleans, tuples, lists and 
> dictionaries. Only when the problem is already solved, I sometimes write a 
> class or two mimicking some of the used plain data structures, but quite 
> often I am satisfied with the plain data especially if it is possible to 
> hide all these plain data arrangements inside the single function that is 
> published to the outside world. This gives me a certainty that I can later 
> make whatever change I need and nobody will notice anything as long as my 
> function still returns the same result.
>
> My 2 cents
> Vitalije
>

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to leo-editor+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/leo-editor/4357ed6c-7a62-4218-99e6-6bc7b952e733n%40googlegroups.com.

Reply via email to