Re: The importers are about to get much simpler

2022-08-16 Thread Edward K. Ream
On Tuesday, August 16, 2022 at 7:25:43 AM UTC-5 Edward K. Ream wrote:

Today is an important milestone in the history of Leo's importers. The code 
> probably can't much simpler.
>

All the importers are now significantly smaller. The cython importer has 
shrunk from 581 lines to 57. Most other importers are about half their 
former size.

Edward

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to leo-editor+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/leo-editor/2bfc47f6-77fa-4c0d-b5a6-f6144a9dc78fn%40googlegroups.com.


Re: The importers are about to get much simpler

2022-08-16 Thread Edward K. Ream
On Monday, August 15, 2022 at 6:43:06 AM UTC-5 Edward K. Ream wrote:

> Vitalije's importer code has primed the psychological pump. The effect on 
> the importers will be a spectacular collapse in complexity.
>

Indeed yes. The work is complete. See PR #2741 
, which has been merged 
into the main work (ekr-new-importers branch).

Today is an important milestone in the history of Leo's importers. The code 
probably can't much simpler.

The overall project, PR #2729 
, is nearing 
completion. A few coverage and functional tests remain.

Edward

P.S. *i.scan_all_lines*, and its helper *i.scan_one_line*, replace *all* 
the horrendous scan tables and related horrible code.

i.scan_all_lines just calls i.scan_one_line for each input line.

i.scan_one_line recognizes strings and comments. As a "trap door" for 
overrides,  i.scan_one_line calls i.update_level to handle the details of 
updating logical level. the XML and HTML importers use i.update_level to 
handle tags. So i.scan_one_line is simple and general.

scan_one_line isn't equipped to handle PHP heredoc comments. If this ever 
becomes an issue we the PHP importer could just override i.scan_one_line.

EKR

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to leo-editor+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/leo-editor/c3e84eb3-c69c-49d0-b413-8e0f77ccaaf9n%40googlegroups.com.


Re: The importers are about to get much simpler

2022-08-15 Thread Edward K. Ream
On Mon, Aug 15, 2022 at 7:09 AM tbp1...@gmail.com 
wrote:

> It might be interesting to know if the framework could support brainf*ck
>  (but I won't be the one to
> try to actually write the importer...).
>

I'll leave this as an exercise for the reader :-)

Edward

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to leo-editor+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/leo-editor/CAMF8tS0%2BNLs1%2BMUfYJ998fs5DT9p8-5MQBTNEqfOSVR8dkp0xQ%40mail.gmail.com.


Re: The importers are about to get much simpler

2022-08-15 Thread tbp1...@gmail.com
It might be interesting to know if the framework could support brainf*ck 
 (but I won't be the one to try 
to actually write the importer...).

On Monday, August 15, 2022 at 7:43:06 AM UTC-4 Edward K. Ream wrote:

> Vitalije's importer code has primed the psychological pump. The effect on 
> the importers will be a spectacular collapse in complexity.
>
> *Aha 1*: A new method, *i.scan_all_lines*, can calculate all scan states 
> in a single pass.
>
> There is no need to store "helper" values in scan states! Instead, 
> i.scan_all_lines will maintain a few simple state-related vars.
>
> As a result, a single *NewScanState *class will suffice. Eventually, it 
> will become a named tuple with just two fields: context and level. Btw, 
> only i.gen_lines uses scan state classes, so the NewScanState class will 
> "disappear" from the view of all importers.
>
> *Aha 2*: The present scanning dictionaries contain two kinds of data: 
> level-related and token-related. Conceptually, those dictionaries should 
> *specify 
> only* tokenizing-related data. This Aha is moot because...
>
> *Aha 3*: None of the scanning dictionaries are needed!!!
>
> This last Aha arose while writing the first draft of i.scan_all_lines. At 
> present, i.scan_dict returns data telling how to:
>
> - compute the new state,
> - compute the new level,
> - increment the index into the line.
>
> But i.scan_all_lines can calculate all this without help! For any 
> language, i.scan_all_lines only needs to know the following *customizing 
> values*:
>
> - The (list of) characters that begin and end strings.
> - The characters that begin and end all forms of comments.
> - The (optional) characters (typically curly brackets) that increment and 
> decrement logical level.
>
> Subclasses of the Importer class will define these customizing values as 
> needed.
>
> *Summary*
>
> A new method, i.scan_all_lines, will compute all (new) scan states in a 
> prepass. 
>
> New scan states will eventually be named tuples containing only context 
> and level fields. All custom scan state classes will disappear.
>
> As needed, importers will define constant customizing values for 
> i.scan_all_lines.
>
> Importers will no longer need ctors. All overriding data will be constants.
>
> The new code will be *slightly *faster and *much* simpler than the old.
>
> Edward
>
> P.S. I'm not going to apologize for the old code. It got us to our present 
> happy state. 
>
> I could not have braved the coming changes without the new unit tests!
>
> EKR
>

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to leo-editor+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/leo-editor/5daace0c-7a24-4de5-86b3-82f5e9e05aafn%40googlegroups.com.


The importers are about to get much simpler

2022-08-15 Thread Edward K. Ream
Vitalije's importer code has primed the psychological pump. The effect on 
the importers will be a spectacular collapse in complexity.

*Aha 1*: A new method, *i.scan_all_lines*, can calculate all scan states in 
a single pass.

There is no need to store "helper" values in scan states! Instead, 
i.scan_all_lines will maintain a few simple state-related vars.

As a result, a single *NewScanState *class will suffice. Eventually, it 
will become a named tuple with just two fields: context and level. Btw, 
only i.gen_lines uses scan state classes, so the NewScanState class will 
"disappear" from the view of all importers.

*Aha 2*: The present scanning dictionaries contain two kinds of data: 
level-related and token-related. Conceptually, those dictionaries should 
*specify 
only* tokenizing-related data. This Aha is moot because...

*Aha 3*: None of the scanning dictionaries are needed!!!

This last Aha arose while writing the first draft of i.scan_all_lines. At 
present, i.scan_dict returns data telling how to:

- compute the new state,
- compute the new level,
- increment the index into the line.

But i.scan_all_lines can calculate all this without help! For any language, 
i.scan_all_lines only needs to know the following *customizing values*:

- The (list of) characters that begin and end strings.
- The characters that begin and end all forms of comments.
- The (optional) characters (typically curly brackets) that increment and 
decrement logical level.

Subclasses of the Importer class will define these customizing values as 
needed.

*Summary*

A new method, i.scan_all_lines, will compute all (new) scan states in a 
prepass. 

New scan states will eventually be named tuples containing only context and 
level fields. All custom scan state classes will disappear.

As needed, importers will define constant customizing values for 
i.scan_all_lines.

Importers will no longer need ctors. All overriding data will be constants.

The new code will be *slightly *faster and *much* simpler than the old.

Edward

P.S. I'm not going to apologize for the old code. It got us to our present 
happy state. 

I could not have braved the coming changes without the new unit tests!

EKR

-- 
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to leo-editor+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/leo-editor/6a92be9d-d3e2-4cc8-9efc-b0986b543ee5n%40googlegroups.com.