Le 07/11/2017 à 23:00, Steffen Märcker a écrit :
I am not familiar with the literature on scanners. May I ask you about
some details on the "for all characters" algorithms you are referring to?
The two main ones available, from the typical Aho/Ullman textbook, are:
- NFA to DFA conversion (i.e. build a NFA with your regular expressions,
then convert it to a DFA)
- Direct regular expression to DFA construction
Both of them have loop of the type:
for (each input symbol a) {
...
Building a (or connecting to) a BDD library would be fun, indeed. But
within that time frame it seems not realistic. Anyway, after finishing
my thesis, I'd like to come back to that idea.
It would certainly be interesting. Please contact us again when you'll
have time :)
Regards,
Thierry
Ciao, Steffen
Am 7. November 2017 16:33:03 MEZ schrieb Andrew Glynn <aglyn...@gmail.com>:
A possible way to accomplish it would be to use an object graph with
an incremental query engine, such as EMF/CDO with Viatra or
something similar. You could then put different character sets in
connected objects and query only as far as you need to.
Andrew Glynn
Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for
Windows 10
*From: *Thierry Goubier <mailto:thierry.goub...@gmail.com>
*Sent: *Tuesday, November 7, 2017 7:17 AM
*To: *Any question about pharo is welcome
<mailto:pharo-users@lists.pharo.org>
*Subject: *Re: [Pharo-users] Binary Decision Diagram Package in
Smalltalk
Hi Andrew, Steffen,
2017-11-07 13:10 GMT+01:00 Prof. Andrew P. Black <bl...@cs.pdx.edu
<mailto:bl...@cs.pdx.edu>>:
> On 28 Oct 2017, at 17:37 , Steffen Märcker <merk...@web.de
<mailto:merk...@web.de>> wrote:
>
> Does that mean the sets/bdd would be constructed mainly at
comile time? Anyway, Andrew, feel free to contact me, I might
help you with this.
>
Thanks for the offer, Steffen! The problem is that I need to
use SmaCC for my current project, and really do not have a month
to take off and re-design the way that it builds its scanner.
I’ve talked to Thierry Goubier about, and he doesn’t have time
either! It would be a fun project, though, and it ought to be
fairly separate from other parts of SmaCC. I’ve spent a fair
bit of time thinking about how to do it, but don’t think that I
will be able to actually focus on it.
Yes, this is the essence of the issue. There are a few alternatives
about it, but none we have the time to pursue.
An alternative approach, which Thierry has suggested, is to make
SmaCC work on the UTF-8 representation of the Unicode. Then we
could represent character sets as prefix trees. But the core
problem would still exist: you can’t run an algorithm that
repeatedly executes
for all characters in the alphabet do:
when there are 2^21 characters in the alphabet!
The main issue is that `for all characters`... All the literature on
scanner building uses 'for all characters do'.
Thierry
Andrew