subject:"Re\: \[Chicken\-users\] Parsing Simple Markup"

Re: [Chicken-users] Parsing Simple Markup

2014-09-23 Thread Yves Cloutier

Wow...that is just perfect...suddenly I'm in love with Scheme :)

On Mon, Sep 22, 2014 at 8:01 AM, Andy Bennett andy...@ashurst.eu.org
wrote:

 Hi,

  Actually due to the possible presence of nested commands, it should
  probably be something more generic, since in the last example:
 
  (bold (smallcap (size 2 text)))
 
  what the procedure 'bold' would be taking in is not a string text, but
  rather an expression...so this is where I guess things would need to be
  recursive.

 The evaluation rules will evaluate things in the correct order. So
 (size 2 text) will be evaluated first, then (smallcap ) and then
 (bold ). It's deliberately unspecified in which order 2 or text will
 be evaluated in.




 Regards,
 @ndy

 --
 andy...@ashurst.eu.org
 http://www.ashurst.eu.org/
 0x7EBA75FF


___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users

Re: [Chicken-users] Parsing Simple Markup

2014-09-22 Thread Jörg F. Wittenberger


Did I already recommend this?  Sorry if that's a duplication.

One more example of SXML together with SRFI 110 sweet expressions 
(indent sensitive LISP syntax).  Those two blend well together. I'm 
using them embedded in XML (XSLT) here:


http://ball.askemos.org/Aa176138e655369f8c01c3044ced70cfc
(Be sure to view this as source, not in the browser!)

Remarks: a) this is served from Chicken b) it's a rather simple but 
complete payment system, docs coming up here:

http://ball.askemos.org/A0cd6168e9408c9c095f700d7c6ec3224/?_v=search_id=1856_go=2
Wilma, Fred and Bamm-Bamm are each running the script above.

Best
/Jörg

Am 21.09.2014 um 22:34 schrieb Arthur Maciel:
Dear Yves, with SXML you could write transformation rules as Peter has 
shown in www.more-magic.net/docs/scheme/sxslt.pdf 
http://www.more-magic.net/docs/scheme/sxslt.pdf.


I'm not experienced with SXML, but AFAIK they would generate a similar 
effect as the procedures in your example below.


Best wishes,
Arthur

2014-09-21 17:06 GMT-03:00 Yves Cloutier yves.clout...@gmail.com 
mailto:yves.clout...@gmail.com:


 Hello Oleg,

 Thank you for your recommendations too.  I actually just came back 
from the local library where I picked up The Scheme Programming 
Language.


 You know, reading through your reply, it was the last part that made 
me think about something.


 If I can convert my input to the format:

 (bold text)
 (indent 5 text)
 (bold (smallcap (size 2 text)))

 Could I not define each of these as functions (or procedures), and 
then just call an (eval '  ) procedure to do my output?


 For example (keeping in mind I'm only just getting familiar with 
Scheme syntax!):


 (define (bold (text)
  (print the opening tag for the command 'bold')
  (print the string 'text')
  (print the closing tag for the command 'bold'))

 (define (indent (indent-value text)
 (print the opening tag for the command 'indent' with value of 
'indent-value')

 (print the string 'text')
 (print the closing tag for the command 'indent'))

 Actually due to the possible presence of nested commands, it should 
probably be something more generic, since in the last example:


 (bold (smallcap (size 2 text)))

 what the procedure 'bold' would be taking in is not a string text, 
but rather an expression...so this is where I guess things would need 
to be recursive.


 Once my document has been converted into one big s-expression, and 
procedures defined accordingly, then I could just (eval ) it..couldn't I?


 (eval '(bold text)
 (indent 5 text)
 (bold (smallcap (size 2 text

 Or something along those lines?

 If this is the casebrilliant!



___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users

Re: [Chicken-users] Parsing Simple Markup

2014-09-22 Thread Andy Bennett

Hi,

 Actually due to the possible presence of nested commands, it should
 probably be something more generic, since in the last example:
 
 (bold (smallcap (size 2 text)))
 
 what the procedure 'bold' would be taking in is not a string text, but
 rather an expression...so this is where I guess things would need to be
 recursive.

The evaluation rules will evaluate things in the correct order. So
(size 2 text) will be evaluated first, then (smallcap ) and then
(bold ). It's deliberately unspecified in which order 2 or text will
be evaluated in.




Regards,
@ndy

-- 
andy...@ashurst.eu.org
http://www.ashurst.eu.org/
0x7EBA75FF


___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users

Re: [Chicken-users] Parsing Simple Markup

2014-09-21 Thread Peter Bex

On Sat, Sep 20, 2014 at 11:19:08AM -0400, Yves Cloutier wrote:
 Hello,
 
 I am  a new user to Scheme in general and to Chicken in particular, nice to
 meet you all.

Hello Yves, and welcome to the CHICKEN community!

 I came to scheme looking for an alternative to Perl for doing a personal
 project which involves parsing an input file, identifying html-like
 commands and converting those to Groff code.

That should be pretty doable.  We already have several eggs for parsing
various markup languages, you may want to take a look at their
implementations for inspiration:

- html-parser
- htmlprag
- lowdown
- mistie
- svnwiki-sxml

 Scheme is a totally different paradigm that I'm used to, so while I wait
 for my books to arrive I will need some hand-holding...hope that's ok.

No problem!  We always help out newbies.  If you have specific questions,
you might also like to try our IRC channel.  There's usually someone
around to answer your question.

 1) Is the Chicken Scheme manual available for purchase?  Online docs are
 great but I like to have a hardcopy so that I can read offline.

I'm afraid not.  You're the first person to ask for a hardcopy.  Of course
you can always print it...  The manual is in svnwiki syntax, which can be
translated to sxml/html or markdown.  It's also human-readable so you
could print out the sources.  There's a copy of the manual with every
tarball, which gets installed as HTML in your system's doc directory,
so it's always available when you're offline.

 2) The best way to learn is to get your hands dirty so I was looking at
 doing everything from scratch, but then I saw input-parse (
 http://wiki.call-cc.org/eggref/4/input-parse) which seems pretty much like
 what I need.  But i can't seem to find this in the Eggs.  It says that page
 does not exist yet.

Which page are you talking about?  http://wiki.call-cc.org/eggref/4/input-parse
looks fine to me.

 In Perl I am able to do most of this with regular expressions, but I'm
 hitting my head against the wall when it comes to multiple formatting
 commands within a group ...,...,...

There's a famous quote by Jamie Zawinsky about regular expressions, which
seems like it applies in this case:  Some people, when confronted with
a problem, think “I know, I'll use regular expressions.” Now they have
two problems.

Having said that, I think the SRE notation for regular expressions makes
them a lot more readable.  However, parsing complex languages using
regular expressions is a bad idea...

 Also to noteI am NOT a programmer of developer - I am a hobbyist and
 doing this for fun!

...since you're not a programmer, you may not be familiar with formal
language theory.  The idea there is that there are several classes of
languages (or grammars), and only so-called regular grammars can
stricly be parsed with regular expressions.  A regular grammar is
basically one which requires no extra information to parse it, aside
from the current rule in the parser.  It also means that no backtracking
is needed when parsing it.  Irregex (like Perl) can do backtracking,
which muddles the waters quite a bit, and I think this is one of the
reasons people get confused about the abilities of regexes.

A good rule of thumb to remember is: if your syntax allows to nest
things, regular expressions alone cannot parse it.  For instance, in
HTML you can arbitrarily nest markup instructions like bi../i/b,
but also divdiv.../div/div.  This is why people will tell you
that HTML/XML cannot be parsed with regular expressions.  If you try
anyway, you set yourself up for failure.  Many security issues have
historically been due to poor parsing choices.  If you're interested
in this stuff, see also http://www.langsec.org, which is a group of
people who are using a language-theoretical approach to fighting
insecurity.

You may be able to do partial parsing steps of a complex grammar
using regular expressions combined with some code to drive it.
This is the typical PHP/Perl approach of parsing languages, with the
reference implementations of Markdown and Textile being prime examples.
However, this quickly becomes untractable, and inevitably leads to the
aforementioned security issues.  Instead, I'd advise you to use one of
the parsing eggs, or roll your own recursive descent parser.  If
performance is not much of a consideration, that's pretty easy to do in
Scheme, and you don't need any dependencies.

 My idea was that I could read a line of text from a file at a time.  My
 understanding is that the input would be read into an s-expression (which
 I understand to basically be a list).

That sounds problematic, because it will limit your ability to have
modifiers that span multiple lines.  Of course, it's still possible with
additional bookkeeping, but you may find it easier to just parse from a
character stream, handling newline symbols in the grammar instead of
being fundamental to the way your syntax must be parsed.

 This is my first attempt at functional

Re: [Chicken-users] Parsing Simple Markup

2014-09-21 Thread Andy Bennett

Hi,

 I am  a new user to Scheme in general and to Chicken in particular, nice
 to meet you all.

Welcome!


 A few examples of what I am trying to parse:
 
 1. Tags that identify structural elements of a document:
 [chapter] Chapter Title
 [heading1] Heading Title
 [list]
 ...
 [end]
 
 [quote]
 ...
 [end]
 
 2. Tags that identify formatting of text:
 boldtext  ;single formatting command with no value
 indent 5text ; formatting command with a value
 dropcapOnce upon a time
 bold, smallcap, size +2text ;a command group which has multiple
 formatting commands enclosed within 
 
 A command group can be singular:
 
 ...
 
 or have multiple commands separated by commas:
 
 ...,...,...,
 
 the closing  signalling the end of the command group.

This is not entirely dissimilar to Markdown so I'd echo Peter's advice
to check out lowdown, the CHICKEN Markdown implementation, and comparse,
the parser library lowdown is implemented in.

I'll also point you to the eMail address parsing egg:
http://api.call-cc.org/doc/email-address which is another example of a
parser written with comparse. It's interesting because, unlike lowdown,
it implements a parser for just a small number of things: eMail
addresses and lists of eMail addresses.

comparse is a parser combinator library. This means that you specify
parts of your grammar / language and a procedure which can parse that
thing is returned. You then combine these parsers to produce other
parsers that, for example, can parse X then Y, X or Y, X then Y
then Y, etc. It takes a couple of hours to wrap your head around it but
it's very powerful. The email-address parser is build up starting from
sets of characters and resulting in two procedures: one that parses and
eMail address and one that parses a sequence of eMail address.


 The idea is to make typesetting with Groff very simple and intuitive for
 any user - not just programmers and hackers.  The markup we are working
 on is called Typesetting Markup Language (TML).  So it would convert
 html-like commands and generate a Groff document from it.

comparse allows to take your results and give them as arguments to other
procedures. In the eMail address egg I use this to populate an internal
data type that represents an eMail address. You could use an
intermediate data type like this or you could try to write a number of
different procedures which immediately output the parsed thing in the
required format.


 Right now I am trying to do a prototype which generated Groff in the
 backend, but the idea is to have a general purpose markup that could
 also be used to generate LaTex/Contex, HTML xml etc

...it's probably best to generate an intermediate format then. The
lowdown egg generates SXML which can easily be rendered down to HTML.

SXML is an s-expression representation of the tree structure of XML.

See here for an illustration of SXML:
http://www.more-magic.net/posts/lispy-dsl-sxml.html


 In Perl I am able to do most of this with regular expressions, but I'm
 hitting my head against the wall when it comes to multiple formatting
 commands within a group ...,...,...

In comparse something like X,Y,Z would be something like:

(off the top of my head, without testing anything)

; fIorz's separated-by parser :
(define (separated-by sep-parser field-parser)
  (sequence* ((head field-parser)
  (tail (zero-or-more
  (preceded-by
sep-parser
field-parser

 (result (cons head tail



(define the-parser
  (sequence-of (char )
   (separated-by
 ,
 (maybe ; support null elements
   (any-of X Y Z))
   (char )))


The email-address additionally has the delimited-by parser to support
white space around the commas. Above I've used the maybe parser to
show how you'd support X,,Y,Z as well as X,Y,Z



 Also to noteI am NOT a programmer of developer - I am a hobbyist and
 doing this for fun!

It looks like you're on the right tracks.


 My idea was that I could read a line of text from a file at a time.  My
 understanding is that the input would be read into an s-expression
 (which I understand to basically be a list).  Then could car the first
 item of the list and match it against my tags or formatting commands
 (which would be defined as something like below)
 
 (define chapter [chapter])
 (define list:digit [list:digit])
 (define list:alpha [list:alpha])
 (define end-list [end])
 (define close-command-group )
 (define command-group-begin )
 (define command-group-end )
 (define bold bold)
 (define smallcap smallcap)
 (define dropcap dropcap)

Don't worry about reading the input: let comparse do that for you.

Other than that, it looks like the rules you have defined there aren't a
million miles from the way comparse would let you specify things. The
additional complexity is that compares returns a procedure that you
apply to the string or port you want

Re: [Chicken-users] Parsing Simple Markup

2014-09-21 Thread Yves Cloutier

Peter/Andy/Richard,

Thanks so much for your replies.

Richard: Thanks for showing me how to  install input-parse from the command
line...I had no idea!.  Also thanks for the link comparing how things are
done using Python as a comparison.  Got me reading a file in 30 seconds!

And/Peter: Thank you for suggesting I look at comparse and lowdown.  I will
certainly do that.

Andy, you put a link to Peter's page for SXML...strangely enough I had
already visited this page while trying to learn more about Scheme/Lisp and
how I would approach my project.

Acutally the markup notation I have devised would actually translate itself
even better to S-Expression.  Using the example from Peter's page:

div
  spanHello, strongdear/strong friends./span
  spanThis is a lt;simplegt; example./span
/div

Converting this HTML fragment to an S-expression is straightforward:

'(div
   (span Hello,  (strong dear)  friends.)
   (span This is a simple example.))

t's a bit more cumbersome to type because you have to break up the strings
for the strong element

If I were to write this in TML, it would look something like this:

div
   spanHello, strongdear friends
   spanthis is a lt;simplegt; example.

which to me looks exactly like:

'(div
   (span Hello,  (strong dear)  friends.)
   (span This is a simple example.))

In TML markup, the symbol  denotes the closing of a tag or tag group, as
opposed to XML/HTML where a corresponding /tag must exists for each
opening tag.  So whenever a  is encountered, that is a a cue to close the
current tag (/tag) or group of tags (/tag1/tag2/tag3.

Well, it looks like Scheme will be a very good choice for this project...

Cheers!






On Sun, Sep 21, 2014 at 11:01 AM, Andy Bennett andy...@ashurst.eu.org
wrote:

 Hi,

  I am  a new user to Scheme in general and to Chicken in particular, nice
  to meet you all.

 Welcome!


  A few examples of what I am trying to parse:
 
  1. Tags that identify structural elements of a document:
  [chapter] Chapter Title
  [heading1] Heading Title
  [list]
  ...
  [end]
 
  [quote]
  ...
  [end]
 
  2. Tags that identify formatting of text:
  boldtext  ;single formatting command with no value
  indent 5text ; formatting command with a value
  dropcapOnce upon a time
  bold, smallcap, size +2text ;a command group which has multiple
  formatting commands enclosed within 
 
  A command group can be singular:
 
  ...
 
  or have multiple commands separated by commas:
 
  ...,...,...,
 
  the closing  signalling the end of the command group.

 This is not entirely dissimilar to Markdown so I'd echo Peter's advice
 to check out lowdown, the CHICKEN Markdown implementation, and comparse,
 the parser library lowdown is implemented in.

 I'll also point you to the eMail address parsing egg:
 http://api.call-cc.org/doc/email-address which is another example of a
 parser written with comparse. It's interesting because, unlike lowdown,
 it implements a parser for just a small number of things: eMail
 addresses and lists of eMail addresses.

 comparse is a parser combinator library. This means that you specify
 parts of your grammar / language and a procedure which can parse that
 thing is returned. You then combine these parsers to produce other
 parsers that, for example, can parse X then Y, X or Y, X then Y
 then Y, etc. It takes a couple of hours to wrap your head around it but
 it's very powerful. The email-address parser is build up starting from
 sets of characters and resulting in two procedures: one that parses and
 eMail address and one that parses a sequence of eMail address.


  The idea is to make typesetting with Groff very simple and intuitive for
  any user - not just programmers and hackers.  The markup we are working
  on is called Typesetting Markup Language (TML).  So it would convert
  html-like commands and generate a Groff document from it.

 comparse allows to take your results and give them as arguments to other
 procedures. In the eMail address egg I use this to populate an internal
 data type that represents an eMail address. You could use an
 intermediate data type like this or you could try to write a number of
 different procedures which immediately output the parsed thing in the
 required format.


  Right now I am trying to do a prototype which generated Groff in the
  backend, but the idea is to have a general purpose markup that could
  also be used to generate LaTex/Contex, HTML xml etc

 ...it's probably best to generate an intermediate format then. The
 lowdown egg generates SXML which can easily be rendered down to HTML.

 SXML is an s-expression representation of the tree structure of XML.

 See here for an illustration of SXML:
 http://www.more-magic.net/posts/lispy-dsl-sxml.html


  In Perl I am able to do most of this with regular expressions, but I'm
  hitting my head against the wall when it comes to multiple formatting
  commands within a group ...,...,...

 In comparse something like X,Y,Z would be something like:

 (off

Re: [Chicken-users] Parsing Simple Markup

2014-09-21 Thread Arthur Maciel

Dear Yves, with SXML you could write transformation rules as Peter has
shown in www.more-magic.net/docs/scheme/sxslt.pdf.

I'm not experienced with SXML, but AFAIK they would generate a similar
effect as the procedures in your example below.

Best wishes,
Arthur

2014-09-21 17:06 GMT-03:00 Yves Cloutier yves.clout...@gmail.com:

 Hello Oleg,

 Thank you for your recommendations too.  I actually just came back from
the local library where I picked up The Scheme Programming Language.

 You know, reading through your reply, it was the last part that made me
think about something.

 If I can convert my input to the format:

 (bold text)
 (indent 5 text)
 (bold (smallcap (size 2 text)))

 Could I not define each of these as functions (or procedures), and then
just call an (eval '  ) procedure to do my output?

 For example (keeping in mind I'm only just getting familiar with Scheme
syntax!):

 (define (bold (text)
  (print the opening tag for the command 'bold')
  (print the string 'text')
  (print the closing tag for the command 'bold'))

 (define (indent (indent-value text)
 (print the opening tag for the command 'indent' with value of
'indent-value')
 (print the string 'text')
 (print the closing tag for the command 'indent'))

 Actually due to the possible presence of nested commands, it should
probably be something more generic, since in the last example:

 (bold (smallcap (size 2 text)))

 what the procedure 'bold' would be taking in is not a string text, but
rather an expression...so this is where I guess things would need to be
recursive.

 Once my document has been converted into one big s-expression, and
procedures defined accordingly, then I could just (eval ) it..couldn't I?

 (eval '(bold text)
 (indent 5 text)
 (bold (smallcap (size 2 text

 Or something along those lines?

 If this is the casebrilliant!

___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users

Re: [Chicken-users] Parsing Simple Markup

2014-09-20 Thread Richard

On Sat, 20 Sep 2014 11:19:08 -0400
Yves Cloutier yves.clout...@gmail.com wrote:

 Hello,
 
 I am  a new user to Scheme in general and to Chicken in particular,
 nice to meet you all.
 
 I came to scheme looking for an alternative to Perl for doing a
 personal project which involves parsing an input file, identifying
 html-like commands and converting those to Groff code.
 
 I was doing well up to a certain point but things started getting
 messy and thought perhaps there is a language out there better suited
 for this - which led me to scheme.
 
 Scheme is a totally different paradigm that I'm used to, so while I
 wait for my books to arrive I will need some hand-holding...hope
 that's ok.
 
 1) Is the Chicken Scheme manual available for purchase?  Online docs
 are great but I like to have a hardcopy so that I can read offline.
 
 2) The best way to learn is to get your hands dirty so I was looking
 at doing everything from scratch, but then I saw input-parse (
 http://wiki.call-cc.org/eggref/4/input-parse) which seems pretty much
 like what I need.  But i can't seem to find this in the Eggs.  It
 says that page does not exist yet.
 
 For the most part, a lot of what I want to do is search and replace,
 except for special cases where additioanl processing would be
 required to extract command:value pairs.
 
 A few examples of what I am trying to parse:
 
 1. Tags that identify structural elements of a document:
 [chapter] Chapter Title
 [heading1] Heading Title
 [list]
 ...
 [end]
 
 [quote]
 ...
 [end]
 
 2. Tags that identify formatting of text:
 boldtext  ;single formatting command with no value
 indent 5text ; formatting command with a value
 dropcapOnce upon a time
 bold, smallcap, size +2text ;a command group which has multiple
 formatting commands enclosed within 
 
 A command group can be singular:
 
 ...
 
 or have multiple commands separated by commas:
 
 ...,...,...,
 
 the closing  signalling the end of the command group.
 
 The idea is to make typesetting with Groff very simple and intuitive
 for any user - not just programmers and hackers.  The markup we are
 working on is called Typesetting Markup Language (TML).  So it would
 convert html-like commands and generate a Groff document from it.
 
 Right now I am trying to do a prototype which generated Groff in the
 backend, but the idea is to have a general purpose markup that could
 also be used to generate LaTex/Contex, HTML xml etc
 
 In Perl I am able to do most of this with regular expressions, but I'm
 hitting my head against the wall when it comes to multiple formatting
 commands within a group ...,...,...
 
 Also to noteI am NOT a programmer of developer - I am a hobbyist
 and doing this for fun!
 
 So there is my introduction!  If any of you have any words of wisdom
 on where to begin I would love to hear from you.
 
 I literally started playing with Scheme last night while i wait for
 my book order (come on amazon...send me my books!)
 
 My idea was that I could read a line of text from a file at a time.
 My understanding is that the input would be read into an
 s-expression (which I understand to basically be a list).  Then
 could car the first item of the list and match it against my tags
 or formatting commands (which would be defined as something like
 below)
 
 (define chapter [chapter])
 (define list:digit [list:digit])
 (define list:alpha [list:alpha])
 (define end-list [end])
 (define close-command-group )
 (define command-group-begin )
 (define command-group-end )
 (define bold bold)
 (define smallcap smallcap)
 (define dropcap dropcap)
 
 And then do something based on what token that is encountered.
 
 This is my first attempt at functional programming so I realize I may
 not be approaching this in the best way.
 
 Regards, and looking forward to playing with Scheme!
 
 yves

Hello Yves,

Welcome to Chicken,

I can give you a more in-depth answer tomorrow when I have more time. In
the meantime; input-parse is working. I do not understand what you mean
by not being able to find it in the Eggs? You install it by typing at
the command line:

chicken-install -s input-parse

(use -s if you need root-privilidges).

Then, to use it, include (use input-parse) at the top of your source
good.

You say you have been using regexps before but got stuck, may I point
you to: http://wiki.call-cc.org/man/4/Unit%20irregex
IMO the extended SRE Syntax is a lot saner than that of Perl. Maybe
this is of some help.

This is an intro written for Python programmers but you might find it
useful none the less.

http://wiki.call-cc.org/chicken-for-python-programmers

Good luck,
Richard

___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users

Re: [Chicken-users] Parsing Simple Markup

2014-09-20 Thread Oleg Kolosov

On 09/20/14 19:19, Yves Cloutier wrote:
 Hello,
 
 I am  a new user to Scheme in general and to Chicken in particular, nice
 to meet you all.

Welcome!

 
 Scheme is a totally different paradigm that I'm used to, so while I wait
 for my books to arrive I will need some hand-holding...hope that's ok.

I was in a similar situation few months ago, with the experience in
classic languages - Scheme looked completely foreign. But it's
actually very simple once you get the basic concepts.

For learning I personally recommend The Scheme
Programming Language (http://www.scheme.com/tspl4/) - it contains very
nice exercises. The book is somewhat tied to Chez Scheme but many
extensions are available in Chicken as well.

Also, not often recommended but my favourite is: An Introduction to
Scheme and its Implementation
(ftp://ftp.cs.utexas.edu/pub/garbage/cs345/schintro-v14/schintro_toc.html)
- although it's unfinished there are some gems scattered around,
especially useful if you are familiar with the C language.

 1) Is the Chicken Scheme manual available for purchase?  Online docs are
 great but I like to have a hardcopy so that I can read offline.

There are http://wiki.call-cc.org/eggref/4/chicken-doc - you can install
it for offline use. It will look like http://api.call-cc.org/doc/chicken.

 For the most part, a lot of what I want to do is search and replace,
 except for special cases where additioanl processing would be required
 to extract command:value pairs.
 
 The idea is to make typesetting with Groff very simple and intuitive for
 any user - not just programmers and hackers.  The markup we are working
 on is called Typesetting Markup Language (TML).  So it would convert
 html-like commands and generate a Groff document from it.

See also http://en.wikipedia.org/wiki/SXML,
http://wiki.call-cc.org/man/4/Unit%20irregex and
http://wiki.call-cc.org/eggref/4/fmt for ideas.

 In Perl I am able to do most of this with regular expressions, but I'm
 hitting my head against the wall when it comes to multiple formatting
 commands within a group ...,...,...
 My idea was that I could read a line of text from a file at a time.  My
 understanding is that the input would be read into an s-expression
 
 And then do something based on what token that is encountered.

You can try to first convert this to simple s-expressions like:

(bold text)
(indent 5 text)
(bold (smallcap (size 2 text)))

and then use http://wiki.call-cc.org/eggref/4/matchable egg to generate
output. See
http://ceaude.twoticketsplease.de/articles/an-introduction-to-lispy-pattern-matching.html
for an introduction.

I've written very simple recursive s-exp parser using matchable some
time ago. I will clean it up and post the link here in a few days for
reference.

-- 
Regards, Oleg

___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users

Re: [Chicken-users] Parsing Simple Markup

Re: [Chicken-users] Parsing Simple Markup

Re: [Chicken-users] Parsing Simple Markup

Re: [Chicken-users] Parsing Simple Markup

Re: [Chicken-users] Parsing Simple Markup

Re: [Chicken-users] Parsing Simple Markup

Re: [Chicken-users] Parsing Simple Markup

Re: [Chicken-users] Parsing Simple Markup

Re: [Chicken-users] Parsing Simple Markup

9 matches

Site Navigation

Mail list logo

Footer information