Re: Pygments-based syntax highlighting preprocessor

2024-07-31 Thread Robin Haberkorn
Hello Bento!

On Wed, 31 Jul 2024 15:19:53 -0300
Bento Borges Schirmer  wrote:

> I peeked around your repository. I noticed the hyperlinks to CSNOBOL4
> are broken. what about SNOBOL4? is it cool? I understand it acts as a
> filter?

I updated the links. It seems to live here nowadays:
http://www.regressive.org/snobol4/csnobol4/

SNOBOL4 is a scripting language. A very old one. But it's still
interesting and worthwhile to learn because of its unique built-in
backtracking-based pattern matching language. Its descendant - Icon - is
also worth checking out.
On the downside, SNOBOL4 will torture you with its exclusive reliance on
Gotos for control flow. A rarity nowadays, but this language is
entirely "unstructured"!

> and is this BNF built into SNOBOL4?

No. I built a EBNF parser in SNOBOL4 that spits out GNU Pic code. That
was back when I wrote my bachelor thesis.
SNOBOL4 is good at prototyping small DSLs and compilers.

> is it worth learning? or do you think other
> languages/libraries/DSLs better solve the problems it was designed
> for?

If you have a fetish for old and obscure languages like I, it's
certainly worthwhile to learn. ;-)

> how do you even learn it, like is there some
> tutorial, manual or exercises about it?

There is lots of material on SNOBOL4 on the aforementioned website:
http://www.regressive.org/snobol4/
The most important book is probably the "Green Book":
http://www.math.bas.bg/bantchev/place/snobol/gpp-2ed.pdf

> is it standardized, or are
> there dozens of mutually exclusive dialects?

Pretty much standardized. There are only two dialects left, that you
could possibly run on a modern PC. That's CSNOBOL4 and SPITBOL.

> is it compiled or interpreted?

CSNOBOL4 is interpreted. But SPITBOL is actually a compiler - the first
and oldest compiler for a loosely typed high-level scripting language
AFAIK. (Mind you, this was before JIT-compilation...)

> can I call it inside a C program as a subroutine?

Not that I know.

> can it call C functions?

http://www.regressive.org/snobol4/csnobol4/curr/doc/snobol4ffi.3.html

> 
> Also, please do share tricks, difficulties you faced, methodologies
> and habits when writing your thesis in ms groff! I'm to follow similar
> path some time.

Mhh, better ask something more concrete.
Use `pdfmom --roff` even with `-ms` (`-mspdf`) and gropdf (pdfmom does
that by default).
Other than that, be prepared to write quite a few custom macros.
Be sure to check out Pic and Grap - it's so much fun to play around
with. Both my bachelor and master thesis contained only code-generated
graphics (not counting screenshots).

Best regards,
Robin



Pygments-based syntax highlighting preprocessor

2024-07-31 Thread Robin Haberkorn
Dear groffers,

as one of the things coming out of my master thesis (written completely
in ms with Groff), here is a small preprocessor for syntax highlighting
code blocks based on Pygments:

https://github.com/rhaberkorn/groff-tools#highlight-python

btw. I also still have Scintilla/Lexilla syntax highlighting support for
Troff lying around for years, that could improve dozens of text editors.
But I don't know when I will find the time and motivation to polish it
up.
Once you dive into Troff as a language, you realize that it's actually
unparseable in therefore impossible to properly syntax highlight. Even
after restricting the language in sane ways that will work in 99.9% of
the cases, it's still hard to highlight *correctly*. Basically you will
have to reimplement a large part of the Groff-parser to do it right, as
you need intimate knowledge of the syntax of certain requests. And then,
there ideally needs to be support for the other Troff variants out
there as well. That's why I never contributed it to the Lexilla project.

Best regards,
Robin



Re: Re: PDF outline not capturing Cyrillic text

2024-02-06 Thread Robin Haberkorn
On Tue, Feb 06, 2024 at 01:39:51PM +, Deri wrote:
> Hi Robin,
> 
> The current gropdf (in the master branch) does support UTF-16BE for pdf 
> outlines (see attached pdf), but Branden has not released the other parts to 
> make it work! If you can compile and install the current git the applying the 
> attached patch should give you what you want.
> 
> To apply the patch, cd into the git groff directory and "patch -p1 < path-to-
> patch-file", and then run make and install as usual.
> 
> I would be very interested in how you get on, and whether it gives you what 
> you need. Note that I am assuming you are feeding groff a file in UTF-8 and 
> the -k flag. I can see some hyphenation happening, but I don't know if it is 
> correct.
> 
> Cheers 
> 
> Deri

Hello Deri!

This patch works. All the outline titles are correct and .pdfinfo /Title,
/Author etc. also work with Cyrillic.
That's very cool.
But it only works when using UTF-8 as the input encoding (-Kutf-8).
As reported earlier in the correponding Savannah ticket, even hyphenation
works with UTF-8 input and I see no difference to the hyphenation result
compared to KOI-8 input. I have no idea how you did this.
Still, when using UTF-8 input, there are problems (missing letters) with
link texts autogenerated by .pdfhref L.
With KOI-8 input, all the outlines are incomprehensible, ie. they consist of
крокозябры as it would be called in Russian. ;-)
Apparently gropdf does not know, it has to convert from KOI-8 instead of UTF-8.

So I am still going to disable the outlines for the time being and go with
KOI-8.
It's anyway more of a nice to have thing, rather than a necessity.
I need Russian support as I am writing my master's thesis in Russian.
At the end of the day, this will be printed, so I can live without
PDF outlines.

Best regards,
Robin

PS: And to comment on some of the heated discussions on this list:
It's great that you and Branden spend so much time on improving Groff.
I think, you do a great job. Regressions are sometimes unavoidable,
especially when taking over a large code base from somebody else.



[bug #65232] Russian hyphenation is not working

2024-02-05 Thread Robin Haberkorn
Follow-up Comment #5, bug#65232 (group groff):

[comment #4 комментарий №4:]
> 
> [comment #3 comment #3:]
> > After switching from pdfroff (-Tps) to pdfmom (-Tpdf), hyphenation
suddenly works fine.
> 
> Glad to hear it.
>  
I forgot to mention, I also had to install a new version of the
LiberationSerif fonts as the previous ones I was using, apparently weren't
fully compatible with gropdf. There were for instance some space characters
that were not displayed correctly.

> > Moreover, it will even work with UTF8 input (-Kutf-8), even though that
causes other glitches.
> 
> What glitches are you seeing?
> 
With -Kutf-8, link texts generated by .pdfhref were sometimes missing -
seemingly random - characters.

> The input is coverted from UTF-8 to KOI8-R.  The hyphenation patters are
defined in terms of KOI8-R code points.  The formatter (GNU _troff_) decides
where the hyphens should go and performs the breaks.  The formatter converts
the input characters into internal data structures called "nodes" that do not
use an externally visible encoding.  Then, when generating device-independent
output, each glyph nodes is converted to a device-independent special
character command _if_ the output device supports its code point.  (If it
doesn't, you get a warning like "special character 'u0413' not defined".)
> 
Are you telling me that pdfmom is actually internally converting my text to
KOI8-R after noticing I did -mru?
This is obviously not the case as I tried to print some Cyrillic using .tm and
it comes out as Unicode escapes as would be expected after the sources are ran
through preconv.



___

Reply to this item at:

  

___
Сообщение отправлено по Savannah
https://savannah.gnu.org/




[bug #65232] Russian hyphenation is not working

2024-02-03 Thread Robin Haberkorn
Follow-up Comment #3, bug#65232 (group groff):

After switching from pdfroff (-Tps) to pdfmom (-Tpdf), hyphenation suddenly
works fine.

Moreover, it will even work with UTF8 input (-Kutf-8), even though that causes
other glitches. I have no idea way it can hyphenate Unicode escapes.

`pdfmom --roff -spdf` generally works much better than pdfroff, including TOC
recollation which can finally be done without manually psselect-ing thanks to
.pdfswitchtopage.

pdfroff should perhaps be marked as deprecated or pdfmom should outright
replace it.

>From my perspective, you can close this ticket.


___

Reply to this item at:

  

___
Сообщение отправлено по Savannah
https://savannah.gnu.org/




Re: PDF outline not capturing Cyrillic text

2024-02-03 Thread Robin Haberkorn
Regarding cyrillic characters in PDF outlines, I think I got a few
insights today.

It turns out that the pdfmarks in the postscript code are "text strings"
according to the PDF specs, that is either a PDFDocEncoding or
UTF-16BE with a leading byte-order marker (cf. PDF Reference 1.7).
A PDFDocEncoding is basically latin1 it seems.
This explains why the current code in MOM works with western European
languages.
Now, in order to include cyrillic, you will have to reencode whatever
encoding Groff uses and passes to the postprocessor - which will
subsequently end up in the postscript code - to UTF-16BE.
Everything needs to be hex-encoded and enclosed in sharp
brackets ().

In the most hacky case, this could be done by a script on the
postscript code generated by `pdfroff --emit-ps`. As a proof of concept
Here's an incomplete, but somewhat working version in SciTECO:

sciteco -e "16,0ED @EB/document.ps/ <@S|/Title (|; -D @I|/ D> 
@EW//"

This assumes that the Groff encoding is KOI8-R, which I chose as an
intermediate format in order to enable Russian hyphenation
(but that does not work unfortunately).
It should be rewritten into a Python or Perl script using some
iconv wrapper or ideally pdfroff itself could do it.
The script could even interpret Groff Unicode escapes generated by preconv
and convert them back to plain Unicode before writing out everything in UTF16.

I will probably just use such a hack for my purposes.

What's the status of pdfroff anyway? I read that it is more or less
deprecated and we should all use `groff -Tpdf` instead.
Actually, pdfmom should work with ms as well, actually uses
gropdf and should perform the necessary multipass processing
for pdfhref forward-references to work.
Will try this next!

Best regards,
Robin



[bug #65232] Russian hyphenation is not working

2024-01-31 Thread Robin Haberkorn
Follow-up Comment #2, bug#65232 (group groff):

Hello Branden!

I am not quite sure what additional info you need. I attached a test case. You
can reproduce it. No matter what font size or hyphenation mode, I cannot get
it to hyphenate.

Hyphenation *does* work when formatting for -Tutf8. The same is true for the
Махновщина-text from the mailing list post. Furthermore, I do not
understand why the Махновщина-text given in UTF8 can be hyphenated
correctly at all. I thought that hyphenation will only work in KOI8-R.


___

Reply to this item at:

  

___
Сообщение отправлено по Savannah
https://savannah.gnu.org/




[bug #65232] Russian hyphenation is not working

2024-01-30 Thread Robin Haberkorn
URL:
  <https://savannah.gnu.org/bugs/?65232>

 Summary: Russian hyphenation is not working
   Group: GNU roff
   Submitter: rhaberkorn
   Submitted: Ср 31 янв 2024 02:48:42
Category: Macro - others/general
Severity: 3 - Normal
  Item Group: Incorrect behaviour
  Status: None
 Privacy: Public
 Assigned to: None
 Open/Closed: Open
 Discussion Lock: Any
 Planned Release: None


___

Follow-up Comments:


---
Date: Ср 31 янв 2024 02:48:42   By: Robin Haberkorn 
I cannot get Russian hyphenation to work on a HEAD build of Groff. As far as I
understand, it should be enough to -mru. It should even enable hyphenation
mode 8 by default.

Still, I try to set HY and .hy manually without any success.

My source file UTF-8, converted to KOI8 using iconv, but I also included the
preconverted KOI8 file in case you don't have a working iconv. btw. that's a
very useful hack, as it preserves misc. codepoints as unicode character
escapes.

You have to install LiberationSerif, for instance using install-fonts.sh.

The command line to build the example used is:


iconv -f UTF-8 -t KOI8-R --unicode-subst='\[u%04X]' hyphen-utf8.ms | groff
-Tpdf -ms -mru >hyphen-koi8.pdf








___
File Attachments:


---
Name: hyphen.tar.gz  Size: 111КиБ
<http://savannah.gnu.org/bugs/download.php?file_id=55647>

AGPL NOTICE

These attachments are served by Savane. You can download the corresponding
source code of Savane at
https://git.savannah.nongnu.org/cgit/administration/savane.git/snapshot/savane-3112ec7181a7018604fb7b25b2201235b3bdfb6a.tar.gz

___

Reply to this item at:

  <https://savannah.gnu.org/bugs/?65232>

___
Сообщение отправлено по Savannah
https://savannah.gnu.org/




ms/mm page margins too large

2023-06-23 Thread Robin Haberkorn
Hello!

I was trying to set up an ms-document with small page margins (see attachment),
but I don't seem to get below a certain limit (especially on the bottom side).
In one-column mode, the bottom border is a bit smaller, but still larger than
1cm. I also cannot explain the spacing introduced by the .2C request (between
BEFORE and AFTER). Build with

pdfroff -ms margins.ms >margins-ms.pdf

Does anybody understand what's going on?

I've tried something similar with mm (margins.mm), trying to set 1cm margins on
all sides. This doesn't work as well when built with:

pdfroff -mm -P-pa4 -rW=19c -rO=1c -rL=27.7c margins.mm >margins-mm.pdf

Although, here we can at least set the page length to something oversized like
31c to get close to what we want. A4 has a height of 29.7cm, so this is also
strange.

For my particular task, I will probably just switch from ms to mm. But it would
be nice to understand why they behave this way.

Yours sincerely,
Robin\# pdfroff -mm -P-pa4 -rW=19c -rO=1c -rL=27.7c margins.mm >margins.pdf
.PGNH
.SP 1c
BEFORE
.2C
AFTER
.
.nr i 0 1
.while \n+i<=100 \{\
.  br
.  nop \ni
.\}\# pdfroff -ms margins.ms >margins.pdf
.nr PO 1c
.nr LL 21c-\n(PO-1c
.nr HM 1c
.\" FIXME: The actual bottom border is much larger.
.nr FM 1c
.nr PI 0
.\" Disable headers and footers
.ds CH
.ds CF
.
BEFORE
.2C
AFTER
.
.nr i 0 1
.while \n+i<=100 \{\
.  br
.  nop \ni
.\}

margins-ms.pdf
Description: Adobe PDF document


margins-mm.pdf
Description: Adobe PDF document


Re: PDF outline not capturing Cyrillic text

2023-06-23 Thread Robin Haberkorn
Hello Peter,

I am also now stumbling across Cyrillc-related issues with pdfmark. I am using
ms for the time being. The bug also affects autogenerating link texts given via
`.pdfhref L`.
In the most simple case, preconv will turn your Cyrillic characters into escapes
which are apparently not further interpreted by pdfmark (or anything that 
follows).
I see text like "[u0421][u043F]..." in my outline.

I believe that this is why you have .pdfmomclean in MOM. Do I understand
correctly that this is supposed to turn the escapes back into Latin-1?
This is presumably mainly the work of .asciify, which would be misnamed anyway.
It does not work with Cyrillic at all, which doesn't surprise.
That's also why you don't get "mojibake garbage" in the outline. None of the
Cyrillic characters end up in intermediate output.

It also explains why I previously had no problems with German Unicode characters
(that was using MOM) - they can be converted back into Latin-1.

Manually editing the ps:exec lines in the intermediate output and inserting
Unicode characters there, does not produce the desired results, which is also
not surprising.

So it seems that the main problem really lies in grops and/or gropdf which
should ideally work with the Unicode escapes produced by preconv.
I am not sure if we would still need .pdfmomclean. But whatever useful stuff it
currently does, it should probably be in pdfmark.tmac (and/or pdf.tmac?) 
instead.

Best regards,
Robin



Re: Questions concerning hyphenation patterns for non-Latin languages, e.g. Russian

2023-04-26 Thread Robin Haberkorn

25.04.23 19:51, G. Branden Robinson пишет:

 While I'm pontificating I'll opine that I'm not a huge fan of C++ as
 a language, but I have found with groff that, given discipline, and
 by maintaining a clear view of its roots in C (_also_  not my
 favorite language--but one alienating, enemy-making rant at a time),
 and not picking up every f***ing new feature that gets shoved into
 the language as soon as (or before) it's standardized, it_can_  be
 managed.  But I also think that the C++ templating facility was, in
 implementation, one of the worst features ever developed for any
 programming language.


I would agree to that largely. The only acceptable C++ is the one close to C. 
Especially if you do indeed interface with C APIs. But even then it remains 
broken by design with its classes in headers, forcing you to expose every type 
belonging to your class to everybody. What's the benefit in C++, especially when 
restraining from namespaces? Deeply nested class hierarchies? You really 
shouldn't have those anyway. IMHO you can get much clearer and better isolated 
code (smaller headers anyway) with properly written idiomatic plain C code. It's 
the lesser of two evils. The preprocessor is one of those things I am also not 
happy with, although I found that C++ often pushes you to metaprogramming only 
for marginally improved typesafety compared to plain-C non-preprocessed 
solutions. As a side effect you get overblown binaries that will blow your cache 
hierarchies. On the other hand the C preprocessor could be made much more useful 
for metaprogramming with a few simple extensions...
I have not long ago migrated SciTECO from "C-like" C++ to plain C and I am not 
looking back!




Re: neatroff for Russian. (Was: Questions concerning hyphenation patterns for non-Latin languages, e.g. Russian)

2023-04-26 Thread Robin Haberkorn

Hello!

I can confirm that Neatroff (and Heirloom Troff) works well for typesetting 
Russian texts including hyphenation.
BUT, I found them unsuitable for complex scientific texts as their ms macros are 
buggy and tbl is somewhat limited. Regarding Neatroff, I found that its 
hyperlinking capabilities are extremely limited.


For future texts I therefore wanted to return to Groff (where we also have the 
excellent MOM macros). Not being able to hyphenate UTF-8 Cyrillic text is a 
major limitation for me. I might get away with converting it to KOI8 first, but 
could I still mix in Unicode characters this way (as they are considered special 
characters by Groff)?


Perhaps I will have a look at the hyphenation code and try to fix it. Hacking 
the typesetter is always a perfect distraction from the work you are supposed to 
do instead. ;-)


Yours sincerely,
Robin

26.04.23 14:10, Ralph Corderoy пишет:

Hi Oliver,

Are you aware there are other troff implementations than GNU's groff?
Neatroff is one.  Ali Gholami Rudi wrote it because he wanted better
Unicode support for foreign languages, including right-to-left text.
He seems very much of your mould in needs.

A good summary of its features is http://litcave.rudi.ir/neatroff.pdf
I see UTF-8 hyphenation files mentioned.
There's also whole-paragraph formatting and lots of other delights.
Rudi's http://litcave.rudi.ir has a Typesetting section past the initial
list of recent changes to his software.

Feel free to continue discussing neatroff here along with general troff
questions.





Re: [groff] Accented Cyrillic characters

2018-08-02 Thread Robin Haberkorn
Hello Ralph!

I see! Groff seems to combine composites to single code points if possible,
probably in order to better support terminals and/or software that cannot
themselves combine them. Makes sense.
But for the rest of glyphs, it should IMHO a) make sure that accentuation glyphs
have a zero-width and b) don't drop them from composite Unicode escapes. Why is
there even something like composite support, where you can even specify Unicode
points if they are always reduced to a single code point in the end?

I tried adding a line like
u0301 0 0 0xCC81
to the R font for devutf8.
But it doesn't work. How does grotty interpret the code? They are obviously not
simply UTF-8 bytes.
(Sorry, I'm not that motivated to seriously debug this in the Groff sources.
Just hoped that somebody would already know what's going on here.)

Best regards,
Robin

02.08.2018 17:26, Ralph Corderoy пишет:
> Hello Robin!
> 
>> Currently, I'm just adding a standalone UTF composite accent character
>> (U+0301) after every vowel I want to show stress on since Unicode does
>> not seem to define separate codepoints for all of the Cyrillic
>> accented vowels.
> 
> That's the recommendation in
> https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode
> 
>> the terminal emulator (at least URXVT) will combine the accent and the
>> vowel into a single glyph.
> 
> xterm(1) does too.  libvte-based terminals seem to place it on the line
> above!?
> 
>> This approach of adding accents causes problems with tbl, though. The
>> combination of the two characters into a single glyph screws up tbl's
>> (and/or Groff's) assumptions. For instance, in a table like:
>> | саморазруше́ние |
>> | foo bar |
>> the bars won't properly line up.
> 
> It boils down to persuading `\w', used by tbl(1), that the U+0301 takes
> no space.
> 
> $ groff -Tutf8 >/dev/null
> .nr w \w'A'   
> .tm \nw 
> 24
> .nr w \w'\[u0435]'
> .tm \nw 
> 24 
> .nr w \w'\[u0435]\[u0301]'
> .tm \nw  
> 48 
> $
> 
> Tricks like overstrike with `\o' and moving left with \h affect the \w
> but don't give the desired output because grotty(1) also processes them.
> 
>> For instance, \[u0435_0301] should theoretically also format as an
>> accented Cyrillic e.  But what happens instead is that the accent is
>> dropped during formatting.  Curiously, this works when using latin
>> characters. For instance, \[e u0301], \[e aa], \[e '] will result in a
>> properly accented latin e.
> 
> I think those are mapped onto their Unicode rune, and as you start by
> saying, then isn't one for U+0435 combined with U+0301.
> 
> $ cd /usr/share/groff/1.22.3/font/devutf8
> $ grep 0435 R
> u0435_030024  0   0x0450
> u0435_030824  0   0x0451
> u0435_030624  0   0x04D7
> $ grep '0045.*0301' R 
> u0045_0301  24  0   0x00C9
> u0045_0304_0301 24  0   0x1E16
> u0045_0302_0301 24  0   0x1EBE
> $
> 
> I look forward to solutions and workarounds from the others here.  :-)
> 



[groff] Accented Cyrillic characters

2018-08-01 Thread Robin Haberkorn
Hello!

I'm working on a small Russian offline dictionary that formats the entries of
words into Troff/Man pages, so you can view them in the terminal.

There is a small problem when trying to format accented Cyrillic characters.
Accents are commonly used in Russian to highlight word stress by placing them on
the stressed syllable's first vowel.
Currently, I'm just adding a standalone UTF composite accent character (U+0301)
after every vowel I want to show stress on since Unicode does not seem to define
separate codepoints for all of the Cyrillic accented vowels.
AFAIK, the accent is not really interpreted by Groff - to it, it will seem like
a standalone glyph. But the terminal emulator (at least URXVT) will combine the
accent and the vowel into a single glyph.
For instance саморазруше\[u0301]ние will effectively render as саморазруше́ние.

This approach of adding accents causes problems with tbl, though. The
combination of the two characters into a single glyph screws up tbl's (and/or
Groff's) assumptions. For instance, in a table like:
| саморазруше́ние |
| foo bar |
the bars won't properly line up.
It will probably cause other more subtle formatting issues as well, but that's
where I personally caught it.

I tried to use the Groff Unicode composite syntax, so it becomes clear to Groff
that the accented character is a single glyph. For instance,
\[u0435_0301] should theoretically also format as an accented Cyrillic e.
But what happens instead is that the accent is dropped during formatting.
Curiously, this works when using latin characters. For instance, \[e u0301],
\[e aa], \[e '] will result in a properly accented latin e.

Why is that so? Did I catch a grotty bug here?
Do you know any workaround I could employ?

Best regards,
Robin



[Groff] gpresent (presentation macros) forked

2017-02-22 Thread Robin Haberkorn
Hi,

just wanted to inform you that I forked the gpresent macros by Bob Diertens:

https://github.com/rhaberkorn/gpresent

I did this out of sheer necessity, as I needed to prepare a few
presentations. This fork contains a hard-to-find patch, so it runs with
recent versions of Groff and adds the TITLEFORMAT and SUBTITLEFORMAT
macros. I might add other features if I need to prepare more slides.
gropdf support would be nice I guess...

Best regards,
Robin

PS: Might be of interest for the curious: SciTECO recently introduced
its own Groff postprocessor/driver (grosciteco) which can render into a
SciTECO buffer. This is used for SciTECO's integrated help system which
actually uses Groff as its markup language.



signature.asc
Description: OpenPGP digital signature


[Groff] [mom][patch] multipage boxed tables

2013-09-13 Thread Robin Haberkorn
Dear groffers,

I ported the ms package's tbl macros, specifically the multipage boxed
table support, to Mom.
It appears to work fine so far, but I'm neither a groff nor a -ms guru yet.
There are possibly some bugs that could be easily avoided by someone
who is more familiar with the ms macros.
That's why I'm writing you this - instead of mailing it to Peter directly.

For instance, why does the tbl@print-header macro change the
environment before printing the diversion? Shouldn't be everything
already be formatted in an environment-independent way when the
diversion is created? Or do I have to expect transparent throughputs
(\! and the like) generated by tbl? What exactly is in a diversion -
after all it's not a macro?
Also have a look at the other comments I added.

See attached patch. It's diffed against CVS HEAD.

Best regards,
Robin


multipage_boxed_tbl.patch
Description: Binary data


[Groff] [mom][patch] FLOAT bug fixes

2013-09-12 Thread Robin Haberkorn
Dear groffers,

I have found and fixed two severe mom bugs in the FLOAT macro. See
attachment - it's a patch against CVS HEAD.

1) if a FLOAT FORCE block fits on the current page, the register
#FORCE is not removed/reset. If it does not fit on the page, a NEWPAGE
is emitted and #FORCE is removed.
This results in the FLOAT immediately following a forced FLOAT
(fitting on its page) being treated as a forced FLOAT as well, even if
it isn't a forced FLOAT. I think Peter will understand...

2) forced FLOATs that do NOT fit on the current page are formatted
strangely. At least if they contain `pic`-processed code, the picture
is formatted at very top of the new page with everything else (like
captions from the same FLOAT and running text) being layed over the
picture.
Apparently the diversion (FLOAT*DIV) cannot be reproduced properly
after breaking to the new page. I could not find the root cause of
this, but perhaps it has something to do with \n[dn] being reset after
the NEWPAGE call. Restoring \n[dn] after NEWPAGE doesn't help though.
I worked around this. Instead of emitting the FLOAT*DIV in the FLOAT
macro call for forced FLOATs that do not fit on their page, I let them
be added to the deferred float list (FLOAT*DIV:\n[defer]), just like
ordinary deferred FLOATs, and then issue a NEWPAGE.
The forced FLOAT will then be printed properly by the HEADER trap.

I'm in a hurry, so no test case this time. If you do need one, just say so.

Best regards,
Robin


float_fixes.patch
Description: Binary data


[Groff] [mom][bug][patch] NUMBER_LINES and tbl

2013-08-27 Thread Robin Haberkorn
Hi everyone,

found and fixed another bug in the mom macros.
Her handling of NUMBER_LINES if tbl is used was buggy.
If you used NUMBER_LINES to turn on line numbering, turned it off and
then used tbl tables, the table was numbered. Naturally if you resumed
line numbering after the table, line numbers were wrong.

My system:
Linux 3.2.0-52-generic #78-Ubuntu SMP Fri Jul 26 16:21:44 UTC 2013
x86_64 x86_64 x86_64 GNU/Linux

GNU grops (groff) version 1.22.2
GNU troff (groff) version 1.22.2

I was using mom v2.0-a1 as well as the CVS HEAD of om.tmac.

The reason was that tbl apparently evaluates \n[ln] to determine
whether line numbering is active or not - but I do not fully
understand the bulk of troff requests tbl spits out.
Nevertheless, mom's NUMBER_LINES OFF stopped line numbering but did
not reset register ln (tbl expects 0).
`ln' however is important for restarting line numbering with
NUMBER_LINES RESUME via .nm +0.

The attached tarball contains a test case (linenumber_tbl_bug.mom),
how it looked like (linenumber_tbl_bug.pdf), my patch against CVS HEAD
(linenumber_tbl_bug.patch) and how it looks fixed
(linenumber_tbl_bug_fixed.pdf).

cheers,
Robin

btw. I have a whole list of mom bugs, so expect patches and more bug
reports soon.


linenumber_tbl_bug.tar.gz
Description: GNU Zip compressed data


Re: [Groff] refer, mom and inline references

2013-08-13 Thread Robin Haberkorn
Hello Peter,

2013/8/11 Peter Schaffter pe...@schaffter.ca:
 ...
 At the top of your file,

   .R1
 label (A.n|Q) ', ' D.y
 bracket-label  ( )\c 
 join-authors , and  ,  , and 
 reverse A1
 sort A1Q1T1B1E1
 database path to database
   .R2

 With this setup, references entered without a preceding .REF are
 inserted into running text.  For example,

   end of sentence
   .[
   keywords
   .]
   \. A new sentence.

Works like a charm, thanks!

 will produce end of sentence (author's last name, date). A new
 sentence.

 To add a page or page-range number

   end of sentence
   .[
   [ keywords
   .], p. 168)\c
   \. A new sentence.

 will produce end of sentence (author's last name, date, p. 168). A
 new sentence.

Now I see why there are bracket flags :-)


 is all that's required.  There's no need for an additional refer
 block.

 Hope this helps.

Yes, that works very well.
I wonder why I thought I'd need custom .]- and .][ macros since refer
inserts the labels just fine on its own.
Turns out, the \c at the EOL before my citation blocks somehow
suppressed the entire label automatically inserted by refer.
Strange groff world...

Best regards,
Robin



[Groff] refer, mom and inline references

2013-08-09 Thread Robin Haberkorn
Hello,

I know a lot has been written on this mailing list about refer and its
mom integration.
However, I still don't quite get it.
Currently I'm writing my bachelor thesis in groff -mom. It was
relatively easy to imitate the LaTex template they were providing,
including Computer Modern fonts and so on but I'm still struggling
with refer.
I would like to have inline references in an abbreviated form (like
[AUTHORYEAR]) and a bibliography with all references automatically
collected.

As I understand mom's refer support and as I was able to verify it,
mom provides the following possibilities:
 * footnote references
 * endnote references
 * inline references
 * bibliographies

When I use endnote references, mom will merely insert a small
superscript number into the running text.
When I use inline references my references are put inline with no
apparent way of customizing the inline citation style.
When I use a .BIBLIOGRAPHY and refer's accumulate option, my
references are neatly accumulated into the bibliography but no longer
occur inline, neither with .REF, nor with .REF[. I guess this has
something to do with refer no longer emitting the start/end reference
macros and string definitions for mom to format.

Any pointers on how to achieve what I want?

Would I be able to at least get automatically formatted inline
references in the [AUTHORYEAR] style using refer's label feature with
a *manually* populated bibliography?
I guess I would have to write custom .[- and .][ macros and bypass
mom's .REF macros altogether.

Best regards,
Robin



[Groff] quick and dirty preprocessors: htbl.tes

2013-08-09 Thread Robin Haberkorn
Hi,

one thing I like about groff/troff is the ability to quickly write
preprocessors extending troff.
I wrote one as part of SciTECO -- in the SciTECO language:
https://github.com/rhaberkorn/sciteco/blob/master/doc/htbl.tes
(Note: contains control codes not properly displayed by Github)
It's a drop-in replacement for tbl, that emits HTML tables for
processing with grohtml.
The code is horrible (it was hacked interactively and then turned into
a batch script), and it only supports a subset of tbl, that is exactly
what I used in the SciTECO manpage tables.
It can be seen in action here:
http://rhaberkorn.github.io/sciteco/manuals/sciteco.7.html

I've also written a small preprocessor in csnobol4 that pipes
.HIGHLIGHT blocks in your groff -mom code through GNU
source-highlight, effectively giving you syntax highlighting for
inline code blocks and an UML preprocessor that pipes .UML blocks into
PlantUML and inserts .PDF_IMAGE references in its place.
Anyone interested in these?

Best regards,
Robin



Re: [Groff] refer, mom and inline references

2013-08-09 Thread Robin Haberkorn
2013/8/9 jjbrioist jean.brio...@numericable.fr:
 Le vendredi 09 août 2013 à 17:35 +0200, Robin Haberkorn a écrit :
 Hello Robin,

 I was facing the same issue as you, although I am using -ms for the time.
 I use the -S option to get inline references.

 Assume you have the following entry in your
 xxx.bib file :

 %A John Robert
 %T Measuring steel bar tension using X-rays
 %D 1993
 %K X-rays dynamometer steel
 %L Steel Res. 35
 %R Steel Res.
 %N 35
 %I Springer
 %C Cleveland


 and your text reads something like

 .BD
 Lately my life
  .[[
 Robert
  .]]
 has become more complicated.
 .R
 .DE
 .LP

 (take care of the double brackets). Compile this with

 refer -pxxx.bib -e -S thesis.ms  aux.ms
 groff -ms aux.ms  thesis.ps


Hello Jean,

I wonder how this could work. Perhaps -ms implementations of ]- and ][
do the label insertion appropriately.
The manual refer preprocessing however is definitely not necessary
since refer -S is just a short cut to defining a label string (you
could embed that in .R1/.R2 sections).

I have now found an ugly solution that works with -mom.
I use something like this near the top of my document:

.R1
# this extracts the last name of the first author and the publication date
label A.nD.y
database biblio.ref
.R2
\# Save Mom's version of ]- and ][
.rn ]- MOM-REFER-BEGIN
\# Provide my own versions: the brackets are inserted automatically so
you don't need
\# .[[ and .]] or [] in the flags field
\# \f(SC merely sets a small-caps font
\# [F contains the label as generated by refer
.ds ]- [\f(SC\\*([F\fP]\c
.rn ][ MOM-REFER-END
.ds ][

References can then be inserted without mom's .REF macros:
.[
whatever
.]

And to generate the bibliography I do:

\# restore mom's ]- and ][
.rn MOM-REFER-BEGIN ]-
.rn MOM-REFER-END ][
.
.BIBLIOGRAPHY
.BIBLIOGRAPHY_TYPE LIST
.R1
no-label-in-text
no-label-in-reference
sort A1Q1T1B1E1
reverse A1
bibliography biblio.ref
.R2
.BIBLIOGRAPHY OFF

So while for ordinary inline references I did the formatting with
custom macros, in the bibliography I let mom do it.
The entire database (biblio.ref) is inserted by refer, which is fine
as long as I do not have a real database (with lots of entries) but
one that contains only the publications I have ever referred to in the
main body of the text.

regards,
Robin