Re: Tears in my eyes, joy in my heart (was: gropdf-ng merge status

2024-02-07 Thread Oliver Corff via

Hi Deri,

just as Peter said: this news just unmade my day.

My contributions to groff are tiny, I am just an avid user who follows
groff development with great interest but little of own competence for
substantial contributions. All the more I appreciate the work you have
been making with groff and pdf, notably with regard to embedded fonts
etc.; something which is essential for the type of documents I process
with groff. So it is a good occasion to thank you for everything.

Sometimes, it takes a long time before a notion of discomfort with a
given situation erupts in the form of a seemingly irreversable choice.

I wish that that notion won't become terminal and that the joy of doing
creative and constructive work will not miss to return to you.

My best wishes to you for New Year! --- tomorrow will be New Year in
Mongolia where I happen to be for the moment.

Cheers,

Oliver.


On 07/02/2024 19:10, Deri wrote:

Hi Branden,

It is with a heavy heart that I announce I shall be leaving the groff mailing
list, I'm finding it too much work. Yesterday, you managed over 2900 words in
approx 80 minutes whilst also doing a code review, and probably 3 other things
as well! I am so jealous my paltry sub 10 words a minutes (on a good day) just
can't cope, particularly as you sometimes have difficulty getting the points I
am trying to make and reply with points which are not relevant. An example was
your "unease" with adding an extra field to afmtodit output, you pointed me to
some documentation rather than a swift perusal of the code in afmtodit where
you can see that it ALWAYS outputs 5 tab characters and never outputs -- so no
comments. And talking of "unease", you wrote, in reply to my request for help
with merging: "Sure.  Once we're _both_ happy with it!  :D", and this was
eight months ago, so unease really does mean rejection. In November it
became:-


1.  Changing the format of font description files to add yet
another field, mapping character names to Unicode code points.
In the rest of groff, this is not necessary because we have
glyphuni.cpp.


https://git.savannah.gnu.org/cgit/groff.git/tree/src/libs/libgroff/

glyphuni

.cpp

I'd like to honor the DRY principle here.  What's a good way
to achieve that?

Given that afmtodit does not use glyphuni.cpp (and can't) the DRY principle
here means to let afmtodit plant the needed data in the font files for gropdf
to use, but you didn't seem to see how irrelevant your comment was.

Anyway, enough of this useless banter. This is a joyful moment, I'm freeing up
so much time to pursue other projects that will be equally rewarding as
writing gropdf has been, like:-

Detection of bias in UK news channels

In the UK there is a legal obligation for "Due impartiality and due accuracy"
(https://www.ofcom.org.uk/tv-radio-and-on-demand/broadcast-codes/broadcast-code/section-five-due-impartiality-accuracy).
 For the past 4 years, I have
been converting the dvb-t subtitles for news channels into text using an OCR
program I wrote. It's about time I fooled around with the data using NLP and
see if it is possible to detect bias within the data, at a minimum I can
extract statistics on the political persuasion of guests, but I've got a
feeling I might be able to go further.

GB News, a right wing channel, keeps getting fined. I'd love to be able to
write something which automatically emailed a complaint to Ofcom if it caught
them breaking the rules, without having to watch the channel all day. :-)

My autobiography

Well, I've got the title - "A life more ordinary"!!

If I ever get the gropdf itch in the future, this is my todo list:-

A) Underlining text.

Peter asked if I could do this, ages ago because he has a method for
postscript, from Tadziu. It is half done.

B) Watermarking

Given a pdf scale to full page size and place it under the groff output, or
stamp, put it above it. I have worked out the last wrinkle. Normally, if you
rotate the page with -P-l any pdfpic will be rotated as well, so that the
picture orientation stays with the text orientation but the watermark
orientation is controlled by the page orientation.

C) Ttf/otf in pdfs

This is a lot of work, but I was starting to get a handle on it. Incidentally,
if I ever do get this done, the Tibet ligatures issue will be solved. The
reason it seems to be Ok everywhere else except in groff, is because the
"rules" for the ligature placement/resizing are in sub-tables within the ttf
font file, but in the fontforge conversion to a pfa file most of this
information is discarded because type 1 fonts have no concept of vertical
adjustments so all that gets through is the horizontal adjustment which
ensures the glyphs print over each other, but without the correct vertical
adjustment/sizing. Still a lot of research to do.

I've just seen your last email with a lot of nice things, but sometimes you
confuse "code review" with "design review". If someone wants to know how to
get to the doctor it is not helpful to say "Well I 

Re: Tears in my eyes, joy in my heart (was: gropdf-ng merge status (was: PDF outline not capturing Cyrillic text))

2024-02-07 Thread Peter Schaffter
On Wed, Feb 07, 2024, Deri wrote:
> It is with a heavy heart that I announce I shall be leaving the groff mailing 
> list, I'm finding it too much work.

Deri --

This news just unmade my day.  The value of gropdf and your
contributions to mom have been immeasurable.  I will miss your voice
on the list.

-- 
Peter Schaffter
https://www.schaffter.ca



Re: Tears in my eyes, joy in my heart (was: gropdf-ng merge status (was: PDF outline not capturing Cyrillic text))

2024-02-07 Thread Dave Kemper
Deri,

I'm very sad to see your departure.  This list and the groff project
will sorely miss your knowledge, expertise, and coding skills.  I hope
you find a way to return soon.



Tears in my eyes, joy in my heart (was: gropdf-ng merge status (was: PDF outline not capturing Cyrillic text))

2024-02-07 Thread Deri
Hi Branden,

It is with a heavy heart that I announce I shall be leaving the groff mailing 
list, I'm finding it too much work. Yesterday, you managed over 2900 words in 
approx 80 minutes whilst also doing a code review, and probably 3 other things 
as well! I am so jealous my paltry sub 10 words a minutes (on a good day) just 
can't cope, particularly as you sometimes have difficulty getting the points I 
am trying to make and reply with points which are not relevant. An example was 
your "unease" with adding an extra field to afmtodit output, you pointed me to 
some documentation rather than a swift perusal of the code in afmtodit where 
you can see that it ALWAYS outputs 5 tab characters and never outputs -- so no 
comments. And talking of "unease", you wrote, in reply to my request for help 
with merging: "Sure.  Once we're _both_ happy with it!  :D", and this was 
eight months ago, so unease really does mean rejection. In November it 
became:-

> > 1.  Changing the format of font description files to add yet
> > another field, mapping character names to Unicode code points.
> > In the rest of groff, this is not necessary because we have
> > glyphuni.cpp.
> >
> >
> > https://git.savannah.gnu.org/cgit/groff.git/tree/src/libs/libgroff/
glyphuni
> > .cpp
> >
> > I'd like to honor the DRY principle here.  What's a good way
> > to achieve that?

Given that afmtodit does not use glyphuni.cpp (and can't) the DRY principle 
here means to let afmtodit plant the needed data in the font files for gropdf 
to use, but you didn't seem to see how irrelevant your comment was.

Anyway, enough of this useless banter. This is a joyful moment, I'm freeing up 
so much time to pursue other projects that will be equally rewarding as 
writing gropdf has been, like:-

Detection of bias in UK news channels

In the UK there is a legal obligation for "Due impartiality and due accuracy" 
(https://www.ofcom.org.uk/tv-radio-and-on-demand/broadcast-codes/broadcast-code/section-five-due-impartiality-accuracy).
 For the past 4 years, I have 
been converting the dvb-t subtitles for news channels into text using an OCR 
program I wrote. It's about time I fooled around with the data using NLP and 
see if it is possible to detect bias within the data, at a minimum I can 
extract statistics on the political persuasion of guests, but I've got a 
feeling I might be able to go further.

GB News, a right wing channel, keeps getting fined. I'd love to be able to 
write something which automatically emailed a complaint to Ofcom if it caught 
them breaking the rules, without having to watch the channel all day. :-)

My autobiography

Well, I've got the title - "A life more ordinary"!!

If I ever get the gropdf itch in the future, this is my todo list:-

A) Underlining text.

Peter asked if I could do this, ages ago because he has a method for 
postscript, from Tadziu. It is half done.

B) Watermarking

Given a pdf scale to full page size and place it under the groff output, or 
stamp, put it above it. I have worked out the last wrinkle. Normally, if you 
rotate the page with -P-l any pdfpic will be rotated as well, so that the 
picture orientation stays with the text orientation but the watermark 
orientation is controlled by the page orientation.

C) Ttf/otf in pdfs

This is a lot of work, but I was starting to get a handle on it. Incidentally, 
if I ever do get this done, the Tibet ligatures issue will be solved. The 
reason it seems to be Ok everywhere else except in groff, is because the 
"rules" for the ligature placement/resizing are in sub-tables within the ttf 
font file, but in the fontforge conversion to a pfa file most of this 
information is discarded because type 1 fonts have no concept of vertical 
adjustments so all that gets through is the horizontal adjustment which 
ensures the glyphs print over each other, but without the correct vertical 
adjustment/sizing. Still a lot of research to do.

I've just seen your last email with a lot of nice things, but sometimes you 
confuse "code review" with "design review". If someone wants to know how to 
get to the doctor it is not helpful to say "Well I would not start from here”. 
I have told you right from the beginning that all I needed was a way to pass 
anything to gropdf, and so I coded on the expectation I could receive anything 
and dealt with it appropriately. This is all working code. Later you expressed 
a preference for a method where you would clean the data within troff so I 
would not need to, but I already had working code and so far any alternative 
is vapourware, and the only pseudo code I've seen (a for loop with a flag to 
indicate whether the next item is a node or a character), with the expectation 
that nodes will be discarded, would not cut the mustard because I believe 
special characters (i.e. \[u] or \[em]) are actually held as nodes within 
troff so would be discarded as not a character. So the criticism is of my 
design, hardly what I call a code review and 

Re: gropdf-ng merge status (was: PDF outline not capturing Cyrillic text)

2024-02-07 Thread G. Branden Robinson
[self-follow-up]

Hi Deri,

One more thing occurred to me, because your last paragraph was sticking
in my mind and I think I figured out why.

At 2024-02-06T19:30:58-0600, G. Branden Robinson wrote:
> > I am quite sure there will be "bugs" in my code, it is fairly
> > complex, but subjecting it to a "code review" without even running
> > it to see if it does what it says on the box, is not helpful.
> 
> I think you've pretty badly mistaken my perspective.  One of the
> reasons I stick my long nose into your code in this way is because I
> don't worry that you won't produce correct results.  You have an
> established record of delivering solutions that work as advertised.

That you put code review into scare quotes gave me a sort of belated
pause.  It finally dawned on me that you might be regarding my
undertaking of such on your contributions as a form of insult.

It emphatically is not!

Some computer science luminary--unfortunately I cannot remember who at
the moment--made the observation that programming languages chiefly
exist so that human beings can communicate to each other about
programming.  (Maybe someone reading recollects who I mean.)  If PLs
were intended _solely_ for consumption by machines, we'd stick with
machine language...or maybe assembly.

At the places I have worked, and at sites like GitHub and GitLab where
people manage things like pull requests and merge requests, it is not
only common for people other than the code author to undertake code
review before attempting to run it themselves, it is expected that they
won't!

Part of this is due to the cultural expectation that the author of code
will have tested it.  But another aspect is that humans are actually
pretty bad at inferring (perfect) correctness from inspection of source
code.  We are indeed likely to assume that it does what is on the box.
What code review is good for--and I think I said this recently on this
list, but maybe it was someplace else--is for programmers to share
expertise and problem-solving techniques with each other, and also to
reinforce the team mentality that sustains successful software projects
above the very small scale.

So I would ask that you please try to adopt that perspective when a
person perceptibly studies your, or anyone else's code.  Not all code
is worthy of study.  The famous Lions book presenting the Sixth Edition
Unix kernel was not an insult to Thompson and Ritchie, but a high form
of flattery...and today that book stands as a monument in the field of
operating systems research as an exposition of a successful,
high-quality system.

At the time time, everybody had gripes about the Unix kernel and some
aspects of how it was written, and even designed.  This is how we learn,
individually and collectively.

So, if I pay your code some scrutiny, it is not out of hauteur, but
respect.  I look at your code because I want to work with you.

I'm appreciate what you've contributed to groff and am pleased by how
well-received your efforts continue to be.

Best regards,
Branden


signature.asc
Description: PGP signature


Re: PDF outline not capturing Cyrillic text

2024-02-07 Thread Deri
On Wednesday, 7 February 2024 01:07:37 GMT Robin Haberkorn wrote:
> Still, when using UTF-8 input, there are problems (missing letters) with
> link texts autogenerated by .pdfhref L.

[...]

> 
> Best regards,
> Robin
> 
> PS: And to comment on some of the heated discussions on this list:
> It's great that you and Branden spend so much time on improving Groff.
> I think, you do a great job. Regressions are sometimes unavoidable,
> especially when taking over a large code base from somebody else.

Hi Robin,

Many thanks for the kind words, although there will be some sad news later. :-
(

I wonder if you could send me a small example of .pdfhref L missing letters 
and the command you are using, I don't need the whole thesis, I would not 
understand it.

Cheers 

Deri