Bug#820549: anarchism: Typo in section H1

2017-02-06 Thread ju
Hi,

>
> I'd like to, but the way anarchism 15 is being uploaded right now makes
> me really feel uncomfortable:

I think what really is missing is to change the Debian package README to
clarify some points.
Please, have a look to the just pushed README [0] in "upstream".
We thought it could be a good idea to have separated the script to
download automatically (and even in real time running it in a box) the
authors' original and the original html extracted from the script plus
conversions to other formats. The repository with the html would be the
Debian upstream and the repository with scraper should be also packaged
for Debian.

>
> - there's tons of trailing whitespaces (at least in debian/changelog and
>   some html files).
>   If I open my editor on a file, I end up changing tons of lines. Shall
>   I use sed to change a typo, or shall you fix your whitespaces first?
>

The trailing whitespaces come from the original HTML.
Personally, i would not modify the original HTML, but we could include a
patch in the Debian package to do so.

> - the .html is not anymore readable by a human, it's just a bunch of
>   parsed content.

Hmm, you mean the HTML code or what is rendered in a browser?.
Usually people do not read HTML code, i guess that is why the Debian
package generated txt, and for more readable formats we have Markdown,
that is why is generated by the scraper.
The HTML displayed in a browser should be readable as in the original
Website. If it is not, then maybe there is some bug in the scraper.

>   Do I really have to create a patch of (at least) 8K just to add a
>   letter?

This is a good question, the idea with the separate scraper and source
repositories was that people could fix the Markdown directly in that
repository, but then there should probably be other Markdown directory
or branch, to do not end up overwrotten by the conversion from the HTML
but possible to merge and use that one for the Debian package.

>
> - there's now multiple directories having the same content, instead of
>   just one html/ from where to get all other file formats.
>   How am I supposed to fix this typo? Do I have to fix it in all 3
>   places?

The scraper generate the markdown and txt directories. Having markdown,
txt should probably not exists, it comes from the previous Debian script.

>
>   Even if I wanted to fix the problem at the root, the script you used
>   to generate it's not in the package itself, and is *big*.

Yeah, sorry, it has not been packaged yet, but the idea is to package it
independently of the HTML.

>   I'm not even sure anymore where I am contributing now.
>
>   Before anarchism 15 was released I tried to help a bit myself, and
>   ended up with the attached relevant debian/ files.
>   My solution has clearly more little errors than yours, but at least
>   anybody can reprocude it.

You can not reproduce the HTML, Markdown and txt from the scraper?. It
should, that was the main idea.
The "source" HTML used to generated the older Debian versions, was never
in any repository nor there was an script to download it from the
original Website.

Maybe you can find something helpful from
>   there to help me fix this issue.
>

Ok, i will look those files. It would be great if you look at the new
README.
A process like the following seems good to me:
1. the scraper obtain the HTML from the original Website
2. the scraper push the HTML to other repository, together with the
generated Markdown
3. Debian package uses the generated "source" as upstream.

Probably what is missing, as said above, is to have a separated Markdown
directory where to push any content change that does not reflect the
original and specify clearly that that is the directory where to fix
typos and content. At least would be easy to see differences with the
original and new original versions.

Thanks for your comments,
ju.

[0] https://0xacab.org/ju/afaq/blob/master/README.md



Bug#820549: anarchism: Typo in section H1

2017-02-06 Thread Michele Orrù
ju  writes:

> Or maybe they wanted to say <>?

No, that's not possible as the full sentence is:

« These critiques contain may important ideas and so are worth
summarising. »


>> I just thought it was a typo, and that it would have been nice to
>> report
>> it. I also downloaded the version proposed in #773529, and I can
>> confirm this
>> bug still applies.
>
> Now that #773529 is closed and the bug still applies, maybe is better a
> PR to the repository where the sources are now?.
>

I'd like to, but the way anarchism 15 is being uploaded right now makes
me really feel uncomfortable:

- there's tons of trailing whitespaces (at least in debian/changelog and
  some html files).
  If I open my editor on a file, I end up changing tons of lines. Shall
  I use sed to change a typo, or shall you fix your whitespaces first?

- the .html is not anymore readable by a human, it's just a bunch of
  parsed content.
  Do I really have to create a patch of (at least) 8K just to add a
  letter?  

- there's now multiple directories having the same content, instead of
  just one html/ from where to get all other file formats.
  How am I supposed to fix this typo? Do I have to fix it in all 3
  places?

  Even if I wanted to fix the problem at the root, the script you used
  to generate it's not in the package itself, and is *big*.
  I'm not even sure anymore where I am contributing now.
  
  Before anarchism 15 was released I tried to help a bit myself, and
  ended up with the attached relevant debian/ files.
  My solution has clearly more little errors than yours, but at least
  anybody can reprocude it. Maybe you can find something helpful from
  there to help me fix this issue.

Hasta la pasta,



rules
Description: Binary data


control
Description: Binary data
#!/usr/bin/env python2

import sys

from bs4 import BeautifulSoup as BS


html = BS(
'''




  
  
  

  
  

''', 'lxml')

def clean_html(inf, outf):
dirty = BS(inf, 'lxml', from_encoding='utf8')
content, = dirty.find_all('div', attrs={'class': 'content clear-block'})
html.head.append(dirty.title)
html.body.append(content)
outf.write(html.encode('utf8'))

if __name__ == '__main__':
if len(sys.argv) == 2:
with open(sys.argv[1], 'r') as inf:
clean_html(inf, sys.stdout)

elif len(sys.argv) == 1:
inf, outf = sys.stdin, sys.stdout
clean_html(inf, outf)

else:
sys.exit(1)

  
-- 
µ.


Bug#820549: anarchism: Typo in section H1

2017-01-25 Thread ju
X-Debbugs-Cc: j...@riseup.net

Package: anarchism
Version: 15.0-1
Followup-For: Bug #820549

Hi,

> In section H.1 of the faq, I read «These critiques contain may
> important ideas
> and so are worth summarising», where "may" is probably a "many"
> mispelled.

Or maybe they wanted to say <>?

> I just thought it was a typo, and that it would have been nice to
> report
> it. I also downloaded the version proposed in #773529, and I can
> confirm this
> bug still applies.

Now that #773529 is closed and the bug still applies, maybe is better a
PR to the repository where the sources are now?.

Cheers,
ju.



Bug#820549: anarchism: Typo in section H1

2016-04-09 Thread Michele Orru`
Package: anarchism
Version: 14.0-4
Severity: normal
Tags: upstream

Dear Maintainer,

In section H.1 of the faq, I read «These critiques contain may important ideas
and so are worth summarising», where "may" is probably a "many" mispelled.

I just thought it was a typo, and that it would have been nice to report
it. I also downloaded the version proposed in #773529, and I can confirm this
bug still applies.

-- System Information:
Debian Release: stretch/sid
  APT prefers unstable
  APT policy: (900, 'unstable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 4.4.0-1-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_US.utf8, LC_CTYPE=en_US.utf8 (charmap=UTF-8)
Shell: /bin/sh linked to /usr/bin/bash
Init: systemd (via /run/systemd/system)

anarchism depends on no packages.

anarchism recommends no packages.

Versions of packages anarchism suggests:
ii  chromium [www-browser]  49.0.2623.87-1
ii  firefox-esr [www-browser]   45.0.1esr-1
ii  google-chrome-stable [www-browser]  49.0.2623.108-1
ii  w3m [www-browser]   0.5.3-27
ii  yelp3.16.1-1

-- no debconf information