Re: [Fedora-packaging] Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions

2021-09-29 Thread Fabio Valentini
On Wed, Sep 29, 2021 at 3:41 PM Miro Hrončok  wrote:
>
> On 29. 09. 21 14:48, Fabio Valentini wrote:
> > On Wed, Sep 29, 2021 at 1:53 PM Miro Hrončok  wrote:
> >>
> >> On 25. 09. 21 11:12, Fabio Valentini wrote:
> >>> So, if I understand correctly, the problem is that right now there's
> >>> no *existing* tool that reliably detects if a given "executable" (has
> >>> mode +x) is an actual executable "script" with a valid shebang?
> >>
> >> - We need to detect "scripts" that are executable but have no shebnag.
> >> - We need to detect "scripts" that are executable and have a shebnag to 
> >> mangle.
> >> - We might want to detect binary files that are executable but shouldn't be
> >> (such as images), but this was not the original purpose of the BRP 
> >> script.
> >
> > If I gave you a program /usr/bin/isexec that determines if a file is a
> > valid executable, i.e.
> > - ELF binary with ELF header / magic number,
> > - PE binary with MZ magic number,
> > - script with shebang line (whether in need of mangling or not),
> > would that help?
> >
> > (I.e. something like this POC: https://github.com/ironthree/isexec ?)
>
> I am not sure I want to throw in a one-man-maintained rust program into the
> mixture. This could open can of worms, e.g.:
>
> - bus factor = 1
> - RHEL maintainer wiling to maintain this in RHEL 10 = 0
> - no "full" architecture support
> - (possibly?) larger dependency chain just to build this
>
> But even ignoring this, we still need to detect "scripts without shebangs".

I was not suggesting that we actually take this and use it. I was just
trying to demonstrate that solving different subsets of the problem
should be easy to do.
But to me the problem you're trying to solve in this thread is very
fuzzy and not well-defined, so I'm not sure if a single existing tool
will just be able to solve it.
So why not split the problem into smaller parts, and use the best
available tool to solve *those individial tasks*, instead of looking
for a "one size-hits-all hammer"?

(PS: "script without shebangs" would be rejected as "invalid
executable" too by my POC, because the file contents would not match
any heuristic. As would executable PNG files, etc.)

Fabio
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: [Fedora-packaging] Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions

2021-09-29 Thread Miro Hrončok

On 29. 09. 21 14:48, Fabio Valentini wrote:

On Wed, Sep 29, 2021 at 1:53 PM Miro Hrončok  wrote:


On 25. 09. 21 11:12, Fabio Valentini wrote:

So, if I understand correctly, the problem is that right now there's
no *existing* tool that reliably detects if a given "executable" (has
mode +x) is an actual executable "script" with a valid shebang?


- We need to detect "scripts" that are executable but have no shebnag.
- We need to detect "scripts" that are executable and have a shebnag to mangle.
- We might want to detect binary files that are executable but shouldn't be
(such as images), but this was not the original purpose of the BRP script.


If I gave you a program /usr/bin/isexec that determines if a file is a
valid executable, i.e.
- ELF binary with ELF header / magic number,
- PE binary with MZ magic number,
- script with shebang line (whether in need of mangling or not),
would that help?

(I.e. something like this POC: https://github.com/ironthree/isexec ?)


I am not sure I want to throw in a one-man-maintained rust program into the 
mixture. This could open can of worms, e.g.:


- bus factor = 1
- RHEL maintainer wiling to maintain this in RHEL 10 = 0
- no "full" architecture support
- (possibly?) larger dependency chain just to build this

But even ignoring this, we still need to detect "scripts without shebangs".

--
Miro Hrončok
--
Phone: +420777974800
IRC: mhroncok
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: [Fedora-packaging] Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions

2021-09-29 Thread Fabio Valentini
On Wed, Sep 29, 2021 at 1:53 PM Miro Hrončok  wrote:
>
> On 25. 09. 21 11:12, Fabio Valentini wrote:
> > So, if I understand correctly, the problem is that right now there's
> > no *existing* tool that reliably detects if a given "executable" (has
> > mode +x) is an actual executable "script" with a valid shebang?
>
> - We need to detect "scripts" that are executable but have no shebnag.
> - We need to detect "scripts" that are executable and have a shebnag to 
> mangle.
> - We might want to detect binary files that are executable but shouldn't be
>(such as images), but this was not the original purpose of the BRP script.

If I gave you a program /usr/bin/isexec that determines if a file is a
valid executable, i.e.
- ELF binary with ELF header / magic number,
- PE binary with MZ magic number,
- script with shebang line (whether in need of mangling or not),
would that help?

(I.e. something like this POC: https://github.com/ironthree/isexec ?)

Fabio
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions

2021-09-24 Thread Otto Urpelainen

Steve Grubb kirjoitti 23.9.2021 klo 18.15:

On Wednesday, September 22, 2021 5:34:17 PM EDT Miro Hrončok wrote:

 From all the scan that we've done on fullish installs in the past,
there's
only 2 others that you might run across: application/x-elc (lisp) and
application/x-java-applet.

Maybe you just build in logic to workaround these 3 types? application/
javascript is really the only one I can think of that is common.


Yeah, maybe we should just do that. However, that would not cleanup the
executable pngs.


They should be easy to identify, they start with 'image'. There's not many
types on a typical system. This is what I see in /usr on a system with 5000
packages installed:

application/gzip
application/javascript
application/json
application/octet-stream
application/vnd.ms-fontobject
application/x-bad-elf
application/x-executable
application/x-kdelnk
application/x-sharedlib
application/zip
audio/ogg
font/sfnt
image/gif
image/jpeg
image/png
image/vnd.microsoft.icon
text/html
text/plain
text/x-awk
text/x-c
text/x-gawk
text/x-lua
text/x-luatex
text/x-perl
text/x-python
text/x-ruby
text/x-shellscript
text/x-systemtap
text/x-tcl

You might just make a map since the list is not all that big. The biggest
issue is when you have things text/plain or application/octet-stream. That
means we don't know what it is.


What about keeping the "detect mime type" approach, then dividing the 
results into three categories?


1. Can be executable, if so, must have a shebang, which is mangled: 
text/* is already there, add application/javascript and possibly others 
as needed.
2. Cannot be executable, remote the executable bit if found: image/* 
would take care of the executable pngs, many more like application/json 
can be added as needed.

3. The rest: do nothing with these.

Maybe that would be good enough, even if the mime type detection 
uncertainty sets a limit on how precise it can be?


Keeping the mime type detection approach, but using less data (the first 
8 bytes approach) does not sound good. If 'file' really works better 
that way, then there is something wrong with it.


As for the application/javascript type, there is an IETF proposal that, 
among other things, tries to deprecate that and de-deprecate 
text/javascript [1]. So, perhaps some day category 1 could be reasonably 
equated with text/* again.


[1]: https://datatracker.ietf.org/doc/draft-ietf-dispatch-javascript-mjs

Otto
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions

2021-09-23 Thread Josh Stone
On 9/23/21 2:11 AM, Miro Hrončok wrote:
> On 23. 09. 21 1:40, Josh Stone wrote:
>> On 9/22/21 4:21 AM, Miro Hrončok wrote:
>>> Hello,
>>>
>>> for many releases, Fedora has the brp-mangle-sehbangs BuildRoot Policy 
>>> Script
>>> that does the following:
>>>
>>>1) Gets all executable files in the buildroot
>>>2) Gets all "text" files from those
>>>3a) Mangles shebangs that are "wrong"
>>>(e.g. #!/usr/bin/env node -> #!/usr/bin/node)
>>>3b) Removes executable bits from "text" files without shebangs
>>
>> While we're at it, can we teach the script to ignore Rust attributes?
>> They're written like #![attr...], and when that's on the first line some
>> editors try to be helpful and make the file executable. That's
>> considered an error with the current script since the "shebang" doesn't
>> start with '/', but it would be best IMO to have it remove the
>> executable bit.
> 
> I believe that currently the script would error:
> 
> ERROR: $f has shebang which doesn't start with '/' (#![attr...])
> 
> Have you ever seen that in a Fedora package?

That's the error I meant, and yes I have seen that in real builds. I
have a line in the rust.spec %prep to "chmod -x *.rs", but I've also
seen this pop up in individual rust-* crate packaging.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions

2021-09-23 Thread Steve Grubb
On Wednesday, September 22, 2021 5:34:17 PM EDT Miro Hrončok wrote:
> > From all the scan that we've done on fullish installs in the past,
> > there's
> > only 2 others that you might run across: application/x-elc (lisp) and
> > application/x-java-applet.
> > 
> > Maybe you just build in logic to workaround these 3 types? application/
> > javascript is really the only one I can think of that is common.
> 
> Yeah, maybe we should just do that. However, that would not cleanup the
> executable pngs.

They should be easy to identify, they start with 'image'. There's not many 
types on a typical system. This is what I see in /usr on a system with 5000 
packages installed:

application/gzip
application/javascript
application/json
application/octet-stream
application/vnd.ms-fontobject
application/x-bad-elf
application/x-executable
application/x-kdelnk
application/x-sharedlib
application/zip
audio/ogg
font/sfnt
image/gif
image/jpeg
image/png
image/vnd.microsoft.icon
text/html
text/plain
text/x-awk
text/x-c
text/x-gawk
text/x-lua
text/x-luatex
text/x-perl
text/x-python
text/x-ruby
text/x-shellscript
text/x-systemtap
text/x-tcl

You might just make a map since the list is not all that big. The biggest 
issue is when you have things text/plain or application/octet-stream. That 
means we don't know what it is.

-Steve

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: [Fedora-packaging] Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions

2021-09-23 Thread Richard W.M. Jones
On Thu, Sep 23, 2021 at 11:13:37AM +0200, Miro Hrončok wrote:
> On 23. 09. 21 9:45, Richard W.M. Jones wrote:
> >After years of experience I wouldn't use "file" for anything I needed
> >to work reliably.
> >
> >If the test is really ELF or not ELF, how about detecting the ELF
> >header magic directly?
> 
> As I say in the rest of the email, I know how to detected elves. I
> just don't know if that's enough.

Oh I see - I misread your second email as eu-elfclassify pulling in
too many dependencies when in fact that was a different package.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: [Fedora-packaging] Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions

2021-09-23 Thread Miro Hrončok

On 23. 09. 21 9:45, Richard W.M. Jones wrote:

After years of experience I wouldn't use "file" for anything I needed
to work reliably.

If the test is really ELF or not ELF, how about detecting the ELF
header magic directly?


As I say in the rest of the email, I know how to detected elves. I just don't 
know if that's enough.


--
Miro Hrončok
--
Phone: +420777974800
IRC: mhroncok
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions

2021-09-23 Thread Miro Hrončok

On 23. 09. 21 1:40, Josh Stone wrote:

On 9/22/21 4:21 AM, Miro Hrončok wrote:

Hello,

for many releases, Fedora has the brp-mangle-sehbangs BuildRoot Policy Script
that does the following:

   1) Gets all executable files in the buildroot
   2) Gets all "text" files from those
   3a) Mangles shebangs that are "wrong"
   (e.g. #!/usr/bin/env node -> #!/usr/bin/node)
   3b) Removes executable bits from "text" files without shebangs


While we're at it, can we teach the script to ignore Rust attributes?
They're written like #![attr...], and when that's on the first line some
editors try to be helpful and make the file executable. That's
considered an error with the current script since the "shebang" doesn't
start with '/', but it would be best IMO to have it remove the
executable bit.


I believe that currently the script would error:

ERROR: $f has shebang which doesn't start with '/' (#![attr...])

Have you ever seen that in a Fedora package?

--
Miro Hrončok
--
Phone: +420777974800
IRC: mhroncok
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions

2021-09-23 Thread Richard W.M. Jones
After years of experience I wouldn't use "file" for anything I needed
to work reliably.

If the test is really ELF or not ELF, how about detecting the ELF
header magic directly?

$ if [[ $(dd if="/bin/ls" status=none bs=4 count=1) == $'\x7fELF' ]]; then echo 
elf ; else echo not-elf; fi
elf

$ if [[ $(dd if="/etc/passwd" status=none bs=4 count=1) == $'\x7fELF' ]]; then 
echo elf ; else echo not-elf; fi
not-elf

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions

2021-09-23 Thread Petr Pisar
V Wed, Sep 22, 2021 at 12:47:07PM -0400, Steve Grubb napsal(a):
> I find the file utility to be almost reliable. It changes how it identifies 
> ELF 
> files every couple releases. So, to stabilize this, fapolicyd-cli uses it's 
> own logic to determine what kind of ELF file it finds. I also regularly find 
> text/plain files where it cannot identify the language and files that are 
> application/octet-stream which are also misidentified.
> 
File's libmagic will always misdetect some files.

I'd like to see rpmbuild to prefer user.mime_type extended attribute over
libmagic guess. That way packagers could override the MIME type directly from
a spec file:

%install
setfattr -n 'user.mime_type' -v 'text/x-perl' %{buildroot}%{_bindir}/GET

If rpmbuild carried that attribute to RPM archive, rpm would set the attribute
when installing that package and it would become available to other tools like
fapolicyd-cli.

Technically we could patch libmagic to do that, but I feel that libmagic
upstream wouldn't like that. Maybe a place for an intermediate wrapper.

-- Petr


signature.asc
Description: PGP signature
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions

2021-09-22 Thread Josh Stone
On 9/22/21 4:21 AM, Miro Hrončok wrote:
> Hello,
> 
> for many releases, Fedora has the brp-mangle-sehbangs BuildRoot Policy Script 
> that does the following:
> 
>   1) Gets all executable files in the buildroot
>   2) Gets all "text" files from those
>   3a) Mangles shebangs that are "wrong"
>   (e.g. #!/usr/bin/env node -> #!/usr/bin/node)
>   3b) Removes executable bits from "text" files without shebangs

While we're at it, can we teach the script to ignore Rust attributes?
They're written like #![attr...], and when that's on the first line some
editors try to be helpful and make the file executable. That's
considered an error with the current script since the "shebang" doesn't
start with '/', but it would be best IMO to have it remove the
executable bit.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions

2021-09-22 Thread Miro Hrončok

On 22. 09. 21 22:48, Steve Grubb wrote:

On Wednesday, September 22, 2021 4:26:49 PM EDT Miro Hrončok wrote:

By chance do you have a pointer to one of those javascript files that is
misidentified? (Or any other for that matter). I'd like to see what's
going on and get a fix in place.


yarnpkg package, %prepped

$ file --mime-type yarn-1.22.10/bin/yarn.js
yarn-1.22.10/bin/yarn.js: application/javascript


application/javascript is, unfortunately, correct. This one is governed by
RFC 4329 which makes it official.

 From all the scan that we've done on fullish installs in the past, there's
only 2 others that you might run across: application/x-elc (lisp) and
application/x-java-applet.

Maybe you just build in logic to workaround these 3 types? application/
javascript is really the only one I can think of that is common.


Yeah, maybe we should just do that. However, that would not cleanup the 
executable pngs.


--
Miro Hrončok
--
Phone: +420777974800
IRC: mhroncok
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions

2021-09-22 Thread Steve Grubb
On Wednesday, September 22, 2021 4:26:49 PM EDT Miro Hrončok wrote:
> > By chance do you have a pointer to one of those javascript files that is
> > misidentified? (Or any other for that matter). I'd like to see what's
> > going on and get a fix in place.
> 
> yarnpkg package, %prepped
> 
> $ file --mime-type yarn-1.22.10/bin/yarn.js
> yarn-1.22.10/bin/yarn.js: application/javascript

application/javascript is, unfortunately, correct. This one is governed by 
RFC 4329 which makes it official.

From all the scan that we've done on fullish installs in the past, there's 
only 2 others that you might run across: application/x-elc (lisp) and 
application/x-java-applet. 

Maybe you just build in logic to workaround these 3 types? application/
javascript is really the only one I can think of that is common.

-Steve

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions

2021-09-22 Thread Miro Hrončok

On 22. 09. 21 21:46, Steve Grubb wrote:

By chance do you have a pointer to one of those javascript files that is
misidentified? (Or any other for that matter). I'd like to see what's going on
and get a fix in place.


yarnpkg package, %prepped

$ file --mime-type yarn-1.22.10/bin/yarn.js
yarn-1.22.10/bin/yarn.js: application/javascript

--
Miro Hrončok
--
Phone: +420777974800
IRC: mhroncok
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions

2021-09-22 Thread Steve Grubb
On Wednesday, September 22, 2021 1:46:11 PM EDT Miro Hrončok wrote:
> > 4) maybe fapolicyd-cli has better detection? Or at least, its more
> > closely
> > maintained. It also has it's own ELF detection so that it's stable from
> > release to release.
> 
> Not checked whether it has better detection or not, but I see it would pull
> the following packages into the default buildroot:

 

> python-pip-wheel
> python-setuptools-wheel
> python3-libs
> python3

I think this is a packaging mistake. That would remove a fair amount of 
what's pulled in.

> systemd-pam
> systemd-rpm-macros
> systemd

It needs libudev, but we get the whole thing.

> Installed size: 54 M
> 
> 
> I don't think this is acceptable. Even systemd and Python alone are not
> supposed to be there.

I'll ask about fixing the package. But I guess the other option is to identify 
problems with libmagic and contribute fixes. That's what we have to do for 
fapolicyd.

Btw, I just looked at F34 and rawhide to compare. Rawhide looks much, much 
better as in things that are executable really are supposed to be. However, 
F34...not so much. There's a lot of png files that are execuatble. This 
cleanup is appreciated.

By chance do you have a pointer to one of those javascript files that is 
misidentified? (Or any other for that matter). I'd like to see what's going on 
and get a fix in place.

Cheers,
-Steve

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions

2021-09-22 Thread Miro Hrončok

On 22. 09. 21 18:47, Steve Grubb wrote:

4) maybe fapolicyd-cli has better detection? Or at least, its more closely
maintained. It also has it's own ELF detection so that it's stable from
release to release.


Not checked whether it has better detection or not, but I see it would pull the 
following packages into the default buildroot:


acl
cryptsetup-libs
dbus-broker
dbus-common
dbus
device-mapper-libs
device-mapper
expat
fapolicyd-dnf-plugin
fapolicyd
iptables-legacy-libs
json-c
kmod-libs
libargon2
libibverbs
libnl3
libpcap
libseccomp
lmdb-libs
mpdecimal
python-pip-wheel
python-setuptools-wheel
python3-libs
python3
systemd-pam
systemd-rpm-macros
systemd

Installed size: 54 M


I don't think this is acceptable. Even systemd and Python alone are not 
supposed to be there.


--
Miro Hrončok
--
Phone: +420777974800
IRC: mhroncok
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions

2021-09-22 Thread Steve Grubb
Hello,

On Wednesday, September 22, 2021 7:21:42 AM EDT Miro Hrončok wrote:
> for many releases, Fedora has the brp-mangle-sehbangs BuildRoot Policy
> Script that does the following:
> 
>   1) Gets all executable files in the buildroot
>   2) Gets all "text" files from those
>   3a) Mangles shebangs that are "wrong"
>   (e.g. #!/usr/bin/env node -> #!/usr/bin/node)
>   3b) Removes executable bits from "text" files without shebangs

This is interesting. I didn't know Fedora had such a policy. I have been 
doing studies of this myself because fapolicyd wants correctly identified 
scripts.

> The idea behind this is that all "text" files that are executable need a 
> shebang and if they don't have it, something is wrong. OTOH files that are
> "binary" don't need it.
> 
> I intentionally put the terms "text" and "binary" in quotation marks, as
> the definition is somewhat fuzzy. Up until now, the script did the
> detection by utilizing the file tool to get the MIME type. If the MIME
> type starts with text/, it considered the executable to be a text file.

I find the file utility to be almost reliable. It changes how it identifies ELF 
files every couple releases. So, to stabilize this, fapolicyd-cli uses it's 
own logic to determine what kind of ELF file it finds. I also regularly find 
text/plain files where it cannot identify the language and files that are 
application/octet-stream which are also misidentified.

> However, a bug [1] has been discovered. Some obvious text files, such as 
> executable JavaScript scripts, are detected as application/ (e.g. 
> application/javascript), and hence are not considered "text".

This is another inconsistency with libmagic that we do battle with. It can 
change on th next release. Another example of this is python 
misidentification. In order to have any stability and correctness, fapolicyd 
ships with it's own libmagic override file. You might find 
fapolicyd-cli --ftype a bit more stable. I also put new languages we discover 
in the override while we are waiting for the patch to be accepted upstream. 
And I think upstream has not accepted a couple patches for languages libmagic 
can't detect right.

> If a JavaScript executable script has the #!/usr/bin/env node shebang, the
> brp-mangle-sehbangs script does not mangle it.
> 
> One possible solution [2] to this problem is to limit the number of bytes
> the MIME detection reads. My experiments showed that limiting the number
> of bytes to 8 always recognizes JavaScript (and other scripting languages)
> files as text/plain and binary files as application/octet-stream. As a
> side effect, it might make the BRP script faster. However, I am not sure
> if this approach is deterministic enough.
> 
> Another solution, suggested by Florian Weimer [3], is to not detect MIME
> type at all, but use eu-elfclassify instead. The idea is quite simple:
> If (and only if) the executable file is ELF [4], it does not require a
> shebang. Instead of some fragile idea about what files are text and what
> files are binary, this is quite deterministic. It allows mangling shebangs
> of executable ZIP files etc. 
> I've drafted the eu-elfclassify solution in a pull request [5]. However, we
>  have discovered that several non-elf binary formats in Fedora are
> possibly legitimately executable. E.g. .exe files (for mono or wine) or
> other formats registered with the kernel [6].
> 
> We are presented with 3 possible actions:
> 
> 1) Keep the script as it is, say the text/ MIME type limitation is how this
> BRP script was scoped. Affected packages would need to correct shebangs
> manually. 
> 2) Limit the MIME type detection to 8 bytes and hope it will not yield 
> incorrect results.
> 
> 3) Use eu-elfclassify. Consider non-ELF executables without shebangs bogus
> and document this. Packages that are affected would need to opt-out. 
> What do you think?

4) maybe fapolicyd-cli has better detection? Or at least, its more closely 
maintained. It also has it's own ELF detection so that it's stable from 
release to release.

-Steve

> [1] https://bugzilla.redhat.com/1998924
> [2] https://bugzilla.redhat.com/1998924#c3
> [3] https://bugzilla.redhat.com/1998924#c4
> [4] https://en.wikipedia.org/wiki/Executable_and_Linkable_Format
> [5] https://src.fedoraproject.org/rpms/redhat-rpm-config/pull-request/145
> [6] https://www.kernel.org/doc/html/latest/admin-guide/binfmt-misc.html
> -- 
> Miro Hrončok
> -- 
> Phone: +420777974800
> IRC: mhroncok
> ___
> devel mailing list -- devel@lists.fedoraproject.org
> To unsubscribe send an email to devel-le...@lists.fedoraproject.org
> Fedora Code of Conduct:
> https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List
> Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List
> Archives:
> https://lists.fedoraproject.org/archives/list/de...@lists.fedoraproject.or
> g Do not reply to spam on the list, report it:
> https://pagure.io/fedora-infras

Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions

2021-09-22 Thread Miro Hrončok

Hello,

for many releases, Fedora has the brp-mangle-sehbangs BuildRoot Policy Script 
that does the following:


 1) Gets all executable files in the buildroot
 2) Gets all "text" files from those
 3a) Mangles shebangs that are "wrong"
 (e.g. #!/usr/bin/env node -> #!/usr/bin/node)
 3b) Removes executable bits from "text" files without shebangs

The idea behind this is that all "text" files that are executable need a 
shebang and if they don't have it, something is wrong. OTOH files that are 
"binary" don't need it.


I intentionally put the terms "text" and "binary" in quotation marks, as the 
definition is somewhat fuzzy. Up until now, the script did the detection by 
utilizing the file tool to get the MIME type. If the MIME type starts with 
text/, it considered the executable to be a text file.


However, a bug [1] has been discovered. Some obvious text files, such as 
executable JavaScript scripts, are detected as application/ (e.g. 
application/javascript), and hence are not considered "text". If a JavaScript 
executable script has the #!/usr/bin/env node shebang, the brp-mangle-sehbangs 
script does not mangle it.


One possible solution [2] to this problem is to limit the number of bytes the 
MIME detection reads. My experiments showed that limiting the number of bytes 
to 8 always recognizes JavaScript (and other scripting languages) files as 
text/plain and binary files as application/octet-stream. As a side effect, it 
might make the BRP script faster. However, I am not sure if this approach is 
deterministic enough.


Another solution, suggested by Florian Weimer [3], is to not detect MIME type 
at all, but use eu-elfclassify instead. The idea is quite simple: If (and only 
if) the executable file is ELF [4], it does not require a shebang. Instead of 
some fragile idea about what files are text and what files are binary, this is 
quite deterministic. It allows mangling shebangs of executable ZIP files etc.


I've drafted the eu-elfclassify solution in a pull request [5]. However, we 
have discovered that several non-elf binary formats in Fedora are possibly 
legitimately executable. E.g. .exe files (for mono or wine) or other formats 
registered with the kernel [6].


We are presented with 3 possible actions:

1) Keep the script as it is, say the text/ MIME type limitation is how this BRP 
script was scoped. Affected packages would need to correct shebangs manually.


2) Limit the MIME type detection to 8 bytes and hope it will not yield 
incorrect results.


3) Use eu-elfclassify. Consider non-ELF executables without shebangs bogus and 
document this. Packages that are affected would need to opt-out.


What do you think?

[1] https://bugzilla.redhat.com/1998924
[2] https://bugzilla.redhat.com/1998924#c3
[3] https://bugzilla.redhat.com/1998924#c4
[4] https://en.wikipedia.org/wiki/Executable_and_Linkable_Format
[5] https://src.fedoraproject.org/rpms/redhat-rpm-config/pull-request/145
[6] https://www.kernel.org/doc/html/latest/admin-guide/binfmt-misc.html
--
Miro Hrončok
--
Phone: +420777974800
IRC: mhroncok
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure