Re: [Fedora-packaging] Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions
On Wed, Sep 29, 2021 at 3:41 PM Miro Hrončok wrote: > > On 29. 09. 21 14:48, Fabio Valentini wrote: > > On Wed, Sep 29, 2021 at 1:53 PM Miro Hrončok wrote: > >> > >> On 25. 09. 21 11:12, Fabio Valentini wrote: > >>> So, if I understand correctly, the problem is that right now there's > >>> no *existing* tool that reliably detects if a given "executable" (has > >>> mode +x) is an actual executable "script" with a valid shebang? > >> > >> - We need to detect "scripts" that are executable but have no shebnag. > >> - We need to detect "scripts" that are executable and have a shebnag to > >> mangle. > >> - We might want to detect binary files that are executable but shouldn't be > >> (such as images), but this was not the original purpose of the BRP > >> script. > > > > If I gave you a program /usr/bin/isexec that determines if a file is a > > valid executable, i.e. > > - ELF binary with ELF header / magic number, > > - PE binary with MZ magic number, > > - script with shebang line (whether in need of mangling or not), > > would that help? > > > > (I.e. something like this POC: https://github.com/ironthree/isexec ?) > > I am not sure I want to throw in a one-man-maintained rust program into the > mixture. This could open can of worms, e.g.: > > - bus factor = 1 > - RHEL maintainer wiling to maintain this in RHEL 10 = 0 > - no "full" architecture support > - (possibly?) larger dependency chain just to build this > > But even ignoring this, we still need to detect "scripts without shebangs". I was not suggesting that we actually take this and use it. I was just trying to demonstrate that solving different subsets of the problem should be easy to do. But to me the problem you're trying to solve in this thread is very fuzzy and not well-defined, so I'm not sure if a single existing tool will just be able to solve it. So why not split the problem into smaller parts, and use the best available tool to solve *those individial tasks*, instead of looking for a "one size-hits-all hammer"? (PS: "script without shebangs" would be rejected as "invalid executable" too by my POC, because the file contents would not match any heuristic. As would executable PNG files, etc.) Fabio ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: [Fedora-packaging] Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions
On 29. 09. 21 14:48, Fabio Valentini wrote: On Wed, Sep 29, 2021 at 1:53 PM Miro Hrončok wrote: On 25. 09. 21 11:12, Fabio Valentini wrote: So, if I understand correctly, the problem is that right now there's no *existing* tool that reliably detects if a given "executable" (has mode +x) is an actual executable "script" with a valid shebang? - We need to detect "scripts" that are executable but have no shebnag. - We need to detect "scripts" that are executable and have a shebnag to mangle. - We might want to detect binary files that are executable but shouldn't be (such as images), but this was not the original purpose of the BRP script. If I gave you a program /usr/bin/isexec that determines if a file is a valid executable, i.e. - ELF binary with ELF header / magic number, - PE binary with MZ magic number, - script with shebang line (whether in need of mangling or not), would that help? (I.e. something like this POC: https://github.com/ironthree/isexec ?) I am not sure I want to throw in a one-man-maintained rust program into the mixture. This could open can of worms, e.g.: - bus factor = 1 - RHEL maintainer wiling to maintain this in RHEL 10 = 0 - no "full" architecture support - (possibly?) larger dependency chain just to build this But even ignoring this, we still need to detect "scripts without shebangs". -- Miro Hrončok -- Phone: +420777974800 IRC: mhroncok ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: [Fedora-packaging] Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions
On Wed, Sep 29, 2021 at 1:53 PM Miro Hrončok wrote: > > On 25. 09. 21 11:12, Fabio Valentini wrote: > > So, if I understand correctly, the problem is that right now there's > > no *existing* tool that reliably detects if a given "executable" (has > > mode +x) is an actual executable "script" with a valid shebang? > > - We need to detect "scripts" that are executable but have no shebnag. > - We need to detect "scripts" that are executable and have a shebnag to > mangle. > - We might want to detect binary files that are executable but shouldn't be >(such as images), but this was not the original purpose of the BRP script. If I gave you a program /usr/bin/isexec that determines if a file is a valid executable, i.e. - ELF binary with ELF header / magic number, - PE binary with MZ magic number, - script with shebang line (whether in need of mangling or not), would that help? (I.e. something like this POC: https://github.com/ironthree/isexec ?) Fabio ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions
Steve Grubb kirjoitti 23.9.2021 klo 18.15: On Wednesday, September 22, 2021 5:34:17 PM EDT Miro Hrončok wrote: From all the scan that we've done on fullish installs in the past, there's only 2 others that you might run across: application/x-elc (lisp) and application/x-java-applet. Maybe you just build in logic to workaround these 3 types? application/ javascript is really the only one I can think of that is common. Yeah, maybe we should just do that. However, that would not cleanup the executable pngs. They should be easy to identify, they start with 'image'. There's not many types on a typical system. This is what I see in /usr on a system with 5000 packages installed: application/gzip application/javascript application/json application/octet-stream application/vnd.ms-fontobject application/x-bad-elf application/x-executable application/x-kdelnk application/x-sharedlib application/zip audio/ogg font/sfnt image/gif image/jpeg image/png image/vnd.microsoft.icon text/html text/plain text/x-awk text/x-c text/x-gawk text/x-lua text/x-luatex text/x-perl text/x-python text/x-ruby text/x-shellscript text/x-systemtap text/x-tcl You might just make a map since the list is not all that big. The biggest issue is when you have things text/plain or application/octet-stream. That means we don't know what it is. What about keeping the "detect mime type" approach, then dividing the results into three categories? 1. Can be executable, if so, must have a shebang, which is mangled: text/* is already there, add application/javascript and possibly others as needed. 2. Cannot be executable, remote the executable bit if found: image/* would take care of the executable pngs, many more like application/json can be added as needed. 3. The rest: do nothing with these. Maybe that would be good enough, even if the mime type detection uncertainty sets a limit on how precise it can be? Keeping the mime type detection approach, but using less data (the first 8 bytes approach) does not sound good. If 'file' really works better that way, then there is something wrong with it. As for the application/javascript type, there is an IETF proposal that, among other things, tries to deprecate that and de-deprecate text/javascript [1]. So, perhaps some day category 1 could be reasonably equated with text/* again. [1]: https://datatracker.ietf.org/doc/draft-ietf-dispatch-javascript-mjs Otto ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions
On 9/23/21 2:11 AM, Miro Hrončok wrote: > On 23. 09. 21 1:40, Josh Stone wrote: >> On 9/22/21 4:21 AM, Miro Hrončok wrote: >>> Hello, >>> >>> for many releases, Fedora has the brp-mangle-sehbangs BuildRoot Policy >>> Script >>> that does the following: >>> >>>1) Gets all executable files in the buildroot >>>2) Gets all "text" files from those >>>3a) Mangles shebangs that are "wrong" >>>(e.g. #!/usr/bin/env node -> #!/usr/bin/node) >>>3b) Removes executable bits from "text" files without shebangs >> >> While we're at it, can we teach the script to ignore Rust attributes? >> They're written like #![attr...], and when that's on the first line some >> editors try to be helpful and make the file executable. That's >> considered an error with the current script since the "shebang" doesn't >> start with '/', but it would be best IMO to have it remove the >> executable bit. > > I believe that currently the script would error: > > ERROR: $f has shebang which doesn't start with '/' (#![attr...]) > > Have you ever seen that in a Fedora package? That's the error I meant, and yes I have seen that in real builds. I have a line in the rust.spec %prep to "chmod -x *.rs", but I've also seen this pop up in individual rust-* crate packaging. ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions
On Wednesday, September 22, 2021 5:34:17 PM EDT Miro Hrončok wrote: > > From all the scan that we've done on fullish installs in the past, > > there's > > only 2 others that you might run across: application/x-elc (lisp) and > > application/x-java-applet. > > > > Maybe you just build in logic to workaround these 3 types? application/ > > javascript is really the only one I can think of that is common. > > Yeah, maybe we should just do that. However, that would not cleanup the > executable pngs. They should be easy to identify, they start with 'image'. There's not many types on a typical system. This is what I see in /usr on a system with 5000 packages installed: application/gzip application/javascript application/json application/octet-stream application/vnd.ms-fontobject application/x-bad-elf application/x-executable application/x-kdelnk application/x-sharedlib application/zip audio/ogg font/sfnt image/gif image/jpeg image/png image/vnd.microsoft.icon text/html text/plain text/x-awk text/x-c text/x-gawk text/x-lua text/x-luatex text/x-perl text/x-python text/x-ruby text/x-shellscript text/x-systemtap text/x-tcl You might just make a map since the list is not all that big. The biggest issue is when you have things text/plain or application/octet-stream. That means we don't know what it is. -Steve ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: [Fedora-packaging] Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions
On Thu, Sep 23, 2021 at 11:13:37AM +0200, Miro Hrončok wrote: > On 23. 09. 21 9:45, Richard W.M. Jones wrote: > >After years of experience I wouldn't use "file" for anything I needed > >to work reliably. > > > >If the test is really ELF or not ELF, how about detecting the ELF > >header magic directly? > > As I say in the rest of the email, I know how to detected elves. I > just don't know if that's enough. Oh I see - I misread your second email as eu-elfclassify pulling in too many dependencies when in fact that was a different package. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: [Fedora-packaging] Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions
On 23. 09. 21 9:45, Richard W.M. Jones wrote: After years of experience I wouldn't use "file" for anything I needed to work reliably. If the test is really ELF or not ELF, how about detecting the ELF header magic directly? As I say in the rest of the email, I know how to detected elves. I just don't know if that's enough. -- Miro Hrončok -- Phone: +420777974800 IRC: mhroncok ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions
On 23. 09. 21 1:40, Josh Stone wrote: On 9/22/21 4:21 AM, Miro Hrončok wrote: Hello, for many releases, Fedora has the brp-mangle-sehbangs BuildRoot Policy Script that does the following: 1) Gets all executable files in the buildroot 2) Gets all "text" files from those 3a) Mangles shebangs that are "wrong" (e.g. #!/usr/bin/env node -> #!/usr/bin/node) 3b) Removes executable bits from "text" files without shebangs While we're at it, can we teach the script to ignore Rust attributes? They're written like #![attr...], and when that's on the first line some editors try to be helpful and make the file executable. That's considered an error with the current script since the "shebang" doesn't start with '/', but it would be best IMO to have it remove the executable bit. I believe that currently the script would error: ERROR: $f has shebang which doesn't start with '/' (#![attr...]) Have you ever seen that in a Fedora package? -- Miro Hrončok -- Phone: +420777974800 IRC: mhroncok ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions
After years of experience I wouldn't use "file" for anything I needed to work reliably. If the test is really ELF or not ELF, how about detecting the ELF header magic directly? $ if [[ $(dd if="/bin/ls" status=none bs=4 count=1) == $'\x7fELF' ]]; then echo elf ; else echo not-elf; fi elf $ if [[ $(dd if="/etc/passwd" status=none bs=4 count=1) == $'\x7fELF' ]]; then echo elf ; else echo not-elf; fi not-elf Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions
V Wed, Sep 22, 2021 at 12:47:07PM -0400, Steve Grubb napsal(a): > I find the file utility to be almost reliable. It changes how it identifies > ELF > files every couple releases. So, to stabilize this, fapolicyd-cli uses it's > own logic to determine what kind of ELF file it finds. I also regularly find > text/plain files where it cannot identify the language and files that are > application/octet-stream which are also misidentified. > File's libmagic will always misdetect some files. I'd like to see rpmbuild to prefer user.mime_type extended attribute over libmagic guess. That way packagers could override the MIME type directly from a spec file: %install setfattr -n 'user.mime_type' -v 'text/x-perl' %{buildroot}%{_bindir}/GET If rpmbuild carried that attribute to RPM archive, rpm would set the attribute when installing that package and it would become available to other tools like fapolicyd-cli. Technically we could patch libmagic to do that, but I feel that libmagic upstream wouldn't like that. Maybe a place for an intermediate wrapper. -- Petr signature.asc Description: PGP signature ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions
On 9/22/21 4:21 AM, Miro Hrončok wrote: > Hello, > > for many releases, Fedora has the brp-mangle-sehbangs BuildRoot Policy Script > that does the following: > > 1) Gets all executable files in the buildroot > 2) Gets all "text" files from those > 3a) Mangles shebangs that are "wrong" > (e.g. #!/usr/bin/env node -> #!/usr/bin/node) > 3b) Removes executable bits from "text" files without shebangs While we're at it, can we teach the script to ignore Rust attributes? They're written like #![attr...], and when that's on the first line some editors try to be helpful and make the file executable. That's considered an error with the current script since the "shebang" doesn't start with '/', but it would be best IMO to have it remove the executable bit. ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions
On 22. 09. 21 22:48, Steve Grubb wrote: On Wednesday, September 22, 2021 4:26:49 PM EDT Miro Hrončok wrote: By chance do you have a pointer to one of those javascript files that is misidentified? (Or any other for that matter). I'd like to see what's going on and get a fix in place. yarnpkg package, %prepped $ file --mime-type yarn-1.22.10/bin/yarn.js yarn-1.22.10/bin/yarn.js: application/javascript application/javascript is, unfortunately, correct. This one is governed by RFC 4329 which makes it official. From all the scan that we've done on fullish installs in the past, there's only 2 others that you might run across: application/x-elc (lisp) and application/x-java-applet. Maybe you just build in logic to workaround these 3 types? application/ javascript is really the only one I can think of that is common. Yeah, maybe we should just do that. However, that would not cleanup the executable pngs. -- Miro Hrončok -- Phone: +420777974800 IRC: mhroncok ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions
On Wednesday, September 22, 2021 4:26:49 PM EDT Miro Hrončok wrote: > > By chance do you have a pointer to one of those javascript files that is > > misidentified? (Or any other for that matter). I'd like to see what's > > going on and get a fix in place. > > yarnpkg package, %prepped > > $ file --mime-type yarn-1.22.10/bin/yarn.js > yarn-1.22.10/bin/yarn.js: application/javascript application/javascript is, unfortunately, correct. This one is governed by RFC 4329 which makes it official. From all the scan that we've done on fullish installs in the past, there's only 2 others that you might run across: application/x-elc (lisp) and application/x-java-applet. Maybe you just build in logic to workaround these 3 types? application/ javascript is really the only one I can think of that is common. -Steve ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions
On 22. 09. 21 21:46, Steve Grubb wrote: By chance do you have a pointer to one of those javascript files that is misidentified? (Or any other for that matter). I'd like to see what's going on and get a fix in place. yarnpkg package, %prepped $ file --mime-type yarn-1.22.10/bin/yarn.js yarn-1.22.10/bin/yarn.js: application/javascript -- Miro Hrončok -- Phone: +420777974800 IRC: mhroncok ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions
On Wednesday, September 22, 2021 1:46:11 PM EDT Miro Hrončok wrote: > > 4) maybe fapolicyd-cli has better detection? Or at least, its more > > closely > > maintained. It also has it's own ELF detection so that it's stable from > > release to release. > > Not checked whether it has better detection or not, but I see it would pull > the following packages into the default buildroot: > python-pip-wheel > python-setuptools-wheel > python3-libs > python3 I think this is a packaging mistake. That would remove a fair amount of what's pulled in. > systemd-pam > systemd-rpm-macros > systemd It needs libudev, but we get the whole thing. > Installed size: 54 M > > > I don't think this is acceptable. Even systemd and Python alone are not > supposed to be there. I'll ask about fixing the package. But I guess the other option is to identify problems with libmagic and contribute fixes. That's what we have to do for fapolicyd. Btw, I just looked at F34 and rawhide to compare. Rawhide looks much, much better as in things that are executable really are supposed to be. However, F34...not so much. There's a lot of png files that are execuatble. This cleanup is appreciated. By chance do you have a pointer to one of those javascript files that is misidentified? (Or any other for that matter). I'd like to see what's going on and get a fix in place. Cheers, -Steve ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions
On 22. 09. 21 18:47, Steve Grubb wrote: 4) maybe fapolicyd-cli has better detection? Or at least, its more closely maintained. It also has it's own ELF detection so that it's stable from release to release. Not checked whether it has better detection or not, but I see it would pull the following packages into the default buildroot: acl cryptsetup-libs dbus-broker dbus-common dbus device-mapper-libs device-mapper expat fapolicyd-dnf-plugin fapolicyd iptables-legacy-libs json-c kmod-libs libargon2 libibverbs libnl3 libpcap libseccomp lmdb-libs mpdecimal python-pip-wheel python-setuptools-wheel python3-libs python3 systemd-pam systemd-rpm-macros systemd Installed size: 54 M I don't think this is acceptable. Even systemd and Python alone are not supposed to be there. -- Miro Hrončok -- Phone: +420777974800 IRC: mhroncok ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions
Hello, On Wednesday, September 22, 2021 7:21:42 AM EDT Miro Hrončok wrote: > for many releases, Fedora has the brp-mangle-sehbangs BuildRoot Policy > Script that does the following: > > 1) Gets all executable files in the buildroot > 2) Gets all "text" files from those > 3a) Mangles shebangs that are "wrong" > (e.g. #!/usr/bin/env node -> #!/usr/bin/node) > 3b) Removes executable bits from "text" files without shebangs This is interesting. I didn't know Fedora had such a policy. I have been doing studies of this myself because fapolicyd wants correctly identified scripts. > The idea behind this is that all "text" files that are executable need a > shebang and if they don't have it, something is wrong. OTOH files that are > "binary" don't need it. > > I intentionally put the terms "text" and "binary" in quotation marks, as > the definition is somewhat fuzzy. Up until now, the script did the > detection by utilizing the file tool to get the MIME type. If the MIME > type starts with text/, it considered the executable to be a text file. I find the file utility to be almost reliable. It changes how it identifies ELF files every couple releases. So, to stabilize this, fapolicyd-cli uses it's own logic to determine what kind of ELF file it finds. I also regularly find text/plain files where it cannot identify the language and files that are application/octet-stream which are also misidentified. > However, a bug [1] has been discovered. Some obvious text files, such as > executable JavaScript scripts, are detected as application/ (e.g. > application/javascript), and hence are not considered "text". This is another inconsistency with libmagic that we do battle with. It can change on th next release. Another example of this is python misidentification. In order to have any stability and correctness, fapolicyd ships with it's own libmagic override file. You might find fapolicyd-cli --ftype a bit more stable. I also put new languages we discover in the override while we are waiting for the patch to be accepted upstream. And I think upstream has not accepted a couple patches for languages libmagic can't detect right. > If a JavaScript executable script has the #!/usr/bin/env node shebang, the > brp-mangle-sehbangs script does not mangle it. > > One possible solution [2] to this problem is to limit the number of bytes > the MIME detection reads. My experiments showed that limiting the number > of bytes to 8 always recognizes JavaScript (and other scripting languages) > files as text/plain and binary files as application/octet-stream. As a > side effect, it might make the BRP script faster. However, I am not sure > if this approach is deterministic enough. > > Another solution, suggested by Florian Weimer [3], is to not detect MIME > type at all, but use eu-elfclassify instead. The idea is quite simple: > If (and only if) the executable file is ELF [4], it does not require a > shebang. Instead of some fragile idea about what files are text and what > files are binary, this is quite deterministic. It allows mangling shebangs > of executable ZIP files etc. > I've drafted the eu-elfclassify solution in a pull request [5]. However, we > have discovered that several non-elf binary formats in Fedora are > possibly legitimately executable. E.g. .exe files (for mono or wine) or > other formats registered with the kernel [6]. > > We are presented with 3 possible actions: > > 1) Keep the script as it is, say the text/ MIME type limitation is how this > BRP script was scoped. Affected packages would need to correct shebangs > manually. > 2) Limit the MIME type detection to 8 bytes and hope it will not yield > incorrect results. > > 3) Use eu-elfclassify. Consider non-ELF executables without shebangs bogus > and document this. Packages that are affected would need to opt-out. > What do you think? 4) maybe fapolicyd-cli has better detection? Or at least, its more closely maintained. It also has it's own ELF detection so that it's stable from release to release. -Steve > [1] https://bugzilla.redhat.com/1998924 > [2] https://bugzilla.redhat.com/1998924#c3 > [3] https://bugzilla.redhat.com/1998924#c4 > [4] https://en.wikipedia.org/wiki/Executable_and_Linkable_Format > [5] https://src.fedoraproject.org/rpms/redhat-rpm-config/pull-request/145 > [6] https://www.kernel.org/doc/html/latest/admin-guide/binfmt-misc.html > -- > Miro Hrončok > -- > Phone: +420777974800 > IRC: mhroncok > ___ > devel mailing list -- devel@lists.fedoraproject.org > To unsubscribe send an email to devel-le...@lists.fedoraproject.org > Fedora Code of Conduct: > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List > Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List > Archives: > https://lists.fedoraproject.org/archives/list/de...@lists.fedoraproject.or > g Do not reply to spam on the list, report it: > https://pagure.io/fedora-infras
Mangling shebangs in text files: How to detect them, bug in the current implementation and possible solutions
Hello, for many releases, Fedora has the brp-mangle-sehbangs BuildRoot Policy Script that does the following: 1) Gets all executable files in the buildroot 2) Gets all "text" files from those 3a) Mangles shebangs that are "wrong" (e.g. #!/usr/bin/env node -> #!/usr/bin/node) 3b) Removes executable bits from "text" files without shebangs The idea behind this is that all "text" files that are executable need a shebang and if they don't have it, something is wrong. OTOH files that are "binary" don't need it. I intentionally put the terms "text" and "binary" in quotation marks, as the definition is somewhat fuzzy. Up until now, the script did the detection by utilizing the file tool to get the MIME type. If the MIME type starts with text/, it considered the executable to be a text file. However, a bug [1] has been discovered. Some obvious text files, such as executable JavaScript scripts, are detected as application/ (e.g. application/javascript), and hence are not considered "text". If a JavaScript executable script has the #!/usr/bin/env node shebang, the brp-mangle-sehbangs script does not mangle it. One possible solution [2] to this problem is to limit the number of bytes the MIME detection reads. My experiments showed that limiting the number of bytes to 8 always recognizes JavaScript (and other scripting languages) files as text/plain and binary files as application/octet-stream. As a side effect, it might make the BRP script faster. However, I am not sure if this approach is deterministic enough. Another solution, suggested by Florian Weimer [3], is to not detect MIME type at all, but use eu-elfclassify instead. The idea is quite simple: If (and only if) the executable file is ELF [4], it does not require a shebang. Instead of some fragile idea about what files are text and what files are binary, this is quite deterministic. It allows mangling shebangs of executable ZIP files etc. I've drafted the eu-elfclassify solution in a pull request [5]. However, we have discovered that several non-elf binary formats in Fedora are possibly legitimately executable. E.g. .exe files (for mono or wine) or other formats registered with the kernel [6]. We are presented with 3 possible actions: 1) Keep the script as it is, say the text/ MIME type limitation is how this BRP script was scoped. Affected packages would need to correct shebangs manually. 2) Limit the MIME type detection to 8 bytes and hope it will not yield incorrect results. 3) Use eu-elfclassify. Consider non-ELF executables without shebangs bogus and document this. Packages that are affected would need to opt-out. What do you think? [1] https://bugzilla.redhat.com/1998924 [2] https://bugzilla.redhat.com/1998924#c3 [3] https://bugzilla.redhat.com/1998924#c4 [4] https://en.wikipedia.org/wiki/Executable_and_Linkable_Format [5] https://src.fedoraproject.org/rpms/redhat-rpm-config/pull-request/145 [6] https://www.kernel.org/doc/html/latest/admin-guide/binfmt-misc.html -- Miro Hrončok -- Phone: +420777974800 IRC: mhroncok ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure