On 11/03/17 21:56, Marc Espie wrote:> See, this is exactly what I'm afraid of.
> 
> run-time depends can be a bitch. The stuff still packages, and stays that
> way until somebody notices an issue... which often happens a few months
> after the commits, and sometimes even after release.

This risk should be minimal, since I grepped through all the packages
in-tree that contain a .php file on all the functions and classes that
moved out of main. Aside from that, currently the risk is about equal
because the admin already has to enable the RUN_DEPENDS modules.
Also note that the admin can still easily install the required module
when needed and is nothing more than a nuisance.
> 
> Does php complain as soon as you try to start the offending package, or
> only when you activate some functionality that depends on the extra
> plugin ?
> 
For module-dependency (e.g. pecl) it's at startup time (during ini-parsing)
(but as a warning unfortunately) and for methods/classes required at runtime
it's done lazily unfortunately.
> 
> I'm seriously way more happy with things breaking straight away during
> compilation than with these kinds of ticking time bombs.

I agree 100%, but unfortunately that's just how PHP works.
> 
> Invariably leads to breakage being noticed at the most inconvenient
> times.
> 
Agree, although I've done all I can think of to prevent this with this
move.
> 
> In the past, we've always cut up php according to two criteria:
> - modules that actually use more libraries (as in: pulling in more
> external stuff)
> - modules that actually have security aspects.
> 
> Getting more fine-grained is not really that useful, and will only
> lead to further issues down the road.
> 
> We already went that way for other things like gstreamer plugins,
> invariably cutting back to FEWER coarse grained modules sooner or
> later.

PHP's usecase can't be compared to that of gstreamer, because it's a
general purpose programming language.
> 
> I would rather this be done sooner rather than later.
> 
So what would be the desired requirements for this merge?
- The module doesn't have any lib-depends to prevent pulling in all
kinds of random packages.
- The module must be a requirement for another port.
- The module must not have any security implications.

If these are the requirements than the following packages would be
merged back into -main (or more precisely the different -SAPI packages):
- bcmath
- calendar
- ctype
- dom (libxml is a required extension, so no new dependencies)
- exif
- fileinfo? (no external dependency)
- ftp
- json
- mysqlnd
- mysql (depends on mysqlnd driver)
- mysqli (depends on mysqlnd driver)
- pdo
- pdo_mysql (depends on mysqlnd and pdo driver)
- phar? (Would leave sparc64 broken)
- simplexml (see dom)
- soap (see dom)
- sockets
- sysvsem? (I currently don't know if this has security implications)
- sysvshm? (I currently don't know if this has security implications)
- tokenizer
- wddx (see dom)
- xmlreader (see dom)
- xmlrpc (see dom)
- xmlwrite (see dom)
- zip

But would still leave the following as an invariables:
- bz2 (external library - archivers/bzip2)
- curl (external libarary - net/curl)
- fileinfo? (security of this feature has been subject of debate before)
- gd (extrenal libraries - graphics/{jpeg,png}
- gettext (external library - devel/gettext)
- gpm (external library - devel/gmp)
- iconv (external library - converters/libiconv)
- imap (external library - mail/alpine,-c-client)
- intl (external library - textproc/icu4c)
- ldap (external library - databases/openldap)
- mbstring (external library - textproc/oniguruma)
- mcrypt (external libraries - security/libmcrypt, devel/libtool,ltdl)
- pcntl (security - process control)
- pdo_pgsql (external library - databases/postgresql)
- pgsql (external library - database/postgresql)
- phar? (Should allow to build on sparc64 if we ignore the module there)
- posix (security - does signals)
- pspell (external library - textproc/aspell/core)
- readline (libreadline, although that can be provided through base)
- snmp (external library - net/netsnmp)
- sqlite3 (external library - databases/sqlite3)
- sysvsem? (I currently don't know if this has security implications)
- sysvshm? (I currently don't know if this has security implications)
- xsl (external library - textproc/libxslt)

This list is about 50/50 and might seem completely unintuitive to the
end-user. What would people think about pdo_mysql being included over
pdo_pgsql or pdo_sqlite3 without knowledge of this discussion? -
They could assume mysql/mariadb is the preferred database for the
OpenBSD project, or think that the OpenBSD devs forgot to include the
pdo_mysql package and start asking questions on the mailing lists.
Or why do I have all my xml-tools and its kitchensink, but not xsl?
Same for sysvsem/sysvshm vs sysvmsg?
What about bzip2, curl? pretty widely used while the sysv* modules
are a rarity (I was surprised even a single port uses them).

Even with the above ruleset both lists feel pretty random to me and
there is no genuine improvement over my proposed patch, except
that runtime breakage will probably go unnoticed for a little longer,
because of the bigger default set included.

And what if for some reason one of the ports decides to start using
e.g. hardcoded opcache or shmop calls. Must we move those packages
into main then, or should we do it pre-emptive?

The runtime checking of PHP will always leave a level of invariance,
even if we would compile in everything and the kitchensink, there could
be an introduction to a package which relies on some pecl extension,
which the admin would need to enable himself once again.

PHP is a language build on invariances and is both one of its strengths
and major weaknesses. The question is how do we want to handle those
invariances. With the discussion above I see three possible options:
1) Include everything there is available
2) Cherrypick the modules
3) Split everything up in individual subpackages
Where:
1) Would incredibly bloat the package (odbc on every install?)
2) Has a high chance of end-user confusion and could insinuate
favouritism if not chosen carefully. Especially if people expect a
subdivision similar to other OSes.
3) Has a higher burden on the admin of the running code.

I still stand by my original statement that splitting things up where
possible is the best option, because it's the most consistent. Things
are available and it's up to the admin to determine what needs to be
enabled. We supply the required dependencies to best of our knowledge,
but there's is no guarantee that it is complete, nor can we give this
promise without doing a full source scan of every package that uses
PHP on every update.

My conclusion: Things will break with PHP, the only question is to what
extend. So let's just give the admin the flexibility to setup the
environment that's a best fit for its purposes. The admin remains the
one who needs to figure out how to fix things anyway when it breaks.
This would gives the admin a clear signal: This is what is available,
this is what we reckon is required for this port to run: gl/hf.

Modules can be included in main if there's a strong motivation to do so:
I included the openssl module because code could fall back in some
situations to unencrypted connections without the admin knowing, which
could lead to security risks.

Reply via email to