On 23.04.2013 19:24, Guido van Rossum wrote: > On Tue, Apr 23, 2013 at 9:04 AM, M.-A. Lemburg <m...@egenix.com> wrote: >> On 23.04.2013 17:47, Guido van Rossum wrote: >>> On Tue, Apr 23, 2013 at 8:22 AM, M.-A. Lemburg <m...@egenix.com> wrote: >>>> Just as reminder: we have the general purpose >>>> encode()/decode() functions in the codecs module: >>>> >>>> import codecs >>>> r13 = codecs.encode('hello world', 'rot-13') >>>> >>>> These interface directly to the codec interfaces, without >>>> enforcing type restrictions. The codec defines the supported >>>> input and output types. >>> >>> As an implementation mechanism I see nothing wrong with this. I hope >>> the codecs module lets you introspect the input and output types of a >>> codec given by name? >> >> At the moment there is no standard interface to access supported >> input and output types... but then: regular Python functions or >> methods also don't provide such functionality, so no surprise >> there ;-) > > Not quite the same though. Each function has its own unique behavior. > But codecs support a standard interface, *except* that the input and > output types sometimes vary.
The codec system itself >> It's mostly a matter of specifying the supported type >> combinations in the codec documentation. >> >> BTW: What would be a use case where you'd want to >> programmatically access such information before calling >> the codec ? > > As you know, in Python 3, most code working with bytes doesn't also > work with strings, and vice versa (except for a few cases where we've > gone out of our way to write polymorphic code -- but users rarely do > so, and any time you use a string or bytes literal you basically limit > yourself to that type). > > Suppose I write a command-line utility that reads a file, runs it > through a codec, and writes the result to another file. Suppose the > name of the codec is a command-line argument (as well as the > filenames). I need to know whether to open the files in text or binary > mode based on the name of the codec. Ok, so you need to know which codecs your tool can support and which of those need text input and which bytes input. I've been thinking about this some more: I think that type information alone is not flexible enough to cover such use cases. In your use case you'd want to only permit use of a certain set of codecs, not simply all of them, since some might not implement what you actually want to achieve with the tool, e.g. a user might have installed a codec set that adds support for reading and writing image data, but your intended use was to only support text data. So what we need is a way to allow the codecs to say e.g. "I work on text", "I support encoding bytes and text", "I encode to bytes", "I'm reversible", "I transform input data", "I support bytes and text, and will create same type output", "I work on image data", "I work on X509 certificates", "I work on XML data", etc. In other words, we need a form of tagging system, with a set of standard tags that each codec can publish and which also allows non-standard tags (which can then at some point be made standard, if there's agreement on them). Given a codec name you could then ask the codec registry for the codec tags and verify that the chosen codec handles text data, needs bytes or text encoding input and creates bytes as encoding output. If the registry returns codec tags that don't include the "I work on text" tag, the tool could then raise an error. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 24 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2013-04-17: Released eGenix mx Base 3.2.6 ... http://egenix.com/go43 ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com