[issue39170] Sqlite3 row_factory for attribute access: NamedRow

2019-12-30 Thread Clinton James


Change by Clinton James :


--
keywords: +patch
pull_requests: +17204
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/17768

___
Python tracker 
<https://bugs.python.org/issue39170>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39170] Sqlite3 row_factory for attribute access: NamedRow

2019-12-30 Thread Clinton James


New submission from Clinton James :

Currently, sqlite3 returns rows by tuple or sqlite3.Row for dict-style, index 
access.  I constantly find myself wanting attribute access like namedtuple for 
rows.  I find attribute access cleaner
without the brackets and quoting field names.  However, unlike previous 
discussions (https://bugs.python.org/issue13299), I don't want to use the 
namedtuple object.  I appreciate the simple API and minimal memory consumption 
of sqlite3.Row and used it as my guide in creating sqlite3.NamedRow to allow 
access by index and attribute.

A pull request is ready

Why a new object instead of adding attribute access to the existing sqlite3.Row?
There is an existing member method `keys` and any table with the field "keys" 
would cause a hard to debug, easily avoidable, collision.

Features:

+ Optimized in C, so it will be faster than any python implementation.
+ Access columns by attribute for all valid names and by index for all names.
+ Iterate over fields by name/value pairs.
+ Works with standard functions `len` and `contains`.
+ Identical memory consumption to sqlite3.Row with two references: the data 
tuple and the cursor description.
+ Identical speed to sqlite3.Row if not faster.  Timing usually has it slightly 
faster for index by name or attribute, but it is almost identical.

Examples:

>>> import sqlite3
>>> c = sqlite3.Connection(":memory:").cursor()
>>> c.row_factory = sqlite3.NamedRow
>>> named_row = c.execute("SELECT 'A' AS letter, '.-' AS morse, 65 AS 
ord").fetchone()

>>> len(named_row)
3
>>> 'letter' in named_row
true
>>> named_row == named_row
true
>>> hash(named_row)
5512444875192833987

Index by number and range.
>>> named_row[0]
'A'
>>> named_row[1:]
('.-', 65)

Index by column name.
>>> named_row["ord"]
65

Access by attribute.
>>> named_row.morse
'.-'

Iterate row for name/value pairs.
>>> dict(named_row)
{'letter': 'A', 'morse': '.-', 'ord': 65}
>>> tuple(named_row)
(('letter', 'A'), ('morse', '.-'), ('ord', 65))


How sqlite3.NamedRow differs from sqlite3.Row
--

The class only has class dunder methods to allow any valid field name. When the 
field name would be an invalid attribute name, you have two options: either use 
the SQL `AS` in the select statement or index by name.
To get the field names use the iterator `[x[0] for x in row]` or do the same 
from the
`cursor.description`.

```python
titles = [x[0] for x in row]
titles = [x[0] for x in cursor.description]
titles = dict(row).keys()
```

Attribute and dict access are no longer case-insensitive.  There are three 
reasons for this.
1. Case-insensitive comparison only works well for ASCII characters.  In a 
Unicode world, case-insensitive edge cases create unnecessary errors. Looking 
at a several existing codebases,
   this feature of Row is almost never used and I believe is not needed in 
NamedRow.
2. Case-insensitivity is not allowed for attribute access.  This "feature" 
would treat attribute access differently from the rest of Python and "special 
cases aren't special enough to break the rules". Where `row.name`, `row.Name`, 
and `row.NAME` are all the same it gives off the faint code smell of something 
wrong.  When case-insensitively is needed and the query SELECT can not be 
modified, sqlite3.Row is still there.
3. Code is simpler and easier to maintain.
4. It is faster.

Timing Results
--

NamedRow is faster than sqlite3.Row for index-by-name access.
I have published a graph and the methodology of my testing.  In the worst-case 
scenario, it is just as fast as sqlite3.Row without any extra memory.  In most 
cases, it is faster.
For more information, see the post at 
http://jidn.com/2019/10/namedrow-better-python-sqlite3-row-factory/

--
components: Library (Lib)
messages: 359104
nosy: jidn
priority: normal
severity: normal
status: open
title: Sqlite3 row_factory for attribute access: NamedRow
type: enhancement
versions: Python 3.9

___
Python tracker 
<https://bugs.python.org/issue39170>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com