gej5sgm opened a new issue, #2418:
URL: https://github.com/apache/age/issues/2418
**Describe the bug**
The `ResultVisitor.visitStringValue()` method in `apache-age-python` uses
`.strip('"')` to remove surrounding quotes from agtype string tokens. Python's
`str.strip()` removes *all* matching characters from both ends, not just one.
This causes data corruption when a string property value starts or ends with an
escaped double quote (`\"`), because the `"` that is part of the actual data
gets stripped along with the delimiter quote. The same bug exists in
`visitPair()`.
For example, a property value `foo "bar"` is serialized by AGE as the agtype
token `"foo \"bar\""`. Calling `.strip('"')` produces `foo \"bar\` instead of
the correct `foo \"bar\"`.
The fix is to replace `.strip('"')` with `[1:-1]` to remove exactly the
first and last character.
It has to be checked whether there are cases when _no_ leading or trailing
double quotes exist, which would mean the fix would wrongfully delete content
in that case.
**How are you accessing AGE (Command line, driver, etc.)?**
- `apache-age-python` driver (via `psycopg` + AGE Python client)
**What data setup do we need to do?**
```pgsql
SELECT * FROM cypher('test_graph', $$
CREATE (a:TestNode {name: 'This value ends with a "quote"'})
$$) AS (a agtype);
```
**What is the necessary configuration info needed?**
- PostgreSQL with Apache AGE extension
- `apache-age-python` package (tested with latest PyPI version)
**What is the command that caused the error?**
```python
import age
age.setUpAge(conn, "test_graph")
with conn.cursor() as cursor:
cursor.execute("""
SELECT * FROM cypher('test_graph', $$
MATCH (a:TestNode) RETURN a.name
$$) AS (name agtype);
""")
for row in cursor:
result = age.parseAgeValue(row[0])
print(repr(result))
# Expected: 'This value ends with a "quote"'
# Actual: 'This value ends with a "quote\\'
```
The root cause in `builder.py`:
```python
# Current (broken):
def visitStringValue(self, ctx:AgtypeParser.StringValueContext):
return ctx.STRING().getText().strip('"')
```
**Expected behavior**
String property values containing double quotes should survive a round-trip
(write → read) without data loss. `visitStringValue()` and `visitPair()` should
remove exactly the first and last delimiter characters, not strip all matching
characters from both ends.
**Environment (please complete the following information):**
- AGE: 1.6.0
- `apache-age-python`: latest (PyPI)
- Python: 3.12
- PostgreSQL: 16
**Additional context**
The `visitPair()` method has the same issue when parsing map keys from
agtype objects:
```python
# Also broken:
def visitPair(self, ctx:AgtypeParser.PairContext):
self.visitChildren(ctx)
return (ctx.STRING().getText().strip('"'), ctx.agValue())
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]