empiredan commented on code in PR #2305:
URL:
https://github.com/apache/incubator-pegasus/pull/2305#discussion_r2664388009
##########
python-client/pypegasus/pgclient.py:
##########
@@ -611,24 +620,50 @@ def generate_key(cls, hash_key, sort_key):
@classmethod
def generate_next_bytes(cls, buff):
- pos = len(buff) - 1
+ """
+ Increment the last non-0xFF byte in the buffer.
+
+ If `buff` is a string, it is assumed to be encoded with 'latin-1' to
ensure
+ a 1:1 mapping between characters and bytes. Unicode strings with
characters
+ outside the 0-255 range will raise a UnicodeEncodeError.
+ """
+ is_str = isinstance(buff, str)
+ is_ba = isinstance(buff, bytearray)
+
+ if is_str:
+ arr = bytearray(buff.encode('latin-1'))
+ elif is_ba:
+ arr = buff
+ else:
+ arr = bytearray(buff)
+ pos = len(arr) - 1
found = False
while pos >= 0:
- if ord(buff[pos]) != 0xFF:
- buff[pos] += 1
+ if arr[pos] != 0xFF:
+ arr[pos] += 1
found = True
break
- if found:
- return buff
+ pos -= 1
+ if not found:
+ arr += b'\x00'
+ if is_str:
+ return arr.decode('latin-1')
+ elif is_ba:
+ return arr
else:
- return buff + chr(0)
+ return bytes(arr)
Review Comment:
Why is it converted to bytes here?
##########
python-client/pypegasus/base/ttypes.py:
##########
@@ -44,10 +44,19 @@ def write(self, oprot):
def validate(self):
return
+
+ def raw(self):
+ if self._is_str:
+ return self.data.decode('UTF-8')
+ else:
+ return self.data
def __init__(self, data=None):
if isinstance(data,str):
+ self._is_str = True
data = data.encode('UTF-8')
+ else:
+ self._is_str = False
self.data = data
Review Comment:
```suggestion
if isinstance(data,str):
self._is_str = True
self.data = data.encode('UTF-8')
else:
self._is_str = False
self.data = data
```
##########
python-client/pypegasus/pgclient.py:
##########
@@ -597,6 +601,11 @@ def generate_key(cls, hash_key, sort_key):
hash_key_len = len(hash_key)
sort_key_len = len(sort_key)
+ if hash_key_len >= 0xFFFF:
+ raise ValueError("hash_key length must be less than 65535")
+ if sort_key_len >= 0xFFFF:
+ raise ValueError("sort_key length must be less than 65535")
Review Comment:
There is currently no restriction on the length of the sort key. The limit
on the hash key exists because the hash key length is stored in only two bytes.
```suggestion
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]