empiredan commented on code in PR #2305:
URL: 
https://github.com/apache/incubator-pegasus/pull/2305#discussion_r2664388009


##########
python-client/pypegasus/pgclient.py:
##########
@@ -611,24 +620,50 @@ def generate_key(cls, hash_key, sort_key):
 
     @classmethod
     def generate_next_bytes(cls, buff):
-        pos = len(buff) - 1
+        """
+        Increment the last non-0xFF byte in the buffer.
+        
+        If `buff` is a string, it is assumed to be encoded with 'latin-1' to 
ensure
+        a 1:1 mapping between characters and bytes. Unicode strings with 
characters
+        outside the 0-255 range will raise a UnicodeEncodeError.
+        """
+        is_str = isinstance(buff, str)
+        is_ba = isinstance(buff, bytearray)
+
+        if is_str:
+            arr = bytearray(buff.encode('latin-1'))  
+        elif is_ba:
+            arr = buff  
+        else:
+            arr = bytearray(buff)
+        pos = len(arr) - 1
         found = False
         while pos >= 0:
-            if ord(buff[pos]) != 0xFF:
-                buff[pos] += 1
+            if arr[pos] != 0xFF:
+                arr[pos] += 1
                 found = True
                 break
-        if found:
-            return buff
+            pos -= 1  
+        if not found:
+            arr += b'\x00'
+        if is_str:
+            return arr.decode('latin-1')
+        elif is_ba:
+            return arr
         else:
-            return buff + chr(0)
+            return bytes(arr)

Review Comment:
   Why is it converted to bytes here?



##########
python-client/pypegasus/base/ttypes.py:
##########
@@ -44,10 +44,19 @@ def write(self, oprot):
 
   def validate(self):
     return
+  
+  def raw(self):
+    if self._is_str:
+      return self.data.decode('UTF-8')
+    else:
+      return self.data
 
   def __init__(self, data=None):
     if isinstance(data,str):
+        self._is_str = True
         data = data.encode('UTF-8')
+    else:
+        self._is_str = False
     self.data = data

Review Comment:
   ```suggestion
       if isinstance(data,str):
           self._is_str = True
           self.data = data.encode('UTF-8')
       else:
           self._is_str = False
           self.data = data
   ```



##########
python-client/pypegasus/pgclient.py:
##########
@@ -597,6 +601,11 @@ def generate_key(cls, hash_key, sort_key):
         hash_key_len = len(hash_key)
         sort_key_len = len(sort_key)
 
+        if hash_key_len >= 0xFFFF:
+            raise ValueError("hash_key length must be less than 65535")
+        if sort_key_len >= 0xFFFF:
+            raise ValueError("sort_key length must be less than 65535")

Review Comment:
   There is currently no restriction on the length of the sort key. The limit 
on the hash key exists because the hash key length is stored in only two bytes.
   
   ```suggestion
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to