NightOwl888 commented on code in PR #1154:
URL: https://github.com/apache/lucenenet/pull/1154#discussion_r2411930481


##########
src/Lucene.Net.Analysis.SmartCn/Hhmm/BigramDictionary.cs:
##########
@@ -286,37 +303,37 @@ public virtual void LoadFromFile(string dctFilePath)
                 int j = 0;
                 while (j < cnt)
                 {
-                    dctFile.Read(intBuffer, 0, intBuffer.Length);
-                    buffer[0] = 
ByteBuffer.Wrap(intBuffer).SetOrder(ByteOrder.LittleEndian)
-                        .GetInt32();// frequency
-                    dctFile.Read(intBuffer, 0, intBuffer.Length);
-                    buffer[1] = 
ByteBuffer.Wrap(intBuffer).SetOrder(ByteOrder.LittleEndian)
-                        .GetInt32();// length
-                    dctFile.Read(intBuffer, 0, intBuffer.Length);
-                    // buffer[2] = ByteBuffer.wrap(intBuffer).order(
-                    // ByteOrder.LITTLE_ENDIAN).getInt();// handle
+                    // LUCENENET: Use BinaryReader to decode little endian 
instead of ByteBuffer, since this is the default in .NET
+                    buffer[0] = reader.ReadInt32(); // frequency
+                    buffer[1] = reader.ReadInt32(); // length
+                    buffer[2] = reader.ReadInt32(); // Skip handle value 
(unused)
 
                     length = buffer[1];
-                    if (length > 0)
+                    if (length > 0 && dctFile.Position + length <= 
dctFile.Length)

Review Comment:
   Actually, this differs from upstream, which only did:
   
   ```c#
   if (length > 0)
   ```
   
   The additional length cap check seems to be to "gracefully" handle any 
over-length issues. The problem is, we don't want to be graceful here. We need 
this to scream loudly when the data is corrupt, since that is the only feedback 
users will get.
   
   So, let's revert that line back to match Lucene.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to