rahil-c commented on PR #18146:
URL: https://github.com/apache/hudi/pull/18146#issuecomment-3956245664

   @vinothchandar regarding this comment 
https://github.com/apache/hudi/pull/18146#discussion_r2850214370
   You are correct thanks for catch that this is a problem. I tried reproducing 
the issue you mentioned with the following test
   
   ```
   void testMultipleVectorColumnsWithDifferentDimensions() {
       // Two vectors with different dimensions both get default FIXED name 
"vector"
       // but different fixedSize     
       HoodieSchema.Vector v128 = HoodieSchema.createVector(128);
       HoodieSchema.Vector v256 = HoodieSchema.createVector(256);
   
       List<HoodieSchemaField> fields = Arrays.asList(
           HoodieSchemaField.of("id", 
HoodieSchema.create(HoodieSchemaType.INT)),
           HoodieSchemaField.of("embedding_small", v128),
           HoodieSchemaField.of("embedding_large", v256)
       );
   
       // This should work — a table with two vector columns of different 
dimensions
       // is a valid use case (e.g., title embedding vs content embedding)
       HoodieSchema record = HoodieSchema.createRecord("TestRecord", null, 
null, fields);
       assertNotNull(record);
   
       // Verify both fields survive a JSON round-trip (schema 
serialization/parsing)
       String json = record.toString();
       HoodieSchema parsed = HoodieSchema.parse(json);
       assertNotNull(parsed.getAvroSchema().getField("embedding_small"));
       assertNotNull(parsed.getAvroSchema().getField("embedding_large"));
       
assertVector(HoodieSchema.fromAvroSchema(parsed.getAvroSchema().getField("embedding_small").schema()),
 128, HoodieSchema.Vector.VectorElementType.FLOAT);
       
assertVector(HoodieSchema.fromAvroSchema(parsed.getAvroSchema().getField("embedding_large").schema()),
 256, HoodieSchema.Vector.VectorElementType.FLOAT);
     }
   ```
   
   However i hit an exception it seems that avro will enforce uniqueness, when 
it hits the `String json = record.toString();` I think this behavior is 
specific to avro FIXED type i believe.
   ```
   org.apache.avro.SchemaParseException: Can't redefine: vector
   
        at org.apache.avro.Schema$Names.put(Schema.java:1604)
        at org.apache.avro.Schema$NamedSchema.writeNameRef(Schema.java:846)
        at org.apache.avro.Schema$FixedSchema.toJson(Schema.java:1316)
        at org.apache.avro.Schema$RecordSchema.fieldsToJson(Schema.java:1041)
        at org.apache.avro.Schema$RecordSchema.toJson(Schema.java:1025)
        at org.apache.avro.Schema.toString(Schema.java:435)
        at org.apache.avro.Schema.toString(Schema.java:407)
        at org.apache.avro.Schema.toString(Schema.java:398)
        at 
org.apache.hudi.common.schema.HoodieSchema.toString(HoodieSchema.java:1221)
        at 
org.apache.hudi.common.schema.TestHoodieSchema.testMultipleVectorColumnsWithDifferentDimensions(TestHoodieSchema.java:1114)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to