Github user StephanEwen commented on the pull request:

    https://github.com/apache/incubator-flink/pull/147#issuecomment-59583563
  
    I think the `getMinimumLength()` method would fit better into the type 
information. That would be a better separation of roles, because the 
serializer's task is actually not to provide statistics for the types. The 
`getLength()` method is for the runtime to handle fix-length data types more 
efficiently at runtime.
    
    The way the method currently delegates mostly to `getLength()`, it will 
return `-1` for the var length data types, leading to weird estimates.
    
    For better size estimates, we might allow type information to sample a 
collection of elements to figure out the size.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to