lostluck commented on issue #23278:
URL: https://github.com/apache/beam/issues/23278#issuecomment-1364758660

   Thank you for your interest and patience!
   
   Specifically this issue is about improving the validation that the SDK 
provides. In particular, it's not possible for Beam to encode arbitrary 
`interface{}` or `any` types as part of a PCollection. PCollections are 
required to have a static type at Pipeline Construction time. This avoids 
runtime type errors at pipeline execution time, and allows the execution layer 
to optimize how it's decoding types.
   
   So the error here is that the SDK isn't validating the types in emitter 
functions (like `func(T)` or `func(K, V)`)  iterators function (like `func(*T) 
bool` or `func(*K, *V) bool, or `func(K) func(*V) bool`, that the `T` or `K` or 
`V` in those are a known, registered type that Beam knows how to encode and 
decode.
   
   In short, the goal is to make the type signatures for the main DoFn method 
ProcessElement *fail* pipeline construction when those are plain `interface{}`.
   
   This would involve updating the `funcx` and `typex` packages to perform this 
validation on the emitter and emitter types and failing them accordingly.
   
   The trick however is that Universal types, like `beam.T` are `any`. They 
*are* allowed, but only if during pipeline construction they are inferable to a 
concrete type, or they are bound concretely to a specific type.
   
   If you have specific questions, let me know. "details instructions" is 
tantamount to simply doing the work to me, so smaller, specific questions will 
get better responses.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to