Hi,  I was wondering if anyone has encountered or used Beam in the following 
manner:   1. During machine learning training, use Beam to create the event 
table. The flow may consist of some joins, aggregations, row-based 
transformations, etc...  2. Once the model is created, deploy the model to some 
scoring service via PMML (or some other scoring service).  3. Enable the SAME 
transformations used in #1 by using a separate engine but thereby guaranteeing 
that it will transform the data identically as the engine used in #1.
  I think this is a pretty interesting use case where Beam is used to guarantee 
portability across engines and deployment (batch to true streaming, not 
micro-batch). What's not clear to me is with respect to how batch joins would 
translate during one-by-one scoring (probably lookups) or how aggregations 
given that some kind of history would need to be stored (and how much is kept 
is configurable too).
  Thoughts?
Thanks,Ron

Reply via email to