Re: Building a ML pipeline with no training

2022-07-20 Thread Sean Owen
The data transformation is all the same. Sure, linear regression is easy: https://spark.apache.org/docs/latest/ml-classification-regression.html#linear-regression These are components that operate on DataFrames. You'll want to look at VectorAssembler to prepare data into an array column. There

Building a ML pipeline with no training

2022-07-20 Thread Edgar H
Morning everyone, The question may seem to broad but will try to synth as much as possible: I'm used to work with Spark SQL, DFs and such on a daily basis, easily grouping, getting extra counters and using functions or UDFs. However, I've come to an scenario where I need to make some predictions