In your custom transformer that produces labels, can you filter null labels? A transformer doesn't always need to do 1:1 mapping.
On Thu, Jan 10, 2019, 7:53 AM Patrick McCarthy <pmccar...@dstillery.com.invalid wrote: > I'm trying to implement an algorithm on the MNIST digits that runs like so: > > > - for every pair of digits (0,1), (0,2), (0,3)... assign a 0/1 label > to the digits and build a LogisticRegression Classifier -- 45 in total > - Fit every classifier on the test set separately > - Aggregate the results per record of the test set and compute a > prediction from the 45 predictions > > I tried implementing this with a Pipeline, composed of > > - stringIndexer > - a custom transformer which accepts a lower-digit and upper-digit > argument, producing the 0/1 label > - a custom transformer to assemble the indexed strings to VectorUDT > - LogisticRegression > > fed by a list of paramMaps. It failed because the fit() method of logistic > couldn't handle cases of null labels, i.e. a case where my 0/1 transformer > found neither the lower nor the upper digit label. I fixed this by > extending the LogisticRegression class and overriding the fit() method to > include a filter for labels in (0,1) -- I didn't want to alter the > transform method. > > Now, I'd like to tune these models using CrossValidator with an estimator > of pipeline but when I run either fitMultiple on my paramMap or I loop over > the paramMaps, I get arcane Scala errors. > > > Is there a better way to build this procedure? Thanks! >