> takes a single input column and outputs multiple columns. It also has a
>>> different number of input rows than output rows due to the group by
>>> operation.
>>>
>>> Given that, how do I fit this into a Mllib pipeline, and it if doesn't
>>>
nd here is an example of 2)
>>
>>
>>
>>
>> Note: My question is in some way related to this question, but I don't
>> think
>> it is answered here:
>> http://apache-spark-developers-list.1001551.n3.nabble.com/
>> Why-can-t-a-Transformer-have-mu
he-spark-developers-list.1001551.n3.nabble.com/Why-can-t-a-
> Transformer-have-multiple-output-columns-td18689.html
> <http://apache-spark-developers-list.1001551.n3.nabble.com/Why-can-t-a-
> Transformer-have-multiple-output-columns-td18689.html>
>
> Thanks
> Adrian
>
>
>
-td18689.html
<http://apache-spark-developers-list.1001551.n3.nabble.com/Why-can-t-a-Transformer-have-multiple-output-columns-td18689.html>
Thanks
Adrian
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/How-does-preprocessing-fit-into-Spark-MLlib-pi