how do I do one hot encode on a column of array? e.g. ['TG', 'CA']

FYI here's my code for one hot encoding normal categorical columns.
How do I make it work for a column of array?


from pyspark.ml import Pipeline
from pyspark.ml.feature import StringIndexer

indexers = [StringIndexer(inputCol=column,
outputCol=column+"_index").fit(flight3) for column in list(set['ColA',
'ColB', 'ColC'])]

pipeline = Pipeline(stages=indexers)
flight4 = pipeline.fit(flight3).transform(flight3)

Reply via email to