StringIndexer + VectorAssembler equivalent to HashingTF?

2015-08-07 Thread praveen S
Is StringIndexer + VectorAssembler equivalent to HashingTF while converting the document for analysis?

Re: StringIndexer + VectorAssembler equivalent to HashingTF?

2015-08-07 Thread Peter Rudenko
(SI1, SI2).setOutputCol(features) - features 00 11 01 22 HashingTF.setNumFeatures(2).setInputCol(COL1).setOutputCol(HT1) bucket1 bucket2 a,a,b c HT1 3 //Hash collision 3 3 1 Thanks, Peter Rudenko On 2015-08-07 09:55, praveen S wrote: Is StringIndexer + VectorAssembler equivalent