hi , i use pyspark 1.5.0 on yarn cluster with 19 nodes and 200 GO and 4 cores eache (include driver)
2016-06-16 15:42 GMT+02:00 pseudo oduesp <pseudo20...@gmail.com>: > Hi , > who i can dummies large set of columns with STRINGindexer fast ? > becasue i tested with 89 values and eache one had 10 max distinct values > and that take > lot of time > thanks >