Hi list, *Scenario :* I am creating a DStream by reading an Avro object from a Kafka topic and then converting it into a DataFrame to perform some operations on the data. I call DataFrame.collect() and perform the intended operation on each Row of Array[Row] returned by DataFrame.collect().
*Problem : * Calling DataFrame.collect() changes the schema of the underlying record, thus making it impossible to get the columns by index(as the order gets changed). *Query :* Is it the way DataFrame.collect() behaves or am I doing something wrong here? In former case is there any way I can maintain the schema while getting each Row? Any pointers/suggestions would be really helpful. Many thanks! [image: http://] Tariq, Mohammad about.me/mti [image: http://] <http://about.me/mti>