Hi guys This is possibly going to sound like a vague, stupid question but I have a problem to solve and I need help. So any which way I go is only up :-) I have a bunch of R scripts (I am not a R expert) and we are currently evaluating how to translate these R scripts to SparkR data frame syntax. The goal is to use the Spark R parallel-ization As an example we are using say Corpus, tm_map , DocumentTermMatrix from the library("tm") How do we translate these to SparkR syntax ? Any pointers would be helpful. thanks sanjay