Response to the 1st approach: When you do spark.read.text("/xyz/a/b/filename") it returns a DataFrame and when applying the rdd methods gives you a RDD[Row], so when you use map, your function get Row as the parameter i.e; ip in your code. Therefore you must use the Row methods to access its members. The error message says it clearly "error : value split is not a member of org.apache.spark.sql.Row" that there is no method like split so it is throwing error.
Response to the 2nd approach: There is something fishy there. The if condition in Row ip(0).isEmpty() should catch the case when it is an empty string so when it is not actually empty ip(0).toInt shouldn't fail. But also you need to make sure ip(0) is not just some random string which can't be converted to Int. -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org