Hi, I am trying to extract the number of distinct users from a file using Spark SQL, but I am getting the following error:
ERROR Executor: Exception in task 1.0 in stage 8.0 (TID 15) java.lang.ArrayIndexOutOfBoundsException: 1 I am following the code in examples/sql/RDDRelation.scala. My code is as follows. The error is appearing when it executes the SQL statement. I am new to Spark SQL. I would like to know how I can fix this issue. thanks for your help. val sql_cxt = new SQLContext(sc) import sql_cxt._ // read the data using th e schema and create a schema RDD val tusers = sc.textFile(inp_file) .map(_.split("\t")) .map(p => TUser(p(0), p(1).trim.toInt)) // register the RDD as a table tusers.registerTempTable("tusers") // get the number of unique users val unique_count = sql_cxt.sql("SELECT COUNT (DISTINCT userid) FROM tusers").collect().head.getLong(0) println(unique_count) -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-ArrayIndexOutofBoundsException-tp15639.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org