Hi,

I am trying to extract the number of distinct users from a file using Spark
SQL, but  I am getting  the following error:


ERROR Executor: Exception in task 1.0 in stage 8.0 (TID 15)
java.lang.ArrayIndexOutOfBoundsException: 1


 I am following the code in examples/sql/RDDRelation.scala. My code is as
follows. The error is appearing when it executes the SQL statement. I am new
to  Spark SQL. I would like to know how I can fix this issue. 

thanks for your help. 


     val sql_cxt = new SQLContext(sc)
     import sql_cxt._

     // read the data using th e schema and create a schema RDD
     val tusers = sc.textFile(inp_file)
                           .map(_.split("\t"))
                           .map(p => TUser(p(0), p(1).trim.toInt))

     // register the RDD as a table
     tusers.registerTempTable("tusers")

     // get the number of unique users
     val unique_count = sql_cxt.sql("SELECT COUNT (DISTINCT userid) FROM
tusers").collect().head.getLong(0)

     println(unique_count)






--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-ArrayIndexOutofBoundsException-tp15639.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to