The bug is likely in your data.  Do you have lines in your input file that
do not contain the "\t" character?  If so .split will only return a single
element and p(1) from the .map() is going to throw java.lang.
ArrayIndexOutOfBoundsException: 1

On Thu, Oct 2, 2014 at 3:35 PM, SK <skrishna...@gmail.com> wrote:

> Hi,
>
> I am trying to extract the number of distinct users from a file using Spark
> SQL, but  I am getting  the following error:
>
>
> ERROR Executor: Exception in task 1.0 in stage 8.0 (TID 15)
> java.lang.ArrayIndexOutOfBoundsException: 1
>
>
>  I am following the code in examples/sql/RDDRelation.scala. My code is as
> follows. The error is appearing when it executes the SQL statement. I am
> new
> to  Spark SQL. I would like to know how I can fix this issue.
>
> thanks for your help.
>
>
>      val sql_cxt = new SQLContext(sc)
>      import sql_cxt._
>
>      // read the data using th e schema and create a schema RDD
>      val tusers = sc.textFile(inp_file)
>                            .map(_.split("\t"))
>                            .map(p => TUser(p(0), p(1).trim.toInt))
>
>      // register the RDD as a table
>      tusers.registerTempTable("tusers")
>
>      // get the number of unique users
>      val unique_count = sql_cxt.sql("SELECT COUNT (DISTINCT userid) FROM
> tusers").collect().head.getLong(0)
>
>      println(unique_count)
>
>
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-ArrayIndexOutofBoundsException-tp15639.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to