val trainRDD = rawTrainData.map( rawRow => Row( rawRow.split(",")
.map(_.toInt) ) )
The above is creating a Row with a single column that contains a sequence.
You need to extract the sequence using varargs:
val trainRDD = rawTrainData.map( rawRow => Row( rawRow.split(",")
.map(_.toInt): _* ))
You could also use Row.fromSeq:
val trainRDD = rawTrainData.map(rawRow => Row.fromSeq(rawRow.split(",")
.map(_.toInt)))
On Mon, May 11, 2015 at 9:51 PM, mikejf12 <[email protected]> wrote:
>
> I got there in the end by specifying my Row record like this ... but there
> must be a neater way of doing this ??
>
> val trainRDD = rawTrainData.map( rawRow => rawRow.split(","))
> .map( p => Row(
>
> p(0),p(1),p(2),p(3),p(4),p(5),p(6),p(7),p(8),p(9),
> p(10),p(11),p(12),p(13),p(14),p(15),p(16),p(17),p(18),p(19),
> .................
> p(770),p(771),p(772),p(773),p(774),p(775),p(776),p(777),p(778),p(779),
> p(780),p(781),p(782),p(783),p(784)
>
> i.e by specifying all 785 elements physically
>
>
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-ArrayIndexOutOfBoundsException-tp22854p22855.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>