Re: Error aliasing an array column.
Do you mind pastebin'ning code snippet and exception one more time - I couldn't see them in your original email. Which Spark release are you using ? On Tue, Feb 9, 2016 at 11:55 AM, rakeshchalasani <vnit.rak...@gmail.com> wrote: > Hi All: > > I am getting an "UnsupportedOperationException" when trying to alias an > array column. The issue seems to be at "CreateArray" expression -> > dataType, > which checks for nullability of its children, while aliasing is creating a > PrettyAttribute that does not implement nullability. > > Below is an example to reproduce it. > > > > this throws the following exception: > > > > > -- > View this message in context: > http://apache-spark-developers-list.1001551.n3.nabble.com/Error-aliasing-an-array-column-tp16288.html > Sent from the Apache Spark Developers List mailing list archive at > Nabble.com. > > - > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > >
Re: Error aliasing an array column.
How about changing the last line to: scala> val df2 = df.select(functions.array(df("a"), df("b")).alias("arrayCol")) df2: org.apache.spark.sql.DataFrame = [arrayCol: array] scala> df2.show() ++ |arrayCol| ++ | [0, 1]| | [1, 2]| | [2, 3]| | [3, 4]| | [4, 5]| | [5, 6]| | [6, 7]| | [7, 8]| | [8, 9]| | [9, 10]| ++ FYI On Tue, Feb 9, 2016 at 1:38 PM, Rakesh Chalasani <vnit.rak...@gmail.com> wrote: > Sorry, didn't realize the mail didn't show the code. Using Spark release > 1.6.0 > > Below is an example to reproduce it. > > import org.apache.spark.sql.SQLContext > val sqlContext = new SQLContext(sparkContext) > import sqlContext.implicits._ > import org.apache.spark.sql.functions > > case class Test(a:Int, b:Int) > val data = sparkContext.parallelize(Array.range(0, 10).map(x => Test(x, > x+1))) > val df = data.toDF() > val arrayCol = functions.array(df("a"), df("b")).as("arrayCol") > > this throws the following exception: > ava.lang.UnsupportedOperationException > at > org.apache.spark.sql.catalyst.expressions.PrettyAttribute.nullable(namedExpressions.scala:289) > at > org.apache.spark.sql.catalyst.expressions.CreateArray$$anonfun$dataType$3.apply(complexTypeCreator.scala:40) > at > org.apache.spark.sql.catalyst.expressions.CreateArray$$anonfun$dataType$3.apply(complexTypeCreator.scala:40) > at > scala.collection.IndexedSeqOptimized$$anonfun$exists$1.apply(IndexedSeqOptimized.scala:40) > at > scala.collection.IndexedSeqOptimized$$anonfun$exists$1.apply(IndexedSeqOptimized.scala:40) > at > scala.collection.IndexedSeqOptimized$class.segmentLength(IndexedSeqOptimized.scala:189) > at > scala.collection.mutable.ArrayBuffer.segmentLength(ArrayBuffer.scala:47) > at > scala.collection.GenSeqLike$class.prefixLength(GenSeqLike.scala:92) > at scala.collection.AbstractSeq.prefixLength(Seq.scala:40) > at > scala.collection.IndexedSeqOptimized$class.exists(IndexedSeqOptimized.scala:40) > at > scala.collection.mutable.ArrayBuffer.exists(ArrayBuffer.scala:47) > at > org.apache.spark.sql.catalyst.expressions.CreateArray.dataType(complexTypeCreator.scala:40) > at > org.apache.spark.sql.catalyst.expressions.Alias.dataType(namedExpressions.scala:136) > at > org.apache.spark.sql.catalyst.expressions.NamedExpression$class.typeSuffix(namedExpressions.scala:84) > at > org.apache.spark.sql.catalyst.expressions.Alias.typeSuffix(namedExpressions.scala:120) > at > org.apache.spark.sql.catalyst.expressions.Alias.toString(namedExpressions.scala:155) > at > org.apache.spark.sql.catalyst.expressions.Expression.prettyString(Expression.scala:207) > at org.apache.spark.sql.Column.toString(Column.scala:138) > at java.lang.String.valueOf(String.java:2994) > at scala.runtime.ScalaRunTime$.stringOf(ScalaRunTime.scala:331) > at > scala.runtime.ScalaRunTime$.replStringOf(ScalaRunTime.scala:337) > at .(:20) > at .() > at $print() > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) > at > org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346) > at > org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) > at > org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) > at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) > > On Tue, Feb 9, 2016 at 4:23 PM Ted Yu <yuzhih...@gmail.com> wrote: > >> Do you mind pastebin'ning code snippet and exception one more time - I >> couldn't see them in your original email. >> >> Which Spark release are you using ? >> >> On Tue, Feb 9, 2016 at 11:55 AM, rakeshchalasani <vnit.rak...@gmail.com> >> wrote: >> >>> Hi All: >>> >>> I am getting an "UnsupportedOperationException" when trying to alias an >>> array column. The issue seems to be at "CreateArray" expression -> >>> dataType, >>> which checks for nullability of its children, while aliasing is creating >>> a >>> PrettyAttribute that does not implement nullability. >>> >>> Below is an example to reproduce it. >&
Re: Error aliasing an array column.
Sorry, didn't realize the mail didn't show the code. Using Spark release 1.6.0 Below is an example to reproduce it. import org.apache.spark.sql.SQLContext val sqlContext = new SQLContext(sparkContext) import sqlContext.implicits._ import org.apache.spark.sql.functions case class Test(a:Int, b:Int) val data = sparkContext.parallelize(Array.range(0, 10).map(x => Test(x, x+1))) val df = data.toDF() val arrayCol = functions.array(df("a"), df("b")).as("arrayCol") this throws the following exception: ava.lang.UnsupportedOperationException at org.apache.spark.sql.catalyst.expressions.PrettyAttribute.nullable(namedExpressions.scala:289) at org.apache.spark.sql.catalyst.expressions.CreateArray$$anonfun$dataType$3.apply(complexTypeCreator.scala:40) at org.apache.spark.sql.catalyst.expressions.CreateArray$$anonfun$dataType$3.apply(complexTypeCreator.scala:40) at scala.collection.IndexedSeqOptimized$$anonfun$exists$1.apply(IndexedSeqOptimized.scala:40) at scala.collection.IndexedSeqOptimized$$anonfun$exists$1.apply(IndexedSeqOptimized.scala:40) at scala.collection.IndexedSeqOptimized$class.segmentLength(IndexedSeqOptimized.scala:189) at scala.collection.mutable.ArrayBuffer.segmentLength(ArrayBuffer.scala:47) at scala.collection.GenSeqLike$class.prefixLength(GenSeqLike.scala:92) at scala.collection.AbstractSeq.prefixLength(Seq.scala:40) at scala.collection.IndexedSeqOptimized$class.exists(IndexedSeqOptimized.scala:40) at scala.collection.mutable.ArrayBuffer.exists(ArrayBuffer.scala:47) at org.apache.spark.sql.catalyst.expressions.CreateArray.dataType(complexTypeCreator.scala:40) at org.apache.spark.sql.catalyst.expressions.Alias.dataType(namedExpressions.scala:136) at org.apache.spark.sql.catalyst.expressions.NamedExpression$class.typeSuffix(namedExpressions.scala:84) at org.apache.spark.sql.catalyst.expressions.Alias.typeSuffix(namedExpressions.scala:120) at org.apache.spark.sql.catalyst.expressions.Alias.toString(namedExpressions.scala:155) at org.apache.spark.sql.catalyst.expressions.Expression.prettyString(Expression.scala:207) at org.apache.spark.sql.Column.toString(Column.scala:138) at java.lang.String.valueOf(String.java:2994) at scala.runtime.ScalaRunTime$.stringOf(ScalaRunTime.scala:331) at scala.runtime.ScalaRunTime$.replStringOf(ScalaRunTime.scala:337) at .(:20) at .() at $print() at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346) at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) On Tue, Feb 9, 2016 at 4:23 PM Ted Yu <yuzhih...@gmail.com> wrote: > Do you mind pastebin'ning code snippet and exception one more time - I > couldn't see them in your original email. > > Which Spark release are you using ? > > On Tue, Feb 9, 2016 at 11:55 AM, rakeshchalasani <vnit.rak...@gmail.com> > wrote: > >> Hi All: >> >> I am getting an "UnsupportedOperationException" when trying to alias an >> array column. The issue seems to be at "CreateArray" expression -> >> dataType, >> which checks for nullability of its children, while aliasing is creating a >> PrettyAttribute that does not implement nullability. >> >> Below is an example to reproduce it. >> >> >> >> this throws the following exception: >> >> >> >> >> -- >> View this message in context: >> http://apache-spark-developers-list.1001551.n3.nabble.com/Error-aliasing-an-array-column-tp16288.html >> Sent from the Apache Spark Developers List mailing list archive at >> Nabble.com. >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >> For additional commands, e-mail: dev-h...@spark.apache.org >> >> >
Re: Error aliasing an array column.
>> org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) >> at >> org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) >> >> On Tue, Feb 9, 2016 at 4:23 PM Ted Yu <yuzhih...@gmail.com> wrote: >> >>> Do you mind pastebin'ning code snippet and exception one more time - I >>> couldn't see them in your original email. >>> >>> Which Spark release are you using ? >>> >>> On Tue, Feb 9, 2016 at 11:55 AM, rakeshchalasani <vnit.rak...@gmail.com> >>> wrote: >>> >>>> Hi All: >>>> >>>> I am getting an "UnsupportedOperationException" when trying to alias an >>>> array column. The issue seems to be at "CreateArray" expression -> >>>> dataType, >>>> which checks for nullability of its children, while aliasing is >>>> creating a >>>> PrettyAttribute that does not implement nullability. >>>> >>>> Below is an example to reproduce it. >>>> >>>> >>>> >>>> this throws the following exception: >>>> >>>> >>>> >>>> >>>> -- >>>> View this message in context: >>>> http://apache-spark-developers-list.1001551.n3.nabble.com/Error-aliasing-an-array-column-tp16288.html >>>> Sent from the Apache Spark Developers List mailing list archive at >>>> Nabble.com. >>>> >>>> - >>>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >>>> For additional commands, e-mail: dev-h...@spark.apache.org >>>> >>>> >>> >
Re: Error aliasing an array column.
orImpl.invoke0(Native Method) >>> at >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:497) >>> at >>> org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) >>> at >>> org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346) >>> at >>> org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) >>> at >>> org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) >>> at >>> org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) >>> >>> On Tue, Feb 9, 2016 at 4:23 PM Ted Yu <yuzhih...@gmail.com> wrote: >>> >>>> Do you mind pastebin'ning code snippet and exception one more time - I >>>> couldn't see them in your original email. >>>> >>>> Which Spark release are you using ? >>>> >>>> On Tue, Feb 9, 2016 at 11:55 AM, rakeshchalasani <vnit.rak...@gmail.com >>>> > wrote: >>>> >>>>> Hi All: >>>>> >>>>> I am getting an "UnsupportedOperationException" when trying to alias an >>>>> array column. The issue seems to be at "CreateArray" expression -> >>>>> dataType, >>>>> which checks for nullability of its children, while aliasing is >>>>> creating a >>>>> PrettyAttribute that does not implement nullability. >>>>> >>>>> Below is an example to reproduce it. >>>>> >>>>> >>>>> >>>>> this throws the following exception: >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> View this message in context: >>>>> http://apache-spark-developers-list.1001551.n3.nabble.com/Error-aliasing-an-array-column-tp16288.html >>>>> Sent from the Apache Spark Developers List mailing list archive at >>>>> Nabble.com. >>>>> >>>>> - >>>>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >>>>> For additional commands, e-mail: dev-h...@spark.apache.org >>>>> >>>>> >>>> >>
Re: Error aliasing an array column.
That looks like a bug in toString for columns. Can you open a JIRA? On Tue, Feb 9, 2016 at 1:38 PM, Rakesh Chalasani <vnit.rak...@gmail.com> wrote: > Sorry, didn't realize the mail didn't show the code. Using Spark release > 1.6.0 > > Below is an example to reproduce it. > > import org.apache.spark.sql.SQLContext > val sqlContext = new SQLContext(sparkContext) > import sqlContext.implicits._ > import org.apache.spark.sql.functions > > case class Test(a:Int, b:Int) > val data = sparkContext.parallelize(Array.range(0, 10).map(x => Test(x, > x+1))) > val df = data.toDF() > val arrayCol = functions.array(df("a"), df("b")).as("arrayCol") > > this throws the following exception: > ava.lang.UnsupportedOperationException > at > org.apache.spark.sql.catalyst.expressions.PrettyAttribute.nullable(namedExpressions.scala:289) > at > org.apache.spark.sql.catalyst.expressions.CreateArray$$anonfun$dataType$3.apply(complexTypeCreator.scala:40) > at > org.apache.spark.sql.catalyst.expressions.CreateArray$$anonfun$dataType$3.apply(complexTypeCreator.scala:40) > at > scala.collection.IndexedSeqOptimized$$anonfun$exists$1.apply(IndexedSeqOptimized.scala:40) > at > scala.collection.IndexedSeqOptimized$$anonfun$exists$1.apply(IndexedSeqOptimized.scala:40) > at > scala.collection.IndexedSeqOptimized$class.segmentLength(IndexedSeqOptimized.scala:189) > at > scala.collection.mutable.ArrayBuffer.segmentLength(ArrayBuffer.scala:47) > at > scala.collection.GenSeqLike$class.prefixLength(GenSeqLike.scala:92) > at scala.collection.AbstractSeq.prefixLength(Seq.scala:40) > at > scala.collection.IndexedSeqOptimized$class.exists(IndexedSeqOptimized.scala:40) > at > scala.collection.mutable.ArrayBuffer.exists(ArrayBuffer.scala:47) > at > org.apache.spark.sql.catalyst.expressions.CreateArray.dataType(complexTypeCreator.scala:40) > at > org.apache.spark.sql.catalyst.expressions.Alias.dataType(namedExpressions.scala:136) > at > org.apache.spark.sql.catalyst.expressions.NamedExpression$class.typeSuffix(namedExpressions.scala:84) > at > org.apache.spark.sql.catalyst.expressions.Alias.typeSuffix(namedExpressions.scala:120) > at > org.apache.spark.sql.catalyst.expressions.Alias.toString(namedExpressions.scala:155) > at > org.apache.spark.sql.catalyst.expressions.Expression.prettyString(Expression.scala:207) > at org.apache.spark.sql.Column.toString(Column.scala:138) > at java.lang.String.valueOf(String.java:2994) > at scala.runtime.ScalaRunTime$.stringOf(ScalaRunTime.scala:331) > at > scala.runtime.ScalaRunTime$.replStringOf(ScalaRunTime.scala:337) > at .(:20) > at .() > at $print() > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) > at > org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346) > at > org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) > at > org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) > at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) > > On Tue, Feb 9, 2016 at 4:23 PM Ted Yu <yuzhih...@gmail.com> wrote: > >> Do you mind pastebin'ning code snippet and exception one more time - I >> couldn't see them in your original email. >> >> Which Spark release are you using ? >> >> On Tue, Feb 9, 2016 at 11:55 AM, rakeshchalasani <vnit.rak...@gmail.com> >> wrote: >> >>> Hi All: >>> >>> I am getting an "UnsupportedOperationException" when trying to alias an >>> array column. The issue seems to be at "CreateArray" expression -> >>> dataType, >>> which checks for nullability of its children, while aliasing is creating >>> a >>> PrettyAttribute that does not implement nullability. >>> >>> Below is an example to reproduce it. >>> >>> >>> >>> this throws the following exception: >>> >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-developers-list.1001551.n3.nabble.com/Error-aliasing-an-array-column-tp16288.html >>> Sent from the Apache Spark Developers List mailing list archive at >>> Nabble.com. >>> >>> - >>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >>> For additional commands, e-mail: dev-h...@spark.apache.org >>> >>> >>
Re: Error aliasing an array column.
dExpressions.scala:84) >>>> at >>>> org.apache.spark.sql.catalyst.expressions.Alias.typeSuffix(namedExpressions.scala:120) >>>> at >>>> org.apache.spark.sql.catalyst.expressions.Alias.toString(namedExpressions.scala:155) >>>> at >>>> org.apache.spark.sql.catalyst.expressions.Expression.prettyString(Expression.scala:207) >>>> at org.apache.spark.sql.Column.toString(Column.scala:138) >>>> at java.lang.String.valueOf(String.java:2994) >>>> at scala.runtime.ScalaRunTime$.stringOf(ScalaRunTime.scala:331) >>>> at >>>> scala.runtime.ScalaRunTime$.replStringOf(ScalaRunTime.scala:337) >>>> at .(:20) >>>> at .() >>>> at $print() >>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>> at >>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>>> at >>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>>> at java.lang.reflect.Method.invoke(Method.java:497) >>>> at >>>> org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) >>>> at >>>> org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346) >>>> at >>>> org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) >>>> at >>>> org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) >>>> at >>>> org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) >>>> >>>> On Tue, Feb 9, 2016 at 4:23 PM Ted Yu <yuzhih...@gmail.com> wrote: >>>> >>>>> Do you mind pastebin'ning code snippet and exception one more time - I >>>>> couldn't see them in your original email. >>>>> >>>>> Which Spark release are you using ? >>>>> >>>>> On Tue, Feb 9, 2016 at 11:55 AM, rakeshchalasani < >>>>> vnit.rak...@gmail.com> wrote: >>>>> >>>>>> Hi All: >>>>>> >>>>>> I am getting an "UnsupportedOperationException" when trying to alias >>>>>> an >>>>>> array column. The issue seems to be at "CreateArray" expression -> >>>>>> dataType, >>>>>> which checks for nullability of its children, while aliasing is >>>>>> creating a >>>>>> PrettyAttribute that does not implement nullability. >>>>>> >>>>>> Below is an example to reproduce it. >>>>>> >>>>>> >>>>>> >>>>>> this throws the following exception: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> View this message in context: >>>>>> http://apache-spark-developers-list.1001551.n3.nabble.com/Error-aliasing-an-array-column-tp16288.html >>>>>> Sent from the Apache Spark Developers List mailing list archive at >>>>>> Nabble.com. >>>>>> >>>>>> - >>>>>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >>>>>> For additional commands, e-mail: dev-h...@spark.apache.org >>>>>> >>>>>> >>>>> >>> >
Error aliasing an array column.
Hi All: I am getting an "UnsupportedOperationException" when trying to alias an array column. The issue seems to be at "CreateArray" expression -> dataType, which checks for nullability of its children, while aliasing is creating a PrettyAttribute that does not implement nullability. Below is an example to reproduce it. this throws the following exception: -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Error-aliasing-an-array-column-tp16288.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org