[ 
https://issues.apache.org/jira/browse/SPARK-16704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15391620#comment-15391620
 ] 

Dongjoon Hyun commented on SPARK-16704:
---------------------------------------

Hi, [~jiunnjye].
It seems you are reporting on Spark 1.6.0. Could you test that on 1.6.2 or 
2.0.0? It seems to work for me in current master branch like the following.
{code}
scala> import java.nio.charset.StandardCharsets
scala> 
Seq("12".getBytes(StandardCharsets.UTF_8)).toDF("a").write.parquet("/tmp/t1")
scala> 
Seq("34".getBytes(StandardCharsets.UTF_8)).toDF("b").write.parquet("/tmp/t2")
scala> val df1 = spark.read.parquet("/tmp/t1")
df1: org.apache.spark.sql.DataFrame = [a: binary]
scala> val df2 = spark.read.parquet("/tmp/t2")
df2: org.apache.spark.sql.DataFrame = [b: binary]
scala> df1.createOrReplaceTempView("binary1")
scala> df2.createOrReplaceTempView("binary2")
scala> sql("SELECT a FROM binary1 UNION SELECT b FROM binary2").show()
+-------+
|      a|
+-------+
|[33 34]|
|[31 32]|
+-------+
{code}

If this is not your scenario, please let me know. Also, if you provide some 
sample code, that will be great.

> Union does not work for column with array byte 
> -----------------------------------------------
>
>                 Key: SPARK-16704
>                 URL: https://issues.apache.org/jira/browse/SPARK-16704
>             Project: Spark
>          Issue Type: Bug
>            Reporter: Ng Jiunn Jye
>
> When union 2 query with columns having array of bytes datatype, spark query 
> fail with exception.
> Example :
> select binaryColumn from tableA
>  union
> select binaryColumn from tableB
> Note that  spark properties "spark.sql.parquet.binaryAsString" is set to true
> org.apache.spark.sql.AnalysisException: unresolved operator 'Union;
>         at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:38)
>  ~[iop-spark-client.spark-catalyst_2.11-1.6.0.jar:1.6.0]
>         at 
> org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:44)
>  ~[iop-spark-client.spark-catalyst_2.11-1.6.0.jar:1.6.0]
>         at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:203)
>  ~[iop-spark-client.spark-catalyst_2.11-1.6.0.jar:1.6.0]
>         at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:50)
>  ~[iop-spark-client.spark-catalyst_2.11-1.6.0.jar:1.6.0]
>         at 
> org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:105) 
> ~[iop-spark-client.spark-catalyst_2.11-1.6.0.jar:1.6.0]
>         at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:104)
>  ~[iop-spark-client.spark-catalyst_2.11-1.6.0.jar:1.6.0]
>         at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:104)
>  ~[iop-spark-client.spark-catalyst_2.11-1.6.0.jar:1.6.0]
>         at scala.collection.immutable.List.foreach(List.scala:381) 
> ~[org.scala-lang.scala-library-2.11.8.jar:na]
>         at 
> org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:104) 
> ~[iop-spark-client.spark-catalyst_2.11-1.6.0.jar:1.6.0]
>         at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:50)
>  ~[iop-spark-client.spark-catalyst_2.11-1.6.0.jar:1.6.0]
>         at 
> org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:44)
>  ~[iop-spark-client.spark-catalyst_2.11-1.6.0.jar:1.6.0]
>         at 
> org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:34)
>  ~[iop-spark-client.spark-sql_2.11-1.6.0.jar:1.6.0]
>         at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:133) 
> ~[iop-spark-client.spark-sql_2.11-1.6.0.jar:1.6.0]
>         at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52) 
> ~[iop-spark-client.spark-sql_2.11-1.6.0.jar:1.6.0]
>         at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:817) 
> ~[iop-spark-client.spark-sql_2.11-1.6.0.jar:1.6.0]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to