Don Smith created SPARK-19692: ---------------------------------- Summary: Comparison on BinaryType returns no results Key: SPARK-19692 URL: https://issues.apache.org/jira/browse/SPARK-19692 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.1.0 Reporter: Don Smith
I believe there is an issue with comparisons on binary fields: {code} val sc = SparkSession.builder.appName("test").getOrCreate() val schema = StructType(Seq(StructField("ip", BinaryType))) val ips = Seq("1.1.1.1", "2.2.2.2", "200.10.6.7").map(s => InetAddress.getByName(s).getAddress) val df = sc.createDataFrame( sc.sparkContext.parallelize(ips, 1).map { ip => Row(ip) }, schema ) val query = df .where(df("ip") >= InetAddress.getByName("200.10.0.0").getAddress) .where(df("ip") <= InetAddress.getByName("200.10.255.255").getAddress) logger.info(query.explain(true)) val results = query.collect() results.length mustEqual 1 {code} returns no results. i believe the problem is that the comparison is coercing the bytes to signed integers in the call to compareTo here in TypeUtils: {code} def compareBinary(x: Array[Byte], y: Array[Byte]): Int = { for (i <- 0 until x.length; if i < y.length) { val res = x(i).compareTo(y(i)) if (res != 0) return res } x.length - y.length } {code} with some hacky testing i was able to get the desired results with: {{ val res = (x(i).toByte & 0xff) - (y(i).toByte & 0xff) }} thanks! -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org