Bruce Robbins created SPARK-26496: ------------------------------------- Summary: Test "locality preferences of StateStoreAwareZippedRDD" frequently fails on High Sierra Key: SPARK-26496 URL: https://issues.apache.org/jira/browse/SPARK-26496 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.4.0 Environment: Mac OS X High Sierra
Reporter: Bruce Robbins This is a bit esoteric and minor, but makes it difficult to run SQL unit tests successfully on High Sierra. StreamingInnerJoinSuite."locality preferences of StateStoreAwareZippedRDD" generates a directory name using {{Random.nextString(10)}}, and frequently that directory name is unacceptable to High Sierra. For example: {noformat} scala> val prefix = Random.nextString(10); val dir = new File("/tmp", "del_" + prefix + "-" + UUID.randomUUID.toString); dir.mkdirs() prefix: String = 媈ᒢ탊渓뀟?녛ꃲ싢櫦 dir: java.io.File = /tmp/del_媈ᒢ탊渓뀟?녛ꃲ싢櫦-aff57fc6-ca38-4825-b4f3-473140edd4f6 res39: Boolean = true // this one was OK scala> val prefix = Random.nextString(10); val dir = new File("/tmp", "del_" + prefix + "-" + UUID.randomUUID.toString); dir.mkdirs() prefix: String = 窽텘⒘駖ⵚ駢⡞Ρ닋 dir: java.io.File = /tmp/del_窽텘⒘駖ⵚ駢⡞Ρ닋-a3f99855-c429-47a0-a108-47bca6905745 res40: Boolean = false // nope, didn't like this one scala> prefix.foreach(x => printf("%04x ", x.toInt)) 7abd d158 2498 99d6 2d5a 99e2 285e 03a1 b2cb 0a4e scala> prefix(9) res46: Char = scala> val prefix = "\u7abd" prefix: String = 窽 scala> val dir = new File("/tmp", "del_" + prefix + "-" + UUID.randomUUID.toString); dir.mkdirs() dir: java.io.File = /tmp/del_窽-d1c3af34-d34d-43fe-afed-ccef9a800ff4 res47: Boolean = true // it's OK with \u7abd scala> val prefix = "\u0a4e" prefix: String = scala> val dir = new File("/tmp", "del_" + prefix + "-" + UUID.randomUUID.toString); dir.mkdirs() dir: java.io.File = /tmp/del_-3654a34c-6f74-4591-85af-a0f28b675a6f res50: Boolean = false // doesn't like \u0a4e {noformat} I thought it might have something to do with my Java 8 version, but Python is equally affected: {noformat} >>> f = open(u"/tmp/del_\u7abd_file", "wb") f = open(u"/tmp/del_\u7abd_file", "wb") >>> f.write("hello\n") f.write("hello\n") # it's OK with \u7abd >>> f2 = open(u"/tmp/del_\u0a4e_file", "wb") f2 = open(u"/tmp/del_\u0a4e_file", "wb") Traceback (most recent call last): File "<stdin>", line 1, in <module> IOError: [Errno 92] Illegal byte sequence: u'/tmp/del_\u0a4e_file' # doesn't like \u0a4e >>> f2 = open(u"/tmp/del_\ufa4e_file", "wb") f2 = open(u"/tmp/del_\ufa4e_file", "wb") # a little change and it's happy again >>> {noformat} Mac OS X Sierra is perfectly happy with these characters. This seems to be a limitation introduced by High Sierra. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org