[ https://issues.apache.org/jira/browse/SPARK-21182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16065876#comment-16065876 ]
Hyukjin Kwon commented on SPARK-21182: -------------------------------------- Looks I can't reproduce this on Windows at the current master. With the reproducer below: {code:title=Wordcount.scala|borderStyle=solid} val lines = spark.readStream.format("socket").option("host", "localhost").option("port", 9999).load() val words = lines.as[String].flatMap(_.split(" ")) val wordCounts = words.groupBy("value").count().sort($"count".desc) val query = wordCounts.writeStream.outputMode("complete").format("console").start() query.awaitTermination() {code} {code:title=nc.py|borderStyle=solid} import socket import urllib import time if __name__ == "__main__": s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.bind(('0.0.0.0', 9999)) s.listen(1) conn, _ = s.accept() while True: conn.sendall(raw_input() + "\n") {code} In cmd A: {code} C:\...\...>python nc.py {code} In cmd B: {code} C:\...\...>.\bin\spark-shell -i Wordcount.scala {code} In cmd A: {code} a abab abab {code} In cmd B: {code} ------------------------------------------- Batch: 0 ------------------------------------------- ... +-----+-----+ |value|count| +-----+-----+ | | 1| +-----+-----+ ------------------------------------------- Batch: 1 ------------------------------------------- ... +-----+-----+ |value|count| +-----+-----+ | a| 1| | | 1| +-----+-----+ ------------------------------------------- Batch: 2 ------------------------------------------- ... +-----+-----+ |value|count| +-----+-----+ | abab| 1| | a| 1| | | 1| +-----+-----+ ------------------------------------------- Batch: 3 ------------------------------------------- ... +-----+-----+ |value|count| +-----+-----+ | abab| 2| | a| 1| | | 1| +-----+-----+ {code} > Structured streaming on Spark-shell on windows > ---------------------------------------------- > > Key: SPARK-21182 > URL: https://issues.apache.org/jira/browse/SPARK-21182 > Project: Spark > Issue Type: Bug > Components: Structured Streaming > Affects Versions: 2.1.1 > Environment: Windows 10 > spark-2.1.1-bin-hadoop2.7 > Reporter: Vijay > Priority: Minor > > Structured streaming output operation is failing on Windows shell. > As per the error message, path is being prefixed with File separator as in > Linux. > Thus, causing the IllegalArgumentException. > Following is the error message. > scala> val query = wordCounts.writeStream .outputMode("complete") > .format("console") .start() > java.lang.IllegalArgumentException: Pathname > {color:red}*/*{color}C:/Users/Vijay/AppData/Local/Temp/temporary-081b482c-98a4-494e-8cfb-22d966c2da01/offsets > from > C:/Users/Vijay/AppData/Local/Temp/temporary-081b482c-98a4-494e-8cfb-22d966c2da01/offsets > is not a valid DFS filename. > at > org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:197) > at > org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:106) > at > org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305) > at > org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317) > at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1426) > at > org.apache.spark.sql.streaming.StreamingQueryManager.createQuery(StreamingQueryManager.scala:222) > at > org.apache.spark.sql.streaming.StreamingQueryManager.startQuery(StreamingQueryManager.scala:280) > at > org.apache.spark.sql.streaming.DataStreamWriter.start(DataStreamWriter.scala:268) > ... 52 elided -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org