Alex Krasnyansky created SPARK-16613: ----------------------------------------
Summary: A bug in RDD pipe operation Key: SPARK-16613 URL: https://issues.apache.org/jira/browse/SPARK-16613 Project: Spark Issue Type: Bug Reporter: Alex Krasnyansky Suppose we have such Spark code {code} object PipeExample { def main(args: Array[String]) { val fstRdd = sc.parallelize(List("hi", "hello", "how", "are", "you")) val pipeRdd = fstRdd.pipe("/Users/finkel/spark-pipe-example/src/main/resources/len.sh") pipeRdd.collect.foreach(println) } } {code} It uses a bash script to convert a string to its length. {code} #!/bin/sh read input len=${#input} echo $len {code} So far so good, but when I run the code, it prints incorrect output. For example: {code} 0 2 0 5 3 0 3 3 {code} I expect to see {code} 2 5 3 3 3 {code} which it correct output for the app. I think it's a bug. It's expected to see only positive integers and avoid zeros. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org