Github user srowen commented on the issue: https://github.com/apache/spark/pull/14256 SPARK-16613 is different I believe. You reported a `StackOverflowError` and indeed I can't figure out why the existing `pipe` methods just call themselves? It happened in https://github.com/apache/spark/commit/279bd4aa5fddbabdb0383a3f6f0fc8d91780e092 and unless I totally miss something that's just a small but bad error. They need to call to the main `pipe` overload. The cleanup to `PipedRDD` constructors also lost the `tokenize` call. These simpler `pipe` overloads do need to invoke it. This is certainly my fault as I was reviewing and suggested some cleanup that ultimately led to losing this functionality. (Also I don't really like using `StringTokenizer` instead of just splitting on whitespace, but, maybe not the thing to deal with now.)
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org