Re: [EXTERNAL] RDD.pipe() for binary data

2022-07-16 Thread Andrew Melo
oint to your application. With py4j you >>>> can call Java/Scala functions from the python application. There's no need >>>> to use the pipe() function for that. >>>> >>>> >>>> Shay >>>> -- >>

Re: [EXTERNAL] RDD.pipe() for binary data

2022-07-16 Thread Sebastian Piu
n. With py4j you can >>> call Java/Scala functions from the python application. There's no need to >>> use the pipe() function for that. >>> >>> >>> Shay >>> -- >>> *From:* Yuhao Zhang >>> *Sent:* Saturday, July 9, 2

Re: [EXTERNAL] RDD.pipe() for binary data

2022-07-16 Thread Sean Owen
t;> call Java/Scala functions from the python application. There's no need to >> use the pipe() function for that. >> >> >> Shay >> -- >> *From:* Yuhao Zhang >> *Sent:* Saturday, July 9, 2022 4:13:42 AM >> *To:* user@

Re: [EXTERNAL] RDD.pipe() for binary data

2022-07-16 Thread Yuhao Zhang
r that. > > > Shay > -- > *From:* Yuhao Zhang > *Sent:* Saturday, July 9, 2022 4:13:42 AM > *To:* user@spark.apache.org > *Subject:* [EXTERNAL] RDD.pipe() for binary data > > > *ATTENTION:* This email originated from outside of GM. > > > Hi All, > > I'm currentl

Re: [EXTERNAL] RDD.pipe() for binary data

2022-07-10 Thread Shay Elbaz
To: user@spark.apache.org Subject: [EXTERNAL] RDD.pipe() for binary data ATTENTION: This email originated from outside of GM. Hi All, I'm currently working on a project involving transferring between Spark 3.x (I use Scala) and a Python runtime. In Spark, data is stored in an RDD as floating-point

RDD.pipe() for binary data

2022-07-08 Thread Yuhao Zhang
operations specific to Spark Scala APIs, so I need to use both runtimes. Now to achieve data transfer I've been using the RDD.pipe() API, by 1. converting the arrays to strings in Spark and calling RDD.pipe(script.py) 2. Then Python receives the strings and casts them as Python's data structures

newbie system architecture problem, trouble using streaming and RDD.pipe()

2014-09-29 Thread Andy Davidson
Hello I am trying to build a system that does a very simple calculation on a stream and displays the results in a graph that I want to update the graph every second or so. I think I have a fundamental mis understanding about how steams and rdd.pipe() works. I want to do the data visualization

looking for a definitive RDD.Pipe() example?

2014-08-11 Thread pjv0580
All, I have been searching the web for a few days looking for a definitive Spark/Spark Streaming RDD.Pipe() example and cannot find one. Would it be possible to share with the group an example of the both the Java/Scala side as well as the script (ex Python) side? Any help or response would

RDD.pipe(...)

2014-07-20 Thread jay vyas
According to the api docs for the pipe operator, def pipe(command: String): RDD http://spark.apache.org/docs/1.0.0/api/scala/org/apache/spark/rdd/RDD.html [String]: Return an RDD created by piping elements to a forked external process. However, its not clear to me: Will the outputted RDD capture

Re: RDD.pipe(...)

2014-07-20 Thread jay vyas
Nevermind :) I found my answer in the docs for the PipedRDD /** * An RDD that pipes the contents of each parent partition through an external command * (printing them one per line) and returns the output as a collection of strings. */ private[spark] class PipedRDD[T: ClassTag]( So, this is

Configure and run external process with RDD.pipe

2014-07-02 Thread Jaonary Rabarisoa
Hi all, I need to run a complex external process with a lots of dependencies from spark. The pipe and addFile function seem to be my friends but there are just some issues that I need to solve. Precisely, the process I want to run are C++ executable that may depend on some libraries and