oint to your application. With py4j you
>>>> can call Java/Scala functions from the python application. There's no need
>>>> to use the pipe() function for that.
>>>>
>>>>
>>>> Shay
>>>> --
>>
n. With py4j you can
>>> call Java/Scala functions from the python application. There's no need to
>>> use the pipe() function for that.
>>>
>>>
>>> Shay
>>> --
>>> *From:* Yuhao Zhang
>>> *Sent:* Saturday, July 9, 2
t;> call Java/Scala functions from the python application. There's no need to
>> use the pipe() function for that.
>>
>>
>> Shay
>> --
>> *From:* Yuhao Zhang
>> *Sent:* Saturday, July 9, 2022 4:13:42 AM
>> *To:* user@
r that.
>
>
> Shay
> --
> *From:* Yuhao Zhang
> *Sent:* Saturday, July 9, 2022 4:13:42 AM
> *To:* user@spark.apache.org
> *Subject:* [EXTERNAL] RDD.pipe() for binary data
>
>
> *ATTENTION:* This email originated from outside of GM.
>
>
> Hi All,
>
> I'm currentl
To: user@spark.apache.org
Subject: [EXTERNAL] RDD.pipe() for binary data
ATTENTION: This email originated from outside of GM.
Hi All,
I'm currently working on a project involving transferring between Spark 3.x (I
use Scala) and a Python runtime. In Spark, data is stored in an RDD as
floating-point
operations
specific to Spark Scala APIs, so I need to use both runtimes.
Now to achieve data transfer I've been using the RDD.pipe() API, by 1.
converting the arrays to strings in Spark and calling RDD.pipe(script.py)
2. Then Python receives the strings and casts them as Python's data
structures
Hello
I am trying to build a system that does a very simple calculation on a
stream and displays the results in a graph that I want to update the graph
every second or so. I think I have a fundamental mis understanding about how
steams and rdd.pipe() works. I want to do the data visualization
All,
I have been searching the web for a few days looking for a definitive
Spark/Spark Streaming RDD.Pipe() example and cannot find one. Would it be
possible to share with the group an example of the both the Java/Scala side
as well as the script (ex Python) side? Any help or response would
According to the api docs for the pipe operator,
def pipe(command: String): RDD
http://spark.apache.org/docs/1.0.0/api/scala/org/apache/spark/rdd/RDD.html
[String]: Return an RDD created by piping elements to a forked external
process.
However, its not clear to me:
Will the outputted RDD capture
Nevermind :) I found my answer in the docs for the PipedRDD
/**
* An RDD that pipes the contents of each parent partition through an
external command
* (printing them one per line) and returns the output as a collection of
strings.
*/
private[spark] class PipedRDD[T: ClassTag](
So, this is
Hi all,
I need to run a complex external process with a lots of dependencies from
spark. The pipe and addFile function seem to be my friends but there
are just some issues that I need to solve.
Precisely, the process I want to run are C++ executable that may depend on
some libraries and
11 matches
Mail list logo