Hello Spark community,
We have a project where we want to use Spark as computation engine to perform
calculations and return result via REST services.
Working with Spark we have learned how to do things to make it work faster and
finally optimize our code to produce results in acceptable time
Hello folks,
Recently I have noticed unexpectedly big network traffic between Driver Program
and Worker node.
During debugging I have figured out that it is caused by following block of
code
—— Java ——— —
DataFrame etpvRecords = context.sql(" SOME SQL query here");
Mapper m = new
program ?
> On Nov 4, 2015, at 12:34 PM, Romi Kuntsman <r...@totango.com> wrote:
>
> I noticed that toJavaRDD causes a computation on the DataFrame, so is it
> considered an action, even though logically it's a transformation?
>
> On Nov 4, 2015 6:51 PM, "Aliaksei Tsyvun
ons you have on the DF and RDD...
>
> On Nov 4, 2015 7:54 PM, "Aliaksei Tsyvunchyk" <atsyvunc...@exadel.com
> <mailto:atsyvunc...@exadel.com>> wrote:
> Hello Romi,
>
> Do you mean that in my particular case I’m causing computation on dataFrame
> or it is regu
Hello all community members,
I need opinion of people who was using Spark before and can share there
experience to help me select technical approach.
I have a project in Proof Of Concept phase, where we are evaluating possibility
of Spark usage for our use case.
Here is brief task description.