You're getting InternalRow instances. They probably have the data you want,
but the toString representation doesn't match the data for InternalRow.
On Thu, Mar 21, 2019 at 3:28 PM Long, Andrew
wrote:
> Hello Friends,
>
>
>
> I’m working on a performance improvement that reads additional parquet
-- Forwarded message -
From: asma zgolli
Date: jeu. 21 mars 2019 à 18:15
Subject: Cross Join
To:
Hello ,
I need to cross my data and i'm executing a cross join on two dataframes .
C = A.crossJoin(B)
A has 50 records
B has 5 records
the result im getting with spark 2.0 is a
I try to read a stream using my custom data source (v2, using spark 2.3),
and it fails *in the second iteration* with the following exception while
reading prune columns:Query [id=xxx, runId=yyy] terminated with exception:
assertion failed: Invalid batch: a#660,b#661L,c#662,d#663,,... 26 more
Hey,
We have a cluster of 10 nodes each of which consists 128GB memory. We are about
to running Spark and Alluxio on the cluster. We wonder how shall allocate the
memory to the Spark executor and the Alluxio worker on a machine? Are there
some recommendations? Thanks!
Best,
Andy Li
Are there specific questions you have? Might be easier to post them here
also.
On Wed, Mar 20, 2019 at 5:16 PM Andriy Redko wrote:
> Hello Dear Spark Community!
>
> The hyper-popularity of the Apache Spark made it a de-facto choice for many
> projects which need some sort of data processing