Hi,
It sounds similar to what we do in our application.
We don't serialize every row, but instead we group first the rows into the
wanted representation and then apply protobuf serialization using map and
lambda.
I suggest not to serialize the entire DataFrame into a single protobuf
message since
Thanks Jungtaek Lim,
I upgraded the cluster to 2.4.3 and it worked fine.
Thanks,
Alex
On Mon, Aug 19, 2019 at 10:01 PM Jungtaek Lim wrote:
> Hi Alex,
>
> you seem to hit SPARK-26606 [1] which has been fixed in 2.4.1. Could you
> try it out with latest version?
>
> Thanks,
> Jungtaek Lim
Hi,
I want to run Running Spark history Server at Context
localhost:18080/sparkhistory instead at port localhost:18080
The end goal is to access Spark History Server with a domain name i.e,
domainname/sparkhistory
is there any hacks or spark config options?
--
Thanks,
Regards,
SandishKumar
It sounds like you want to aggregate your rows in some way. I actually just
wrote a blog post about that topic:
https://medium.com/@albamus/spark-aggregating-your-data-the-fast-way-e37b53314fad
On Mon, Aug 19, 2019 at 4:24 PM Rishikesh Gawade
wrote:
> *This Message originated outside your
Hi All,
I have been trying to serialize a dataframe in protobuf format. So far, I
have been able to serialize every row of the dataframe by using map
function and the logic for serialization within the same(within the lambda
function). The resultant dataframe consists of rows in serialized
Hi Alex,
you seem to hit SPARK-26606 [1] which has been fixed in 2.4.1. Could you
try it out with latest version?
Thanks,
Jungtaek Lim (HeartSaVioR)
1. https://issues.apache.org/jira/browse/SPARK-26606
On Tue, Aug 20, 2019 at 3:43 AM Alex Landa wrote:
> Hi,
>
> We are using Spark Standalone
Hi,
We are using Spark Standalone 2.4.0 in production and publishing our Scala
app using cluster mode.
I saw that extra java options passed to the driver don't actually pass.
A submit example:
*spark-submit --deploy-mode cluster --master spark://:7077
--driver-memory 512mb --conf
Hello,
I'm using hadoop 3.1.2 with Yarn and Spark 2.4.2:
I'm trying to read file compressed with zstd command line from spark shell.
However after a huge fight to finally understand issue in library import and
other stuff, I no longer face error when trying to read those files.
However If I