Hi,
There is definitely a parameter while creating temporary security
credential to mention the number of minutes those credentials will be
active. There is an upper limit ofcourse which is around 3 days in case I
remember correctly and the default, as you can see, is 30 mins.
Can you let me
I've been working with Datastax's spark-cassandra-connector, and have noticed
that, when creating batches of DataFrame Rows to write to database, write
throughput is increased substantially and overall task completion time is
decreased if the user sorts the DataFrame on Cassandra partition key
Hi, I'm working on the implementation of a semi-supervised algorithm in Spark
and I want it to implement the interfaces provided by MLlib, so that it can
use things like model selection.
My problem is that, as far as I can tell, the provided interfaces are meant
for supervised algorithms (for
Hi All,
A cluster of one spark driver and multiple executors(5) is setup with redis
for spark processed data storage and s3 is used for checkpointing. I have a
couple of queries about this setup.
1) How to analyze what part of code executes on Spark Driver and what part
of code executes on the
Hi all,
I'm relatively new to spark and something is bothering me for optimizing
sort merge join from parquet.
My work consists to get stats on purchases for a retail company.
For example, i have to calculate the mean purchase over a period, for a
segment of prodcuts and a segment of client.