Hi list,
I have a Spark cluster with 3 nodes. I'm calling spark-shell with some
packages to connect to AWS S3 and Cassandra:
spark-shell \
--packages
org.apache.hadoop:hadoop-aws:2.7.3,com.amazonaws:aws-java-sdk:1.7.4,datastax:spark-cassandra-connector:2.0.6-s_2.11
\
--conf spark.cassandra.co
Hi list,
I'm trying to make a custom build of Spark, but in the end on Web UI
there's no images.
Some help please.
Build from:
git checkout v2.2.1
./dev/make-distribution.sh --name custom-spark --pip --tgz -Psparkr
-Phadoop-2.7 -Dhadoop.version=2.7.3 -Phive -Phive-thriftserver -Pmesos
-Pyarn -
quot;true"))
.load()
.select("kafka")
dfs.printSchema()
Any way to put this schema in json?
Thanks in advance
On 22-01-2018 17:51, Sathish Kumaran Vairavelu wrote:
> You have to register a Cassandra table in spark as dataframes
>
>
> https://github.com/datastax/spa
Hi list,
I have a Cassandra table with two fields; id bigint, kafka text
My goal is to read only the kafka field (that is a JSON) and infer the
schema
Hi have this skeleton code (not working):
sc.stop
import org.apache.spark._
import com.datastax.spark._
import org.apache.spark.sql.functions.ge
Just run by yourself this test and check the results. During the run
also check with top a worker.
Python:
import random
def inside(p):
x, y = random.random(), random.random()
return x * x + y * y < 1
def estimate_pi(num_samples):
count = sc.parallelize(xrange(0, num_samples)).filte
Hi list,
I'm using Cassandra with only 2 fields (id, json).
I'm using Spark to query the json. Until now I can use a json file and
query that file, but Cassandra and RDDs of the json field not yet.
sc = spark.sparkContext
path = "/home/me/red50k.json"
redirectsDF = spark.read.json(path)
redirects