Hi,
I am trying to deploy a Spark app in a Kubernetes Cluster. The cluster consists
of 2 machines - 1 master and 1 slave, each of them with the following config:
RHEL 7.2
Docker 17.03.1
K8S 1.7.
I am following the steps provided in
Any changes in the Java code (to be specific, the generated bytecode) in
the functions you pass to Spark (i.e., map function, reduce function, as
well as it closure dependencies) counts as "application code change", and
will break the recovery from checkpoints.
On Sat, Oct 7, 2017 at 11:53 AM,
https://issues.apache.org/jira/browse/SPARK-8
On Sun, Oct 8, 2017 at 11:58 AM, kant kodali wrote:
> I have the following so far
>
> private StructType getSchema() {
> return new StructType()
> .add("name", StringType)
> .add("address",
Tried the following.
dataset.map(new MapFunction>>() {
@Override
public List
if you are willing to use kryo encoder you can do your original Dataset<
List Seq(1,2,3).toDS.map(x => if (x % 2 == 0) x else
x.toString)(org.apache.spark.sql.Encoders.kryo[Any]).map{ (x: Any)
Hi Koert,
Thanks! If I have this Dataset>> what would be the
Enconding?is it Encoding.kryo(Seq.class) ?
Also shouldn't List be supported? Should I create a ticket for this?
On Mon, Oct 9, 2017 at 6:10 AM, Koert Kuipers wrote:
> it supports
Hi all!
I would love to use Spark with a somewhat more modern logging framework
than Log4j 1.2. I have Logback in mind, mostly because it integrates well
with central logging solutions such as the ELK stack. I've read up a bit on
getting Spark 2.0 (that's what I'm using currently) to work with
Have you raised it in ES connector github as issues? In my past experience
(with hadoop connector with Pig), they respond pretty quickly.
On Tue, Oct 10, 2017 at 12:36 AM, sixers wrote:
> ### Issue description
>
> We have an issue with data consistency when storing data in
### Issue description
We have an issue with data consistency when storing data in Elasticsearch
using Spark and elasticsearch-spark connector. Job finishes successfully,
but when we compare the original data (stored in S3), with the data stored
in ES, some documents are not present in
it supports Dataset>> where X must be a supported type
also. Object is not a supported type.
On Mon, Oct 9, 2017 at 7:36 AM, kant kodali wrote:
> Hi All,
>
> I am wondering if spark supports Dataset>> ?
>
> when I do the following
Hi All,
I am wondering if spark supports Dataset>> ?
when I do the following it says no map function available?
Dataset>> resultDs = ds.map(lambda,
Encoders.bean(List.class));
Thanks!
Does you get the warning info such as:
`Failed to load implementation from:
com.github.fommil.netlib.NativeSystemBLAS`
`Failed to load implementation from:
com.github.fommil.netlib.NativeRefBLAS` ?
These two errors are thrown in `com.github.fommil.netlib.BLAS`, but it
catch the original exception
Hi Users,
Is there any way to avoid creation of .crc files when writing an RDD with
saveAsTextFile method?
My use case is, I have mounted S3 on the local file system using S3FS and
saving an RDD to mounting point. by looking at S3, I found one .crc file for
each part file and even _SUCCESS file.
Hi,
I am getting the following Warning when i run the pyspark job:
My Code is
mat = RowMatrix(tf_rdd_vec.cache()) # RDD is cached
svd = mat.computeSVD(num_topics, computeU=False)
I am using Ubuntu 16.04 EC2 instance. And I have installed all following
libraries into my system.
sudo apt
After doing group, you can use mkstring on the data frame. Following is an
example where are columns are concatenated with space as a separator.
scala> call_cdf.map(row => row.mkString(" ")).show(false)
15 matches
Mail list logo