No Java installed? Or process can but find it? Java-home not set?
On Fri, 13 Nov 2020 at 23:24, Mich Talebzadeh
wrote:
> Hi,
>
> This is basically a simple module
>
> from pyspark import SparkContext
> from pyspark.sql import SQLContext
> from pyspark.sql import HiveContext
> from pyspark.sql
Hi,
This is basically a simple module
from pyspark import SparkContext
from pyspark.sql import SQLContext
from pyspark.sql import HiveContext
from pyspark.sql import SparkSession
from pyspark.sql import Row
from pyspark.sql.types import StringType, ArrayType
from pyspark.sql.functions import
* When you say refresh happens for only batch or non-streaming sources, I
am assuming all kinds of DB sources like RDBMS, Distributed data store, file
system etc as batch sources. Please correct if required.
It depends on how you read the data frame. Any dataframe that you get by doing
Thanks for quick response.
This is a batch use case in as-is world. We are redesigning it and intend
to use streaming. Good to know that spark streaming will refresh data for
every microbatch.
When you say refresh happens for only batch or non-streaming sources, I am
assuming all kinds of DB
Is this a streaming application or a batch application?
Normally, for batch applications, you want to keep data consistent. If you have
a portfolio of mortgages that you are computing payments for and the interest
rate changes while you are computing payments, you don’t want to compute half
Hi
In the financial systems world, if some data is being updated too
frequently, and that data is to be used as reference data by a Spark job
that runs for 6/7 hours, most likely Spark job may read that data at the
beginning and keep it in memory as DataFrame and will keep running for
remaining
Hi,
Is it recommended to use Spark on K8S in production?
Spark operator for Kubernetes seems to be in beta state.