Hi,
I am running Spark 2.0.0 and 2.1.1 on YARN in a Hadoop 2.7.3 cluster. Is
spark-env.sh sourced when starting the Spark AM container or the executor
container?
Saw this paragraph on
https://github.com/apache/spark/blob/master/docs/configuration.md:
Note: When running Spark on YARN in cluster
Hi,
No this is not possible with the current data source API. However, there is a
new data source API v2 on its way - maybe it will support it.
Alternatively, you can have a config option to calculate meta data after an
insert.
However, could you please explain more for which dB your
Hi,
I'm working on integrating Spark and a custom data source.
Most things go well with nice Spark Data Source APIs (Thanks to well
designed APIs)
But, one thing I couldn't resolve is that how to execute custom meta query
for `ANALYZE TABLE`
The custom data source I'm currently working on has
I noticed some weird behavior with NingWSClient 2.4.3. when used with Spark.
Try this
1. Spin up spark-shell with play-ws2.4.3 in driver class path
2. Run this code
val myConfig = new AsyncHttpClientConfigBean()
config.setAcceptAnyCertificate(true)
config.setFollowRedirect(true)
val
You can repartition your dataframe into 1 partition and all the data will land
into one partition. However, doing this is perilious because you will end up
with all your data on one node, and if you have too much data you will run out
of memory. In fact, anytime you are thinking about putting
I write a sliding window analytic program and use the functions.window
function (
https://spark.apache.org/docs/2.1.0/api/java/org/apache/spark/sql/functions.html#window(org.apache.spark.sql.Column,%20java.lang.String,%20java.lang.String)
)
The code looks like this:
Column slidingWindow =
Hi Jeroen,
in case you are using HIVE partitions how many partitions do you have?
Also is there any chance that you might post the code?
Regards,
Gourav Sengupta
On Tue, Jan 2, 2018 at 7:50 AM, Jeroen Miller
wrote:
> Hello Gourav,
>
> On 30 Dec 2017, at 20:20, Gourav