Re: Do spark-submit overwrite the Spark session created manually?

2018-12-31 Thread Neo Chien
Hi According to official spec as below, I think the SparkSession.builder gets the highest priority to the configuration and it means the ‘spark-submit’ passing options would be ignored. Please correct me if I am wrong, many thanks. Properties set directly on the SparkConf take highest

Re: Corrupt record handling in spark structured streaming and from_json function

2018-12-31 Thread Colin Williams
Dear spark user community, I have recieved some insight regarding filtering seperate dataframes in my spark-structured-streaming job. However I wish to write the dataframes aforementioned above in the stack overflow question each using a parquet writer to a separate location. My initial

structure streaming dataframe/dataset join (Java)

2018-12-31 Thread Mann Du
Hello there, I am trying to calculate simple difference btw adjacent rows ( ts = ts -10) of a column for a dataset using Join (of itself). The sql expression was working for static datasets (trackT) as: Dataset trackDiff = spark.sql(" select a.*, " + "a.posX - coalesce(b.posX, 0) as delX,

Do spark-submit overwrite the Spark session created manually?

2018-12-31 Thread email
Hi Community , When we submit a job using 'spark-submit' passing options like the 'master url' what should be the content of the main class? For example , if I create the session myself : val spark = SparkSession.builder. master("local[*]") .appName("Console")

Spark jdbc postgres numeric array

2018-12-31 Thread Alexey
Hi, I came across strange behavior when dealing with postgres columns of type numeric[] using Spark 2.3.2, PostgreSQL 10.4, 9.6.9. Consider the following table definition: create table test1 ( v numeric[], d numeric ); insert into test1 values('{.222,.332}', 222.4555); When

Re: Postgres Read JDBC with COPY TO STDOUT

2018-12-31 Thread Nicolas Paris
The resulting library is on github: https://github.com/EDS-APHP/spark-postgres While there is room for improvements it is also able to read/write postgres data with the COPY statement allowing reading/writing **very large** tables without problems. On Sat, Dec 29, 2018 at 01:06:00PM +0100,