Re: Data Contracts

2023-06-14 Thread Jean-Georges Perrin
Hi, While I was at PayPal, we open sourced a template of Data Contract, it is here: https://github.com/paypal/data-contract-template. Companies like GX (Great Expectations) are interested in using it. Spark could read some elements form it pretty easily, like schema validation, some rules vali

Issues with Delta Lake on 3.0.0 preview + preview 2

2019-12-30 Thread Jean-Georges Perrin
Hi there, Trying to run a very simple app saving content of a dataframe to Delta Lake. Code works great on 2.4.4 but fails on 3.0.0 preview & preview 2. Tried on both Delta Lake 0.5.0 and 0.4.0. Code (I know, it’s amazing): df.write().format("delta") .mode("overwrite") .sav

Re: Issue with map Java lambda function with 3.0.0 preview and preview 2

2019-12-28 Thread Jean-Georges Perrin
e different in Spark 3; some of that has always > been the case with Java 8 in Spark 2. I think it might be related to > Scala 2.12; were you using Spark 2 with Scala 2.11 before? > > On Sat, Dec 28, 2019 at 11:38 AM Jean-Georges Perrin wrote: >> >> Hey guys, &

Re: Issue with map Java lambda function with 3.0.0 preview and preview 2

2019-12-28 Thread Jean-Georges Perrin
I forgot… it does the same thing with the reducer… int dartsInCircle = dotsDs.reduce((x, y) -> x + y); jg > On Dec 28, 2019, at 12:38 PM, Jean-Georges Perrin wrote: > > Hey guys, > > This code: > > Dataset incrementalDf = spark > .cr

Issue with map Java lambda function with 3.0.0 preview and preview 2

2019-12-28 Thread Jean-Georges Perrin
Hey guys, This code: Dataset incrementalDf = spark .createDataset(l, Encoders.INT()) .toDF(); Dataset dotsDs = incrementalDf .map(status -> { double x = Math.random() * 2 - 1; double y = Math.random() * 2 - 1; counter++; if (

Re: Thoughts on Spark 3 release, or a preview release

2019-09-11 Thread Jean Georges Perrin
As a user/non committer, +1 I love the idea of an early 3.0.0 so we can test current dev against it, I know the final 3.x will probably need another round of testing when it gets out, but less for sure... I know I could checkout and compile, but having a “packaged” preversion is great if it doe

Re: [DISCUSSION]JDK11 for Apache 2.x?

2019-08-27 Thread Jean Georges Perrin
Not a contributor, but a user perspective… As Spark 3.x will be an evolution, I am not completely shocked that it would imply a Java 11 requirement as well. Would be great to have both Java 8 and Java 11, but one needs to be able to say goodbye. Java 8 is great, still using it actively in produ

Spark in Action, 2e...

2019-07-10 Thread Jean-Georges Perrin
://linkedin.com/in/jgperrin <http://linkedin.com/in/jgperrin>) Thanks for building Spark! Jean-Georges Perrin j...@jgp.net

Re: DataSourceV2 sync, 17 April 2019

2019-04-27 Thread Jean Georges Perrin
” v2 API that will be available after the release of Spark v3. jg -- Jean Georges Perrin j...@jgp.net > On Apr 19, 2019, at 10:10, Ryan Blue wrote: > > Here are my notes from the last DSv2 sync. As always: > > If you’d like to attend the sync, send me an email and I’ll add you t

Re: [RESULT] [VOTE] Functional DataSourceV2 in Spark 3.0

2019-03-03 Thread Jean Georges Perrin
Hi, I am kind of new at the whole Apache process (not specifically Spark). Does that means that the DataSourceV2 is dead or stays experimental? Thanks for clarifying for a newbie. jg > On Mar 3, 2019, at 11:21, Ryan Blue wrote: > > This vote fails with the following counts: > > 3 +1 votes:

Re: Static functions

2019-02-15 Thread Jean Georges Perrin
Javadoc in https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/functions.html… or anywhere else (Scala doc or SQL functions). Jean Georges Perrin j...@jgp.net > On Feb 11, 2019, at 09:42, Jacek Laskowski wrote: > > Hi Jean, > > I thought the functions have alre

Static functions

2019-02-10 Thread Jean Georges Perrin
Hey guys, We have 381 static functions now (including the deprecated). I am trying to sort them out by group/tag them. So far, I have: Array Conversion Date Math Trigo (sub group of maths) Security Streaming String Technical Do you see more categories? Tags? Thanks! jg — Jean Georges Perrin

Re: [VOTE] SPARK 2.4.0 (RC3)

2018-10-10 Thread Jean Georges Perrin
d=12315420&version=12342385 > > <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315420&version=12342385> > > Bests, > Dongjoon. > > > On Wed, Oct 10, 2018 at 11:29 AM Jean Georges Perrin <mailto:j...@jgp.net>> wrote: > Hi,

Re: [VOTE] SPARK 2.4.0 (RC3)

2018-10-10 Thread Jean Georges Perrin
Hi, Sorry if it's stupid question, but where can I find the release notes of 2.4.0? jg > On Oct 10, 2018, at 2:00 PM, Imran Rashid > wrote: > > Sorry I had messed up my testing earlier, so I only just discovered > https://issues.apache.org/jira/browse/SPAR