date:20200731

Re: Pyspark: Issue using sql in foreachBatch sink

2020-07-31 Thread Jungtaek Lim

Python doesn't allow abbreviating () with no param, whereas Scala does. Use `write()`, not `write`. On Wed, Jul 29, 2020 at 9:09 AM muru wrote: > In a pyspark SS job, trying to use sql instead of sql functions in > foreachBatch sink > throws AttributeError: 'JavaMember' object has no attribute

Re: Tab delimited csv import and empty columns

2020-07-31 Thread Vladimir Ryzhov

Would *df.na.fill("") *do the trick? On Fri, Jul 31, 2020 at 8:43 AM Sean Owen wrote: > Try setting nullValue to anything besides the empty string. Because its > default is the empty string, empty strings become null by default. > > On Fri, Jul 31, 2020 at 3:20 AM Stephen Coy > wrote: > >>

Re: Tab delimited csv import and empty columns

2020-07-31 Thread Sean Owen

Try setting nullValue to anything besides the empty string. Because its default is the empty string, empty strings become null by default. On Fri, Jul 31, 2020 at 3:20 AM Stephen Coy wrote: > That does not work. > > This is Spark 3.0 by the way. > > I have been looking at the Spark unit tests

Re: Tab delimited csv import and empty columns

2020-07-31 Thread Stephen Coy

That does not work. This is Spark 3.0 by the way. I have been looking at the Spark unit tests and there does not seem to be any that load a CSV text file and verify that an empty string maps to an empty string which I think is supposed to be the default behaviour because the “nullValue”