Hello,

I have a PR https://github.com/apache/spark/pull/45620  ready to go that will 
extend the definition of whitespace (what separates token) from the small set 
of ASCII characters space, tab, linefeed to those defined in Unicode.
While this is a small and safe change, it is one where we would have a hard 
time changing our minds about later.
It is also a change that, AFAIK, cannot be controlled under a config.

What does the community think?

Cheers
Serge
SQL Architect at Databricks

Reply via email to