Re: [DISCUSS] Null-handling of primitive-type of untyped Scala UDF in Scala 2.12

2020-03-16 Thread wuyi
Thanks Sean and Takeshi. Option 1 seems really impossible. And I'm going to take Option 2 as an alternative choice. -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ - To unsubscribe e-mail:

Re: FYI: The evolution on `CHAR` type behavior

2020-03-16 Thread Stephen Coy
I don’t think I can recall any usages of type CHAR in any situation. Really, it’s only use (on any traditional SQL database) would be when you *want* a fixed width character column that has been right padded with spaces. On 17 Mar 2020, at 12:13 pm, Reynold Xin mailto:r...@databricks.com>>

Re: FYI: The evolution on `CHAR` type behavior

2020-03-16 Thread Reynold Xin
For sure. There's another reason I feel char is not that important and it's more important to be internally consistent (e.g. all data sources support it with the same behavior, vs one data sources do one behavior and another do the other). char was created at a time when cpu was slow and

Re: FYI: The evolution on `CHAR` type behavior

2020-03-16 Thread Dongjoon Hyun
Thank you for sharing and confirming. We had better consider all heterogeneous customers in the world. And, I also have experiences with the non-negligible cases in on-prem. Bests, Dongjoon. On Mon, Mar 16, 2020 at 5:42 PM Reynold Xin wrote: > −User > > char barely showed up (honestly

Re: FYI: The evolution on `CHAR` type behavior

2020-03-16 Thread Reynold Xin
−User char barely showed up (honestly negligible). I was comparing select vs select. On Mon, Mar 16, 2020 at 5:40 PM, Dongjoon Hyun < dongjoon.h...@gmail.com > wrote: > > Ur, are you comparing the number of SELECT statement with TRIM and CREATE > statements with `CHAR`? > > > I looked up our

Re: FYI: The evolution on `CHAR` type behavior

2020-03-16 Thread Dongjoon Hyun
Ur, are you comparing the number of SELECT statement with TRIM and CREATE statements with `CHAR`? > I looked up our usage logs (sorry I can't share this publicly) and trim has at least four orders of magnitude higher usage than char. We need to discuss more about what to do. This thread is what

Re: FYI: The evolution on `CHAR` type behavior

2020-03-16 Thread Reynold Xin
BTW I'm not opposing us sticking to SQL standard (I'm in general for it). I was merely pointing out that if we deviate away from SQL standard in any way we are considered "wrong" or "incorrect". That argument itself is flawed when plenty of other popular database systems also deviate away from

Re: FYI: The evolution on `CHAR` type behavior

2020-03-16 Thread Reynold Xin
I looked up our usage logs (sorry I can't share this publicly) and trim has at least four orders of magnitude higher usage than char. On Mon, Mar 16, 2020 at 5:27 PM, Dongjoon Hyun < dongjoon.h...@gmail.com > wrote: > > Thank you, Stephen and Reynold. > > > To Reynold. > > > The way I see

Re: FYI: The evolution on `CHAR` type behavior

2020-03-16 Thread Dongjoon Hyun
Thank you, Stephen and Reynold. To Reynold. The way I see the following is a little different. > CHAR is an undocumented data type without clearly defined semantics. Let me describe in Apache Spark User's View point. Apache Spark started to claim `HiveContext` (and `hql/hiveql`

Re: FYI: The evolution on `CHAR` type behavior

2020-03-16 Thread Stephen Coy
Hi there, I’m kind of new around here, but I have had experience with all of all the so called “big iron” databases such as Oracle, IBM DB2 and Microsoft SQL Server as well as Postgresql. They all support the notion of “ANSI padding” for CHAR columns - which means that such columns are always

Re: FYI: The evolution on `CHAR` type behavior

2020-03-16 Thread Reynold Xin
I haven't spent enough time thinking about it to give a strong opinion, but this is of course very different from TRIM. TRIM is a publicly documented function with two arguments, and we silently swapped the two arguments. And trim is also quite commonly used from a long time ago. CHAR is an

Re: FYI: The evolution on `CHAR` type behavior

2020-03-16 Thread Dongjoon Hyun
Hi, Reynold. (And +Michael Armbrust) If you think so, do you think it's okay that we change the return value silently? Then, I'm wondering why we reverted `TRIM` functions then? > Are we sure "not padding" is "incorrect"? Bests, Dongjoon. On Sun, Mar 15, 2020 at 11:15 PM Gourav Sengupta

Re-triggering failed GitHub workflows

2020-03-16 Thread Nicholas Chammas
Is there any way contributors can retrigger a failed GitHub workflow, like we do with Jenkins? There's supposed to be a "Re-run all checks" button, but I don't see it. Do we need INFRA to grant permissions for that, perhaps? Right now I'm doing it by adding empty commits: ``` git commit