@John, I agree with what you suggested. The behavior on Postgres makes more sense.
The issue is in Apache Calcite, the sql planner Drill that uses. I post this question on Calcite dev list. The fix would be in Calcite code. On Tue, May 24, 2016 at 8:01 AM, Zelaine Fong <zf...@maprtech.com> wrote: > I agree that this is a bug, as you've already noted via DRILL-4692. > Putting backticks around USER should treat it as an identifier, not a > reserved word or special function. Based on Jinfeng's findings, it looks > like Drill puts special function names in the same namespace as > identifiers, resulting in the current behavior. > > -- Zelaine > > On Tue, May 24, 2016 at 5:14 AM, John Omernik <j...@omernik.com> wrote: > >> I think if there is any change that one of the interpretations could be to >> allow `user` (backtick user backtick) that we should do that... Drill uses >> backticks as the quoted identifier (I gather Postgres uses double quotes as >> it's identifier) having user be the column name will same many >> organizations from errorless "wrong" results... I just worry about the down >> stream there... I'd probably feel less strongly about this if A. the word >> was less likely to be a column name (like current_drill_user vs user) and >> B. I wasn't a user with a background in security and see all the data >> sources that use "user" as a column. I just have this sinking feeling it >> will lead to bad things in data analysis for many users. >> >> I guess it comes down to the principle of least surprise, I believe that in >> this case, `user` providing the the column name user is less surprising >> then `user` providing the current drill user, and if we can make it align >> with out SQL systems (like how Postgres behaves with ITS quoted identifier) >> than I think we have a compelling case for changing how drill reacts here. >> >> Thoughts? >> >> (Once again, thanks for continued follow-up here, I love talking about >> issues like these) >> >> John >> >> >> >> On Mon, May 23, 2016 at 10:39 PM, Jinfeng Ni <jinfengn...@gmail.com> >> wrote: >> >> > mydb=# select "user" from t1; >> > user >> > ------ >> > ABC >> > >> > I should take back what I said. With quoted identifier, Postgres >> > behaved different from Drill. Both of the interpretations seem to be >> > reasonable, since the identifier could represent two different things. >> > >> > >> > On Mon, May 23, 2016 at 7:41 PM, Zelaine Fong <zf...@maprtech.com> >> wrote: >> > > Jinfeng, >> > > >> > > What does postgres return for the following query in your example? >> > > >> > > select "user" from t1; >> > > >> > > -- Zelaine >> > > >> > > On Mon, May 23, 2016 at 7:39 PM, John Omernik <j...@omernik.com> >> wrote: >> > > >> > >> Hmm, you are correct, I don't have to like it :) but there is both >> logic >> > >> and precedence here. Thanks for following up >> > >> >> > >> John >> > >> >> > >> On Monday, May 23, 2016, Jinfeng Ni <jinfengn...@gmail.com> wrote: >> > >> >> > >> > An quoted identifier is still an identifier (Drill uses back tick as >> > >> > quote). Per SQL standard, identifier CURRENT_USER / USER/ >> > >> > CURRENT_SESSION/etc are implicit function calls; no () is required. >> > >> > >> > >> > I checked Postgre, and seems it has the same behavior. >> > >> > >> > >> > mydb=# create table t1 (id int, "user" varchar(10)); >> > >> > >> > >> > mydb=# insert into t1 values(100, 'ABC'); >> > >> > INSERT 0 1 >> > >> > >> > >> > mydb=# select * from t1; >> > >> > id | user >> > >> > -----+------ >> > >> > 100 | ABC >> > >> > (1 row) >> > >> > >> > >> > mydb=# select user from t1; >> > >> > current_user >> > >> > -------------- >> > >> > postgres >> > >> > (1 row) >> > >> > >> > >> > mydb=# select t1.user from t1; >> > >> > user >> > >> > ------ >> > >> > ABC >> > >> > (1 row) >> > >> > >> > >> > >> > >> > >> > >> > On Mon, May 23, 2016 at 5:12 PM, John Omernik <j...@omernik.com >> > >> > <javascript:;>> wrote: >> > >> > > Can (should) things inside back ticks be callable? I guess this >> > makes a >> > >> > > very difficult situation from a usability standpoint because user >> > is a >> > >> > not >> > >> > > uncommon column name (think security logs, web logs, etc) yet in >> the >> > >> > > current setup there is lots of possibility for assumptions on >> > calling >> > >> > back >> > >> > > tick user back tick and without an error users may have wrong, but >> > >> > "error" >> > >> > > free results. >> > >> > > On May 23, 2016 4:54 PM, "Jinfeng Ni" <jinfengn...@gmail.com >> > >> > <javascript:;>> wrote: >> > >> > > >> > >> > >> The problem here is that identifier 'user' is not only a reserved >> > >> > >> word, but also represents a special function == current_user() >> > call. >> > >> > >> The identifier 'user', whether it's quoted or not, could mean >> > either >> > >> > >> column name or the function call. Without the table alias, it >> > could >> > >> > >> be ambiguous to sql parser. The table alias informs the parser >> that >> > >> > >> this identifier is not a function call, but a regular identifier, >> > thus >> > >> > >> removes the ambiguity. >> > >> > >> >> > >> > >> This is different from other cases you use quoted reserved word >> to >> > >> > >> represent a column name, since those reserved words do not >> > represent a >> > >> > >> special function, thus no ambiguity. >> > >> > >> >> > >> > >> select `update`, `insert` from dfs.tmp.`1.json`; >> > >> > >> +---------+---------+ >> > >> > >> | update | insert | >> > >> > >> +---------+---------+ >> > >> > >> | abc | 100 | >> > >> > >> +---------+---------+ >> > >> > >> >> > >> > >> >> > >> > >> >> > >> > >> On Mon, May 23, 2016 at 10:44 AM, John Omernik <j...@omernik.com >> > >> > <javascript:;>> wrote: >> > >> > >> > Ya, as I am testing, this works, however, the users of the >> system >> > >> > expect >> > >> > >> to >> > >> > >> > be able to use `user` and while I can provide them instructions >> > to >> > >> > use a >> > >> > >> > table alias, I am very worried that they will forget and since >> it >> > >> > doesn't >> > >> > >> > error, but instead puts in a different string, this could lead >> to >> > >> bad >> > >> > >> > downstream results... >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > On Mon, May 23, 2016 at 12:41 PM, John Omernik < >> j...@omernik.com >> > >> > <javascript:;>> wrote: >> > >> > >> > >> > >> > >> >> I filed https://issues.apache.org/jira/browse/DRILL-4692 >> > >> > >> >> >> > >> > >> >> I see an alias would work as a tmp fix, but this should be >> > address >> > >> (I >> > >> > >> >> wonder if other words may have a problem too?) >> > >> > >> >> >> > >> > >> >> >> > >> > >> >> >> > >> > >> >> On Mon, May 23, 2016 at 12:38 PM, Andries Engelbrecht < >> > >> > >> >> aengelbre...@maprtech.com <javascript:;>> wrote: >> > >> > >> >> >> > >> > >> >>> Hmm interesting. >> > >> > >> >>> >> > >> > >> >>> As a workaround just use a table alias when referencing the >> > >> column. >> > >> > >> >>> >> > >> > >> >>> >> > >> > >> >>> Might be good to se if there is a JIRA for this, or file one >> if >> > >> not. >> > >> > >> >>> >> > >> > >> >>> --Andries >> > >> > >> >>> >> > >> > >> >>> > On May 23, 2016, at 10:28 AM, John Omernik < >> j...@omernik.com >> > >> > <javascript:;>> wrote: >> > >> > >> >>> > >> > >> > >> >>> > I have data with a field name user. >> > >> > >> >>> > >> > >> > >> >>> > When I select, with backticks, it doesn't show the field, >> but >> > >> > >> instead my >> > >> > >> >>> > current logged in user... >> > >> > >> >>> > >> > >> > >> >>> > >> > >> > >> >>> > select CONVERT_FROM(`user`, 'UTF8') as `user` from table >> > limit >> > >> 10; >> > >> > >> >>> > >> > >> > >> >>> > >> > >> > >> >>> > Shouldn't the backticks allow me to reference the field >> > >> properly? >> > >> > >> >>> > >> > >> > >> >>> > John >> > >> > >> >>> >> > >> > >> >>> >> > >> > >> >> >> > >> > >> >> > >> > >> > >> >> > >> >> > >> -- >> > >> Sent from my iThing >> > >> >> > >>