[jira] [Commented] (HIVE-494) Select columns by index instead of name
[ https://issues.apache.org/jira/browse/HIVE-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14055969#comment-14055969 ] Navis commented on HIVE-494: Test fails seemed not related to this. Select columns by index instead of name --- Key: HIVE-494 URL: https://issues.apache.org/jira/browse/HIVE-494 Project: Hive Issue Type: Wish Components: Clients, Query Processor Reporter: Adam Kramer Assignee: Navis Priority: Minor Labels: SQL Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-494.D1641.1.patch, HIVE-494.2.patch.txt, HIVE-494.3.patch.txt, HIVE-494.D12153.1.patch SELECT mytable[0], mytable[2] FROM some_table_name mytable; ...should return the first and third columns, respectively, from mytable regardless of their column names. The need for names specifically is kind of silly when they just get translated into numbers anyway. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-494) Select columns by index instead of name
[ https://issues.apache.org/jira/browse/HIVE-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049363#comment-14049363 ] Hive QA commented on HIVE-494: -- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12653349/HIVE-494.3.patch.txt {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 5675 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority org.apache.hive.hcatalog.streaming.TestStreaming.testRemainingTransactions org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/652/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/652/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-652/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12653349 Select columns by index instead of name --- Key: HIVE-494 URL: https://issues.apache.org/jira/browse/HIVE-494 Project: Hive Issue Type: Wish Components: Clients, Query Processor Reporter: Adam Kramer Assignee: Navis Priority: Minor Labels: SQL Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-494.D1641.1.patch, HIVE-494.2.patch.txt, HIVE-494.3.patch.txt, HIVE-494.D12153.1.patch SELECT mytable[0], mytable[2] FROM some_table_name mytable; ...should return the first and third columns, respectively, from mytable regardless of their column names. The need for names specifically is kind of silly when they just get translated into numbers anyway. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-494) Select columns by index instead of name
[ https://issues.apache.org/jira/browse/HIVE-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14041800#comment-14041800 ] Navis commented on HIVE-494: [~xuefuz], [~appodictic] Anyone still interested in this? Select columns by index instead of name --- Key: HIVE-494 URL: https://issues.apache.org/jira/browse/HIVE-494 Project: Hive Issue Type: Wish Components: Clients, Query Processor Reporter: Adam Kramer Assignee: Navis Priority: Minor Labels: SQL Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-494.D1641.1.patch, HIVE-494.2.patch.txt, HIVE-494.D12153.1.patch SELECT mytable[0], mytable[2] FROM some_table_name mytable; ...should return the first and third columns, respectively, from mytable regardless of their column names. The need for names specifically is kind of silly when they just get translated into numbers anyway. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-494) Select columns by index instead of name
[ https://issues.apache.org/jira/browse/HIVE-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14042453#comment-14042453 ] Xuefu Zhang commented on HIVE-494: -- Yeah. I think this is useful, as long as it doesn't bring ambiguity. Original original proposal of table[0] notation is ambiguous, but the current table.$0 is good. Also, we should make sure that Hive emits an error message in case $0 cannot be uniquely identified such as in a join, similar to the case where a column's name cannot be uniquely identified. Select columns by index instead of name --- Key: HIVE-494 URL: https://issues.apache.org/jira/browse/HIVE-494 Project: Hive Issue Type: Wish Components: Clients, Query Processor Reporter: Adam Kramer Assignee: Navis Priority: Minor Labels: SQL Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-494.D1641.1.patch, HIVE-494.2.patch.txt, HIVE-494.D12153.1.patch SELECT mytable[0], mytable[2] FROM some_table_name mytable; ...should return the first and third columns, respectively, from mytable regardless of their column names. The need for names specifically is kind of silly when they just get translated into numbers anyway. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-494) Select columns by index instead of name
[ https://issues.apache.org/jira/browse/HIVE-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14042618#comment-14042618 ] Hive QA commented on HIVE-494: -- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12652150/HIVE-494.2.patch.txt {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5670 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_ppd_decimal org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/579/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/579/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-579/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12652150 Select columns by index instead of name --- Key: HIVE-494 URL: https://issues.apache.org/jira/browse/HIVE-494 Project: Hive Issue Type: Wish Components: Clients, Query Processor Reporter: Adam Kramer Assignee: Navis Priority: Minor Labels: SQL Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-494.D1641.1.patch, HIVE-494.2.patch.txt, HIVE-494.D12153.1.patch SELECT mytable[0], mytable[2] FROM some_table_name mytable; ...should return the first and third columns, respectively, from mytable regardless of their column names. The need for names specifically is kind of silly when they just get translated into numbers anyway. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-494) Select columns by index instead of name
[ https://issues.apache.org/jira/browse/HIVE-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14042940#comment-14042940 ] Navis commented on HIVE-494: Failed tests are not related to this. Select columns by index instead of name --- Key: HIVE-494 URL: https://issues.apache.org/jira/browse/HIVE-494 Project: Hive Issue Type: Wish Components: Clients, Query Processor Reporter: Adam Kramer Assignee: Navis Priority: Minor Labels: SQL Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-494.D1641.1.patch, HIVE-494.2.patch.txt, HIVE-494.D12153.1.patch SELECT mytable[0], mytable[2] FROM some_table_name mytable; ...should return the first and third columns, respectively, from mytable regardless of their column names. The need for names specifically is kind of silly when they just get translated into numbers anyway. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-494) Select columns by index instead of name
[ https://issues.apache.org/jira/browse/HIVE-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736716#comment-13736716 ] Hive QA commented on HIVE-494: -- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12597403/HIVE-494.D12153.1.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 2790 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/399/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/399/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. Select columns by index instead of name --- Key: HIVE-494 URL: https://issues.apache.org/jira/browse/HIVE-494 Project: Hive Issue Type: Wish Components: Clients, Query Processor Reporter: Adam Kramer Assignee: Navis Priority: Minor Labels: SQL Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-494.D1641.1.patch, HIVE-494.D12153.1.patch SELECT mytable[0], mytable[2] FROM some_table_name mytable; ...should return the first and third columns, respectively, from mytable regardless of their column names. The need for names specifically is kind of silly when they just get translated into numbers anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-494) Select columns by index instead of name
[ https://issues.apache.org/jira/browse/HIVE-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13735928#comment-13735928 ] Xuefu Zhang commented on HIVE-494: -- Pig supports this, though using $1, $2 syntax, which is useful and convenient in some sense. However, I didn't find it's in standard SQL. One downside of supporting this is that ordering starts to matter now in the select list. If I do select a, b, c from T, the output is deterministic regardless T's schema (as long it has a, b, and c). On the other hand, if I do select $1, $2, $3 from T and if later on the table's schema is changed as (a, b, d, c), then my query will return a different data set. So, projecting by numbers is different from they just get translated into numbers anyway. Adding columns is quite common in hadoop data. Of course, one can argue that columns should always add at the end, which doesn't happen that way always. Select columns by index instead of name --- Key: HIVE-494 URL: https://issues.apache.org/jira/browse/HIVE-494 Project: Hive Issue Type: Wish Components: Clients, Query Processor Reporter: Adam Kramer Priority: Minor Labels: SQL Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-494.D1641.1.patch SELECT mytable[0], mytable[2] FROM some_table_name mytable; ...should return the first and third columns, respectively, from mytable regardless of their column names. The need for names specifically is kind of silly when they just get translated into numbers anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-494) Select columns by index instead of name
[ https://issues.apache.org/jira/browse/HIVE-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13735932#comment-13735932 ] Edward Capriolo commented on HIVE-494: -- I think any user will realize that '$1' can change. In the end i think hive should be more dynamic somewhat like pig. Imagine something like this: create table x stored by dynamichandler; select $1 , $2 from x (inputformat=textinputformat, inpath=/x/y/z); We are close to this now because Navis added the ability to specify per query table properties. What is, or what is not in the SQL spec should not be our metric, we can already do amazing things that SQL can't so I want to keep innovating. As long as something does not produce an ambiguity in the language I see no harm in it. Select columns by index instead of name --- Key: HIVE-494 URL: https://issues.apache.org/jira/browse/HIVE-494 Project: Hive Issue Type: Wish Components: Clients, Query Processor Reporter: Adam Kramer Priority: Minor Labels: SQL Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-494.D1641.1.patch SELECT mytable[0], mytable[2] FROM some_table_name mytable; ...should return the first and third columns, respectively, from mytable regardless of their column names. The need for names specifically is kind of silly when they just get translated into numbers anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-494) Select columns by index instead of name
[ https://issues.apache.org/jira/browse/HIVE-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13735943#comment-13735943 ] Edward Capriolo commented on HIVE-494: -- I think we should also support negative numbers to query from the right end like awk's $NF Select columns by index instead of name --- Key: HIVE-494 URL: https://issues.apache.org/jira/browse/HIVE-494 Project: Hive Issue Type: Wish Components: Clients, Query Processor Reporter: Adam Kramer Priority: Minor Labels: SQL Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-494.D1641.1.patch SELECT mytable[0], mytable[2] FROM some_table_name mytable; ...should return the first and third columns, respectively, from mytable regardless of their column names. The need for names specifically is kind of silly when they just get translated into numbers anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-494) Select columns by index instead of name
[ https://issues.apache.org/jira/browse/HIVE-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736158#comment-13736158 ] Navis commented on HIVE-494: The patch for this was made once and I expect it to be found somewhere in local git (regardless it might be based on hive-0.9 or older). It seemed very convenient especially for SQL generators. I'll look into this, in tomorrow. Select columns by index instead of name --- Key: HIVE-494 URL: https://issues.apache.org/jira/browse/HIVE-494 Project: Hive Issue Type: Wish Components: Clients, Query Processor Reporter: Adam Kramer Priority: Minor Labels: SQL Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-494.D1641.1.patch SELECT mytable[0], mytable[2] FROM some_table_name mytable; ...should return the first and third columns, respectively, from mytable regardless of their column names. The need for names specifically is kind of silly when they just get translated into numbers anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-494) Select columns by index instead of name
[ https://issues.apache.org/jira/browse/HIVE-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13735746#comment-13735746 ] Edward Capriolo commented on HIVE-494: -- [~cwsteinbach] [~navis] I think we should commit this. * it is impossible to name a column 1 * it is impossible to name a column alias 1 If order by supports this I do not see group by can't? Do we want to reconsider this?I kinda like the feature. Select columns by index instead of name --- Key: HIVE-494 URL: https://issues.apache.org/jira/browse/HIVE-494 Project: Hive Issue Type: Wish Components: Clients, Query Processor Reporter: Adam Kramer Priority: Minor Labels: SQL Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-494.D1641.1.patch SELECT mytable[0], mytable[2] FROM some_table_name mytable; ...should return the first and third columns, respectively, from mytable regardless of their column names. The need for names specifically is kind of silly when they just get translated into numbers anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-494) Select columns by index instead of name
[ https://issues.apache.org/jira/browse/HIVE-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262449#comment-13262449 ] Carl Steinbach commented on HIVE-494: - bq. It seemed not a standard SQL and also could be confusional especially with join/unions. I'm a little worried about this too, mainly because it appears to extend SQL syntax and (speaking personally) I don't think I fully understand the long term impact of a change like this. If it turns out that this syntax is broken, deprecating it is going to be painful for all of the people who start using it. bq. Could it be worth to be implemented? I think we should pass on this unless it turns out to be part of standard SQL, or we can point to some other DB like MySQL that already implements it. On a related note, HIVE-1947 covers implementing similar syntax for the ORDER BY clause, and it turns out that this *is* part of standard SQL. It's also interesting to note that ordinal column references in the WHERE clause aren't supported since it would result in ambiguous statements like this: {noformat} SELECT a, b from src WHERE 1=1; {noformat} Select columns by index instead of name --- Key: HIVE-494 URL: https://issues.apache.org/jira/browse/HIVE-494 Project: Hive Issue Type: Wish Components: Clients, Query Processor Reporter: Adam Kramer Assignee: Navis Priority: Minor Labels: SQL Attachments: HIVE-494.D1641.1.patch SELECT mytable[0], mytable[2] FROM some_table_name mytable; ...should return the first and third columns, respectively, from mytable regardless of their column names. The need for names specifically is kind of silly when they just get translated into numbers anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-494) Select columns by index instead of name
[ https://issues.apache.org/jira/browse/HIVE-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262373#comment-13262373 ] Phabricator commented on HIVE-494: -- cwsteinbach has commented on the revision HIVE-494 [jira] Select columns by index instead of name. INLINE COMMENTS ql/src/test/queries/clientpositive/select_by_column_index.q:2 Is this syntax part of the SQL standard, or is it an extension specific to Hive? ql/src/test/queries/clientpositive/select_by_column_index.q:5 Please add coverage for JOINs, ORDER BY, HAVING, and UNION clauses, and a negative testcase that shows what happens when an invalid index is referenced. REVISION DETAIL https://reviews.facebook.net/D1641 Select columns by index instead of name --- Key: HIVE-494 URL: https://issues.apache.org/jira/browse/HIVE-494 Project: Hive Issue Type: Wish Components: Clients, Query Processor Reporter: Adam Kramer Assignee: Navis Priority: Minor Labels: SQL Attachments: HIVE-494.D1641.1.patch SELECT mytable[0], mytable[2] FROM some_table_name mytable; ...should return the first and third columns, respectively, from mytable regardless of their column names. The need for names specifically is kind of silly when they just get translated into numbers anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-494) Select columns by index instead of name
[ https://issues.apache.org/jira/browse/HIVE-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262381#comment-13262381 ] Navis commented on HIVE-494: I've completely forgotten this patch and looking into it newly. 1. It seemed not a standard SQL and also could be confusional especially with join/unions. 2. The patch is using RR of input operator but it's clearly not correct. Could it be worth to be implemented? Select columns by index instead of name --- Key: HIVE-494 URL: https://issues.apache.org/jira/browse/HIVE-494 Project: Hive Issue Type: Wish Components: Clients, Query Processor Reporter: Adam Kramer Assignee: Navis Priority: Minor Labels: SQL Attachments: HIVE-494.D1641.1.patch SELECT mytable[0], mytable[2] FROM some_table_name mytable; ...should return the first and third columns, respectively, from mytable regardless of their column names. The need for names specifically is kind of silly when they just get translated into numbers anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-494) Select columns by index instead of name
[ https://issues.apache.org/jira/browse/HIVE-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13203357#comment-13203357 ] Phabricator commented on HIVE-494: -- navis has commented on the revision HIVE-494 [jira] Select columns by index instead of name. Simple patch, but was very useful for me to implement query generator for random-forest. REVISION DETAIL https://reviews.facebook.net/D1641 Select columns by index instead of name --- Key: HIVE-494 URL: https://issues.apache.org/jira/browse/HIVE-494 Project: Hive Issue Type: Wish Components: Clients, Query Processor Reporter: Adam Kramer Priority: Minor Labels: SQL Attachments: HIVE-494.D1641.1.patch SELECT mytable[0], mytable[2] FROM some_table_name mytable; ...should return the first and third columns, respectively, from mytable regardless of their column names. The need for names specifically is kind of silly when they just get translated into numbers anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira