[ https://issues.apache.org/jira/browse/IMPALA-7400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Work on IMPALA-7400 started by Alex Rodoni. ------------------------------------------- > "SQL Statements to Remove or Adapt" is out of date > -------------------------------------------------- > > Key: IMPALA-7400 > URL: https://issues.apache.org/jira/browse/IMPALA-7400 > Project: IMPALA > Issue Type: Bug > Components: Docs > Affects Versions: Impala 3.0 > Reporter: Tim Armstrong > Assignee: Alex Rodoni > Priority: Major > Labels: docs > > "Impala has no DELETE statement." and "Impala has no UPDATE statement. " are > not totally true - Impala has those statements but only for Kudu tables. > "For example, Impala does not support natural joins or anti-joins," - Impala > does support Anti-joins via NOT IN/NOT EXISTS or even explicitly like: > {code} > select * from functional.alltypes a1 left anti join functional.alltypestiny > a2 on a1.id = a2.id; > {code} > "Within queries, Impala requires query aliases for any subqueries:" - this is > only true for subqueries used as inline views in the FROM clause. E.g. the > following works: > {code} > select * from functional.alltypes where id = (select min(id) from > functional.alltypes); > {code} > " Impala .. requires the CROSS JOIN operator for Cartesian products." - > untrue, this works: > {code} > select * from functional.alltypes t1, functional.alltypes t2; > {code} > "Have you run the COMPUTE STATS statement on each table involved in join > queries". This isn't specific to queries with joins, although may have more > impact. We recommend that users run COMPUTE STATS on all tables. > "A CREATE TABLE statement with no PARTITIONED BY clause stores all the data > files in the same physical location," - unpartitioned tables with multiple > files can have files residing in different locations (and there are already 3 > replicas per file by default, so the statement is a little misleading even if > there's a single file). I think the latest statement about "Have you > partitioned at the right granularity so that there is enough data in each > partition to parallelize the work for each query?" is also misleading for the > same reason. > "The INSERT ... VALUES syntax is suitable for setting up toy tables with a > few rows for functional testing, but because each such statement creates a > separate tiny file in HDFS". This advice only applies to HDFS, this should > work fine for Kudu tables although the INSERT statements are not particularly > fast. > "The number of expressions allowed in an Impala query might be smaller than > for some other database systems, causing failures for very complicated > queries" - this doesn't seem right - I don't know why the queries would fail. > Also the codegen time isn't really specific to expressions or where clauses. > There seems to be a point buried in there, but maybe it's just essentially > that "Complex queries may have high codegen time" -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org