Lars Volker has posted comments on this change. Change subject: IMPALA-4166: Add SORT BY sql clause ......................................................................
Patch Set 17: (19 comments) Thank you for your review! Please see my inline comments and PS 17. http://gerrit.cloudera.org:8080/#/c/6495/15/fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java File fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java: PS15, Line 168: > typo: commas Done PS15, Line 169: ' > remove Done PS15, Line 170: f column names separated by commas, > nit: this function Done PS15, Line 172: during the an > You need to comment on the return value. Done PS15, Line 174: columns. > long line Done PS15, Line 178: if (!tblProperties.containsKey( > Whenever I see this, I keep wondering about HBase. Can you add a preconditi Done http://gerrit.cloudera.org:8080/#/c/6495/15/fe/src/main/java/org/apache/impala/analysis/AlterTableSortByColumnsStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableSortByColumnsStmt.java: PS15, Line 44: public static final String TBL_PROP_SORT_BY_COLUMNS = "sort.by.columns"; > Maybe it makes more sense to put this in AlterTableSetTblProperiesStmt. Not After moving this back and forth, my reasoning was that this is the more natural way to change the underlying table property, and therefore I left it here. In a similar fashion, the property name for 'skip.header.line.count' is defined in HdfsTable.java. I also don't have a strong preference. Let's wait what the next reviewer says then. :) PS15, Line 93: TableDef.analyzeSortByColumns(sor > What else could it be? You can just do Preconditions.checkState(table insta Done http://gerrit.cloudera.org:8080/#/c/6495/15/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java: PS15, Line 768: if (!(table_ instanceof HdfsTable)) return; : > Why not if (!(table_ instanceof HdfsTable)) return; instead? Done PS15, Line 843: // We need to make a copy to make the sortByColumns_ list mutable. : // TODO: Remove this when removing the sortby() hint (IMPALA-5157). : sortByColumns_ = Lists.newArrayList(sortByColumns_); : } : for (int i = 0; i < columns.size(); ++i) { : if (columns.get(i).getName().equals(columnName)) { : sortByExprs_.add(resultExprs_.get(i)); : sortByColumns_.add(i); : foundColumn = true; : break; : } : } : i > Not sure I follow what is happening here. If you already have sortByColumns This code is required for the sortby() hint, and will be removed together with the hint. Should I try and explain this more in the comment above? http://gerrit.cloudera.org:8080/#/c/6495/15/fe/src/main/java/org/apache/impala/analysis/TableDef.java File fe/src/main/java/org/apache/impala/analysis/TableDef.java: PS15, Line 284: List<Str > I like using interfaces but is this ever called with anything other than a No. I think it was used with the result of a split() in an earlier patchset, but now is only used for Lists. PS15, Line 350: if (options_.sortByCols == null) r > if (options_.sortByCols == null) return; Done http://gerrit.cloudera.org:8080/#/c/6495/15/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java File fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java: PS15, Line 77: S > nit: extra space Done PS15, Line 297: if (sortByColsSql != null) { : sb.append(String.format("SORT BY (\n %s\n)\n", : Joiner.on(", \n ").join(sortByColsSql))); : } > haha, I can't visualize the output here. I re-used the code from L288. Do you have a suggestion how it can be made more clear? http://gerrit.cloudera.org:8080/#/c/6495/15/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: PS15, Line 456: he/she adde > nit: added I think both ways are fine, adding the columns first, or calling this method first. Changed it to added though, since that seems to reflect the more common case better. http://gerrit.cloudera.org:8080/#/c/6495/15/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: PS15, Line 1881: String oldColumns = msTbl.getParameters().get(sortByKey); : String alteredColumns = MetaStoreUtil.intersectCsvLis > I find it hard to parse this. Can you split it into two statements? Also, e Done PS15, Line 1915: ring sortByKey = AlterTableSortByColumnsStmt.TBL_PROP_SORT_BY_COLUMNS; : if (msTbl.getParameters().containsKey(sortByKey)) { > same here Done PS15, Line 2198: } : } > and here Done http://gerrit.cloudera.org:8080/#/c/6495/15/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java File fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java: PS15, Line 218: HashSet<String> rightColNames = Sets.newHashSet(); : for (TColumn c : rightCols) rightColNames.add(c.getColumnName().toLowerCase()); : List<String> outputList = Lists.newArrayList(); : for (String leftCol : leftCols) { : if (rightColNames.contains(leftCol.toLowerCase())) outputList.add(leftCol); : } > Can't you use Set.retainAll() here? Cool, I didn't know retainAll(). I looked at it, but it looks like I'd have to materialize both lists to make them lowercase. In the current version I get away with materializing the right side only. Do you have an idea how to do the transformation and intersection in one step? -- To view, visit http://gerrit.cloudera.org:8080/6495 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I08834f38a941786ab45a4381c2732d929a934f75 Gerrit-PatchSet: 17 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com> Gerrit-Reviewer: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Marcel Kornacker <mar...@cloudera.com> Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com> Gerrit-HasComments: Yes