[jira] [Commented] (DRILL-4264) Allow field names to include dots
[ https://issues.apache.org/jira/browse/DRILL-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16153303#comment-16153303 ] ASF GitHub Bot commented on DRILL-4264: --- Github user vvysotskyi closed the pull request at: https://github.com/apache/drill/pull/909 > Allow field names to include dots > - > > Key: DRILL-4264 > URL: https://issues.apache.org/jira/browse/DRILL-4264 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Codegen >Reporter: Alex >Assignee: Volodymyr Vysotskyi > Labels: doc-impacting, ready-to-commit > Fix For: 1.12.0 > > > If you have some json data like this... > {code:javascript} > { > "0.0.1":{ > "version":"0.0.1", > "date_created":"2014-03-15" > }, > "0.1.2":{ > "version":"0.1.2", > "date_created":"2014-05-21" > } > } > {code} > ... there is no way to select any of the rows since their identifiers contain > dots and when trying to select them, Drill throws the following error: > Error: SYSTEM ERROR: UnsupportedOperationException: Unhandled field reference > "0.0.1"; a field reference identifier must not have the form of a qualified > name > This must be fixed since there are many json data files containing dots in > some of the keys (e.g. when specifying version numbers etc) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4264) Allow field names to include dots
[ https://issues.apache.org/jira/browse/DRILL-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16153102#comment-16153102 ] ASF GitHub Bot commented on DRILL-4264: --- Github user amansinha100 commented on the issue: https://github.com/apache/drill/pull/909 Merged in d105950a7a9fb2ff3acd072ee65a51ef1fca120e. @vvysotskyi pls close the PR (for some reason github is not showing me the option to close). > Allow field names to include dots > - > > Key: DRILL-4264 > URL: https://issues.apache.org/jira/browse/DRILL-4264 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Codegen >Reporter: Alex >Assignee: Volodymyr Vysotskyi > Labels: doc-impacting, ready-to-commit > Fix For: 1.12.0 > > > If you have some json data like this... > {code:javascript} > { > "0.0.1":{ > "version":"0.0.1", > "date_created":"2014-03-15" > }, > "0.1.2":{ > "version":"0.1.2", > "date_created":"2014-05-21" > } > } > {code} > ... there is no way to select any of the rows since their identifiers contain > dots and when trying to select them, Drill throws the following error: > Error: SYSTEM ERROR: UnsupportedOperationException: Unhandled field reference > "0.0.1"; a field reference identifier must not have the form of a qualified > name > This must be fixed since there are many json data files containing dots in > some of the keys (e.g. when specifying version numbers etc) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4264) Allow field names to include dots
[ https://issues.apache.org/jira/browse/DRILL-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16153098#comment-16153098 ] ASF GitHub Bot commented on DRILL-4264: --- Github user amansinha100 commented on the issue: https://github.com/apache/drill/pull/909 Thanks @vvysotskyi for the PR and @paul-rogers for reviewing the proposal and code. > Allow field names to include dots > - > > Key: DRILL-4264 > URL: https://issues.apache.org/jira/browse/DRILL-4264 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Codegen >Reporter: Alex >Assignee: Volodymyr Vysotskyi > Labels: doc-impacting, ready-to-commit > Fix For: 1.12.0 > > > If you have some json data like this... > {code:javascript} > { > "0.0.1":{ > "version":"0.0.1", > "date_created":"2014-03-15" > }, > "0.1.2":{ > "version":"0.1.2", > "date_created":"2014-05-21" > } > } > {code} > ... there is no way to select any of the rows since their identifiers contain > dots and when trying to select them, Drill throws the following error: > Error: SYSTEM ERROR: UnsupportedOperationException: Unhandled field reference > "0.0.1"; a field reference identifier must not have the form of a qualified > name > This must be fixed since there are many json data files containing dots in > some of the keys (e.g. when specifying version numbers etc) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4264) Allow field names to include dots
[ https://issues.apache.org/jira/browse/DRILL-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16147606#comment-16147606 ] ASF GitHub Bot commented on DRILL-4264: --- Github user paul-rogers commented on the issue: https://github.com/apache/drill/pull/909 Thanks much for the great work! +1 > Allow field names to include dots > - > > Key: DRILL-4264 > URL: https://issues.apache.org/jira/browse/DRILL-4264 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Codegen >Reporter: Alex >Assignee: Volodymyr Vysotskyi > Labels: doc-impacting > Fix For: 1.12.0 > > > If you have some json data like this... > {code:javascript} > { > "0.0.1":{ > "version":"0.0.1", > "date_created":"2014-03-15" > }, > "0.1.2":{ > "version":"0.1.2", > "date_created":"2014-05-21" > } > } > {code} > ... there is no way to select any of the rows since their identifiers contain > dots and when trying to select them, Drill throws the following error: > Error: SYSTEM ERROR: UnsupportedOperationException: Unhandled field reference > "0.0.1"; a field reference identifier must not have the form of a qualified > name > This must be fixed since there are many json data files containing dots in > some of the keys (e.g. when specifying version numbers etc) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4264) Allow field names to include dots
[ https://issues.apache.org/jira/browse/DRILL-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16147018#comment-16147018 ] ASF GitHub Bot commented on DRILL-4264: --- Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/909#discussion_r136016798 --- Diff: logical/src/main/java/org/apache/drill/common/expression/SchemaPath.java --- @@ -115,6 +112,33 @@ public static SchemaPath create(NamePart namePart) { } /** + * Parses input string and returns {@code SchemaPath} instance. + * + * @param expr input string to be parsed + * @return {@code SchemaPath} instance + */ + public static SchemaPath parseFromString(String expr) { --- End diff -- Done > Allow field names to include dots > - > > Key: DRILL-4264 > URL: https://issues.apache.org/jira/browse/DRILL-4264 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Codegen >Reporter: Alex >Assignee: Volodymyr Vysotskyi > Labels: doc-impacting > Fix For: 1.12.0 > > > If you have some json data like this... > {code:javascript} > { > "0.0.1":{ > "version":"0.0.1", > "date_created":"2014-03-15" > }, > "0.1.2":{ > "version":"0.1.2", > "date_created":"2014-05-21" > } > } > {code} > ... there is no way to select any of the rows since their identifiers contain > dots and when trying to select them, Drill throws the following error: > Error: SYSTEM ERROR: UnsupportedOperationException: Unhandled field reference > "0.0.1"; a field reference identifier must not have the form of a qualified > name > This must be fixed since there are many json data files containing dots in > some of the keys (e.g. when specifying version numbers etc) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4264) Allow field names to include dots
[ https://issues.apache.org/jira/browse/DRILL-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16145835#comment-16145835 ] ASF GitHub Bot commented on DRILL-4264: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/909#discussion_r135872295 --- Diff: logical/src/main/java/org/apache/drill/common/expression/SchemaPath.java --- @@ -115,6 +112,33 @@ public static SchemaPath create(NamePart namePart) { } /** + * Parses input string and returns {@code SchemaPath} instance. + * + * @param expr input string to be parsed + * @return {@code SchemaPath} instance + */ + public static SchemaPath parseFromString(String expr) { --- End diff -- Thanks for the explanation. Can you copy it into the Javadoc for this method? Will help others in the future. > Allow field names to include dots > - > > Key: DRILL-4264 > URL: https://issues.apache.org/jira/browse/DRILL-4264 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Codegen >Reporter: Alex >Assignee: Volodymyr Vysotskyi > Labels: doc-impacting > Fix For: 1.12.0 > > > If you have some json data like this... > {code:javascript} > { > "0.0.1":{ > "version":"0.0.1", > "date_created":"2014-03-15" > }, > "0.1.2":{ > "version":"0.1.2", > "date_created":"2014-05-21" > } > } > {code} > ... there is no way to select any of the rows since their identifiers contain > dots and when trying to select them, Drill throws the following error: > Error: SYSTEM ERROR: UnsupportedOperationException: Unhandled field reference > "0.0.1"; a field reference identifier must not have the form of a qualified > name > This must be fixed since there are many json data files containing dots in > some of the keys (e.g. when specifying version numbers etc) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4264) Allow field names to include dots
[ https://issues.apache.org/jira/browse/DRILL-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16145834#comment-16145834 ] ASF GitHub Bot commented on DRILL-4264: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/909#discussion_r135868684 --- Diff: contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/binary/CompareFunctionsProcessor.java --- @@ -82,466 +45,37 @@ public static CompareFunctionsProcessor process(FunctionCall call, boolean nullC LogicalExpression swapArg = valueArg; valueArg = nameArg; nameArg = swapArg; -evaluator.functionName = COMPARE_FUNCTIONS_TRANSPOSE_MAP.get(functionName); + evaluator.setFunctionName(COMPARE_FUNCTIONS_TRANSPOSE_MAP.get(functionName)); } - evaluator.success = nameArg.accept(evaluator, valueArg); + evaluator.setSuccess(nameArg.accept(evaluator, valueArg)); } else if (nullComparatorSupported && call.args.get(0) instanceof SchemaPath) { - evaluator.success = true; - evaluator.path = (SchemaPath) nameArg; + evaluator.setSuccess(true); + evaluator.setPath((SchemaPath) nameArg); } return evaluator; } - public CompareFunctionsProcessor(String functionName) { -this.success = false; -this.functionName = functionName; -this.isEqualityFn = COMPARE_FUNCTIONS_TRANSPOSE_MAP.containsKey(functionName) -&& COMPARE_FUNCTIONS_TRANSPOSE_MAP.get(functionName).equals(functionName); -this.isRowKeyPrefixComparison = false; -this.sortOrderAscending = true; - } - - public byte[] getValue() { -return value; - } - - public boolean isSuccess() { -return success; - } - - public SchemaPath getPath() { -return path; - } - - public String getFunctionName() { -return functionName; - } - - public boolean isRowKeyPrefixComparison() { - return isRowKeyPrefixComparison; - } - - public byte[] getRowKeyPrefixStartRow() { -return rowKeyPrefixStartRow; - } - - public byte[] getRowKeyPrefixStopRow() { - return rowKeyPrefixStopRow; - } - - public Filter getRowKeyPrefixFilter() { - return rowKeyPrefixFilter; - } - - public boolean isSortOrderAscending() { -return sortOrderAscending; - } - @Override - public Boolean visitCastExpression(CastExpression e, LogicalExpression valueArg) throws RuntimeException { -if (e.getInput() instanceof CastExpression || e.getInput() instanceof SchemaPath) { - return e.getInput().accept(this, valueArg); -} -return false; - } - - @Override - public Boolean visitConvertExpression(ConvertExpression e, LogicalExpression valueArg) throws RuntimeException { -if (e.getConvertFunction() == ConvertExpression.CONVERT_FROM) { - - String encodingType = e.getEncodingType(); - int prefixLength= 0; - - // Handle scan pruning in the following scenario: - // The row-key is a composite key and the CONVERT_FROM() function has byte_substr() as input function which is - // querying for the first few bytes of the row-key(start-offset 1) - // Example WHERE clause: - // CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'DATE_EPOCH_BE') < DATE '2015-06-17' - if (e.getInput() instanceof FunctionCall) { - -// We can prune scan range only for big-endian encoded data -if (encodingType.endsWith("_BE") == false) { - return false; -} - -FunctionCall call = (FunctionCall)e.getInput(); -String functionName = call.getName(); -if (!functionName.equalsIgnoreCase("byte_substr")) { - return false; -} - -LogicalExpression nameArg = call.args.get(0); -LogicalExpression valueArg1 = call.args.size() >= 2 ? call.args.get(1) : null; -LogicalExpression valueArg2 = call.args.size() >= 3 ? call.args.get(2) : null; - -if (((nameArg instanceof SchemaPath) == false) || - (valueArg1 == null) || ((valueArg1 instanceof IntExpression) == false) || - (valueArg2 == null) || ((valueArg2 instanceof IntExpression) == false)) { - return false; -} - -boolean isRowKey = ((SchemaPath)nameArg).getAsUnescapedPath().equals(DrillHBaseConstants.ROW_KEY); -int offset = ((IntExpression)valueArg1).getInt(); - -if (!isRowKey || (offset != 1)) { - return false; -} - -this.path= (Schem
[jira] [Commented] (DRILL-4264) Allow field names to include dots
[ https://issues.apache.org/jira/browse/DRILL-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16145084#comment-16145084 ] ASF GitHub Bot commented on DRILL-4264: --- Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/909#discussion_r135753493 --- Diff: contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/binary/CompareFunctionsProcessor.java --- @@ -82,466 +45,37 @@ public static CompareFunctionsProcessor process(FunctionCall call, boolean nullC LogicalExpression swapArg = valueArg; valueArg = nameArg; nameArg = swapArg; -evaluator.functionName = COMPARE_FUNCTIONS_TRANSPOSE_MAP.get(functionName); + evaluator.setFunctionName(COMPARE_FUNCTIONS_TRANSPOSE_MAP.get(functionName)); } - evaluator.success = nameArg.accept(evaluator, valueArg); + evaluator.setSuccess(nameArg.accept(evaluator, valueArg)); } else if (nullComparatorSupported && call.args.get(0) instanceof SchemaPath) { - evaluator.success = true; - evaluator.path = (SchemaPath) nameArg; + evaluator.setSuccess(true); + evaluator.setPath((SchemaPath) nameArg); } return evaluator; } - public CompareFunctionsProcessor(String functionName) { -this.success = false; -this.functionName = functionName; -this.isEqualityFn = COMPARE_FUNCTIONS_TRANSPOSE_MAP.containsKey(functionName) -&& COMPARE_FUNCTIONS_TRANSPOSE_MAP.get(functionName).equals(functionName); -this.isRowKeyPrefixComparison = false; -this.sortOrderAscending = true; - } - - public byte[] getValue() { -return value; - } - - public boolean isSuccess() { -return success; - } - - public SchemaPath getPath() { -return path; - } - - public String getFunctionName() { -return functionName; - } - - public boolean isRowKeyPrefixComparison() { - return isRowKeyPrefixComparison; - } - - public byte[] getRowKeyPrefixStartRow() { -return rowKeyPrefixStartRow; - } - - public byte[] getRowKeyPrefixStopRow() { - return rowKeyPrefixStopRow; - } - - public Filter getRowKeyPrefixFilter() { - return rowKeyPrefixFilter; - } - - public boolean isSortOrderAscending() { -return sortOrderAscending; - } - @Override - public Boolean visitCastExpression(CastExpression e, LogicalExpression valueArg) throws RuntimeException { -if (e.getInput() instanceof CastExpression || e.getInput() instanceof SchemaPath) { - return e.getInput().accept(this, valueArg); -} -return false; - } - - @Override - public Boolean visitConvertExpression(ConvertExpression e, LogicalExpression valueArg) throws RuntimeException { -if (e.getConvertFunction() == ConvertExpression.CONVERT_FROM) { - - String encodingType = e.getEncodingType(); - int prefixLength= 0; - - // Handle scan pruning in the following scenario: - // The row-key is a composite key and the CONVERT_FROM() function has byte_substr() as input function which is - // querying for the first few bytes of the row-key(start-offset 1) - // Example WHERE clause: - // CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'DATE_EPOCH_BE') < DATE '2015-06-17' - if (e.getInput() instanceof FunctionCall) { - -// We can prune scan range only for big-endian encoded data -if (encodingType.endsWith("_BE") == false) { - return false; -} - -FunctionCall call = (FunctionCall)e.getInput(); -String functionName = call.getName(); -if (!functionName.equalsIgnoreCase("byte_substr")) { - return false; -} - -LogicalExpression nameArg = call.args.get(0); -LogicalExpression valueArg1 = call.args.size() >= 2 ? call.args.get(1) : null; -LogicalExpression valueArg2 = call.args.size() >= 3 ? call.args.get(2) : null; - -if (((nameArg instanceof SchemaPath) == false) || - (valueArg1 == null) || ((valueArg1 instanceof IntExpression) == false) || - (valueArg2 == null) || ((valueArg2 instanceof IntExpression) == false)) { - return false; -} - -boolean isRowKey = ((SchemaPath)nameArg).getAsUnescapedPath().equals(DrillHBaseConstants.ROW_KEY); -int offset = ((IntExpression)valueArg1).getInt(); - -if (!isRowKey || (offset != 1)) { - return false; -} - -this.path= (Schema
[jira] [Commented] (DRILL-4264) Allow field names to include dots
[ https://issues.apache.org/jira/browse/DRILL-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16142299#comment-16142299 ] ASF GitHub Bot commented on DRILL-4264: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/909#discussion_r135359090 --- Diff: contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/binary/CompareFunctionsProcessor.java --- @@ -82,466 +45,37 @@ public static CompareFunctionsProcessor process(FunctionCall call, boolean nullC LogicalExpression swapArg = valueArg; valueArg = nameArg; nameArg = swapArg; -evaluator.functionName = COMPARE_FUNCTIONS_TRANSPOSE_MAP.get(functionName); + evaluator.setFunctionName(COMPARE_FUNCTIONS_TRANSPOSE_MAP.get(functionName)); } - evaluator.success = nameArg.accept(evaluator, valueArg); + evaluator.setSuccess(nameArg.accept(evaluator, valueArg)); } else if (nullComparatorSupported && call.args.get(0) instanceof SchemaPath) { - evaluator.success = true; - evaluator.path = (SchemaPath) nameArg; + evaluator.setSuccess(true); + evaluator.setPath((SchemaPath) nameArg); } return evaluator; } - public CompareFunctionsProcessor(String functionName) { -this.success = false; -this.functionName = functionName; -this.isEqualityFn = COMPARE_FUNCTIONS_TRANSPOSE_MAP.containsKey(functionName) -&& COMPARE_FUNCTIONS_TRANSPOSE_MAP.get(functionName).equals(functionName); -this.isRowKeyPrefixComparison = false; -this.sortOrderAscending = true; - } - - public byte[] getValue() { -return value; - } - - public boolean isSuccess() { -return success; - } - - public SchemaPath getPath() { -return path; - } - - public String getFunctionName() { -return functionName; - } - - public boolean isRowKeyPrefixComparison() { - return isRowKeyPrefixComparison; - } - - public byte[] getRowKeyPrefixStartRow() { -return rowKeyPrefixStartRow; - } - - public byte[] getRowKeyPrefixStopRow() { - return rowKeyPrefixStopRow; - } - - public Filter getRowKeyPrefixFilter() { - return rowKeyPrefixFilter; - } - - public boolean isSortOrderAscending() { -return sortOrderAscending; - } - @Override - public Boolean visitCastExpression(CastExpression e, LogicalExpression valueArg) throws RuntimeException { -if (e.getInput() instanceof CastExpression || e.getInput() instanceof SchemaPath) { - return e.getInput().accept(this, valueArg); -} -return false; - } - - @Override - public Boolean visitConvertExpression(ConvertExpression e, LogicalExpression valueArg) throws RuntimeException { -if (e.getConvertFunction() == ConvertExpression.CONVERT_FROM) { - - String encodingType = e.getEncodingType(); - int prefixLength= 0; - - // Handle scan pruning in the following scenario: - // The row-key is a composite key and the CONVERT_FROM() function has byte_substr() as input function which is - // querying for the first few bytes of the row-key(start-offset 1) - // Example WHERE clause: - // CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'DATE_EPOCH_BE') < DATE '2015-06-17' - if (e.getInput() instanceof FunctionCall) { - -// We can prune scan range only for big-endian encoded data -if (encodingType.endsWith("_BE") == false) { - return false; -} - -FunctionCall call = (FunctionCall)e.getInput(); -String functionName = call.getName(); -if (!functionName.equalsIgnoreCase("byte_substr")) { - return false; -} - -LogicalExpression nameArg = call.args.get(0); -LogicalExpression valueArg1 = call.args.size() >= 2 ? call.args.get(1) : null; -LogicalExpression valueArg2 = call.args.size() >= 3 ? call.args.get(2) : null; - -if (((nameArg instanceof SchemaPath) == false) || - (valueArg1 == null) || ((valueArg1 instanceof IntExpression) == false) || - (valueArg2 == null) || ((valueArg2 instanceof IntExpression) == false)) { - return false; -} - -boolean isRowKey = ((SchemaPath)nameArg).getAsUnescapedPath().equals(DrillHBaseConstants.ROW_KEY); -int offset = ((IntExpression)valueArg1).getInt(); - -if (!isRowKey || (offset != 1)) { - return false; -} - -this.path= (Schem
[jira] [Commented] (DRILL-4264) Allow field names to include dots
[ https://issues.apache.org/jira/browse/DRILL-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138696#comment-16138696 ] ASF GitHub Bot commented on DRILL-4264: --- Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/909#discussion_r134540459 --- Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/CompareFunctionsProcessor.java --- @@ -147,10 +147,10 @@ public Boolean visitCastExpression(CastExpression e, LogicalExpression valueArg) @Override public Boolean visitConvertExpression(ConvertExpression e, LogicalExpression valueArg) throws RuntimeException { -if (e.getConvertFunction() == ConvertExpression.CONVERT_FROM) { +if (ConvertExpression.CONVERT_FROM.equals(e.getConvertFunction())) { --- End diff -- Since both these classes almost the same, I moved mutual code to the abstract class. > Allow field names to include dots > - > > Key: DRILL-4264 > URL: https://issues.apache.org/jira/browse/DRILL-4264 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Codegen >Reporter: Alex >Assignee: Volodymyr Vysotskyi > Labels: doc-impacting > Fix For: 1.12.0 > > > If you have some json data like this... > {code:javascript} > { > "0.0.1":{ > "version":"0.0.1", > "date_created":"2014-03-15" > }, > "0.1.2":{ > "version":"0.1.2", > "date_created":"2014-05-21" > } > } > {code} > ... there is no way to select any of the rows since their identifiers contain > dots and when trying to select them, Drill throws the following error: > Error: SYSTEM ERROR: UnsupportedOperationException: Unhandled field reference > "0.0.1"; a field reference identifier must not have the form of a qualified > name > This must be fixed since there are many json data files containing dots in > some of the keys (e.g. when specifying version numbers etc) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4264) Allow field names to include dots
[ https://issues.apache.org/jira/browse/DRILL-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138705#comment-16138705 ] ASF GitHub Bot commented on DRILL-4264: --- Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/909#discussion_r134683465 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java --- @@ -359,30 +361,109 @@ public Mutator(OperatorExecContext oContext, BufferAllocator allocator, VectorCo public T addField(MaterializedField field, Class clazz) throws SchemaChangeException { // Check if the field exists. - ValueVector v = fieldVectorMap.get(field.getPath()); - if (v == null || v.getClass() != clazz) { + ValueVector vector = fieldVectorMap.get(field.getName()); + ValueVector childVector = vector; + // if a vector does not exist yet, creates value vector, or if it exists and has map type, omit this code + if (vector == null || (vector.getClass() != clazz +&& (vector.getField().getType().getMinorType() != MinorType.MAP +|| field.getType().getMinorType() != MinorType.MAP))) { // Field does not exist--add it to the map and the output container. -v = TypeHelper.getNewVector(field, allocator, callBack); -if (!clazz.isAssignableFrom(v.getClass())) { +vector = TypeHelper.getNewVector(field, allocator, callBack); +childVector = vector; +// gets inner field if the map was created the first time +if (field.getType().getMinorType() == MinorType.MAP) { + childVector = getChildVectorByField(vector, field); +} else if (!clazz.isAssignableFrom(vector.getClass())) { throw new SchemaChangeException( String.format( "The class that was provided, %s, does not correspond to the " + "expected vector type of %s.", - clazz.getSimpleName(), v.getClass().getSimpleName())); + clazz.getSimpleName(), vector.getClass().getSimpleName())); } -final ValueVector old = fieldVectorMap.put(field.getPath(), v); +final ValueVector old = fieldVectorMap.put(field.getName(), vector); if (old != null) { old.clear(); container.remove(old); } -container.add(v); +container.add(vector); // Added new vectors to the container--mark that the schema has changed. schemaChanged = true; } + // otherwise, checks that field and existing vector have a map type + // and adds child fields from the field to the vector + else if (field.getType().getMinorType() == MinorType.MAP + && vector.getField().getType().getMinorType() == MinorType.MAP + && !field.getChildren().isEmpty()) { +// an incoming field contains only single child since it determines +// full name path of the field in the schema +childVector = addNestedChildToMap((MapVector) vector, Iterables.getLast(field.getChildren())); +schemaChanged = true; + } - return clazz.cast(v); + return clazz.cast(childVector); +} + +/** + * Finds and returns value vector which path corresponds to the specified field. + * If required vector is nested in the map, gets and returns this vector from the map. + * + * @param valueVector vector that should be checked + * @param field field that corresponds to required vector + * @return value vector whose path corresponds to the specified field + * + * @throws SchemaChangeException if the field does not correspond to the found vector + */ +private ValueVector getChildVectorByField(ValueVector valueVector, + MaterializedField field) throws SchemaChangeException { + if (field.getChildren().isEmpty()) { +if (valueVector.getField().equals(field)) { + return valueVector; +} else { + throw new SchemaChangeException( +String.format( + "The field that was provided, %s, does not correspond to the " ++ "expected vector type of %s.", + field, valueVector.getClass().getSimpleName())); +} + } else { +// an incoming field contains only single child since it determines +// full name path of the field in the schema +MaterializedField childField = Iter
[jira] [Commented] (DRILL-4264) Allow field names to include dots
[ https://issues.apache.org/jira/browse/DRILL-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138700#comment-16138700 ] ASF GitHub Bot commented on DRILL-4264: --- Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/909#discussion_r134778939 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/expr/TestSchemaPathMaterialization.java --- @@ -93,4 +93,23 @@ public void testProjectionMultipleFiles() throws Exception { .go(); } + @Test //DRILL-4264 + public void testFieldNameWithDot() throws Exception { --- End diff -- Added more tests > Allow field names to include dots > - > > Key: DRILL-4264 > URL: https://issues.apache.org/jira/browse/DRILL-4264 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Codegen >Reporter: Alex >Assignee: Volodymyr Vysotskyi > Labels: doc-impacting > Fix For: 1.12.0 > > > If you have some json data like this... > {code:javascript} > { > "0.0.1":{ > "version":"0.0.1", > "date_created":"2014-03-15" > }, > "0.1.2":{ > "version":"0.1.2", > "date_created":"2014-05-21" > } > } > {code} > ... there is no way to select any of the rows since their identifiers contain > dots and when trying to select them, Drill throws the following error: > Error: SYSTEM ERROR: UnsupportedOperationException: Unhandled field reference > "0.0.1"; a field reference identifier must not have the form of a qualified > name > This must be fixed since there are many json data files containing dots in > some of the keys (e.g. when specifying version numbers etc) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4264) Allow field names to include dots
[ https://issues.apache.org/jira/browse/DRILL-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138695#comment-16138695 ] ASF GitHub Bot commented on DRILL-4264: --- Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/909#discussion_r134539810 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java --- @@ -359,30 +361,109 @@ public Mutator(OperatorExecContext oContext, BufferAllocator allocator, VectorCo public T addField(MaterializedField field, Class clazz) throws SchemaChangeException { // Check if the field exists. - ValueVector v = fieldVectorMap.get(field.getPath()); - if (v == null || v.getClass() != clazz) { + ValueVector vector = fieldVectorMap.get(field.getName()); + ValueVector childVector = vector; + // if a vector does not exist yet, creates value vector, or if it exists and has map type, omit this code + if (vector == null || (vector.getClass() != clazz +&& (vector.getField().getType().getMinorType() != MinorType.MAP +|| field.getType().getMinorType() != MinorType.MAP))) { // Field does not exist--add it to the map and the output container. -v = TypeHelper.getNewVector(field, allocator, callBack); -if (!clazz.isAssignableFrom(v.getClass())) { +vector = TypeHelper.getNewVector(field, allocator, callBack); +childVector = vector; +// gets inner field if the map was created the first time +if (field.getType().getMinorType() == MinorType.MAP) { + childVector = getChildVectorByField(vector, field); +} else if (!clazz.isAssignableFrom(vector.getClass())) { throw new SchemaChangeException( String.format( "The class that was provided, %s, does not correspond to the " + "expected vector type of %s.", - clazz.getSimpleName(), v.getClass().getSimpleName())); + clazz.getSimpleName(), vector.getClass().getSimpleName())); } -final ValueVector old = fieldVectorMap.put(field.getPath(), v); +final ValueVector old = fieldVectorMap.put(field.getName(), vector); if (old != null) { old.clear(); container.remove(old); } -container.add(v); +container.add(vector); // Added new vectors to the container--mark that the schema has changed. schemaChanged = true; } + // otherwise, checks that field and existing vector have a map type --- End diff -- I was suggesting that the work here may be produced on the nested fields thru the map. I agree with you that it would be correct to deal with the desired field. So thanks for pointing this, I reverted the changes in this method. > Allow field names to include dots > - > > Key: DRILL-4264 > URL: https://issues.apache.org/jira/browse/DRILL-4264 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Codegen >Reporter: Alex >Assignee: Volodymyr Vysotskyi > Labels: doc-impacting > Fix For: 1.12.0 > > > If you have some json data like this... > {code:javascript} > { > "0.0.1":{ > "version":"0.0.1", > "date_created":"2014-03-15" > }, > "0.1.2":{ > "version":"0.1.2", > "date_created":"2014-05-21" > } > } > {code} > ... there is no way to select any of the rows since their identifiers contain > dots and when trying to select them, Drill throws the following error: > Error: SYSTEM ERROR: UnsupportedOperationException: Unhandled field reference > "0.0.1"; a field reference identifier must not have the form of a qualified > name > This must be fixed since there are many json data files containing dots in > some of the keys (e.g. when specifying version numbers etc) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4264) Allow field names to include dots
[ https://issues.apache.org/jira/browse/DRILL-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138699#comment-16138699 ] ASF GitHub Bot commented on DRILL-4264: --- Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/909#discussion_r134788919 --- Diff: exec/vector/src/main/java/org/apache/drill/exec/vector/accessor/TupleAccessor.java --- @@ -48,9 +48,21 @@ MaterializedField column(int index); -MaterializedField column(String name); +/** + * Returns {@code MaterializedField} instance from schema using the name path specified in param. + * + * @param name full name path of the column in the schema + * @return {@code MaterializedField} instance + */ +MaterializedField column(String[] name); --- End diff -- Thanks, reverted my changes. > Allow field names to include dots > - > > Key: DRILL-4264 > URL: https://issues.apache.org/jira/browse/DRILL-4264 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Codegen >Reporter: Alex >Assignee: Volodymyr Vysotskyi > Labels: doc-impacting > Fix For: 1.12.0 > > > If you have some json data like this... > {code:javascript} > { > "0.0.1":{ > "version":"0.0.1", > "date_created":"2014-03-15" > }, > "0.1.2":{ > "version":"0.1.2", > "date_created":"2014-05-21" > } > } > {code} > ... there is no way to select any of the rows since their identifiers contain > dots and when trying to select them, Drill throws the following error: > Error: SYSTEM ERROR: UnsupportedOperationException: Unhandled field reference > "0.0.1"; a field reference identifier must not have the form of a qualified > name > This must be fixed since there are many json data files containing dots in > some of the keys (e.g. when specifying version numbers etc) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4264) Allow field names to include dots
[ https://issues.apache.org/jira/browse/DRILL-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138701#comment-16138701 ] ASF GitHub Bot commented on DRILL-4264: --- Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/909#discussion_r134689741 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/project/ProjectRecordBatch.java --- @@ -362,16 +363,16 @@ protected boolean setupNewSchema() throws SchemaChangeException { final TransferPair tp = vvIn.makeTransferPair(vvOut); transfers.add(tp); } - } else if (value != null && value.intValue() > 1) { // subsequent wildcards should do a copy of incoming valuevectors + } else if (value != null && value > 1) { // subsequent wildcards should do a copy of incoming valuevectors int k = 0; for (final VectorWrapper wrapper : incoming) { final ValueVector vvIn = wrapper.getValueVector(); - final SchemaPath originalPath = SchemaPath.getSimplePath(vvIn.getField().getPath()); - if (k > result.outputNames.size()-1) { + final SchemaPath originalPath = SchemaPath.getSimplePath(vvIn.getField().getName()); + if (k > result.outputNames.size() - 1) { assert false; } final String name = result.outputNames.get(k++); // get the renamed column names - if (name == EMPTY_STRING) { + if (EMPTY_STRING.equals(name)) { --- End diff -- Thanks, replaced by `name.isEmpty()`, but `EMPTY_STRING` is used in other places, so left it in the class. > Allow field names to include dots > - > > Key: DRILL-4264 > URL: https://issues.apache.org/jira/browse/DRILL-4264 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Codegen >Reporter: Alex >Assignee: Volodymyr Vysotskyi > Labels: doc-impacting > Fix For: 1.12.0 > > > If you have some json data like this... > {code:javascript} > { > "0.0.1":{ > "version":"0.0.1", > "date_created":"2014-03-15" > }, > "0.1.2":{ > "version":"0.1.2", > "date_created":"2014-05-21" > } > } > {code} > ... there is no way to select any of the rows since their identifiers contain > dots and when trying to select them, Drill throws the following error: > Error: SYSTEM ERROR: UnsupportedOperationException: Unhandled field reference > "0.0.1"; a field reference identifier must not have the form of a qualified > name > This must be fixed since there are many json data files containing dots in > some of the keys (e.g. when specifying version numbers etc) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4264) Allow field names to include dots
[ https://issues.apache.org/jira/browse/DRILL-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138708#comment-16138708 ] ASF GitHub Bot commented on DRILL-4264: --- Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/909#discussion_r134787569 --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetSchema.java --- @@ -83,7 +85,13 @@ private void updateStructure(int index, PhysicalSchema children) { public boolean isMap() { return mapSchema != null; } public PhysicalSchema mapSchema() { return mapSchema; } public MaterializedField field() { return field; } -public String fullName() { return fullName; } + +/** + * Returns full name path of the column. --- End diff -- I reverted these changes. Also, I commented out the test where this code is used with the map fields. > Allow field names to include dots > - > > Key: DRILL-4264 > URL: https://issues.apache.org/jira/browse/DRILL-4264 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Codegen >Reporter: Alex >Assignee: Volodymyr Vysotskyi > Labels: doc-impacting > Fix For: 1.12.0 > > > If you have some json data like this... > {code:javascript} > { > "0.0.1":{ > "version":"0.0.1", > "date_created":"2014-03-15" > }, > "0.1.2":{ > "version":"0.1.2", > "date_created":"2014-05-21" > } > } > {code} > ... there is no way to select any of the rows since their identifiers contain > dots and when trying to select them, Drill throws the following error: > Error: SYSTEM ERROR: UnsupportedOperationException: Unhandled field reference > "0.0.1"; a field reference identifier must not have the form of a qualified > name > This must be fixed since there are many json data files containing dots in > some of the keys (e.g. when specifying version numbers etc) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4264) Allow field names to include dots
[ https://issues.apache.org/jira/browse/DRILL-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138703#comment-16138703 ] ASF GitHub Bot commented on DRILL-4264: --- Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/909#discussion_r134782398 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/record/TestMaterializedField.java --- @@ -84,4 +89,22 @@ public void testClone() { } + @Test // DRILL-4264 + public void testSchemaPathToMaterializedFieldConverting() { --- End diff -- This test was designed to check the `SchemaPathUtil.getMaterializedFieldFromSchemaPath()` method. Since this method removed, I removed this test. > Allow field names to include dots > - > > Key: DRILL-4264 > URL: https://issues.apache.org/jira/browse/DRILL-4264 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Codegen >Reporter: Alex >Assignee: Volodymyr Vysotskyi > Labels: doc-impacting > Fix For: 1.12.0 > > > If you have some json data like this... > {code:javascript} > { > "0.0.1":{ > "version":"0.0.1", > "date_created":"2014-03-15" > }, > "0.1.2":{ > "version":"0.1.2", > "date_created":"2014-05-21" > } > } > {code} > ... there is no way to select any of the rows since their identifiers contain > dots and when trying to select them, Drill throws the following error: > Error: SYSTEM ERROR: UnsupportedOperationException: Unhandled field reference > "0.0.1"; a field reference identifier must not have the form of a qualified > name > This must be fixed since there are many json data files containing dots in > some of the keys (e.g. when specifying version numbers etc) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4264) Allow field names to include dots
[ https://issues.apache.org/jira/browse/DRILL-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138698#comment-16138698 ] ASF GitHub Bot commented on DRILL-4264: --- Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/909#discussion_r134764401 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/SchemaPathUtil.java --- @@ -0,0 +1,59 @@ +/* +* Licensed to the Apache Software Foundation (ASF) under one or more +* contributor license agreements. See the NOTICE file distributed with +* this work for additional information regarding copyright ownership. +* The ASF licenses this file to you under the Apache License, Version 2.0 +* (the "License"); you may not use this file except in compliance with +* the License. You may obtain a copy of the License at +* +* http://www.apache.org/licenses/LICENSE-2.0 +* +* Unless required by applicable law or agreed to in writing, software +* distributed under the License is distributed on an "AS IS" BASIS, +* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +* See the License for the specific language governing permissions and +* limitations under the License. +*/ +package org.apache.drill.exec.vector.complex; + +import org.apache.drill.common.expression.PathSegment; +import org.apache.drill.common.expression.SchemaPath; +import org.apache.drill.common.types.TypeProtos; +import org.apache.drill.common.types.Types; +import org.apache.drill.exec.record.MaterializedField; + +public class SchemaPathUtil { --- End diff -- Removed this class, since all code where it was used, uses simple name path so it is not needed anymore. > Allow field names to include dots > - > > Key: DRILL-4264 > URL: https://issues.apache.org/jira/browse/DRILL-4264 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Codegen >Reporter: Alex >Assignee: Volodymyr Vysotskyi > Labels: doc-impacting > Fix For: 1.12.0 > > > If you have some json data like this... > {code:javascript} > { > "0.0.1":{ > "version":"0.0.1", > "date_created":"2014-03-15" > }, > "0.1.2":{ > "version":"0.1.2", > "date_created":"2014-05-21" > } > } > {code} > ... there is no way to select any of the rows since their identifiers contain > dots and when trying to select them, Drill throws the following error: > Error: SYSTEM ERROR: UnsupportedOperationException: Unhandled field reference > "0.0.1"; a field reference identifier must not have the form of a qualified > name > This must be fixed since there are many json data files containing dots in > some of the keys (e.g. when specifying version numbers etc) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4264) Allow field names to include dots
[ https://issues.apache.org/jira/browse/DRILL-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138706#comment-16138706 ] ASF GitHub Bot commented on DRILL-4264: --- Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/909#discussion_r134790779 --- Diff: logical/src/main/java/org/apache/drill/common/expression/PathSegment.java --- @@ -17,11 +17,15 @@ */ package org.apache.drill.common.expression; -public abstract class PathSegment{ +public abstract class PathSegment { - PathSegment child; + private PathSegment child; --- End diff -- This field has a setter. > Allow field names to include dots > - > > Key: DRILL-4264 > URL: https://issues.apache.org/jira/browse/DRILL-4264 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Codegen >Reporter: Alex >Assignee: Volodymyr Vysotskyi > Labels: doc-impacting > Fix For: 1.12.0 > > > If you have some json data like this... > {code:javascript} > { > "0.0.1":{ > "version":"0.0.1", > "date_created":"2014-03-15" > }, > "0.1.2":{ > "version":"0.1.2", > "date_created":"2014-05-21" > } > } > {code} > ... there is no way to select any of the rows since their identifiers contain > dots and when trying to select them, Drill throws the following error: > Error: SYSTEM ERROR: UnsupportedOperationException: Unhandled field reference > "0.0.1"; a field reference identifier must not have the form of a qualified > name > This must be fixed since there are many json data files containing dots in > some of the keys (e.g. when specifying version numbers etc) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4264) Allow field names to include dots
[ https://issues.apache.org/jira/browse/DRILL-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138704#comment-16138704 ] ASF GitHub Bot commented on DRILL-4264: --- Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/909#discussion_r134686740 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java --- @@ -359,30 +361,109 @@ public Mutator(OperatorExecContext oContext, BufferAllocator allocator, VectorCo public T addField(MaterializedField field, Class clazz) throws SchemaChangeException { // Check if the field exists. - ValueVector v = fieldVectorMap.get(field.getPath()); - if (v == null || v.getClass() != clazz) { + ValueVector vector = fieldVectorMap.get(field.getName()); + ValueVector childVector = vector; + // if a vector does not exist yet, creates value vector, or if it exists and has map type, omit this code + if (vector == null || (vector.getClass() != clazz +&& (vector.getField().getType().getMinorType() != MinorType.MAP +|| field.getType().getMinorType() != MinorType.MAP))) { // Field does not exist--add it to the map and the output container. -v = TypeHelper.getNewVector(field, allocator, callBack); -if (!clazz.isAssignableFrom(v.getClass())) { +vector = TypeHelper.getNewVector(field, allocator, callBack); +childVector = vector; +// gets inner field if the map was created the first time +if (field.getType().getMinorType() == MinorType.MAP) { + childVector = getChildVectorByField(vector, field); +} else if (!clazz.isAssignableFrom(vector.getClass())) { throw new SchemaChangeException( String.format( "The class that was provided, %s, does not correspond to the " + "expected vector type of %s.", - clazz.getSimpleName(), v.getClass().getSimpleName())); + clazz.getSimpleName(), vector.getClass().getSimpleName())); } -final ValueVector old = fieldVectorMap.put(field.getPath(), v); +final ValueVector old = fieldVectorMap.put(field.getName(), vector); if (old != null) { old.clear(); container.remove(old); } -container.add(v); +container.add(vector); // Added new vectors to the container--mark that the schema has changed. schemaChanged = true; } + // otherwise, checks that field and existing vector have a map type + // and adds child fields from the field to the vector + else if (field.getType().getMinorType() == MinorType.MAP + && vector.getField().getType().getMinorType() == MinorType.MAP + && !field.getChildren().isEmpty()) { +// an incoming field contains only single child since it determines +// full name path of the field in the schema +childVector = addNestedChildToMap((MapVector) vector, Iterables.getLast(field.getChildren())); +schemaChanged = true; + } - return clazz.cast(v); + return clazz.cast(childVector); +} + +/** + * Finds and returns value vector which path corresponds to the specified field. + * If required vector is nested in the map, gets and returns this vector from the map. + * + * @param valueVector vector that should be checked + * @param field field that corresponds to required vector + * @return value vector whose path corresponds to the specified field + * + * @throws SchemaChangeException if the field does not correspond to the found vector + */ +private ValueVector getChildVectorByField(ValueVector valueVector, + MaterializedField field) throws SchemaChangeException { + if (field.getChildren().isEmpty()) { +if (valueVector.getField().equals(field)) { + return valueVector; +} else { + throw new SchemaChangeException( +String.format( + "The field that was provided, %s, does not correspond to the " ++ "expected vector type of %s.", + field, valueVector.getClass().getSimpleName())); +} + } else { +// an incoming field contains only single child since it determines +// full name path of the field in the schema +MaterializedField childField = Iter
[jira] [Commented] (DRILL-4264) Allow field names to include dots
[ https://issues.apache.org/jira/browse/DRILL-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138707#comment-16138707 ] ASF GitHub Bot commented on DRILL-4264: --- Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/909#discussion_r134688368 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatch.java --- @@ -293,7 +294,7 @@ private StreamingAggregator createAggregatorInternal() throws SchemaChangeExcept continue; } keyExprs[i] = expr; - final MaterializedField outputField = MaterializedField.create(ne.getRef().getAsUnescapedPath(), expr.getMajorType()); + final MaterializedField outputField = SchemaPathUtil.getMaterializedFieldFromSchemaPath(ne.getRef(), expr.getMajorType()); --- End diff -- Yes, it should. But `MaterializedField` class is in the `vector` module, `SchemaPath` class is in the `drill-logical` module and `vector` module does not have the dependency on the `drill-logical` module. Replaced this code by the `MaterializedField.create(ne.getRef().getLastSegment().getNameSegment().getPath(), expr.getMajorType());` since simple name path is used here. > Allow field names to include dots > - > > Key: DRILL-4264 > URL: https://issues.apache.org/jira/browse/DRILL-4264 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Codegen >Reporter: Alex >Assignee: Volodymyr Vysotskyi > Labels: doc-impacting > Fix For: 1.12.0 > > > If you have some json data like this... > {code:javascript} > { > "0.0.1":{ > "version":"0.0.1", > "date_created":"2014-03-15" > }, > "0.1.2":{ > "version":"0.1.2", > "date_created":"2014-05-21" > } > } > {code} > ... there is no way to select any of the rows since their identifiers contain > dots and when trying to select them, Drill throws the following error: > Error: SYSTEM ERROR: UnsupportedOperationException: Unhandled field reference > "0.0.1"; a field reference identifier must not have the form of a qualified > name > This must be fixed since there are many json data files containing dots in > some of the keys (e.g. when specifying version numbers etc) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4264) Allow field names to include dots
[ https://issues.apache.org/jira/browse/DRILL-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138697#comment-16138697 ] ASF GitHub Bot commented on DRILL-4264: --- Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/909#discussion_r134795741 --- Diff: logical/src/main/java/org/apache/drill/common/expression/SchemaPath.java --- @@ -115,6 +112,33 @@ public static SchemaPath create(NamePart namePart) { } /** + * Parses input string and returns {@code SchemaPath} instance. + * + * @param expr input string to be parsed + * @return {@code SchemaPath} instance + */ + public static SchemaPath parseFromString(String expr) { --- End diff -- It parses a string using the same rules which are used for the field in the query. If a string contains dot outside backticks, or there are no backticks in the string, will be created `SchemaPath` with the `NameSegment` which contains one else `NameSegment`, etc. If a string contains [] then `ArraySegment` will be created. > Allow field names to include dots > - > > Key: DRILL-4264 > URL: https://issues.apache.org/jira/browse/DRILL-4264 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Codegen >Reporter: Alex >Assignee: Volodymyr Vysotskyi > Labels: doc-impacting > Fix For: 1.12.0 > > > If you have some json data like this... > {code:javascript} > { > "0.0.1":{ > "version":"0.0.1", > "date_created":"2014-03-15" > }, > "0.1.2":{ > "version":"0.1.2", > "date_created":"2014-05-21" > } > } > {code} > ... there is no way to select any of the rows since their identifiers contain > dots and when trying to select them, Drill throws the following error: > Error: SYSTEM ERROR: UnsupportedOperationException: Unhandled field reference > "0.0.1"; a field reference identifier must not have the form of a qualified > name > This must be fixed since there are many json data files containing dots in > some of the keys (e.g. when specifying version numbers etc) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4264) Allow field names to include dots
[ https://issues.apache.org/jira/browse/DRILL-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138702#comment-16138702 ] ASF GitHub Bot commented on DRILL-4264: --- Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/909#discussion_r134788014 --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetSchema.java --- @@ -94,12 +102,20 @@ private void updateStructure(int index, PhysicalSchema children) { */ public static class NameSpace { -private final Map nameSpace = new HashMap<>(); +private final Map nameSpace = new HashMap<>(); private final List columns = new ArrayList<>(); -public int add(String key, T value) { +/** + * Adds column path with specified value to the columns list + * and returns the index of the column in the list. + * + * @param key full name path of the column in the schema + * @param value value to be added to the list + * @return index of the column in the list + */ +public int add(String[] key, T value) { int index = columns.size(); - nameSpace.put(key, index); + nameSpace.put(SchemaPath.getCompoundPath(key).toExpr(), index); --- End diff -- Thanks for the explanation, reverted my changes. > Allow field names to include dots > - > > Key: DRILL-4264 > URL: https://issues.apache.org/jira/browse/DRILL-4264 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Codegen >Reporter: Alex >Assignee: Volodymyr Vysotskyi > Labels: doc-impacting > Fix For: 1.12.0 > > > If you have some json data like this... > {code:javascript} > { > "0.0.1":{ > "version":"0.0.1", > "date_created":"2014-03-15" > }, > "0.1.2":{ > "version":"0.1.2", > "date_created":"2014-05-21" > } > } > {code} > ... there is no way to select any of the rows since their identifiers contain > dots and when trying to select them, Drill throws the following error: > Error: SYSTEM ERROR: UnsupportedOperationException: Unhandled field reference > "0.0.1"; a field reference identifier must not have the form of a qualified > name > This must be fixed since there are many json data files containing dots in > some of the keys (e.g. when specifying version numbers etc) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4264) Allow field names to include dots
[ https://issues.apache.org/jira/browse/DRILL-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16130668#comment-16130668 ] ASF GitHub Bot commented on DRILL-4264: --- GitHub user vvysotskyi opened a pull request: https://github.com/apache/drill/pull/909 DRILL-4264: Allow field names to include dots 1. Removed checking the field name for dots. 2. Replaced using `SchemaPath.getAsUnescapedPath()` method by `SchemaPath.getRootSegmentPath()` and `SchemaPathUtil.getMaterializedFieldFromSchemaPath()` where it is needed. 3. Replaced using `MaterializedField.getPath()` and `MaterializedField.getLastName()` methods by `MaterializedField.getName()` method and checked the correctness of the behaviour. 4. Added tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/vvysotskyi/drill DRILL-4264 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/909.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #909 commit 4ba59488a96fb79455b192ed960a728481ceaf93 Author: Volodymyr Vysotskyi Date: 2017-07-05T19:08:59Z DRILL-4264: Allow field names to include dots > Allow field names to include dots > - > > Key: DRILL-4264 > URL: https://issues.apache.org/jira/browse/DRILL-4264 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Codegen >Reporter: Alex >Assignee: Volodymyr Vysotskyi > Labels: doc-impacting > Fix For: 1.12.0 > > > If you have some json data like this... > {code:javascript} > { > "0.0.1":{ > "version":"0.0.1", > "date_created":"2014-03-15" > }, > "0.1.2":{ > "version":"0.1.2", > "date_created":"2014-05-21" > } > } > {code} > ... there is no way to select any of the rows since their identifiers contain > dots and when trying to select them, Drill throws the following error: > Error: SYSTEM ERROR: UnsupportedOperationException: Unhandled field reference > "0.0.1"; a field reference identifier must not have the form of a qualified > name > This must be fixed since there are many json data files containing dots in > some of the keys (e.g. when specifying version numbers etc) -- This message was sent by Atlassian JIRA (v6.4.14#64029)