[jira] [Commented] (DRILL-5316) C++ Client Crashes When drillbitsVector.count is 0 after zoo_get_children completed with ZOK
[ https://issues.apache.org/jira/browse/DRILL-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15898909#comment-15898909 ] ASF GitHub Bot commented on DRILL-5316: --- Github user sohami commented on a diff in the pull request: https://github.com/apache/drill/pull/772#discussion_r104606833 --- Diff: contrib/native/client/src/clientlib/zookeeperClient.cpp --- @@ -138,6 +138,11 @@ int ZookeeperClient::getAllDrillbits(const std::string& connectStr, std::vector< DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "\t Unshuffled Drillbit id: " << drillbits[i] << std::endl;) } } +else{ --- End diff -- Agreed. Should be handled in caller (i.e. DrillClient). If the returned vector size is zero then we should check that in DrillClient and close client connection with error as `ERR_CONN_ZKNODBIT`. Something like below: `return handleConnError(CONN_INVALID_INPUT, getMessage(ERR_CONN_ZKNODBIT, pathToDrill.c_str()));` > C++ Client Crashes When drillbitsVector.count is 0 after zoo_get_children > completed with ZOK > > > Key: DRILL-5316 > URL: https://issues.apache.org/jira/browse/DRILL-5316 > Project: Apache Drill > Issue Type: Bug > Components: Client - C++ >Reporter: Rob Wu >Priority: Critical > > When connecting to drillbit with Zookeeper, occasionally the C++ client would > crash without any reason. > A further look into the code revealed that during this call > rc=zoo_get_children(p_zh.get(), m_path.c_str(), 0, &drillbitsVector); > zoo_get_children returns ZOK (0) but drillbitsVector.count is 0. > This causes drillbits to stay empty and thus > causes err = zook.getEndPoint(drillbits[drillbits.size() -1], endpoint); to > crash > Size check should be done to prevent this from happening -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5316) C++ Client Crashes When drillbitsVector.count is 0 after zoo_get_children completed with ZOK
[ https://issues.apache.org/jira/browse/DRILL-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15898916#comment-15898916 ] ASF GitHub Bot commented on DRILL-5316: --- Github user superbstreak commented on a diff in the pull request: https://github.com/apache/drill/pull/772#discussion_r104607726 --- Diff: contrib/native/client/src/clientlib/zookeeperClient.cpp --- @@ -138,6 +138,11 @@ int ZookeeperClient::getAllDrillbits(const std::string& connectStr, std::vector< DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "\t Unshuffled Drillbit id: " << drillbits[i] << std::endl;) } } +else{ --- End diff -- Thanks both. Yea make sense, I'll make the change. > C++ Client Crashes When drillbitsVector.count is 0 after zoo_get_children > completed with ZOK > > > Key: DRILL-5316 > URL: https://issues.apache.org/jira/browse/DRILL-5316 > Project: Apache Drill > Issue Type: Bug > Components: Client - C++ >Reporter: Rob Wu >Priority: Critical > > When connecting to drillbit with Zookeeper, occasionally the C++ client would > crash without any reason. > A further look into the code revealed that during this call > rc=zoo_get_children(p_zh.get(), m_path.c_str(), 0, &drillbitsVector); > zoo_get_children returns ZOK (0) but drillbitsVector.count is 0. > This causes drillbits to stay empty and thus > causes err = zook.getEndPoint(drillbits[drillbits.size() -1], endpoint); to > crash > Size check should be done to prevent this from happening -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5316) C++ Client Crashes When drillbitsVector.count is 0 after zoo_get_children completed with ZOK
[ https://issues.apache.org/jira/browse/DRILL-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899081#comment-15899081 ] ASF GitHub Bot commented on DRILL-5316: --- Github user superbstreak commented on a diff in the pull request: https://github.com/apache/drill/pull/772#discussion_r104627768 --- Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp --- @@ -86,6 +86,9 @@ connectionStatus_t DrillClientImpl::connect(const char* connStr, DrillUserProper std::vector drillbits; int err = zook.getAllDrillbits(hostPortStr, drillbits); if(!err){ +if (drillbits.empty()){ +return handleConnError(CONN_INVALID_INPUT, getMessage(ERR_CONN_ZKNODBIT)); --- End diff -- double check CONN_INVALID_INPUT usage. > C++ Client Crashes When drillbitsVector.count is 0 after zoo_get_children > completed with ZOK > > > Key: DRILL-5316 > URL: https://issues.apache.org/jira/browse/DRILL-5316 > Project: Apache Drill > Issue Type: Bug > Components: Client - C++ >Reporter: Rob Wu >Priority: Critical > > When connecting to drillbit with Zookeeper, occasionally the C++ client would > crash without any reason. > A further look into the code revealed that during this call > rc=zoo_get_children(p_zh.get(), m_path.c_str(), 0, &drillbitsVector); > zoo_get_children returns ZOK (0) but drillbitsVector.count is 0. > This causes drillbits to stay empty and thus > causes err = zook.getEndPoint(drillbits[drillbits.size() -1], endpoint); to > crash > Size check should be done to prevent this from happening -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5317) Drill Configuration to access S3 buckets in Mumbai region
shivamurthy.dee...@gmail.com created DRILL-5317: --- Summary: Drill Configuration to access S3 buckets in Mumbai region Key: DRILL-5317 URL: https://issues.apache.org/jira/browse/DRILL-5317 Project: Apache Drill Issue Type: Bug Components: Functions - Drill Affects Versions: 1.8.0 Reporter: shivamurthy.dee...@gmail.com I am able to access and query S3 buckets in US standard region, but not able to access/query buckets in Mumbai region. Is there any specific configuration that needs to be enabled on Drill? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-4335) Apache Drill should support network encryption
[ https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zelaine Fong updated DRILL-4335: Reviewer: Sudheesh Katkam Assigned Reviewer to [~sudheeshkatkam] > Apache Drill should support network encryption > -- > > Key: DRILL-4335 > URL: https://issues.apache.org/jira/browse/DRILL-4335 > Project: Apache Drill > Issue Type: New Feature >Reporter: Keys Botzum >Assignee: Sorabh Hamirwasia > Labels: security > > This is clearly related to Drill-291 but wanted to make explicit that this > needs to include network level encryption and not just authentication. This > is particularly important for the client connection to Drill which will often > be sending passwords in the clear until there is encryption. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5318) Create a sub-operator test framework
Paul Rogers created DRILL-5318: -- Summary: Create a sub-operator test framework Key: DRILL-5318 URL: https://issues.apache.org/jira/browse/DRILL-5318 Project: Apache Drill Issue Type: Improvement Components: Tools, Build & Test Affects Versions: 1.10.0 Reporter: Paul Rogers Assignee: Paul Rogers Fix For: Future Drill provides two unit test frameworks for whole-server, SQL-based testing: the original {{BaseTestQuery}} and the newer {{ClusterFixture}}. Both use the {{TestBuilder}} mechanism to build system-level functional tests that run queries and check results. Jason provided an operator-level test framework based, in part on mocks: As Drill operators become more complex, we have a crying need for true unit-level tests at a level below the whole system and below operators. That is, we need to test the individual pieces that, together, form the operator. This umbrella ticket includes a number of tasks needed to create the sub-operator framework. Our intention is that, over time, as we find the need to revisit existing operators, or create new ones, we can employ the sub-operator test framework to exercise code at a finer granularity than is possible prior to this framework. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5319) Refactor OperatorContext and OptionsManager for unit testing
Paul Rogers created DRILL-5319: -- Summary: Refactor OperatorContext and OptionsManager for unit testing Key: DRILL-5319 URL: https://issues.apache.org/jira/browse/DRILL-5319 Project: Apache Drill Issue Type: Sub-task Affects Versions: 1.10.0 Reporter: Paul Rogers Assignee: Paul Rogers Fix For: Future Roll-up task for two refactorings, see the sub-tasks for details. This ticket allows a single PR for the two different refactorings since the work heavily overlaps. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5320) Refactor OptionManager to allow better unit testing
Paul Rogers created DRILL-5320: -- Summary: Refactor OptionManager to allow better unit testing Key: DRILL-5320 URL: https://issues.apache.org/jira/browse/DRILL-5320 Project: Apache Drill Issue Type: Sub-task Affects Versions: 1.10.0 Reporter: Paul Rogers Assignee: Paul Rogers Fix For: Future The {{OptionManager}} interface serves two purposes: * Create and modify options * Access option values The implementations of this class are integrated with the rest of Drill, making it difficult to use the classes in isolation in unit testing. Further, since operators are given the full interface, the operator has the ability to modify options, and so each unit test should either verify that no modification is, in fact, done, or must track down modifications and test them. For operator and sub-operator unit tests we need a simpler interface. As it turns out, most low-level uses of {{OptionManager}} are all read-only. This allows a simple refactoring to enhance unit testability: create a new super-interface {{OptionSet}}, which provides only the read-only methods. Then, refactor low-level classes (code generation, compilers, and so on) to use the restricted {{OptionSet}} interface. Finally, for unit tests, create a trivial, map-based implementation that can be populated as needed for each specific test. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5321) Refactor FragmentContext for unit testing
Paul Rogers created DRILL-5321: -- Summary: Refactor FragmentContext for unit testing Key: DRILL-5321 URL: https://issues.apache.org/jira/browse/DRILL-5321 Project: Apache Drill Issue Type: Sub-task Affects Versions: 1.10.0 Reporter: Paul Rogers Assignee: Paul Rogers Fix For: Future Each operator has visibility to the {{FragmentContext}} class. {{FragmentContext}} provides access to all of Drill internals: the Drillbit context, the network interfaces, RPC messages and so on. Further, all the code generation mechanisms require a {{FragmentContext}} object. This structure creates a large barrier to unit testing. To test, say, a particular bit of generated code, we must have the entire Drillbit running so we can obtain a {{FragmentContext}}. Clearly, this is less than ideal. Upon inspection, it turns out that the {{FragmentContext}} is mostly needed, by many operators, to generate code. Of the many methods in {{FragmentContext}}, code generation uses only six. The solution is to create a new super-interface, {{CodeGenContext}}, which holds those six methods. The {{CodeGenContext}} can be easily re-implemented for unit testing. Then, modify all the code-generation classes that currently take {{FragmentContext}} to take {{CodeGenContext}} instead. Since {{FragmentContext}} derives from {{CodeGenContext}}, existing operator code "just works." -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5319) Refactor FragmentContext and OptionManager for unit testing
[ https://issues.apache.org/jira/browse/DRILL-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-5319: --- Summary: Refactor FragmentContext and OptionManager for unit testing (was: Refactor OperatorContext and OptionsManager for unit testing) > Refactor FragmentContext and OptionManager for unit testing > --- > > Key: DRILL-5319 > URL: https://issues.apache.org/jira/browse/DRILL-5319 > Project: Apache Drill > Issue Type: Sub-task > Components: Tools, Build & Test >Affects Versions: 1.10.0 >Reporter: Paul Rogers >Assignee: Paul Rogers > Fix For: Future > > > Roll-up task for two refactorings, see the sub-tasks for details. This ticket > allows a single PR for the two different refactorings since the work heavily > overlaps. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5320) Refactor OptionManager to allow better unit testing
[ https://issues.apache.org/jira/browse/DRILL-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899821#comment-15899821 ] Paul Rogers commented on DRILL-5320: Proposed new interface (comments omitted): {code} public interface OptionSet { OptionValue getOption(String name); boolean getOption(TypeValidators.BooleanValidator validator); double getOption(TypeValidators.DoubleValidator validator); long getOption(TypeValidators.LongValidator validator); String getOption(TypeValidators.StringValidator validator); } {code} > Refactor OptionManager to allow better unit testing > --- > > Key: DRILL-5320 > URL: https://issues.apache.org/jira/browse/DRILL-5320 > Project: Apache Drill > Issue Type: Sub-task > Components: Tools, Build & Test >Affects Versions: 1.10.0 >Reporter: Paul Rogers >Assignee: Paul Rogers > Fix For: Future > > > The {{OptionManager}} interface serves two purposes: > * Create and modify options > * Access option values > The implementations of this class are integrated with the rest of Drill, > making it difficult to use the classes in isolation in unit testing. Further, > since operators are given the full interface, the operator has the ability to > modify options, and so each unit test should either verify that no > modification is, in fact, done, or must track down modifications and test > them. > For operator and sub-operator unit tests we need a simpler interface. As it > turns out, most low-level uses of {{OptionManager}} are all read-only. This > allows a simple refactoring to enhance unit testability: create a new > super-interface {{OptionSet}}, which provides only the read-only methods. > Then, refactor low-level classes (code generation, compilers, and so on) to > use the restricted {{OptionSet}} interface. > Finally, for unit tests, create a trivial, map-based implementation that can > be populated as needed for each specific test. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5321) Refactor FragmentContext for unit testing
[ https://issues.apache.org/jira/browse/DRILL-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899824#comment-15899824 ] Paul Rogers commented on DRILL-5321: Proposed new interface (comments omitted): {code} public interface CodeGenContext { FunctionImplementationRegistry getFunctionRegistry(); OptionSet getOptionSet(); T getImplementationClass(final ClassGenerator cg) throws ClassTransformationException, IOException; T getImplementationClass(final CodeGenerator cg) throws ClassTransformationException, IOException; List getImplementationClass(final ClassGenerator cg, final int instanceCount) throws ClassTransformationException, IOException; List getImplementationClass(final CodeGenerator cg, final int instanceCount) throws ClassTransformationException, IOException; } {code} > Refactor FragmentContext for unit testing > - > > Key: DRILL-5321 > URL: https://issues.apache.org/jira/browse/DRILL-5321 > Project: Apache Drill > Issue Type: Sub-task > Components: Tools, Build & Test >Affects Versions: 1.10.0 >Reporter: Paul Rogers >Assignee: Paul Rogers > Fix For: Future > > > Each operator has visibility to the {{FragmentContext}} class. > {{FragmentContext}} provides access to all of Drill internals: the Drillbit > context, the network interfaces, RPC messages and so on. > Further, all the code generation mechanisms require a {{FragmentContext}} > object. > This structure creates a large barrier to unit testing. To test, say, a > particular bit of generated code, we must have the entire Drillbit running so > we can obtain a {{FragmentContext}}. Clearly, this is less than ideal. > Upon inspection, it turns out that the {{FragmentContext}} is mostly needed, > by many operators, to generate code. Of the many methods in > {{FragmentContext}}, code generation uses only six. > The solution is to create a new super-interface, {{CodeGenContext}}, which > holds those six methods. The {{CodeGenContext}} can be easily re-implemented > for unit testing. > Then, modify all the code-generation classes that currently take > {{FragmentContext}} to take {{CodeGenContext}} instead. > Since {{FragmentContext}} derives from {{CodeGenContext}}, existing operator > code "just works." -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5319) Refactor FragmentContext and OptionManager for unit testing
[ https://issues.apache.org/jira/browse/DRILL-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-5319: --- Description: Roll-up task for two refactorings, see the sub-tasks for details. This ticket allows a single PR for the two different refactorings since the work heavily overlaps. See DRILL-5320 and DRILL-5321 for details. (was: Roll-up task for two refactorings, see the sub-tasks for details. This ticket allows a single PR for the two different refactorings since the work heavily overlaps.) > Refactor FragmentContext and OptionManager for unit testing > --- > > Key: DRILL-5319 > URL: https://issues.apache.org/jira/browse/DRILL-5319 > Project: Apache Drill > Issue Type: Sub-task > Components: Tools, Build & Test >Affects Versions: 1.10.0 >Reporter: Paul Rogers >Assignee: Paul Rogers > Fix For: Future > > > Roll-up task for two refactorings, see the sub-tasks for details. This ticket > allows a single PR for the two different refactorings since the work heavily > overlaps. See DRILL-5320 and DRILL-5321 for details. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5322) Provide an OperatorFixture for sub-operator unit testing setup
Paul Rogers created DRILL-5322: -- Summary: Provide an OperatorFixture for sub-operator unit testing setup Key: DRILL-5322 URL: https://issues.apache.org/jira/browse/DRILL-5322 Project: Apache Drill Issue Type: Sub-task Affects Versions: 1.11.0 Reporter: Paul Rogers Assignee: Paul Rogers Fix For: 1.11.0 We recently created various "fixture" classes to assist with system-level testing: {{LogFixture}}, {{ClusterFixture}} and {{ClientFixture}}. Each handles the tedious work of setting up the conditions to run certain kinds of tests. In the same way, we need an {{OperatorFixture}} to set up the low-level bits and pieces needed for operator-level, and sub-operator-level unit testing. The {{DrillConfig}} is used by both the system-level and operator-level fixtures. So, pull the config-setup tasks our of (cluster) {{FixtureBuilder}} (should rename) and into a new {{ConfigBuilder}}. Leave the existing methods in {{FixtureBuilder}}, but modify them to be wrappers around the new config builder. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption
[ https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899831#comment-15899831 ] Laurent Goujon commented on DRILL-4335: --- It looks like from the initial implementation that there are lots of byte copying involved. Any estimation/benchmark to quantify the impact on throughput? wouldn't a SSL/TLS implementation be a more performant alternative here (because of its integration directly into netty?) > Apache Drill should support network encryption > -- > > Key: DRILL-4335 > URL: https://issues.apache.org/jira/browse/DRILL-4335 > Project: Apache Drill > Issue Type: New Feature >Reporter: Keys Botzum >Assignee: Sorabh Hamirwasia > Labels: security > > This is clearly related to Drill-291 but wanted to make explicit that this > needs to include network level encryption and not just authentication. This > is particularly important for the client connection to Drill which will often > be sending passwords in the clear until there is encryption. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5318) Create a sub-operator test framework
[ https://issues.apache.org/jira/browse/DRILL-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-5318: --- Affects Version/s: (was: 1.10.0) 1.11.0 > Create a sub-operator test framework > > > Key: DRILL-5318 > URL: https://issues.apache.org/jira/browse/DRILL-5318 > Project: Apache Drill > Issue Type: Improvement > Components: Tools, Build & Test >Affects Versions: 1.11.0 >Reporter: Paul Rogers >Assignee: Paul Rogers > Fix For: 1.11.0 > > > Drill provides two unit test frameworks for whole-server, SQL-based testing: > the original {{BaseTestQuery}} and the newer {{ClusterFixture}}. Both use the > {{TestBuilder}} mechanism to build system-level functional tests that run > queries and check results. > Jason provided an operator-level test framework based, in part on mocks: > As Drill operators become more complex, we have a crying need for true > unit-level tests at a level below the whole system and below operators. That > is, we need to test the individual pieces that, together, form the operator. > This umbrella ticket includes a number of tasks needed to create the > sub-operator framework. Our intention is that, over time, as we find the need > to revisit existing operators, or create new ones, we can employ the > sub-operator test framework to exercise code at a finer granularity than is > possible prior to this framework. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5319) Refactor FragmentContext and OptionManager for unit testing
[ https://issues.apache.org/jira/browse/DRILL-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-5319: --- Fix Version/s: (was: Future) 1.11.0 > Refactor FragmentContext and OptionManager for unit testing > --- > > Key: DRILL-5319 > URL: https://issues.apache.org/jira/browse/DRILL-5319 > Project: Apache Drill > Issue Type: Sub-task > Components: Tools, Build & Test >Affects Versions: 1.10.0 >Reporter: Paul Rogers >Assignee: Paul Rogers > Fix For: 1.11.0 > > > Roll-up task for two refactorings, see the sub-tasks for details. This ticket > allows a single PR for the two different refactorings since the work heavily > overlaps. See DRILL-5320 and DRILL-5321 for details. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5320) Refactor OptionManager to allow better unit testing
[ https://issues.apache.org/jira/browse/DRILL-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-5320: --- Affects Version/s: (was: 1.10.0) 1.11.0 Fix Version/s: (was: Future) 1.11.0 > Refactor OptionManager to allow better unit testing > --- > > Key: DRILL-5320 > URL: https://issues.apache.org/jira/browse/DRILL-5320 > Project: Apache Drill > Issue Type: Sub-task > Components: Tools, Build & Test >Affects Versions: 1.11.0 >Reporter: Paul Rogers >Assignee: Paul Rogers > Fix For: 1.11.0 > > > The {{OptionManager}} interface serves two purposes: > * Create and modify options > * Access option values > The implementations of this class are integrated with the rest of Drill, > making it difficult to use the classes in isolation in unit testing. Further, > since operators are given the full interface, the operator has the ability to > modify options, and so each unit test should either verify that no > modification is, in fact, done, or must track down modifications and test > them. > For operator and sub-operator unit tests we need a simpler interface. As it > turns out, most low-level uses of {{OptionManager}} are all read-only. This > allows a simple refactoring to enhance unit testability: create a new > super-interface {{OptionSet}}, which provides only the read-only methods. > Then, refactor low-level classes (code generation, compilers, and so on) to > use the restricted {{OptionSet}} interface. > Finally, for unit tests, create a trivial, map-based implementation that can > be populated as needed for each specific test. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5318) Create a sub-operator test framework
[ https://issues.apache.org/jira/browse/DRILL-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-5318: --- Fix Version/s: (was: Future) 1.11.0 > Create a sub-operator test framework > > > Key: DRILL-5318 > URL: https://issues.apache.org/jira/browse/DRILL-5318 > Project: Apache Drill > Issue Type: Improvement > Components: Tools, Build & Test >Affects Versions: 1.10.0 >Reporter: Paul Rogers >Assignee: Paul Rogers > Fix For: 1.11.0 > > > Drill provides two unit test frameworks for whole-server, SQL-based testing: > the original {{BaseTestQuery}} and the newer {{ClusterFixture}}. Both use the > {{TestBuilder}} mechanism to build system-level functional tests that run > queries and check results. > Jason provided an operator-level test framework based, in part on mocks: > As Drill operators become more complex, we have a crying need for true > unit-level tests at a level below the whole system and below operators. That > is, we need to test the individual pieces that, together, form the operator. > This umbrella ticket includes a number of tasks needed to create the > sub-operator framework. Our intention is that, over time, as we find the need > to revisit existing operators, or create new ones, we can employ the > sub-operator test framework to exercise code at a finer granularity than is > possible prior to this framework. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5323) Provide test tools to create, populate and compare row sets
Paul Rogers created DRILL-5323: -- Summary: Provide test tools to create, populate and compare row sets Key: DRILL-5323 URL: https://issues.apache.org/jira/browse/DRILL-5323 Project: Apache Drill Issue Type: Sub-task Affects Versions: 1.11.0 Reporter: Paul Rogers Assignee: Paul Rogers Fix For: 1.11.0 Operators work with individual row sets. A row set is a collection of records stored as column vectors. (Drill uses various terms for this concept. A record batch is a row set with an operator implementation wrapped around it. A vector container is a row set, but with much functionality left as an exercise for the developer. And so on.) To simplify tests, we need a {{TestRowSet}} concept that wraps a {{VectorContainer}} and provides easy ways to: * Define a schema for the row set. * Create a set of vectors that implement the schema. * Populate the row set with test data via code. * Add an SV2 to the row set. * Pass the row set to operator components (such as generated code blocks.) * Compare the results of the operation with an expected result set. * Dispose of the underling direct memory when work is done. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5321) Refactor FragmentContext for unit testing
[ https://issues.apache.org/jira/browse/DRILL-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-5321: --- Affects Version/s: (was: 1.10.0) 1.11.0 Fix Version/s: (was: Future) 1.11.0 > Refactor FragmentContext for unit testing > - > > Key: DRILL-5321 > URL: https://issues.apache.org/jira/browse/DRILL-5321 > Project: Apache Drill > Issue Type: Sub-task > Components: Tools, Build & Test >Affects Versions: 1.11.0 >Reporter: Paul Rogers >Assignee: Paul Rogers > Fix For: 1.11.0 > > > Each operator has visibility to the {{FragmentContext}} class. > {{FragmentContext}} provides access to all of Drill internals: the Drillbit > context, the network interfaces, RPC messages and so on. > Further, all the code generation mechanisms require a {{FragmentContext}} > object. > This structure creates a large barrier to unit testing. To test, say, a > particular bit of generated code, we must have the entire Drillbit running so > we can obtain a {{FragmentContext}}. Clearly, this is less than ideal. > Upon inspection, it turns out that the {{FragmentContext}} is mostly needed, > by many operators, to generate code. Of the many methods in > {{FragmentContext}}, code generation uses only six. > The solution is to create a new super-interface, {{CodeGenContext}}, which > holds those six methods. The {{CodeGenContext}} can be easily re-implemented > for unit testing. > Then, modify all the code-generation classes that currently take > {{FragmentContext}} to take {{CodeGenContext}} instead. > Since {{FragmentContext}} derives from {{CodeGenContext}}, existing operator > code "just works." -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5324) Provide simplified column reader/writer for use in tests
Paul Rogers created DRILL-5324: -- Summary: Provide simplified column reader/writer for use in tests Key: DRILL-5324 URL: https://issues.apache.org/jira/browse/DRILL-5324 Project: Apache Drill Issue Type: Sub-task Affects Versions: 1.11.0 Reporter: Paul Rogers Assignee: Paul Rogers Fix For: 1.11.0 In support of DRILL-, we wish to provide a very easy way to work with row sets. See the comment section for examples of the target API. Drill provides over 100 different value vectors, any of which may be required to perform a specific unit test. Creating these vectors, populating them, and retrieving values, is very tedious. The work is so complex that it acts to discourage developers from writing such tests. To simplify the task, we wish to provide a simplified row set reader and writer. To do that, we need to generate the corresponding column reader and writer for each value vector. This ticket focuses on the column-level readers and writers, and the required code generation. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5324) Provide simplified column reader/writer for use in tests
[ https://issues.apache.org/jira/browse/DRILL-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-5324: --- Description: In support of DRILL-5323, we wish to provide a very easy way to work with row sets. See the comment section for examples of the target API. Drill provides over 100 different value vectors, any of which may be required to perform a specific unit test. Creating these vectors, populating them, and retrieving values, is very tedious. The work is so complex that it acts to discourage developers from writing such tests. To simplify the task, we wish to provide a simplified row set reader and writer. To do that, we need to generate the corresponding column reader and writer for each value vector. This ticket focuses on the column-level readers and writers, and the required code generation. was: In support of DRILL-, we wish to provide a very easy way to work with row sets. See the comment section for examples of the target API. Drill provides over 100 different value vectors, any of which may be required to perform a specific unit test. Creating these vectors, populating them, and retrieving values, is very tedious. The work is so complex that it acts to discourage developers from writing such tests. To simplify the task, we wish to provide a simplified row set reader and writer. To do that, we need to generate the corresponding column reader and writer for each value vector. This ticket focuses on the column-level readers and writers, and the required code generation. > Provide simplified column reader/writer for use in tests > > > Key: DRILL-5324 > URL: https://issues.apache.org/jira/browse/DRILL-5324 > Project: Apache Drill > Issue Type: Sub-task > Components: Tools, Build & Test >Affects Versions: 1.11.0 >Reporter: Paul Rogers >Assignee: Paul Rogers > Fix For: 1.11.0 > > > In support of DRILL-5323, we wish to provide a very easy way to work with row > sets. See the comment section for examples of the target API. > Drill provides over 100 different value vectors, any of which may be required > to perform a specific unit test. Creating these vectors, populating them, and > retrieving values, is very tedious. The work is so complex that it acts to > discourage developers from writing such tests. > To simplify the task, we wish to provide a simplified row set reader and > writer. To do that, we need to generate the corresponding column reader and > writer for each value vector. This ticket focuses on the column-level readers > and writers, and the required code generation. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5324) Provide simplified column reader/writer for use in tests
[ https://issues.apache.org/jira/browse/DRILL-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-5324: --- Description: In support of DRILL-5323, we wish to provide a very easy way to work with row sets. See the comment section for examples of the target API. Drill provides over 100 different value vectors, any of which may be required to perform a specific unit test. Creating these vectors, populating them, and retrieving values, is very tedious. The work is so complex that it acts to discourage developers from writing such tests. To simplify the task, we wish to provide a simplified row set reader and writer. To do that, we need to generate the corresponding column reader and writer for each value vector. This ticket focuses on the column-level readers and writers, and the required code generation. Drill already provides vector readers and writers derived from {{FieldReader}}. However, these readers do not provide a uniform get/set interface that is type independent on the application side. Instead, application code must be aware of the type of the vector, something we seek to avoid for test code. The reader and writer classes are designed to be used in many contexts, not just for testing. As a result, their implementation makes no assumptions about the broader row reader and writer, other than that a row index and the required value vector are both available. was: In support of DRILL-5323, we wish to provide a very easy way to work with row sets. See the comment section for examples of the target API. Drill provides over 100 different value vectors, any of which may be required to perform a specific unit test. Creating these vectors, populating them, and retrieving values, is very tedious. The work is so complex that it acts to discourage developers from writing such tests. To simplify the task, we wish to provide a simplified row set reader and writer. To do that, we need to generate the corresponding column reader and writer for each value vector. This ticket focuses on the column-level readers and writers, and the required code generation. > Provide simplified column reader/writer for use in tests > > > Key: DRILL-5324 > URL: https://issues.apache.org/jira/browse/DRILL-5324 > Project: Apache Drill > Issue Type: Sub-task > Components: Tools, Build & Test >Affects Versions: 1.11.0 >Reporter: Paul Rogers >Assignee: Paul Rogers > Fix For: 1.11.0 > > > In support of DRILL-5323, we wish to provide a very easy way to work with row > sets. See the comment section for examples of the target API. > Drill provides over 100 different value vectors, any of which may be required > to perform a specific unit test. Creating these vectors, populating them, and > retrieving values, is very tedious. The work is so complex that it acts to > discourage developers from writing such tests. > To simplify the task, we wish to provide a simplified row set reader and > writer. To do that, we need to generate the corresponding column reader and > writer for each value vector. This ticket focuses on the column-level readers > and writers, and the required code generation. > Drill already provides vector readers and writers derived from > {{FieldReader}}. However, these readers do not provide a uniform get/set > interface that is type independent on the application side. Instead, > application code must be aware of the type of the vector, something we seek > to avoid for test code. > The reader and writer classes are designed to be used in many contexts, not > just for testing. As a result, their implementation makes no assumptions > about the broader row reader and writer, other than that a row index and the > required value vector are both available. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5324) Provide simplified column reader/writer for use in tests
[ https://issues.apache.org/jira/browse/DRILL-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899873#comment-15899873 ] Paul Rogers commented on DRILL-5324: The idea is to provide a uniform column access interface, similar to how JDBC works (but much simpler!). Assume a row reader defined elsewhere: {code} public interface RowSetReader { boolean next(); ColumnReader column(int colIndex); } {code} Then, we can read values as follows. Assume a schema of (Int, VarChar): {code} RowSetReader reader = ...; assertEquals(10, reader.column(0).getInt()); assertEquals("foo", reader.column(1).getString()); {code} Proposed interfaces (without comments): {code} public interface ColumnReader { ValueType getType(); boolean isNull(); int getInt(); long getLong(); double getDouble(); String getString(); byte[] getBytes(); } public interface ColumnWriter { ValueType getType(); void setNull(); void setInt(int value); void setLong(long value); void setDouble(double value); void setString(String value); void setBytes(byte[] value); } {code} > Provide simplified column reader/writer for use in tests > > > Key: DRILL-5324 > URL: https://issues.apache.org/jira/browse/DRILL-5324 > Project: Apache Drill > Issue Type: Sub-task > Components: Tools, Build & Test >Affects Versions: 1.11.0 >Reporter: Paul Rogers >Assignee: Paul Rogers > Fix For: 1.11.0 > > > In support of DRILL-5323, we wish to provide a very easy way to work with row > sets. See the comment section for examples of the target API. > Drill provides over 100 different value vectors, any of which may be required > to perform a specific unit test. Creating these vectors, populating them, and > retrieving values, is very tedious. The work is so complex that it acts to > discourage developers from writing such tests. > To simplify the task, we wish to provide a simplified row set reader and > writer. To do that, we need to generate the corresponding column reader and > writer for each value vector. This ticket focuses on the column-level readers > and writers, and the required code generation. > Drill already provides vector readers and writers derived from > {{FieldReader}}. However, these readers do not provide a uniform get/set > interface that is type independent on the application side. Instead, > application code must be aware of the type of the vector, something we seek > to avoid for test code. > The reader and writer classes are designed to be used in many contexts, not > just for testing. As a result, their implementation makes no assumptions > about the broader row reader and writer, other than that a row index and the > required value vector are both available. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (DRILL-5324) Provide simplified column reader/writer for use in tests
[ https://issues.apache.org/jira/browse/DRILL-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899873#comment-15899873 ] Paul Rogers edited comment on DRILL-5324 at 3/7/17 6:13 PM: The idea is to provide a uniform column access interface, similar to how JDBC works (but much simpler!). Assume a row reader defined elsewhere: {code} public interface RowSetReader { boolean next(); ColumnReader column(int colIndex); } {code} Then, we can read values as follows. Assume a schema of (Int, VarChar): {code} RowSetReader reader = ...; assertEquals(10, reader.column(0).getInt()); assertEquals("foo", reader.column(1).getString()); {code} Proposed interfaces (without comments): {code} public enum ValueType { INTEGER, LONG, DOUBLE, STRING, BYTES } public interface ColumnReader { ValueType getType(); boolean isNull(); int getInt(); long getLong(); double getDouble(); String getString(); byte[] getBytes(); } public interface ColumnWriter { ValueType getType(); void setNull(); void setInt(int value); void setLong(long value); void setDouble(double value); void setString(String value); void setBytes(byte[] value); } {code} was (Author: paul-rogers): The idea is to provide a uniform column access interface, similar to how JDBC works (but much simpler!). Assume a row reader defined elsewhere: {code} public interface RowSetReader { boolean next(); ColumnReader column(int colIndex); } {code} Then, we can read values as follows. Assume a schema of (Int, VarChar): {code} RowSetReader reader = ...; assertEquals(10, reader.column(0).getInt()); assertEquals("foo", reader.column(1).getString()); {code} Proposed interfaces (without comments): {code} public interface ColumnReader { ValueType getType(); boolean isNull(); int getInt(); long getLong(); double getDouble(); String getString(); byte[] getBytes(); } public interface ColumnWriter { ValueType getType(); void setNull(); void setInt(int value); void setLong(long value); void setDouble(double value); void setString(String value); void setBytes(byte[] value); } {code} > Provide simplified column reader/writer for use in tests > > > Key: DRILL-5324 > URL: https://issues.apache.org/jira/browse/DRILL-5324 > Project: Apache Drill > Issue Type: Sub-task > Components: Tools, Build & Test >Affects Versions: 1.11.0 >Reporter: Paul Rogers >Assignee: Paul Rogers > Fix For: 1.11.0 > > > In support of DRILL-5323, we wish to provide a very easy way to work with row > sets. See the comment section for examples of the target API. > Drill provides over 100 different value vectors, any of which may be required > to perform a specific unit test. Creating these vectors, populating them, and > retrieving values, is very tedious. The work is so complex that it acts to > discourage developers from writing such tests. > To simplify the task, we wish to provide a simplified row set reader and > writer. To do that, we need to generate the corresponding column reader and > writer for each value vector. This ticket focuses on the column-level readers > and writers, and the required code generation. > Drill already provides vector readers and writers derived from > {{FieldReader}}. However, these readers do not provide a uniform get/set > interface that is type independent on the application side. Instead, > application code must be aware of the type of the vector, something we seek > to avoid for test code. > The reader and writer classes are designed to be used in many contexts, not > just for testing. As a result, their implementation makes no assumptions > about the broader row reader and writer, other than that a row index and the > required value vector are both available. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (DRILL-5324) Provide simplified column reader/writer for use in tests
[ https://issues.apache.org/jira/browse/DRILL-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899873#comment-15899873 ] Paul Rogers edited comment on DRILL-5324 at 3/7/17 6:16 PM: The idea is to provide a uniform column access interface, similar to how JDBC works (but much simpler!). Assume a row reader defined elsewhere: {code} public interface RowSetReader { boolean next(); ColumnReader column(int colIndex); } {code} Then, we can read values as follows. Assume a schema of (Int, VarChar): {code} RowSetReader reader = ...; assertEquals(10, reader.column(0).getInt()); assertEquals("foo", reader.column(1).getString()); {code} Proposed interfaces (without comments): {code} public enum ValueType { INTEGER, LONG, DOUBLE, STRING, BYTES } public interface ColumnReader { ValueType getType(); boolean isNull(); int getInt(); long getLong(); double getDouble(); String getString(); byte[] getBytes(); } public interface ColumnWriter { ValueType getType(); void setNull(); void setInt(int value); void setLong(long value); void setDouble(double value); void setString(String value); void setBytes(byte[] value); } {code} The generated versions are very type-specific: each value vector type supports only one of the above methods. (The numeric types all convert to/from int or double.) An implementation could certainly create another layer that does type conversion: say from int to String. But, to keep the generated code simple, the generated code implements only a single get or set method. (The others throw an exception if called.) was (Author: paul-rogers): The idea is to provide a uniform column access interface, similar to how JDBC works (but much simpler!). Assume a row reader defined elsewhere: {code} public interface RowSetReader { boolean next(); ColumnReader column(int colIndex); } {code} Then, we can read values as follows. Assume a schema of (Int, VarChar): {code} RowSetReader reader = ...; assertEquals(10, reader.column(0).getInt()); assertEquals("foo", reader.column(1).getString()); {code} Proposed interfaces (without comments): {code} public enum ValueType { INTEGER, LONG, DOUBLE, STRING, BYTES } public interface ColumnReader { ValueType getType(); boolean isNull(); int getInt(); long getLong(); double getDouble(); String getString(); byte[] getBytes(); } public interface ColumnWriter { ValueType getType(); void setNull(); void setInt(int value); void setLong(long value); void setDouble(double value); void setString(String value); void setBytes(byte[] value); } {code} > Provide simplified column reader/writer for use in tests > > > Key: DRILL-5324 > URL: https://issues.apache.org/jira/browse/DRILL-5324 > Project: Apache Drill > Issue Type: Sub-task > Components: Tools, Build & Test >Affects Versions: 1.11.0 >Reporter: Paul Rogers >Assignee: Paul Rogers > Fix For: 1.11.0 > > > In support of DRILL-5323, we wish to provide a very easy way to work with row > sets. See the comment section for examples of the target API. > Drill provides over 100 different value vectors, any of which may be required > to perform a specific unit test. Creating these vectors, populating them, and > retrieving values, is very tedious. The work is so complex that it acts to > discourage developers from writing such tests. > To simplify the task, we wish to provide a simplified row set reader and > writer. To do that, we need to generate the corresponding column reader and > writer for each value vector. This ticket focuses on the column-level readers > and writers, and the required code generation. > Drill already provides vector readers and writers derived from > {{FieldReader}}. However, these readers do not provide a uniform get/set > interface that is type independent on the application side. Instead, > application code must be aware of the type of the vector, something we seek > to avoid for test code. > The reader and writer classes are designed to be used in many contexts, not > just for testing. As a result, their implementation makes no assumptions > about the broader row reader and writer, other than that a row index and the > required value vector are both available. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5325) Implement sub-opertor unit tests for managed external sort
Paul Rogers created DRILL-5325: -- Summary: Implement sub-opertor unit tests for managed external sort Key: DRILL-5325 URL: https://issues.apache.org/jira/browse/DRILL-5325 Project: Apache Drill Issue Type: Sub-task Affects Versions: 1.11.0 Reporter: Paul Rogers Assignee: Paul Rogers Fix For: 1.11.0 Validate the proposed sub-operator test framework, by creating low-level unit tests for the managed version of the external sort. The external sort has a small number of existing tests, but those tests are quite superficial; the "managed sort" project found many bugs. The managed sort itself was tested with ad-hoc system-level tests created using the new "cluster fixture" framework. But, again, such tests could not reach deep inside the sort code to exercise very specific conditions. As a result, we spent far too much time using QA functional tests to identify specific code issues. Using the sub-opeator unit test framework, we can instead test each bit of functionality at the unit test level. If doing so works, and is practical, it can serve as a model for other operator testing projects. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Closed] (DRILL-4963) Issues when overloading Drill native functions with dynamic UDFs
[ https://issues.apache.org/jira/browse/DRILL-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman closed DRILL-4963. I have tested this issue and it was not reproduced. Verified and closed. > Issues when overloading Drill native functions with dynamic UDFs > > > Key: DRILL-4963 > URL: https://issues.apache.org/jira/browse/DRILL-4963 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.9.0 >Reporter: Roman >Assignee: Arina Ielchiieva > Labels: ready-to-commit > Fix For: 1.10.0 > > Attachments: subquery_udf-1.0.jar, subquery_udf-1.0-sources.jar, > test_overloading-1.0.jar, test_overloading-1.0-sources.jar > > > I created jar file which overloads 3 DRILL native functions > (LOG(VARCHAR-REQUIRED), CURRENT_DATE(VARCHAR-REQUIRED) and > ABS(VARCHAR-REQUIRED,VARCHAR-REQUIRED)) and registered it as dynamic UDF. > If I try to use my functions I will get errors: > {code:xml} > SELECT CURRENT_DATE('test') FROM (VALUES(1)); > {code} > Error: FUNCTION ERROR: CURRENT_DATE does not support operand types (CHAR) > SQL Query null > {code:xml} > SELECT ABS('test','test') FROM (VALUES(1)); > {code} > Error: FUNCTION ERROR: ABS does not support operand types (CHAR,CHAR) > SQL Query null > {code:xml} > SELECT LOG('test') FROM (VALUES(1)); > {code} > Error: SYSTEM ERROR: DrillRuntimeException: Failure while materializing > expression in constant expression evaluator LOG('test'). Errors: > Error in expression at index -1. Error: Missing function implementation: > castTINYINT(VARCHAR-REQUIRED). Full expression: UNKNOWN EXPRESSION. > But if I rerun all this queries after "DrillRuntimeException", they will run > correctly. It seems that Drill have not updated the function signature before > that error. Also if I add jar as usual UDF (copy jar to > /drill_home/jars/3rdparty and restart drillbits), all queries will run > correctly without errors. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5326) Unexpected/unhandled MinorType value GENERIC_OBJECT
Vitalii Diravka created DRILL-5326: -- Summary: Unexpected/unhandled MinorType value GENERIC_OBJECT Key: DRILL-5326 URL: https://issues.apache.org/jira/browse/DRILL-5326 Project: Apache Drill Issue Type: Bug Components: Metadata Affects Versions: 1.10.0 Reporter: Vitalii Diravka Assignee: Vitalii Diravka Priority: Blocker Fix For: 1.10.0 In DRILL-5301 a new SERVER_META rpc call was introduced. The server will support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT it is disabled. When I enabled this method (by way of upgrading drill version to 1.10.0 or 1.11.0-SNAPSHOT) I found the following exception: {code} java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT {code} It appears in several tests (for example in DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh). The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was added in DRILL-1126). The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name for this "GENERIC_OBJECT" RPC-/protobuf-level data type. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5326) Unit tests failures related to the SERVER_METADTA
[ https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vitalii Diravka updated DRILL-5326: --- Summary: Unit tests failures related to the SERVER_METADTA (was: Unexpected/unhandled MinorType value GENERIC_OBJECT) > Unit tests failures related to the SERVER_METADTA > - > > Key: DRILL-5326 > URL: https://issues.apache.org/jira/browse/DRILL-5326 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.10.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka >Priority: Blocker > Fix For: 1.10.0 > > > In DRILL-5301 a new SERVER_META rpc call was introduced. The server will > support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT > it is disabled. > When I enabled this method (by way of upgrading drill version to 1.10.0 or > 1.11.0-SNAPSHOT) I found the following exception: > {code} > java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT > {code} > It appears in several tests (for example in > DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh). > The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in > the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was > added in DRILL-1126). > The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name > for this "GENERIC_OBJECT" RPC-/protobuf-level data type. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-2293) CTAS does not clean up when it fails
[ https://issues.apache.org/jira/browse/DRILL-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900149#comment-15900149 ] Kunal Khatua commented on DRILL-2293: - [~rkins], [~khfaraaz] had tested a similar usecase for CTTAS. Perhaps, a similar solution can be ported for this specifically, if it is unresolved. > CTAS does not clean up when it fails > > > Key: DRILL-2293 > URL: https://issues.apache.org/jira/browse/DRILL-2293 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Parquet >Reporter: Rahul Challapalli > Fix For: Future > > > git.commit.id.abbrev=6676f2d > Data Set : > {code} > { > "id" : 1, > "map":{"rm": [ > {"mapid":"m1","mapvalue":{"col1":1,"col2":[0,1,2,3,4,5]},"rptd": [{ "a": > "foo"},{"b":"boo"}]}, > {"mapid":"m2","mapvalue":{"col1":0,"col2":[]},"rptd": [{ "a": > "bar"},{"c":1},{"d":4.5}]} > ]} > } > {code} > The below query fails : > {code} > create table rep_map as select d.map from `temp.json` d; > Query failed: Query stopped., index: -4, length: 4 (expected: range(0, > 16384)) [ d76e3f74-7e2c-406f-a7fd-5efc68227e75 on qa-node190.qa.lab:31010 ] > {code} > However drill created a folder 'rep_map' and the folder contained a broken > parquet file. > {code} > create table rep_map as select d.map from `temp.json` d; > +++ > | ok | summary | > +++ > | false | Table 'rep_map' already exists. | > {code} > Drill should clean up properly in case of a failure. > I raised a different issue for the actual failure. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5326) Unit tests failures related to the SERVER_METADTA
[ https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vitalii Diravka updated DRILL-5326: --- Description: 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT it is disabled. When I enabled this method (by way of upgrading drill version to 1.10.0 or 1.11.0-SNAPSHOT) I found the following exception: {code} java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT {code} It appears in several tests (for example in DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh). The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was added in DRILL-1126). The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name for this "GENERIC_OBJECT" RPC-/protobuf-level data type. 2. After the fixing first one the mentioned above test still fails by reason of the incorrect "NullCollation" value in the "ServerMetaProvider". According to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the default val should be NC_HIGH (NULL is the highest value). was: In DRILL-5301 a new SERVER_META rpc call was introduced. The server will support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT it is disabled. When I enabled this method (by way of upgrading drill version to 1.10.0 or 1.11.0-SNAPSHOT) I found the following exception: {code} java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT {code} It appears in several tests (for example in DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh). The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was added in DRILL-1126). The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name for this "GENERIC_OBJECT" RPC-/protobuf-level data type. > Unit tests failures related to the SERVER_METADTA > - > > Key: DRILL-5326 > URL: https://issues.apache.org/jira/browse/DRILL-5326 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.10.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka >Priority: Blocker > Fix For: 1.10.0 > > > 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will > support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT > it is disabled. > When I enabled this method (by way of upgrading drill version to 1.10.0 or > 1.11.0-SNAPSHOT) I found the following exception: > {code} > java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT > {code} > It appears in several tests (for example in > DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh). > The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in > the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was > added in DRILL-1126). > The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name > for this "GENERIC_OBJECT" RPC-/protobuf-level data type. > 2. After the fixing first one the mentioned above test still fails by reason > of the incorrect "NullCollation" value in the "ServerMetaProvider". According > to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the > default val should be NC_HIGH (NULL is the highest value). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA
[ https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900155#comment-15900155 ] ASF GitHub Bot commented on DRILL-5326: --- GitHub user vdiravka opened a pull request: https://github.com/apache/drill/pull/775 DRILL-5326: Unit tests failures related to the SERVER_METADTA - adding of the sql type name for the "GENERIC_OBJECT"; - changing "NullCollation" in the "ServerMetaProvider" to the correct default value; You can merge this pull request into a Git repository by running: $ git pull https://github.com/vdiravka/drill DRILL-5326 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/775.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #775 commit e7ca7650fa1bcc32638cfe4aade96aa56406a362 Author: Vitalii Diravka Date: 2017-03-07T20:53:03Z DRILL-5326: Unit tests failures related to the SERVER_METADTA - adding of the sql type name for the "GENERIC_OBJECT"; - changing "NullCollation" in the "ServerMetaProvider" to the correct default value; > Unit tests failures related to the SERVER_METADTA > - > > Key: DRILL-5326 > URL: https://issues.apache.org/jira/browse/DRILL-5326 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.10.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka >Priority: Blocker > Fix For: 1.10.0 > > > 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will > support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT > it is disabled. > When I enabled this method (by way of upgrading drill version to 1.10.0 or > 1.11.0-SNAPSHOT) I found the following exception: > {code} > java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT > {code} > It appears in several tests (for example in > DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh). > The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in > the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was > added in DRILL-1126). > The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name > for this "GENERIC_OBJECT" RPC-/protobuf-level data type. > 2. After the fixing first one the mentioned above test still fails by reason > of the incorrect "NullCollation" value in the "ServerMetaProvider". According > to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the > default val should be NC_HIGH (NULL is the highest value). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5326) Unit tests failures related to the SERVER_METADTA
[ https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vitalii Diravka updated DRILL-5326: --- Description: 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT it is disabled. When I enabled this method (by way of upgrading drill version to 1.10.0 or 1.11.0-SNAPSHOT) I found the following exception: {code} java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT {code} It appears in several tests (for example in DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh). The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was added in DRILL-1126). The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name for this "GENERIC_OBJECT" RPC-/protobuf-level data type. 2. After fixing the first one the mentioned above test still fails by reason of the incorrect "NullCollation" value in the "ServerMetaProvider". According to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the default val should be NC_HIGH (NULL is the highest value). was: 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT it is disabled. When I enabled this method (by way of upgrading drill version to 1.10.0 or 1.11.0-SNAPSHOT) I found the following exception: {code} java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT {code} It appears in several tests (for example in DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh). The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was added in DRILL-1126). The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name for this "GENERIC_OBJECT" RPC-/protobuf-level data type. 2. After the fixing first one the mentioned above test still fails by reason of the incorrect "NullCollation" value in the "ServerMetaProvider". According to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the default val should be NC_HIGH (NULL is the highest value). > Unit tests failures related to the SERVER_METADTA > - > > Key: DRILL-5326 > URL: https://issues.apache.org/jira/browse/DRILL-5326 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.10.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka >Priority: Blocker > Fix For: 1.10.0 > > > 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will > support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT > it is disabled. > When I enabled this method (by way of upgrading drill version to 1.10.0 or > 1.11.0-SNAPSHOT) I found the following exception: > {code} > java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT > {code} > It appears in several tests (for example in > DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh). > The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in > the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was > added in DRILL-1126). > The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name > for this "GENERIC_OBJECT" RPC-/protobuf-level data type. > 2. After fixing the first one the mentioned above test still fails by reason > of the incorrect "NullCollation" value in the "ServerMetaProvider". According > to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the > default val should be NC_HIGH (NULL is the highest value). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (DRILL-4851) TPCDS Query 64 just hang there at "starting" state
[ https://issues.apache.org/jira/browse/DRILL-4851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dechang Gu resolved DRILL-4851. --- Resolution: Duplicate duplicate of DRILL-4347 > TPCDS Query 64 just hang there at "starting" state > -- > > Key: DRILL-4851 > URL: https://issues.apache.org/jira/browse/DRILL-4851 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.8.0 > Environment: REL 6.0 >Reporter: Dechang Gu >Assignee: Jinfeng Ni > > TPC-DS Query 64 on SF100 (views on parquet files) hung there at "starting" > state. drillbit.log on the foreman node show the following info: > 2016-08-17 14:26:00,785 ucs-node4.perf.lab > [284b2996-d25a-d9af-178c-143b32ea6969:foreman] INFO > o.a.drill.exec.work.foreman.Foreman - Query text for query id > 284b2996-d25a-d9af-178c-143b32ea6969: WITH cs_ui AS (SELECT cs_item_sk, > Sum(cs_ext_list_price) AS sale, Sum(cr_refunded_cash + cr_reversed_charge + > cr_store_credit) AS refund FROM catalog_sales, catalog_returns WHERE > cs_item_sk = cr_item_sk AND cs_order_number = cr_order_number GROUP BY > cs_item_sk HAVING Sum(cs_ext_list_price) > 2 * Sum( cr_refunded_cash + > cr_reversed_charge + cr_store_credit)), cross_sales AS (SELECT i_product_name > product_name, i_item_sk item_sk, s_store_name > store_name, s_zip store_zip, ad1.ca_street_number > b_street_number, ad1.ca_street_name b_streen_name, ad1.ca_city > b_city, ad1.ca_zip b_zip, ad2.ca_street_number c_street_number, > ad2.ca_street_name c_street_name, ad2.ca_cityc_city, > ad2.ca_zip c_zip, d1.d_year AS syear, d2.d_year > AS fsyear, d3.d_year s2year, Count(*) cnt, > Sum(ss_wholesale_cost) s1, Sum(ss_list_price) s2, Sum(ss_coupon_amt) > s3 FROM store_sales, store_returns, cs_ui, date_dim d1, date_dim d2, > date_dim d3, store, customer, customer_dd > emographics cd1, customer_demographics cd2, promotion, household_demographics > hd1, household_demographics hd2, customer_address ad1, customer_address ad2, > income_band ib1, income_band ib2, item WHERE ss_store_sk = s_stt > ore_sk AND ss_sold_date_sk = d1.d_date_sk AND ss_customer_sk = c_customer_sk > AND ss_cdemo_sk = cd1.cd_demo_sk AND ss_hdemo_sk = hd1.hd_demo_sk AND > ss_addr_sk = ad1.ca_address_sk AND ss_item_sk = i_item_sk AND ss_item_skk > = sr_item_sk AND ss_ticket_number = sr_ticket_number AND ss_item_sk = > cs_ui.cs_item_sk AND c_current_cdemo_sk = cd2.cd_demo_sk AND > c_current_hdemo_sk = hd2.hd_demo_sk AND c_current_addr_sk = ad2.ca_address_sk > AND c_firr > st_sales_date_sk = d2.d_date_sk AND c_first_shipto_date_sk = d3.d_date_sk AND > ss_promo_sk = p_promo_sk AND hd1.hd_income_band_sk = ib1.ib_income_band_sk > AND hd2.hd_income_band_sk = ib2.ib_income_band_sk AND cd1.cd_maritt > al_status <> cd2.cd_marital_status AND i_color IN ( 'cyan', 'peach', 'blush', > 'frosted', 'powder', 'orange' ) AND i_current_price BETWEEN 58 AND 58 + 10 > AND i_current_price BETWEEN 58 + 1 AND 58 + 15 GROUP BY i_productt > _name, i_item_sk, s_store_name, s_zip, ad1.ca_street_number, > ad1.ca_street_name, ad1.ca_city, ad1.ca_zip, ad2.ca_street_number, > ad2.ca_street_name, ad2.ca_city, ad2.ca_zip, d1.d_year, d2.d_year, d3.d_year) > SELECT cs1.prr > oduct_name, cs1.store_name, cs1.store_zip, cs1.b_street_number, > cs1.b_streen_name, cs1.b_city, cs1.b_zip, cs1.c_street_number, > cs1.c_street_name, cs1.c_city, cs1.c_zip, cs1.syear, cs1.cnt, cs1.s1, cs1.s2, > cs1.s3, cs2.s11 > , cs2.s2, cs2.s3, cs2.syear, cs2.cnt FROM cross_sales cs1, cross_sales cs2 > WHERE cs1.item_sk = cs2.item_sk AND cs1.syear = 2001 AND cs2.syear = 2001 + > 1 AND cs2.cnt <= cs1.cnt AND cs1.store_name = cs2.store_name AND > cs1.store_zip = cs2.store_zip ORDER BY cs1.product_name, cs1.store_name, > cs2.cnt > 2016-08-17 14:26:04,045 ucs-node4.perf.lab > [284b2996-d25a-d9af-178c-143b32ea6969:foreman] INFO > o.a.d.exec.store.parquet.Metadata - Took 1 ms to get file statuses > 2016-08-17 14:26:04,051 ucs-node4.perf.lab > [284b2996-d25a-d9af-178c-143b32ea6969:foreman] INFO > o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 1 out of > 1 using 1 threads. Time: 4ms total, 4.753323mm > s avg, 4ms max. > 2016-08-17 14:26:04,051 ucs-node4.perf.lab > [284b2996-d25a-d9af-178c-143b32ea6969:foreman] INFO > o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 1 out of > 1 using 1 threads. Earliest start: 12.261000 > μs, Latest start: 12.261000 μs, Average start: 12.261000 μs . > 2016-08-17 14:26:04,051 ucs-node4.perf.lab > [284b2996-d25a-d9af-178c-143b32ea6969:foreman] INFO > o.a.d.exec.store.parquet.Metadata - Took 6 ms
[jira] [Closed] (DRILL-4851) TPCDS Query 64 just hang there at "starting" state
[ https://issues.apache.org/jira/browse/DRILL-4851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dechang Gu closed DRILL-4851. - Duplicate of DRILL-4347, hence close this one > TPCDS Query 64 just hang there at "starting" state > -- > > Key: DRILL-4851 > URL: https://issues.apache.org/jira/browse/DRILL-4851 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.8.0 > Environment: REL 6.0 >Reporter: Dechang Gu >Assignee: Jinfeng Ni > > TPC-DS Query 64 on SF100 (views on parquet files) hung there at "starting" > state. drillbit.log on the foreman node show the following info: > 2016-08-17 14:26:00,785 ucs-node4.perf.lab > [284b2996-d25a-d9af-178c-143b32ea6969:foreman] INFO > o.a.drill.exec.work.foreman.Foreman - Query text for query id > 284b2996-d25a-d9af-178c-143b32ea6969: WITH cs_ui AS (SELECT cs_item_sk, > Sum(cs_ext_list_price) AS sale, Sum(cr_refunded_cash + cr_reversed_charge + > cr_store_credit) AS refund FROM catalog_sales, catalog_returns WHERE > cs_item_sk = cr_item_sk AND cs_order_number = cr_order_number GROUP BY > cs_item_sk HAVING Sum(cs_ext_list_price) > 2 * Sum( cr_refunded_cash + > cr_reversed_charge + cr_store_credit)), cross_sales AS (SELECT i_product_name > product_name, i_item_sk item_sk, s_store_name > store_name, s_zip store_zip, ad1.ca_street_number > b_street_number, ad1.ca_street_name b_streen_name, ad1.ca_city > b_city, ad1.ca_zip b_zip, ad2.ca_street_number c_street_number, > ad2.ca_street_name c_street_name, ad2.ca_cityc_city, > ad2.ca_zip c_zip, d1.d_year AS syear, d2.d_year > AS fsyear, d3.d_year s2year, Count(*) cnt, > Sum(ss_wholesale_cost) s1, Sum(ss_list_price) s2, Sum(ss_coupon_amt) > s3 FROM store_sales, store_returns, cs_ui, date_dim d1, date_dim d2, > date_dim d3, store, customer, customer_dd > emographics cd1, customer_demographics cd2, promotion, household_demographics > hd1, household_demographics hd2, customer_address ad1, customer_address ad2, > income_band ib1, income_band ib2, item WHERE ss_store_sk = s_stt > ore_sk AND ss_sold_date_sk = d1.d_date_sk AND ss_customer_sk = c_customer_sk > AND ss_cdemo_sk = cd1.cd_demo_sk AND ss_hdemo_sk = hd1.hd_demo_sk AND > ss_addr_sk = ad1.ca_address_sk AND ss_item_sk = i_item_sk AND ss_item_skk > = sr_item_sk AND ss_ticket_number = sr_ticket_number AND ss_item_sk = > cs_ui.cs_item_sk AND c_current_cdemo_sk = cd2.cd_demo_sk AND > c_current_hdemo_sk = hd2.hd_demo_sk AND c_current_addr_sk = ad2.ca_address_sk > AND c_firr > st_sales_date_sk = d2.d_date_sk AND c_first_shipto_date_sk = d3.d_date_sk AND > ss_promo_sk = p_promo_sk AND hd1.hd_income_band_sk = ib1.ib_income_band_sk > AND hd2.hd_income_band_sk = ib2.ib_income_band_sk AND cd1.cd_maritt > al_status <> cd2.cd_marital_status AND i_color IN ( 'cyan', 'peach', 'blush', > 'frosted', 'powder', 'orange' ) AND i_current_price BETWEEN 58 AND 58 + 10 > AND i_current_price BETWEEN 58 + 1 AND 58 + 15 GROUP BY i_productt > _name, i_item_sk, s_store_name, s_zip, ad1.ca_street_number, > ad1.ca_street_name, ad1.ca_city, ad1.ca_zip, ad2.ca_street_number, > ad2.ca_street_name, ad2.ca_city, ad2.ca_zip, d1.d_year, d2.d_year, d3.d_year) > SELECT cs1.prr > oduct_name, cs1.store_name, cs1.store_zip, cs1.b_street_number, > cs1.b_streen_name, cs1.b_city, cs1.b_zip, cs1.c_street_number, > cs1.c_street_name, cs1.c_city, cs1.c_zip, cs1.syear, cs1.cnt, cs1.s1, cs1.s2, > cs1.s3, cs2.s11 > , cs2.s2, cs2.s3, cs2.syear, cs2.cnt FROM cross_sales cs1, cross_sales cs2 > WHERE cs1.item_sk = cs2.item_sk AND cs1.syear = 2001 AND cs2.syear = 2001 + > 1 AND cs2.cnt <= cs1.cnt AND cs1.store_name = cs2.store_name AND > cs1.store_zip = cs2.store_zip ORDER BY cs1.product_name, cs1.store_name, > cs2.cnt > 2016-08-17 14:26:04,045 ucs-node4.perf.lab > [284b2996-d25a-d9af-178c-143b32ea6969:foreman] INFO > o.a.d.exec.store.parquet.Metadata - Took 1 ms to get file statuses > 2016-08-17 14:26:04,051 ucs-node4.perf.lab > [284b2996-d25a-d9af-178c-143b32ea6969:foreman] INFO > o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 1 out of > 1 using 1 threads. Time: 4ms total, 4.753323mm > s avg, 4ms max. > 2016-08-17 14:26:04,051 ucs-node4.perf.lab > [284b2996-d25a-d9af-178c-143b32ea6969:foreman] INFO > o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 1 out of > 1 using 1 threads. Earliest start: 12.261000 > μs, Latest start: 12.261000 μs, Average start: 12.261000 μs . > 2016-08-17 14:26:04,051 ucs-node4.perf.lab > [284b2996-d25a-d9af-178c-143b32ea6969:foreman] INFO > o.a.d.exec.store.parquet.Metadata - Took 6 ms to read
[jira] [Resolved] (DRILL-4850) TPCDS Query 33 failed in the second and 3rd runs, but succeeded in the 1st run
[ https://issues.apache.org/jira/browse/DRILL-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dechang Gu resolved DRILL-4850. --- Resolution: Cannot Reproduce Fix Version/s: 1.10.0 Cannot repro in 1.10.0 (gitid: 3bfb497), hence close it. > TPCDS Query 33 failed in the second and 3rd runs, but succeeded in the 1st run > -- > > Key: DRILL-4850 > URL: https://issues.apache.org/jira/browse/DRILL-4850 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.8.0 > Environment: REL 6.0 >Reporter: Dechang Gu >Assignee: Parth Chandra > Fix For: 1.10.0 > > > I run tpcds query 33 on SF100 database, 3 times consecutively. The first one > succeeded, but the 2nd and 3rd runs hit the following error: > > 2016-08-16 20:20:52,530 ucs-node6.perf.lab > [284c27f1-ee13-dfd0-6cbb-37f49810e93f:frag:3:9] ERROR > o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: IllegalStateException: > Failure while reading vector. Expected vector class of > org.apache.drill.exec.vector.NullableIntVector but was holding vector class > org.apache.drill.exec.vector.NullableBigIntVector, field= > i_manufact_id(BIGINT:OPTIONAL)[$bits$(UINT1:REQUIRED), > i_manufact_id(BIGINT:OPTIONAL)] > Fragment 3:9 > [Error Id: 7fc06ab9-6c63-402b-a1b4-465526aa7dc7 on ucs-node6.perf.lab:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Failure while reading vector. Expected vector class > of org.apache.drill.exec.vector.NullableIntVector but was holding vector > class org.apache.drill.exec.vector.NullableBigIntVector, field= > i_manufact_id(BIGINT:OPTIONAL)[$bits$(UINT1:REQUIRED), > i_manufact_id(BIGINT:OPTIONAL)] > Fragment 3:9 > [Error Id: 7fc06ab9-6c63-402b-a1b4-465526aa7dc7 on ucs-node6.perf.lab:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543) > ~[drill-common-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:293) > [drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160) > [drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:262) > [drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_65] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_65] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65] > Caused by: java.lang.IllegalStateException: Failure while reading vector. > Expected vector class of org.apache.drill.exec.vector.NullableIntVector but > was holding vector class org.apache.drill.exec.vector.NullableBigIntVector, > field= i_manufact_id(BIGINT:OPTIONAL)[$bits$(UINT1:REQUIRED), > i_manufact_id(BIGINT:OPTIONAL)] > at > org.apache.drill.exec.record.VectorContainer.getValueAccessorById(VectorContainer.java:290) > ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT] > at > org.apache.drill.exec.record.RecordBatchLoader.getValueAccessorById(RecordBatchLoader.java:178) > ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.unorderedreceiver.UnorderedReceiverBatch.getValueAccessorById(UnorderedReceiverBatch.java:135) > ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT] > at > org.apache.drill.exec.test.generated.PartitionerGen36655$OutgoingRecordBatch.doSetup(PartitionerTemplate.java:64) > ~[na:na] > at > org.apache.drill.exec.test.generated.PartitionerGen36655$OutgoingRecordBatch.initializeBatch(PartitionerTemplate.java:358) > ~[na:na] > at > org.apache.drill.exec.test.generated.PartitionerGen36655.flushOutgoingBatches(PartitionerTemplate.java:163) > ~[na:na] > at > org.apache.drill.exec.physical.impl.partitionsender.PartitionerDecorator$FlushBatchesHandlingClass.execute(PartitionerDecorator.java:266) > ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.partitionsender.PartitionerDecorator.executeMethodLogic(PartitionerDecorator.java:138) > ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.partitionsender.PartitionerDecorator.flushOutgoingBatches(PartitionerDecorator.java:82) > ~[drill-java-exec-1.8.0-SNAPSHOT.jar:
[jira] [Closed] (DRILL-4850) TPCDS Query 33 failed in the second and 3rd runs, but succeeded in the 1st run
[ https://issues.apache.org/jira/browse/DRILL-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dechang Gu closed DRILL-4850. - Cannot repro in 1.10.0, so close it. > TPCDS Query 33 failed in the second and 3rd runs, but succeeded in the 1st run > -- > > Key: DRILL-4850 > URL: https://issues.apache.org/jira/browse/DRILL-4850 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.8.0 > Environment: REL 6.0 >Reporter: Dechang Gu >Assignee: Parth Chandra > Fix For: 1.10.0 > > > I run tpcds query 33 on SF100 database, 3 times consecutively. The first one > succeeded, but the 2nd and 3rd runs hit the following error: > > 2016-08-16 20:20:52,530 ucs-node6.perf.lab > [284c27f1-ee13-dfd0-6cbb-37f49810e93f:frag:3:9] ERROR > o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: IllegalStateException: > Failure while reading vector. Expected vector class of > org.apache.drill.exec.vector.NullableIntVector but was holding vector class > org.apache.drill.exec.vector.NullableBigIntVector, field= > i_manufact_id(BIGINT:OPTIONAL)[$bits$(UINT1:REQUIRED), > i_manufact_id(BIGINT:OPTIONAL)] > Fragment 3:9 > [Error Id: 7fc06ab9-6c63-402b-a1b4-465526aa7dc7 on ucs-node6.perf.lab:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Failure while reading vector. Expected vector class > of org.apache.drill.exec.vector.NullableIntVector but was holding vector > class org.apache.drill.exec.vector.NullableBigIntVector, field= > i_manufact_id(BIGINT:OPTIONAL)[$bits$(UINT1:REQUIRED), > i_manufact_id(BIGINT:OPTIONAL)] > Fragment 3:9 > [Error Id: 7fc06ab9-6c63-402b-a1b4-465526aa7dc7 on ucs-node6.perf.lab:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543) > ~[drill-common-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:293) > [drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160) > [drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:262) > [drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_65] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_65] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65] > Caused by: java.lang.IllegalStateException: Failure while reading vector. > Expected vector class of org.apache.drill.exec.vector.NullableIntVector but > was holding vector class org.apache.drill.exec.vector.NullableBigIntVector, > field= i_manufact_id(BIGINT:OPTIONAL)[$bits$(UINT1:REQUIRED), > i_manufact_id(BIGINT:OPTIONAL)] > at > org.apache.drill.exec.record.VectorContainer.getValueAccessorById(VectorContainer.java:290) > ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT] > at > org.apache.drill.exec.record.RecordBatchLoader.getValueAccessorById(RecordBatchLoader.java:178) > ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.unorderedreceiver.UnorderedReceiverBatch.getValueAccessorById(UnorderedReceiverBatch.java:135) > ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT] > at > org.apache.drill.exec.test.generated.PartitionerGen36655$OutgoingRecordBatch.doSetup(PartitionerTemplate.java:64) > ~[na:na] > at > org.apache.drill.exec.test.generated.PartitionerGen36655$OutgoingRecordBatch.initializeBatch(PartitionerTemplate.java:358) > ~[na:na] > at > org.apache.drill.exec.test.generated.PartitionerGen36655.flushOutgoingBatches(PartitionerTemplate.java:163) > ~[na:na] > at > org.apache.drill.exec.physical.impl.partitionsender.PartitionerDecorator$FlushBatchesHandlingClass.execute(PartitionerDecorator.java:266) > ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.partitionsender.PartitionerDecorator.executeMethodLogic(PartitionerDecorator.java:138) > ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.partitionsender.PartitionerDecorator.flushOutgoingBatches(PartitionerDecorator.java:82) > ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.partitionsender.Pa
[jira] [Commented] (DRILL-4347) Planning time for query64 from TPCDS test suite has increased 10 times compared to 1.4 release
[ https://issues.apache.org/jira/browse/DRILL-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900247#comment-15900247 ] Dechang Gu commented on DRILL-4347: --- Check it with the current AD1.10.0 master (gitid 3dfb497), it takes >4 minutes for planning: DURATION: 05 min 54.007 sec PLANNING: 04 min 12.826 sec EXECUTION: 01 min 41.181 sec So someone need to chase the issue further > Planning time for query64 from TPCDS test suite has increased 10 times > compared to 1.4 release > -- > > Key: DRILL-4347 > URL: https://issues.apache.org/jira/browse/DRILL-4347 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.5.0 >Reporter: Victoria Markman >Assignee: Aman Sinha > Fix For: Future > > Attachments: 294e9fb9-cdda-a89f-d1a7-b852878926a1.sys.drill_1.4.0, > 294ea418-9fb8-3082-1725-74e3cfe38fe9.sys.drill_1.5.0, drill4347_jstack.txt > > > mapr-drill-1.5.0.201602012001-1.noarch.rpm > {code} > 0: jdbc:drill:schema=dfs> WITH cs_ui > . . . . . . . . . . . . > AS (SELECT cs_item_sk, > . . . . . . . . . . . . > Sum(cs_ext_list_price) AS sale, > . . . . . . . . . . . . > Sum(cr_refunded_cash + > cr_reversed_charge > . . . . . . . . . . . . > + cr_store_credit) AS refund > . . . . . . . . . . . . > FROM catalog_sales, > . . . . . . . . . . . . > catalog_returns > . . . . . . . . . . . . > WHERE cs_item_sk = cr_item_sk > . . . . . . . . . . . . > AND cs_order_number = > cr_order_number > . . . . . . . . . . . . > GROUP BY cs_item_sk > . . . . . . . . . . . . > HAVING Sum(cs_ext_list_price) > 2 * Sum( > . . . . . . . . . . . . > cr_refunded_cash + > cr_reversed_charge > . . . . . . . . . . . . > + cr_store_credit)), > . . . . . . . . . . . . > cross_sales > . . . . . . . . . . . . > AS (SELECT i_product_name product_name, > . . . . . . . . . . . . > i_item_sk item_sk, > . . . . . . . . . . . . > s_store_name store_name, > . . . . . . . . . . . . > s_zip store_zip, > . . . . . . . . . . . . > ad1.ca_street_number > b_street_number, > . . . . . . . . . . . . > ad1.ca_street_name > b_streen_name, > . . . . . . . . . . . . > ad1.ca_cityb_city, > . . . . . . . . . . . . > ad1.ca_zip b_zip, > . . . . . . . . . . . . > ad2.ca_street_number > c_street_number, > . . . . . . . . . . . . > ad2.ca_street_name > c_street_name, > . . . . . . . . . . . . > ad2.ca_cityc_city, > . . . . . . . . . . . . > ad2.ca_zip c_zip, > . . . . . . . . . . . . > d1.d_year AS syear, > . . . . . . . . . . . . > d2.d_year AS fsyear, > . . . . . . . . . . . . > d3.d_year s2year, > . . . . . . . . . . . . > Count(*) cnt, > . . . . . . . . . . . . > Sum(ss_wholesale_cost) s1, > . . . . . . . . . . . . > Sum(ss_list_price) s2, > . . . . . . . . . . . . > Sum(ss_coupon_amt) s3 > . . . . . . . . . . . . > FROM store_sales, > . . . . . . . . . . . . > store_returns, > . . . . . . . . . . . . > cs_ui, > . . . . . . . . . . . . > date_dim d1, > . . . . . . . . . . . . > date_dim d2, > . . . . . . . . . . . . > date_dim d3, > . . . . . . . . . . . . > store, > . . . . . . . . . . . . > customer, > . . . . . . . . . . . . > customer_demographics cd1, > . . . . . . . . . . . . > customer_demographics cd2, > . . . . . . . . . . . . > promotion, > . . . . . . . . . . . . > household_demographics hd1, > . . . . . . . . . . . . > household_demographics hd2, > . . . . . . . . . . . . > customer_address ad1, > . . . . . . . . . . . . > customer_address ad2, > . . . . . . . . . . . . > income_band ib1, > . . . . . . . . . . . . > income_band ib2, > . . . . . . . . . . . . > item > . . . . . . . . . . . . > WHERE ss_store_sk = s_store_sk > . . . . . . . . . . . . > AND ss_sold_date_sk = d1.d_date_sk > . . . . . . . . . . . . > AND ss_customer_sk = c_customer_sk > . . . . . . . . . . . . >
[jira] [Updated] (DRILL-5326) Unit tests failures related to the SERVER_METADTA
[ https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zelaine Fong updated DRILL-5326: Reviewer: Jinfeng Ni Assigned Reviewer to [~jni] > Unit tests failures related to the SERVER_METADTA > - > > Key: DRILL-5326 > URL: https://issues.apache.org/jira/browse/DRILL-5326 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.10.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka >Priority: Blocker > Fix For: 1.10.0 > > > 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will > support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT > it is disabled. > When I enabled this method (by way of upgrading drill version to 1.10.0 or > 1.11.0-SNAPSHOT) I found the following exception: > {code} > java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT > {code} > It appears in several tests (for example in > DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh). > The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in > the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was > added in DRILL-1126). > The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name > for this "GENERIC_OBJECT" RPC-/protobuf-level data type. > 2. After fixing the first one the mentioned above test still fails by reason > of the incorrect "NullCollation" value in the "ServerMetaProvider". According > to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the > default val should be NC_HIGH (NULL is the highest value). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA
[ https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900343#comment-15900343 ] ASF GitHub Bot commented on DRILL-5326: --- Github user jinfengni commented on a diff in the pull request: https://github.com/apache/drill/pull/775#discussion_r104802660 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/metadata/ServerMetaProvider.java --- @@ -76,7 +76,7 @@ .setReadOnly(false) .setGroupBySupport(GroupBySupport.GB_UNRELATED) .setLikeEscapeClauseSupported(true) - .setNullCollation(NullCollation.NC_AT_END) + .setNullCollation(NullCollation.NC_HIGH) --- End diff -- I'm not completely sure why we should change from NC_AT_END to NC_HIGH, in stead of NC_AT_END. I thought Drill is using ASC as default ordering, and NULLS LAST as default null collation for ASC. This is consistent to what Oracle [1] and Postgres [2] is doing : ASC /NULL LAST is the default option. 1. http://docs.oracle.com/javadb/10.6.2.1/ref/rrefsqlj13658.html 2. https://www.postgresql.org/docs/9.4/static/queries-order.html > Unit tests failures related to the SERVER_METADTA > - > > Key: DRILL-5326 > URL: https://issues.apache.org/jira/browse/DRILL-5326 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.10.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka >Priority: Blocker > Fix For: 1.10.0 > > > 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will > support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT > it is disabled. > When I enabled this method (by way of upgrading drill version to 1.10.0 or > 1.11.0-SNAPSHOT) I found the following exception: > {code} > java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT > {code} > It appears in several tests (for example in > DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh). > The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in > the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was > added in DRILL-1126). > The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name > for this "GENERIC_OBJECT" RPC-/protobuf-level data type. > 2. After fixing the first one the mentioned above test still fails by reason > of the incorrect "NullCollation" value in the "ServerMetaProvider". According > to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the > default val should be NC_HIGH (NULL is the highest value). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA
[ https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900354#comment-15900354 ] Jinfeng Ni commented on DRILL-5326: --- [~laurentgo], can you please take a look this PR as well, since it's related to the change you made in DRILL-5301, and it's blocking issue for drill 1.10.0 RC0? Thanks. > Unit tests failures related to the SERVER_METADTA > - > > Key: DRILL-5326 > URL: https://issues.apache.org/jira/browse/DRILL-5326 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.10.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka >Priority: Blocker > Fix For: 1.10.0 > > > 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will > support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT > it is disabled. > When I enabled this method (by way of upgrading drill version to 1.10.0 or > 1.11.0-SNAPSHOT) I found the following exception: > {code} > java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT > {code} > It appears in several tests (for example in > DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh). > The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in > the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was > added in DRILL-1126). > The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name > for this "GENERIC_OBJECT" RPC-/protobuf-level data type. > 2. After fixing the first one the mentioned above test still fails by reason > of the incorrect "NullCollation" value in the "ServerMetaProvider". According > to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the > default val should be NC_HIGH (NULL is the highest value). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5327) Hash aggregate can return empty batch which can cause schema change exception
Chun Chang created DRILL-5327: - Summary: Hash aggregate can return empty batch which can cause schema change exception Key: DRILL-5327 URL: https://issues.apache.org/jira/browse/DRILL-5327 Project: Apache Drill Issue Type: Bug Components: Functions - Drill Affects Versions: 1.10.0 Reporter: Chun Chang Priority: Blocker Hash aggregate can return empty batches which cause drill to throw schema change exception (not handling this type of schema change). This is not a new bug. But a recent hash function change (a theoretically correct change) may have increased the chance of hitting this issue. I don't have scientific data to support my claim (in fact I don't believe it's the case), but a regular regression run used to pass fails now due to this bug. My concern is that existing drill users out there may have queries that used to work but fail now. It will be difficult to explain why the new release is better for them. I put this bug as blocker so we can discuss it before releasing 1.10. {noformat} /root/drillAutomation/framework-master/framework/resources/Advanced/tpcds/tpcds_sf1/original/text/query66.sql Query: -- start query 66 in stream 0 using template query66.tpl SELECT w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country, ship_carriers, year1, Sum(jan_sales) AS jan_sales, Sum(feb_sales) AS feb_sales, Sum(mar_sales) AS mar_sales, Sum(apr_sales) AS apr_sales, Sum(may_sales) AS may_sales, Sum(jun_sales) AS jun_sales, Sum(jul_sales) AS jul_sales, Sum(aug_sales) AS aug_sales, Sum(sep_sales) AS sep_sales, Sum(oct_sales) AS oct_sales, Sum(nov_sales) AS nov_sales, Sum(dec_sales) AS dec_sales, Sum(jan_sales / w_warehouse_sq_ft) AS jan_sales_per_sq_foot, Sum(feb_sales / w_warehouse_sq_ft) AS feb_sales_per_sq_foot, Sum(mar_sales / w_warehouse_sq_ft) AS mar_sales_per_sq_foot, Sum(apr_sales / w_warehouse_sq_ft) AS apr_sales_per_sq_foot, Sum(may_sales / w_warehouse_sq_ft) AS may_sales_per_sq_foot, Sum(jun_sales / w_warehouse_sq_ft) AS jun_sales_per_sq_foot, Sum(jul_sales / w_warehouse_sq_ft) AS jul_sales_per_sq_foot, Sum(aug_sales / w_warehouse_sq_ft) AS aug_sales_per_sq_foot, Sum(sep_sales / w_warehouse_sq_ft) AS sep_sales_per_sq_foot, Sum(oct_sales / w_warehouse_sq_ft) AS oct_sales_per_sq_foot, Sum(nov_sales / w_warehouse_sq_ft) AS nov_sales_per_sq_foot, Sum(dec_sales / w_warehouse_sq_ft) AS dec_sales_per_sq_foot, Sum(jan_net) AS jan_net, Sum(feb_net) AS feb_net, Sum(mar_net) AS mar_net, Sum(apr_net) AS apr_net, Sum(may_net) AS may_net, Sum(jun_net) AS jun_net, Sum(jul_net) AS jul_net, Sum(aug_net) AS aug_net, Sum(sep_net) AS sep_net, Sum(oct_net) AS oct_net, Sum(nov_net) AS nov_net, Sum(dec_net) AS dec_net FROM (SELECT w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country, 'ZOUROS' || ',' || 'ZHOU' AS ship_carriers, d_yearAS year1, Sum(CASE WHEN d_moy = 1 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS jan_sales, Sum(CASE WHEN d_moy = 2 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS feb_sales, Sum(CASE WHEN d_moy = 3 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS mar_sales, Sum(CASE WHEN d_moy = 4 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS apr_sales,
[jira] [Commented] (DRILL-5327) Hash aggregate can return empty batch which can cause schema change exception
[ https://issues.apache.org/jira/browse/DRILL-5327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900357#comment-15900357 ] Chun Chang commented on DRILL-5327: --- This is related to DRILL-3991 > Hash aggregate can return empty batch which can cause schema change exception > - > > Key: DRILL-5327 > URL: https://issues.apache.org/jira/browse/DRILL-5327 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.10.0 >Reporter: Chun Chang >Priority: Blocker > > Hash aggregate can return empty batches which cause drill to throw schema > change exception (not handling this type of schema change). This is not a new > bug. But a recent hash function change (a theoretically correct change) may > have increased the chance of hitting this issue. I don't have scientific data > to support my claim (in fact I don't believe it's the case), but a regular > regression run used to pass fails now due to this bug. My concern is that > existing drill users out there may have queries that used to work but fail > now. It will be difficult to explain why the new release is better for them. > I put this bug as blocker so we can discuss it before releasing 1.10. > {noformat} > /root/drillAutomation/framework-master/framework/resources/Advanced/tpcds/tpcds_sf1/original/text/query66.sql > Query: > -- start query 66 in stream 0 using template query66.tpl > SELECT w_warehouse_name, >w_warehouse_sq_ft, >w_city, >w_county, >w_state, >w_country, >ship_carriers, >year1, >Sum(jan_sales) AS jan_sales, >Sum(feb_sales) AS feb_sales, >Sum(mar_sales) AS mar_sales, >Sum(apr_sales) AS apr_sales, >Sum(may_sales) AS may_sales, >Sum(jun_sales) AS jun_sales, >Sum(jul_sales) AS jul_sales, >Sum(aug_sales) AS aug_sales, >Sum(sep_sales) AS sep_sales, >Sum(oct_sales) AS oct_sales, >Sum(nov_sales) AS nov_sales, >Sum(dec_sales) AS dec_sales, >Sum(jan_sales / w_warehouse_sq_ft) AS jan_sales_per_sq_foot, >Sum(feb_sales / w_warehouse_sq_ft) AS feb_sales_per_sq_foot, >Sum(mar_sales / w_warehouse_sq_ft) AS mar_sales_per_sq_foot, >Sum(apr_sales / w_warehouse_sq_ft) AS apr_sales_per_sq_foot, >Sum(may_sales / w_warehouse_sq_ft) AS may_sales_per_sq_foot, >Sum(jun_sales / w_warehouse_sq_ft) AS jun_sales_per_sq_foot, >Sum(jul_sales / w_warehouse_sq_ft) AS jul_sales_per_sq_foot, >Sum(aug_sales / w_warehouse_sq_ft) AS aug_sales_per_sq_foot, >Sum(sep_sales / w_warehouse_sq_ft) AS sep_sales_per_sq_foot, >Sum(oct_sales / w_warehouse_sq_ft) AS oct_sales_per_sq_foot, >Sum(nov_sales / w_warehouse_sq_ft) AS nov_sales_per_sq_foot, >Sum(dec_sales / w_warehouse_sq_ft) AS dec_sales_per_sq_foot, >Sum(jan_net) AS jan_net, >Sum(feb_net) AS feb_net, >Sum(mar_net) AS mar_net, >Sum(apr_net) AS apr_net, >Sum(may_net) AS may_net, >Sum(jun_net) AS jun_net, >Sum(jul_net) AS jul_net, >Sum(aug_net) AS aug_net, >Sum(sep_net) AS sep_net, >Sum(oct_net) AS oct_net, >Sum(nov_net) AS nov_net, >Sum(dec_net) AS dec_net > FROM (SELECT w_warehouse_name, >w_warehouse_sq_ft, >w_city, >w_county, >w_state, >w_country, >'ZOUROS' >|| ',' >|| 'ZHOU' AS ship_carriers, >d_yearAS year1, >Sum(CASE > WHEN d_moy = 1 THEN ws_ext_sales_price * ws_quantity > ELSE 0 >END) AS jan_sales, >Sum(CASE > WHEN d_moy = 2 THEN ws_ext_sales_price * ws_
[jira] [Assigned] (DRILL-5327) Hash aggregate can return empty batch which can cause schema change exception
[ https://issues.apache.org/jira/browse/DRILL-5327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zelaine Fong reassigned DRILL-5327: --- Assignee: Boaz Ben-Zvi [~ben-zvi] - can you confirm that this is due to your changes for DRILL-5293. > Hash aggregate can return empty batch which can cause schema change exception > - > > Key: DRILL-5327 > URL: https://issues.apache.org/jira/browse/DRILL-5327 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.10.0 >Reporter: Chun Chang >Assignee: Boaz Ben-Zvi >Priority: Blocker > > Hash aggregate can return empty batches which cause drill to throw schema > change exception (not handling this type of schema change). This is not a new > bug. But a recent hash function change (a theoretically correct change) may > have increased the chance of hitting this issue. I don't have scientific data > to support my claim (in fact I don't believe it's the case), but a regular > regression run used to pass fails now due to this bug. My concern is that > existing drill users out there may have queries that used to work but fail > now. It will be difficult to explain why the new release is better for them. > I put this bug as blocker so we can discuss it before releasing 1.10. > {noformat} > /root/drillAutomation/framework-master/framework/resources/Advanced/tpcds/tpcds_sf1/original/text/query66.sql > Query: > -- start query 66 in stream 0 using template query66.tpl > SELECT w_warehouse_name, >w_warehouse_sq_ft, >w_city, >w_county, >w_state, >w_country, >ship_carriers, >year1, >Sum(jan_sales) AS jan_sales, >Sum(feb_sales) AS feb_sales, >Sum(mar_sales) AS mar_sales, >Sum(apr_sales) AS apr_sales, >Sum(may_sales) AS may_sales, >Sum(jun_sales) AS jun_sales, >Sum(jul_sales) AS jul_sales, >Sum(aug_sales) AS aug_sales, >Sum(sep_sales) AS sep_sales, >Sum(oct_sales) AS oct_sales, >Sum(nov_sales) AS nov_sales, >Sum(dec_sales) AS dec_sales, >Sum(jan_sales / w_warehouse_sq_ft) AS jan_sales_per_sq_foot, >Sum(feb_sales / w_warehouse_sq_ft) AS feb_sales_per_sq_foot, >Sum(mar_sales / w_warehouse_sq_ft) AS mar_sales_per_sq_foot, >Sum(apr_sales / w_warehouse_sq_ft) AS apr_sales_per_sq_foot, >Sum(may_sales / w_warehouse_sq_ft) AS may_sales_per_sq_foot, >Sum(jun_sales / w_warehouse_sq_ft) AS jun_sales_per_sq_foot, >Sum(jul_sales / w_warehouse_sq_ft) AS jul_sales_per_sq_foot, >Sum(aug_sales / w_warehouse_sq_ft) AS aug_sales_per_sq_foot, >Sum(sep_sales / w_warehouse_sq_ft) AS sep_sales_per_sq_foot, >Sum(oct_sales / w_warehouse_sq_ft) AS oct_sales_per_sq_foot, >Sum(nov_sales / w_warehouse_sq_ft) AS nov_sales_per_sq_foot, >Sum(dec_sales / w_warehouse_sq_ft) AS dec_sales_per_sq_foot, >Sum(jan_net) AS jan_net, >Sum(feb_net) AS feb_net, >Sum(mar_net) AS mar_net, >Sum(apr_net) AS apr_net, >Sum(may_net) AS may_net, >Sum(jun_net) AS jun_net, >Sum(jul_net) AS jul_net, >Sum(aug_net) AS aug_net, >Sum(sep_net) AS sep_net, >Sum(oct_net) AS oct_net, >Sum(nov_net) AS nov_net, >Sum(dec_net) AS dec_net > FROM (SELECT w_warehouse_name, >w_warehouse_sq_ft, >w_city, >w_county, >w_state, >w_country, >'ZOUROS' >|| ',' >|| 'ZHOU' AS ship_carriers, >d_yearAS year1, >Sum(CASE > WHEN d_moy = 1 THEN ws_ext_sales_price * ws_quantity > ELSE 0 >END) AS jan_sales, >Sum(CASE >
[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA
[ https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900366#comment-15900366 ] ASF GitHub Bot commented on DRILL-5326: --- Github user zfong commented on a diff in the pull request: https://github.com/apache/drill/pull/775#discussion_r104805550 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/metadata/ServerMetaProvider.java --- @@ -76,7 +76,7 @@ .setReadOnly(false) .setGroupBySupport(GroupBySupport.GB_UNRELATED) .setLikeEscapeClauseSupported(true) - .setNullCollation(NullCollation.NC_AT_END) + .setNullCollation(NullCollation.NC_HIGH) --- End diff -- @jinfengni - See @vdiravka's comments in the Jira. The Drill documentation at https://drill.apache.org/docs/order-by-clause/#usage-notes says NULLs sort highest. If the doc is wrong, then we should fix the doc. > Unit tests failures related to the SERVER_METADTA > - > > Key: DRILL-5326 > URL: https://issues.apache.org/jira/browse/DRILL-5326 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.10.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka >Priority: Blocker > Fix For: 1.10.0 > > > 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will > support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT > it is disabled. > When I enabled this method (by way of upgrading drill version to 1.10.0 or > 1.11.0-SNAPSHOT) I found the following exception: > {code} > java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT > {code} > It appears in several tests (for example in > DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh). > The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in > the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was > added in DRILL-1126). > The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name > for this "GENERIC_OBJECT" RPC-/protobuf-level data type. > 2. After fixing the first one the mentioned above test still fails by reason > of the incorrect "NullCollation" value in the "ServerMetaProvider". According > to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the > default val should be NC_HIGH (NULL is the highest value). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA
[ https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900378#comment-15900378 ] ASF GitHub Bot commented on DRILL-5326: --- Github user jinfengni commented on a diff in the pull request: https://github.com/apache/drill/pull/775#discussion_r104807540 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/metadata/ServerMetaProvider.java --- @@ -76,7 +76,7 @@ .setReadOnly(false) .setGroupBySupport(GroupBySupport.GB_UNRELATED) .setLikeEscapeClauseSupported(true) - .setNullCollation(NullCollation.NC_AT_END) + .setNullCollation(NullCollation.NC_HIGH) --- End diff -- My understanding is NULL collation should be specified together with ASC/DESC. ASC/NULL LAST as the default option essentially implies NC_HIGH. However, I'm not sure if we just specify NC_HIGH alone, or ASC/NC_AT_END combined. > Unit tests failures related to the SERVER_METADTA > - > > Key: DRILL-5326 > URL: https://issues.apache.org/jira/browse/DRILL-5326 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.10.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka >Priority: Blocker > Fix For: 1.10.0 > > > 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will > support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT > it is disabled. > When I enabled this method (by way of upgrading drill version to 1.10.0 or > 1.11.0-SNAPSHOT) I found the following exception: > {code} > java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT > {code} > It appears in several tests (for example in > DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh). > The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in > the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was > added in DRILL-1126). > The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name > for this "GENERIC_OBJECT" RPC-/protobuf-level data type. > 2. After fixing the first one the mentioned above test still fails by reason > of the incorrect "NullCollation" value in the "ServerMetaProvider". According > to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the > default val should be NC_HIGH (NULL is the highest value). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA
[ https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900399#comment-15900399 ] ASF GitHub Bot commented on DRILL-5326: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/775#discussion_r104809964 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/metadata/ServerMetaProvider.java --- @@ -76,7 +76,7 @@ .setReadOnly(false) .setGroupBySupport(GroupBySupport.GB_UNRELATED) .setLikeEscapeClauseSupported(true) - .setNullCollation(NullCollation.NC_AT_END) + .setNullCollation(NullCollation.NC_HIGH) --- End diff -- When sorting in Drill, the detailed sort spec is set in the {{ExternalSort}} operator definition. This thing is a bit complex. One can control sort order (ASC, DESC) and nulls position (LOW, HIGH, UNSPECIFIED.) Data sorts according to ASC, DESC. Nulls sort as follows: HIGH: last if ASC, first if DESC LOW: first if ASC, last if DESC UNSPECIFIED: always high If the planner has no way of setting the nulls ordering from a SQL query, then the value is UNSPECIFIED, which means nulls always sort last as Jinfeng explained. > Unit tests failures related to the SERVER_METADTA > - > > Key: DRILL-5326 > URL: https://issues.apache.org/jira/browse/DRILL-5326 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.10.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka >Priority: Blocker > Fix For: 1.10.0 > > > 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will > support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT > it is disabled. > When I enabled this method (by way of upgrading drill version to 1.10.0 or > 1.11.0-SNAPSHOT) I found the following exception: > {code} > java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT > {code} > It appears in several tests (for example in > DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh). > The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in > the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was > added in DRILL-1126). > The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name > for this "GENERIC_OBJECT" RPC-/protobuf-level data type. > 2. After fixing the first one the mentioned above test still fails by reason > of the incorrect "NullCollation" value in the "ServerMetaProvider". According > to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the > default val should be NC_HIGH (NULL is the highest value). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5328) Trim down physical plan size - replace StoragePluginConfig with storage name
Chunhui Shi created DRILL-5328: -- Summary: Trim down physical plan size - replace StoragePluginConfig with storage name Key: DRILL-5328 URL: https://issues.apache.org/jira/browse/DRILL-5328 Project: Apache Drill Issue Type: Improvement Reporter: Chunhui Shi For a physical plan, we now pass StoragePluginConfig as part of plan, then the destination use the config to fetch the storage plugin in StoragePluginRegistry. However, we can also fetch a storage plugin with the name which is identical to all Drillbits. In the example of simple physical plan of 150 lines shown below, the storage plugin config took 60 lines. In a typical large system, FileSystem's StoragePluginConfig could be >500 lines. So this improvement should save the cost of passing a larger physical plan among nodes. 0: jdbc:drill:zk=10.10.88.126:5181> explain plan for select * from dfs.tmp.employee1 where last_name='Blumberg'; +--+--+ | text | json | +--+--+ | 00-00Screen 00-01 Project(*=[$0]) 00-02Project(T1¦¦*=[$0]) 00-03 SelectionVectorRemover 00-04Filter(condition=[=($1, 'Blumberg')]) 00-05 Project(T1¦¦*=[$0], last_name=[$1]) 00-06Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tmp/employee1/0_0_0.parquet]], selectionRoot=/tmp/employee1, numFiles=1, usedMetadataFile=true, cacheFileRoot=/tmp/employee1, columns=[`*`]]]) | { "head" : { "version" : 1, "generator" : { "type" : "ExplainHandler", "info" : "" }, "type" : "APACHE_DRILL_PHYSICAL", "options" : [ ], "queue" : 0, "resultMode" : "EXEC" }, "graph" : [ { "pop" : "parquet-scan", "@id" : 6, "userName" : "root", "entries" : [ { "path" : "/tmp/employee1/0_0_0.parquet" } ], "storage" : { "type" : "file", "enabled" : true, "connection" : "maprfs:///", "config" : null, "workspaces" : { "root" : { "location" : "/", "writable" : false, "defaultInputFormat" : null }, "tmp" : { "location" : "/tmp", "writable" : true, "defaultInputFormat" : null }, "shi" : { "location" : "/user/shi", "writable" : true, "defaultInputFormat" : null }, "dir700" : { "location" : "/user/shi/dir700", "writable" : true, "defaultInputFormat" : null }, "dir775" : { "location" : "/user/shi/dir775", "writable" : true, "defaultInputFormat" : null }, "xyz" : { "location" : "/user/xyz", "writable" : true, "defaultInputFormat" : null } }, "formats" : { "psv" : { "type" : "text", "extensions" : [ "tbl" ], "delimiter" : "|" }, "csv" : { "type" : "text", "extensions" : [ "csv" ], "delimiter" : "," }, "tsv" : { "type" : "text", "extensions" : [ "tsv" ], "delimiter" : "\t" }, "parquet" : { "type" : "parquet" }, "json" : { "type" : "json", "extensions" : [ "json" ] }, "maprdb" : { "type" : "maprdb" } } }, "format" : { "type" : "parquet" }, "columns" : [ "`*`" ], "selectionRoot" : "/tmp/employee1", "filter" : "true", "fileSet" : [ "/tmp/employee1/0_0_0.parquet" ], "files" : [ "/tmp/employee1/0_0_0.parquet" ], "cost" : 1155.0 }, { "pop" : "project", "@id" : 5, "exprs" : [ { "ref" : "`T1¦¦*`", "expr" : "`*`" }, { "ref" : "`last_name`", "expr" : "`last_name`" } ], "child" : 6, "initialAllocation" : 100, "maxAllocation" : 100, "cost" : 1155.0 }, { "pop" : "filter", "@id" : 4, "child" : 5, "expr" : "equal(`last_name`, 'Blumberg') ", "initialAllocation" : 100, "maxAllocation" : 100, "cost" : 173.25 }, { "pop" : "selection-vector-remover", "@id" : 3, "child" : 4, "initialAllocation" : 100, "maxAllocation" : 100, "cost" : 173.25 }, { "pop" : "project", "@id" : 2, "exprs" : [ { "ref" : "`T1¦¦*`", "expr" : "`T1¦¦*`" } ], "child" : 3, "initialAllocation" : 100, "maxAllocation" : 100, "cost" : 173.25 }, { "pop" : "project", "@id" : 1, "exprs" : [ { "ref" : "`*`", "expr" : "`T1¦¦*`" } ], "child" : 2, "initialAllocation" : 100, "maxAllocation" : 100, "cost" : 173.25 }, { "pop" : "screen", "@id" : 0, "child" : 1, "initialAllocation" : 100, "maxAllocation" : 100, "co
[jira] [Commented] (DRILL-5327) Hash aggregate can return empty batch which can cause schema change exception
[ https://issues.apache.org/jira/browse/DRILL-5327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900405#comment-15900405 ] Boaz Ben-Zvi commented on DRILL-5327: - DRILL-5293 has only exposed this existing bug (e.g., seen before when MapR-DB storage was used). The underlying cause is a hardcoded decision to mark the schema a schema-less empty batch as an INT, which conflicts with the existing varchar schema (probably `w_warehouse_name`). There were two rows/records distributed to one batch, and none to the second, which was thus empty. With a different hashing, the two rows were split among the two batches, hence none was empty. Another familiar symptom -- this bug is intermittent -- reflecting the race situation between the batches -- when the empty batch arrives first (at the Hash Aggr), there is no schema change as the second arrives because we can change INT into VARCHAR. Also the relation to DRILL-3991 is highly speculative; that other Jira has to do with coping with an actual schema change. There is some slight chance that by such coping we could overcome the empty batch problem. Though not likely (e.g., once the schema was set to varchar, then comes an empty batch with our default as INT -- can we force INT upon all those varchars ?) > Hash aggregate can return empty batch which can cause schema change exception > - > > Key: DRILL-5327 > URL: https://issues.apache.org/jira/browse/DRILL-5327 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.10.0 >Reporter: Chun Chang >Assignee: Boaz Ben-Zvi >Priority: Blocker > > Hash aggregate can return empty batches which cause drill to throw schema > change exception (not handling this type of schema change). This is not a new > bug. But a recent hash function change (a theoretically correct change) may > have increased the chance of hitting this issue. I don't have scientific data > to support my claim (in fact I don't believe it's the case), but a regular > regression run used to pass fails now due to this bug. My concern is that > existing drill users out there may have queries that used to work but fail > now. It will be difficult to explain why the new release is better for them. > I put this bug as blocker so we can discuss it before releasing 1.10. > {noformat} > /root/drillAutomation/framework-master/framework/resources/Advanced/tpcds/tpcds_sf1/original/text/query66.sql > Query: > -- start query 66 in stream 0 using template query66.tpl > SELECT w_warehouse_name, >w_warehouse_sq_ft, >w_city, >w_county, >w_state, >w_country, >ship_carriers, >year1, >Sum(jan_sales) AS jan_sales, >Sum(feb_sales) AS feb_sales, >Sum(mar_sales) AS mar_sales, >Sum(apr_sales) AS apr_sales, >Sum(may_sales) AS may_sales, >Sum(jun_sales) AS jun_sales, >Sum(jul_sales) AS jul_sales, >Sum(aug_sales) AS aug_sales, >Sum(sep_sales) AS sep_sales, >Sum(oct_sales) AS oct_sales, >Sum(nov_sales) AS nov_sales, >Sum(dec_sales) AS dec_sales, >Sum(jan_sales / w_warehouse_sq_ft) AS jan_sales_per_sq_foot, >Sum(feb_sales / w_warehouse_sq_ft) AS feb_sales_per_sq_foot, >Sum(mar_sales / w_warehouse_sq_ft) AS mar_sales_per_sq_foot, >Sum(apr_sales / w_warehouse_sq_ft) AS apr_sales_per_sq_foot, >Sum(may_sales / w_warehouse_sq_ft) AS may_sales_per_sq_foot, >Sum(jun_sales / w_warehouse_sq_ft) AS jun_sales_per_sq_foot, >Sum(jul_sales / w_warehouse_sq_ft) AS jul_sales_per_sq_foot, >Sum(aug_sales / w_warehouse_sq_ft) AS aug_sales_per_sq_foot, >Sum(sep_sales / w_warehouse_sq_ft) AS sep_sales_per_sq_foot, >Sum(oct_sales / w_warehouse_sq_ft) AS oct_sales_per_sq_foot, >Sum(nov_sales / w_warehouse_sq_ft) AS nov_sales_per_sq_foot, >Sum(dec_sales / w_warehouse_sq_ft) AS dec_sales_per_sq_foot, >Sum(jan_net) AS jan_net, >Sum(feb_net) AS feb_net, >Sum(mar_net) AS mar_net, >Sum(apr_net)
[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA
[ https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900411#comment-15900411 ] Vitalii Diravka commented on DRILL-5326: Example of the row order sorting with nulls: {code} 0: jdbc:drill:zk=local> select a from dfs.`/home/vitalii/IdeaProjects/drill/exec/java-exec/src/test/resources/parquet/data.snappy.parquet` limit 5; ++ | a| ++ | null | | null | | 67985 | | null | | null | ++ 5 rows selected (0.153 seconds) 0: jdbc:drill:zk=local> select a from dfs.`/home/vitalii/IdeaProjects/drill/exec/java-exec/src/test/resources/parquet/data.snappy.parquet` order by `a` limit 5; +--+ | a | +--+ | 42 | | 50 | | 95 | | 116 | | 116 | +--+ 5 rows selected (0.248 seconds) 0: jdbc:drill:zk=local> select a from dfs.`/home/vitalii/IdeaProjects/drill/exec/java-exec/src/test/resources/parquet/data.snappy.parquet` order by `a` DESC limit 5; +---+ | a | +---+ | null | | null | | null | | null | | null | +---+ 5 rows selected (0.247 seconds) {code} > Unit tests failures related to the SERVER_METADTA > - > > Key: DRILL-5326 > URL: https://issues.apache.org/jira/browse/DRILL-5326 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.10.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka >Priority: Blocker > Fix For: 1.10.0 > > > 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will > support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT > it is disabled. > When I enabled this method (by way of upgrading drill version to 1.10.0 or > 1.11.0-SNAPSHOT) I found the following exception: > {code} > java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT > {code} > It appears in several tests (for example in > DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh). > The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in > the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was > added in DRILL-1126). > The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name > for this "GENERIC_OBJECT" RPC-/protobuf-level data type. > 2. After fixing the first one the mentioned above test still fails by reason > of the incorrect "NullCollation" value in the "ServerMetaProvider". According > to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the > default val should be NC_HIGH (NULL is the highest value). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5328) Trim down physical plan size - replace StoragePluginConfig with storage name
[ https://issues.apache.org/jira/browse/DRILL-5328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900415#comment-15900415 ] Paul Rogers commented on DRILL-5328: Careful not to introduce race conditions. * Submit plan onto Drillbit A, with one storage plugin config. * At the same time, alter the storage plugin on Drillbit B. The query is planned, and will execute, on Drillbit A based on the old config. The query will execute on Drillbit B with the new config. Eventually, Drillbit A will learn of the changed config, but it takes time. This creates a race condition between the query submission and the plugin updates. This issue is very similar to the issue that arrises in Dynamic UDFs. We've had trouble getting the design right. We plan to move to an MVCC model to finally resolve all the race conditions. MVCC could be used here as well. Store storage plugin configs as versions. Now, the above race condition is resolved: * Query is planned on Drillbit A using version 17 of, say, "dfs." * User modifies plugins on Drillbit B to create version 18. * When the query executes on Drillbit B, it uses (dfs, 17), and so uses the same version of the information with which it was planned. > Trim down physical plan size - replace StoragePluginConfig with storage name > > > Key: DRILL-5328 > URL: https://issues.apache.org/jira/browse/DRILL-5328 > Project: Apache Drill > Issue Type: Improvement >Reporter: Chunhui Shi > > For a physical plan, we now pass StoragePluginConfig as part of plan, then > the destination use the config to fetch the storage plugin in > StoragePluginRegistry. However, we can also fetch a storage plugin with the > name which is identical to all Drillbits. > In the example of simple physical plan of 150 lines shown below, the storage > plugin config took 60 lines. In a typical large system, FileSystem's > StoragePluginConfig could be >500 lines. So this improvement should save the > cost of passing a larger physical plan among nodes. > 0: jdbc:drill:zk=10.10.88.126:5181> explain plan for select * from > dfs.tmp.employee1 where last_name='Blumberg'; > +--+--+ > | text | json | > +--+--+ > | 00-00Screen > 00-01 Project(*=[$0]) > 00-02Project(T1¦¦*=[$0]) > 00-03 SelectionVectorRemover > 00-04Filter(condition=[=($1, 'Blumberg')]) > 00-05 Project(T1¦¦*=[$0], last_name=[$1]) > 00-06Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=/tmp/employee1/0_0_0.parquet]], > selectionRoot=/tmp/employee1, numFiles=1, usedMetadataFile=true, > cacheFileRoot=/tmp/employee1, columns=[`*`]]]) > | { > "head" : { > "version" : 1, > "generator" : { > "type" : "ExplainHandler", > "info" : "" > }, > "type" : "APACHE_DRILL_PHYSICAL", > "options" : [ ], > "queue" : 0, > "resultMode" : "EXEC" > }, > "graph" : [ { > "pop" : "parquet-scan", > "@id" : 6, > "userName" : "root", > "entries" : [ { > "path" : "/tmp/employee1/0_0_0.parquet" > } ], > "storage" : { > "type" : "file", > "enabled" : true, > "connection" : "maprfs:///", > "config" : null, > "workspaces" : { > "root" : { > "location" : "/", > "writable" : false, > "defaultInputFormat" : null > }, > "tmp" : { > "location" : "/tmp", > "writable" : true, > "defaultInputFormat" : null > }, > "shi" : { > "location" : "/user/shi", > "writable" : true, > "defaultInputFormat" : null > }, > "dir700" : { > "location" : "/user/shi/dir700", > "writable" : true, > "defaultInputFormat" : null > }, > "dir775" : { > "location" : "/user/shi/dir775", > "writable" : true, > "defaultInputFormat" : null > }, > "xyz" : { > "location" : "/user/xyz", > "writable" : true, > "defaultInputFormat" : null > } > }, > "formats" : { > "psv" : { > "type" : "text", > "extensions" : [ "tbl" ], > "delimiter" : "|" > }, > "csv" : { > "type" : "text", > "extensions" : [ "csv" ], > "delimiter" : "," > }, > "tsv" : { > "type" : "text", > "extensions" : [ "tsv" ], > "delimiter" : "\t" > }, > "parquet" : { > "type" : "parquet" > }, > "json" : { > "type" : "json", > "extensions" : [ "json" ] > }, > "maprdb" : { > "type" : "maprdb" > } > } > }, > "fo
[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA
[ https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900414#comment-15900414 ] ASF GitHub Bot commented on DRILL-5326: --- Github user vdiravka commented on a diff in the pull request: https://github.com/apache/drill/pull/775#discussion_r104812618 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/metadata/ServerMetaProvider.java --- @@ -76,7 +76,7 @@ .setReadOnly(false) .setGroupBySupport(GroupBySupport.GB_UNRELATED) .setLikeEscapeClauseSupported(true) - .setNullCollation(NullCollation.NC_AT_END) + .setNullCollation(NullCollation.NC_HIGH) --- End diff -- @jinfengni Agree with Paul. The difference between NC_HIGH and NC_AT_END for the DESC case. I checked that Drill uses NC_HIGH for sorting. Please, see it at [jira](https://issues.apache.org/jira/browse/DRILL-5326?focusedCommentId=15900411&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15900411), since github doesn't show formatted code correctly in the comments. > Unit tests failures related to the SERVER_METADTA > - > > Key: DRILL-5326 > URL: https://issues.apache.org/jira/browse/DRILL-5326 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.10.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka >Priority: Blocker > Fix For: 1.10.0 > > > 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will > support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT > it is disabled. > When I enabled this method (by way of upgrading drill version to 1.10.0 or > 1.11.0-SNAPSHOT) I found the following exception: > {code} > java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT > {code} > It appears in several tests (for example in > DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh). > The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in > the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was > added in DRILL-1126). > The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name > for this "GENERIC_OBJECT" RPC-/protobuf-level data type. > 2. After fixing the first one the mentioned above test still fails by reason > of the incorrect "NullCollation" value in the "ServerMetaProvider". According > to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the > default val should be NC_HIGH (NULL is the highest value). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5089) Skip initializing all enabled storage plugins for every query
[ https://issues.apache.org/jira/browse/DRILL-5089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Girish updated DRILL-5089: --- Priority: Critical (was: Major) > Skip initializing all enabled storage plugins for every query > - > > Key: DRILL-5089 > URL: https://issues.apache.org/jira/browse/DRILL-5089 > Project: Apache Drill > Issue Type: Improvement > Components: Query Planning & Optimization >Affects Versions: 1.9.0 >Reporter: Abhishek Girish >Assignee: Chunhui Shi >Priority: Critical > > In a query's lifecycle, at attempt is made to initialize each enabled storage > plugin, while building the schema tree. This is done regardless of the actual > plugins involved within a query. > Sometimes, when one or more of the enabled storage plugins have issues - > either due to misconfiguration or the underlying datasource being slow or > being down, the overall query time taken increases drastically. Most likely > due the attempt being made to register schemas from a faulty plugin. > For example, when a jdbc plugin is configured with SQL Server, and at one > point the underlying SQL Server db goes down, any Drill query starting to > execute at that point and beyond begin to slow down drastically. > We must skip registering unrelated schemas (& workspaces) for a query. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5303) Drillbits fail to start when Drill server built with JDK 8 is deployed on a JDK 7 environment
[ https://issues.apache.org/jira/browse/DRILL-5303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900416#comment-15900416 ] Abhishek Girish commented on DRILL-5303: Cool, thanks [~laurentgo] > Drillbits fail to start when Drill server built with JDK 8 is deployed on a > JDK 7 environment > - > > Key: DRILL-5303 > URL: https://issues.apache.org/jira/browse/DRILL-5303 > Project: Apache Drill > Issue Type: Bug > Components: Server, Tools, Build & Test >Affects Versions: 1.9.0 >Reporter: Abhishek Girish > > When Drill is built on a node configured with JDK 8 and is then deployed in a > JDK 7 environment, Drillbits fail to start and the following errors are seen > in Drillbit.out: > {code} > Exception in thread "main" java.lang.NoSuchMethodError: > java.util.concurrent.ConcurrentHashMap.keySet()Ljava/util/concurrent/ConcurrentHashMap$KeySetView; > at > org.apache.drill.exec.coord.ClusterCoordinator.drillbitRegistered(ClusterCoordinator.java:85) > at > org.apache.drill.exec.coord.zk.ZKClusterCoordinator.updateEndpoints(ZKClusterCoordinator.java:266) > at > org.apache.drill.exec.coord.zk.ZKClusterCoordinator.start(ZKClusterCoordinator.java:135) > at org.apache.drill.exec.server.Drillbit.run(Drillbit.java:117) > at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:292) > at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:272) > at org.apache.drill.exec.server.Drillbit.main(Drillbit.java:268) > {code} > Workaround is to match the Java versions of build and deployment environments. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA
[ https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900421#comment-15900421 ] ASF GitHub Bot commented on DRILL-5326: --- Github user laurentgo commented on a diff in the pull request: https://github.com/apache/drill/pull/775#discussion_r104813282 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/metadata/ServerMetaProvider.java --- @@ -76,7 +76,7 @@ .setReadOnly(false) .setGroupBySupport(GroupBySupport.GB_UNRELATED) .setLikeEscapeClauseSupported(true) - .setNullCollation(NullCollation.NC_AT_END) + .setNullCollation(NullCollation.NC_HIGH) --- End diff -- I'm not sure if the last sentence is correct. If user specified DESC, and null collation is unspecified, then null should sort first, no (if always HIGH). NC_HIGH seems then the correct value (unless the default can be changed in the planner configuration). > Unit tests failures related to the SERVER_METADTA > - > > Key: DRILL-5326 > URL: https://issues.apache.org/jira/browse/DRILL-5326 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.10.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka >Priority: Blocker > Fix For: 1.10.0 > > > 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will > support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT > it is disabled. > When I enabled this method (by way of upgrading drill version to 1.10.0 or > 1.11.0-SNAPSHOT) I found the following exception: > {code} > java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT > {code} > It appears in several tests (for example in > DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh). > The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in > the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was > added in DRILL-1126). > The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name > for this "GENERIC_OBJECT" RPC-/protobuf-level data type. > 2. After fixing the first one the mentioned above test still fails by reason > of the incorrect "NullCollation" value in the "ServerMetaProvider". According > to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the > default val should be NC_HIGH (NULL is the highest value). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-3365) Query with window function on large dataset fails with "IOException: Mkdirs failed to create spill directory"
[ https://issues.apache.org/jira/browse/DRILL-3365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Girish updated DRILL-3365: --- Priority: Major (was: Minor) > Query with window function on large dataset fails with "IOException: Mkdirs > failed to create spill directory" > - > > Key: DRILL-3365 > URL: https://issues.apache.org/jira/browse/DRILL-3365 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow >Affects Versions: 1.1.0 >Reporter: Abhishek Girish > Fix For: Future > > > Dataset: TPC-DS SF100 Parquet > Query: > {code:sql} > SELECT sum(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk ORDER BY > ss.ss_customer_sk) AS PartialSum FROM store_sales ss GROUP BY > ss.ss_net_paid_inc_tax, ss.ss_store_sk, ss.ss_customer_sk LIMIT 20; > java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: > java.io.IOException: Mkdirs failed to create > /tmp/drill/spill/2a74ac18-0679-ab99-26c6-af41b9af7f4e/major_fragment_1/minor_fragment_17/operator_4 > (exists=false, cwd=file:///opt/mapr/drill/drill-1.1.0/bin) > Fragment 1:17 > [Error Id: 4905b400-fc0f-4287-beba-d1ca18359986 on abhi5.qa.lab:31010] > at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73) > at > sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:85) > at sqlline.TableOutputFormat.print(TableOutputFormat.java:116) > at sqlline.SqlLine.print(SqlLine.java:1583) > at sqlline.Commands.execute(Commands.java:852) > at sqlline.Commands.sql(Commands.java:751) > at sqlline.SqlLine.dispatch(SqlLine.java:738) > at sqlline.SqlLine.begin(SqlLine.java:612) > at sqlline.SqlLine.start(SqlLine.java:366) > at sqlline.SqlLine.main(SqlLine.java:259) > {code} > Was unable to find corresponding logs. This was consistently seen via JDBC > program and sqlline. > After I restarted Drillbits the issue seems to have been resolved. But wanted > to report this anyway. Possible explanation is DRILL-2917 (one or more > drillbits were in an inconsistent state) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5329) External sort does not support TinyInt type
Paul Rogers created DRILL-5329: -- Summary: External sort does not support TinyInt type Key: DRILL-5329 URL: https://issues.apache.org/jira/browse/DRILL-5329 Project: Apache Drill Issue Type: Bug Affects Versions: 1.10.0 Reporter: Paul Rogers A unit test was created to exercise the "Sorter" mechanism within the External Sort, which is used to sort each incoming batch. The sorter was tested with each Drill data type. The following types fail: * TinyInt -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (DRILL-3111) Drill UI should support fast schema return and streaming results
[ https://issues.apache.org/jira/browse/DRILL-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Girish resolved DRILL-3111. Resolution: Fixed Fix Version/s: (was: Future) > Drill UI should support fast schema return and streaming results > > > Key: DRILL-3111 > URL: https://issues.apache.org/jira/browse/DRILL-3111 > Project: Apache Drill > Issue Type: Bug > Components: Client - HTTP >Affects Versions: 1.0.0 >Reporter: Abhishek Girish > > On sqlline, a query which returns several hundred rows, supports fast schema > return and streams results as they are fetched. > Drill UI doesn't support either of these. It waits until all results are > fetched and displays them at once. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Closed] (DRILL-3111) Drill UI should support fast schema return and streaming results
[ https://issues.apache.org/jira/browse/DRILL-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Girish closed DRILL-3111. -- > Drill UI should support fast schema return and streaming results > > > Key: DRILL-3111 > URL: https://issues.apache.org/jira/browse/DRILL-3111 > Project: Apache Drill > Issue Type: Bug > Components: Client - HTTP >Affects Versions: 1.0.0 >Reporter: Abhishek Girish > > On sqlline, a query which returns several hundred rows, supports fast schema > return and streams results as they are fetched. > Drill UI doesn't support either of these. It waits until all results are > fetched and displays them at once. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5329) External sort does not support TinyInt type
[ https://issues.apache.org/jira/browse/DRILL-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900431#comment-15900431 ] Paul Rogers commented on DRILL-5329: Failure for TinyInt: {code} ERROR o.a.d.e.e.f.r.RemoteFunctionRegistry - Problem during trying to access remote function registry [registry] java.lang.NullPointerException: null at org.apache.drill.exec.expr.fn.registry.RemoteFunctionRegistry.getRegistryVersion(RemoteFunctionRegistry.java:119) ~[classes/:na] at org.apache.drill.exec.expr.fn.FunctionImplementationRegistry.syncWithRemoteRegistry(FunctionImplementationRegistry.java:320) [classes/:na] at org.apache.drill.exec.expr.fn.FunctionImplementationRegistry.findDrillFunction(FunctionImplementationRegistry.java:164) [classes/:na] at org.apache.drill.exec.expr.ExpressionTreeMaterializer$AbstractMaterializeVisitor.visitFunctionCall(ExpressionTreeMaterializer.java:352) [classes/:na] at org.apache.drill.exec.expr.ExpressionTreeMaterializer$AbstractMaterializeVisitor.visitFunctionCall(ExpressionTreeMaterializer.java:1) [classes/:na] at org.apache.drill.common.expression.FunctionCall.accept(FunctionCall.java:60) [classes/:na] at org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize(ExpressionTreeMaterializer.java:131) [classes/:na] at org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize(ExpressionTreeMaterializer.java:106) [classes/:na] at org.apache.drill.exec.expr.fn.FunctionGenerationHelper.getOrderingComparator(FunctionGenerationHelper.java:84) [classes/:na] {code} > External sort does not support TinyInt type > --- > > Key: DRILL-5329 > URL: https://issues.apache.org/jira/browse/DRILL-5329 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.10.0 >Reporter: Paul Rogers > > A unit test was created to exercise the "Sorter" mechanism within the > External Sort, which is used to sort each incoming batch. The sorter was > tested with each Drill data type. > The following types fail: > * TinyInt -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5329) External sort does not support "obscure" numeric types
[ https://issues.apache.org/jira/browse/DRILL-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-5329: --- Summary: External sort does not support "obscure" numeric types (was: External sort does not support TinyInt type) > External sort does not support "obscure" numeric types > -- > > Key: DRILL-5329 > URL: https://issues.apache.org/jira/browse/DRILL-5329 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.10.0 >Reporter: Paul Rogers > > A unit test was created to exercise the "Sorter" mechanism within the > External Sort, which is used to sort each incoming batch. The sorter was > tested with each Drill data type. > The following types fail: > * TinyInt -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5330) NPE in FunctionImplementationRegistry.functionReplacement()
Paul Rogers created DRILL-5330: -- Summary: NPE in FunctionImplementationRegistry.functionReplacement() Key: DRILL-5330 URL: https://issues.apache.org/jira/browse/DRILL-5330 Project: Apache Drill Issue Type: Bug Affects Versions: 1.10.0 Reporter: Paul Rogers Assignee: Paul Rogers Fix For: 1.11.0 The code in {{FunctionImplementationRegistry.functionReplacement()}} will produce an NPE if ever it is called: {code} if (optionManager != null && optionManager.getOption( ExecConstants.CAST_TO_NULLABLE_NUMERIC).bool_val ... {code} If an option manager is provided, then get the specified option. The option manager will contain a value for that option only if the user has explicitly set that option. Suppose the user had not set the option. Then the return from {{getOption()}} will be null. The next thing we do is *assume* that the option exists and is a boolean by dereferencing the option. This will trigger an NPE. This NPE was seen in detail-level unit tests. The proper way to handle such options is to use an option validator. Strangely, one actually exists in {{ExecConstants}}: {code} String CAST_TO_NULLABLE_NUMERIC = "drill.exec.functions.cast_empty_string_to_null"; OptionValidator CAST_TO_NULLABLE_NUMERIC_OPTION = new BooleanValidator(CAST_TO_NULLABLE_NUMERIC, false); {code} Then do: {code} optionManager.getOption( ExecConstants.CAST_TO_NULLABLE_NUMERIC_OPTION) {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5330) NPE in FunctionImplementationRegistry.functionReplacement()
[ https://issues.apache.org/jira/browse/DRILL-5330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900451#comment-15900451 ] Paul Rogers commented on DRILL-5330: To fix this, {{ExecConstants}} must change: {code} BooleanValidator CAST_TO_NULLABLE_NUMERIC_OPTION = new BooleanValidator(CAST_TO_NULLABLE_NUMERIC, false); {code} The {{BooleanValidator}} is required to select the correct overloaded method in the option manager. (This may be why the original author didn't use the validator in the first place...) > NPE in FunctionImplementationRegistry.functionReplacement() > --- > > Key: DRILL-5330 > URL: https://issues.apache.org/jira/browse/DRILL-5330 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.10.0 >Reporter: Paul Rogers >Assignee: Paul Rogers > Fix For: 1.11.0 > > > The code in {{FunctionImplementationRegistry.functionReplacement()}} will > produce an NPE if ever it is called: > {code} > if (optionManager != null > && optionManager.getOption( >ExecConstants.CAST_TO_NULLABLE_NUMERIC).bool_val > ... > {code} > If an option manager is provided, then get the specified option. The option > manager will contain a value for that option only if the user has explicitly > set that option. Suppose the user had not set the option. Then the return > from {{getOption()}} will be null. > The next thing we do is *assume* that the option exists and is a boolean by > dereferencing the option. This will trigger an NPE. This NPE was seen in > detail-level unit tests. > The proper way to handle such options is to use an option validator. > Strangely, one actually exists in {{ExecConstants}}: > {code} > String CAST_TO_NULLABLE_NUMERIC = > "drill.exec.functions.cast_empty_string_to_null"; > OptionValidator CAST_TO_NULLABLE_NUMERIC_OPTION = new > BooleanValidator(CAST_TO_NULLABLE_NUMERIC, false); > {code} > Then do: > {code} > optionManager.getOption( > ExecConstants.CAST_TO_NULLABLE_NUMERIC_OPTION) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA
[ https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900452#comment-15900452 ] ASF GitHub Bot commented on DRILL-5326: --- Github user jinfengni commented on a diff in the pull request: https://github.com/apache/drill/pull/775#discussion_r104816830 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/metadata/ServerMetaProvider.java --- @@ -76,7 +76,7 @@ .setReadOnly(false) .setGroupBySupport(GroupBySupport.GB_UNRELATED) .setLikeEscapeClauseSupported(true) - .setNullCollation(NullCollation.NC_AT_END) + .setNullCollation(NullCollation.NC_HIGH) --- End diff -- If only specify NC here (no sort order), then NC_HIGH looks more reasonable. The change looks fine to me. > Unit tests failures related to the SERVER_METADTA > - > > Key: DRILL-5326 > URL: https://issues.apache.org/jira/browse/DRILL-5326 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.10.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka >Priority: Blocker > Fix For: 1.10.0 > > > 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will > support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT > it is disabled. > When I enabled this method (by way of upgrading drill version to 1.10.0 or > 1.11.0-SNAPSHOT) I found the following exception: > {code} > java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT > {code} > It appears in several tests (for example in > DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh). > The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in > the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was > added in DRILL-1126). > The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name > for this "GENERIC_OBJECT" RPC-/protobuf-level data type. > 2. After fixing the first one the mentioned above test still fails by reason > of the incorrect "NullCollation" value in the "ServerMetaProvider". According > to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the > default val should be NC_HIGH (NULL is the highest value). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA
[ https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900460#comment-15900460 ] ASF GitHub Bot commented on DRILL-5326: --- Github user vdiravka commented on the issue: https://github.com/apache/drill/pull/775 @laurentgo I changed the names for the server meta method. > Unit tests failures related to the SERVER_METADTA > - > > Key: DRILL-5326 > URL: https://issues.apache.org/jira/browse/DRILL-5326 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.10.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka >Priority: Blocker > Fix For: 1.10.0 > > > 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will > support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT > it is disabled. > When I enabled this method (by way of upgrading drill version to 1.10.0 or > 1.11.0-SNAPSHOT) I found the following exception: > {code} > java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT > {code} > It appears in several tests (for example in > DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh). > The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in > the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was > added in DRILL-1126). > The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name > for this "GENERIC_OBJECT" RPC-/protobuf-level data type. > 2. After fixing the first one the mentioned above test still fails by reason > of the incorrect "NullCollation" value in the "ServerMetaProvider". According > to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the > default val should be NC_HIGH (NULL is the highest value). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5331) NPE in FunctionImplementationRegistry.findDrillFunction() if dynamic UDFs disabled
Paul Rogers created DRILL-5331: -- Summary: NPE in FunctionImplementationRegistry.findDrillFunction() if dynamic UDFs disabled Key: DRILL-5331 URL: https://issues.apache.org/jira/browse/DRILL-5331 Project: Apache Drill Issue Type: Bug Affects Versions: 1.10.0 Reporter: Paul Rogers Assignee: Paul Rogers Fix For: 1.11.0 Drill provides the Dynamic UDF (DUDF) functionality. DUFDs can be disabled using the following option in {{ExecConstants}}: {code} String USE_DYNAMIC_UDFS_KEY = "exec.udf.use_dynamic"; BooleanValidator USE_DYNAMIC_UDFS = new BooleanValidator(USE_DYNAMIC_UDFS_KEY, true); {code} In a unit test, we created a setup in which we wish to use only the local function registry, no DUDF support is needed. Run the code. The following code is invoked when asking for a non-existent function: {code} public DrillFuncHolder findDrillFunction(FunctionResolver functionResolver, FunctionCall functionCall) { ... if (holder == null) { syncWithRemoteRegistry(version.get()); List updatedFunctions = localFunctionRegistry.getMethods(newFunctionName, version); holder = functionResolver.getBestMatch(updatedFunctions, functionCall); } {code} The result is an NPE: {code} ERROR o.a.d.e.e.f.r.RemoteFunctionRegistry - Problem during trying to access remote function registry [registry] java.lang.NullPointerException: null at org.apache.drill.exec.expr.fn.registry.RemoteFunctionRegistry.getRegistryVersion(RemoteFunctionRegistry.java:119) ~[classes/:na] {code} The fix is simply to add a DUDF-enabled check: {code} if (holder == null) { boolean useDynamicUdfs = optionManager != null && optionManager.getOption(ExecConstants.USE_DYNAMIC_UDFS); if (useDynamicUdfs) { syncWithRemoteRegistry(version.get()); ... {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5331) NPE in FunctionImplementationRegistry.findDrillFunction() if dynamic UDFs disabled
[ https://issues.apache.org/jira/browse/DRILL-5331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-5331: --- Description: Drill provides the Dynamic UDF (DUDF) functionality. DUFDs can be disabled using the following option in {{ExecConstants}}: {code} String USE_DYNAMIC_UDFS_KEY = "exec.udf.use_dynamic"; BooleanValidator USE_DYNAMIC_UDFS = new BooleanValidator(USE_DYNAMIC_UDFS_KEY, true); {code} In a unit test, we created a setup in which we wish to use only the local function registry, no DUDF support is needed. Run the code. The following code is invoked when asking for a non-existent function: {code} public DrillFuncHolder findDrillFunction(FunctionResolver functionResolver, FunctionCall functionCall) { ... if (holder == null) { syncWithRemoteRegistry(version.get()); List updatedFunctions = localFunctionRegistry.getMethods(newFunctionName, version); holder = functionResolver.getBestMatch(updatedFunctions, functionCall); } {code} The result is an NPE: {code} ERROR o.a.d.e.e.f.r.RemoteFunctionRegistry - Problem during trying to access remote function registry [registry] java.lang.NullPointerException: null at org.apache.drill.exec.expr.fn.registry.RemoteFunctionRegistry.getRegistryVersion(RemoteFunctionRegistry.java:119) ~[classes/:na] {code} The fix is simply to add a DUDF-enabled check: {code} if (holder == null) { boolean useDynamicUdfs = optionManager != null && optionManager.getOption(ExecConstants.USE_DYNAMIC_UDFS); if (useDynamicUdfs) { syncWithRemoteRegistry(version.get()); ... {code} Then, disable dynamic UDFs for the test case by setting {{ExecConstants.USE_DYNAMIC_UDFS}} to false. was: Drill provides the Dynamic UDF (DUDF) functionality. DUFDs can be disabled using the following option in {{ExecConstants}}: {code} String USE_DYNAMIC_UDFS_KEY = "exec.udf.use_dynamic"; BooleanValidator USE_DYNAMIC_UDFS = new BooleanValidator(USE_DYNAMIC_UDFS_KEY, true); {code} In a unit test, we created a setup in which we wish to use only the local function registry, no DUDF support is needed. Run the code. The following code is invoked when asking for a non-existent function: {code} public DrillFuncHolder findDrillFunction(FunctionResolver functionResolver, FunctionCall functionCall) { ... if (holder == null) { syncWithRemoteRegistry(version.get()); List updatedFunctions = localFunctionRegistry.getMethods(newFunctionName, version); holder = functionResolver.getBestMatch(updatedFunctions, functionCall); } {code} The result is an NPE: {code} ERROR o.a.d.e.e.f.r.RemoteFunctionRegistry - Problem during trying to access remote function registry [registry] java.lang.NullPointerException: null at org.apache.drill.exec.expr.fn.registry.RemoteFunctionRegistry.getRegistryVersion(RemoteFunctionRegistry.java:119) ~[classes/:na] {code} The fix is simply to add a DUDF-enabled check: {code} if (holder == null) { boolean useDynamicUdfs = optionManager != null && optionManager.getOption(ExecConstants.USE_DYNAMIC_UDFS); if (useDynamicUdfs) { syncWithRemoteRegistry(version.get()); ... {code} > NPE in FunctionImplementationRegistry.findDrillFunction() if dynamic UDFs > disabled > -- > > Key: DRILL-5331 > URL: https://issues.apache.org/jira/browse/DRILL-5331 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.10.0 >Reporter: Paul Rogers >Assignee: Paul Rogers > Fix For: 1.11.0 > > > Drill provides the Dynamic UDF (DUDF) functionality. DUFDs can be disabled > using the following option in {{ExecConstants}}: > {code} > String USE_DYNAMIC_UDFS_KEY = "exec.udf.use_dynamic"; > BooleanValidator USE_DYNAMIC_UDFS = new > BooleanValidator(USE_DYNAMIC_UDFS_KEY, true); > {code} > In a unit test, we created a setup in which we wish to use only the local > function registry, no DUDF support is needed. Run the code. The following > code is invoked when asking for a non-existent function: > {code} > public DrillFuncHolder findDrillFunction(FunctionResolver functionResolver, > FunctionCall functionCall) { > ... > if (holder == null) { > syncWithRemoteRegistry(version.get()); > List updatedFunctions = > localFunctionRegistry.getMethods(newFunctionName, version); > holder = functionResolver.getBestMatch(updatedFunctions, functionCall); > } > {code} > The result is an NPE: > {code} > ERROR o.a.d.e.e.f.r.RemoteFunctionRegistry - Problem during trying to access > remote function registry [registry] > java.lang.NullPointerException: null > at > org.apache.drill.exec.expr.fn.registry.RemoteFunctionRegistry.getRegistryVersion(RemoteFunctionRegistry
[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA
[ https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900476#comment-15900476 ] ASF GitHub Bot commented on DRILL-5326: --- Github user laurentgo commented on the issue: https://github.com/apache/drill/pull/775 LGTM > Unit tests failures related to the SERVER_METADTA > - > > Key: DRILL-5326 > URL: https://issues.apache.org/jira/browse/DRILL-5326 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.10.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka >Priority: Blocker > Fix For: 1.10.0 > > > 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will > support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT > it is disabled. > When I enabled this method (by way of upgrading drill version to 1.10.0 or > 1.11.0-SNAPSHOT) I found the following exception: > {code} > java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT > {code} > It appears in several tests (for example in > DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh). > The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in > the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was > added in DRILL-1126). > The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name > for this "GENERIC_OBJECT" RPC-/protobuf-level data type. > 2. After fixing the first one the mentioned above test still fails by reason > of the incorrect "NullCollation" value in the "ServerMetaProvider". According > to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the > default val should be NC_HIGH (NULL is the highest value). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Issue Comment Deleted] (DRILL-5329) External sort does not support "obscure" numeric types
[ https://issues.apache.org/jira/browse/DRILL-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-5329: --- Comment: was deleted (was: Failure for TinyInt: {code} ERROR o.a.d.e.e.f.r.RemoteFunctionRegistry - Problem during trying to access remote function registry [registry] java.lang.NullPointerException: null at org.apache.drill.exec.expr.fn.registry.RemoteFunctionRegistry.getRegistryVersion(RemoteFunctionRegistry.java:119) ~[classes/:na] at org.apache.drill.exec.expr.fn.FunctionImplementationRegistry.syncWithRemoteRegistry(FunctionImplementationRegistry.java:320) [classes/:na] at org.apache.drill.exec.expr.fn.FunctionImplementationRegistry.findDrillFunction(FunctionImplementationRegistry.java:164) [classes/:na] at org.apache.drill.exec.expr.ExpressionTreeMaterializer$AbstractMaterializeVisitor.visitFunctionCall(ExpressionTreeMaterializer.java:352) [classes/:na] at org.apache.drill.exec.expr.ExpressionTreeMaterializer$AbstractMaterializeVisitor.visitFunctionCall(ExpressionTreeMaterializer.java:1) [classes/:na] at org.apache.drill.common.expression.FunctionCall.accept(FunctionCall.java:60) [classes/:na] at org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize(ExpressionTreeMaterializer.java:131) [classes/:na] at org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize(ExpressionTreeMaterializer.java:106) [classes/:na] at org.apache.drill.exec.expr.fn.FunctionGenerationHelper.getOrderingComparator(FunctionGenerationHelper.java:84) [classes/:na] {code} ) > External sort does not support "obscure" numeric types > -- > > Key: DRILL-5329 > URL: https://issues.apache.org/jira/browse/DRILL-5329 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.10.0 >Reporter: Paul Rogers > > A unit test was created to exercise the "Sorter" mechanism within the > External Sort, which is used to sort each incoming batch. The sorter was > tested with each Drill data type. > The following types fail: > * TinyInt > * UInt1 > The failure manifests on one of two ways: > * If dynamic UDFs are enabled, the query crashes with an NPE. (See > DRILL-5331.) > * If dynamic UDFs are disabled, the generated code silently skips the > comparison step, resulting in the sort not actually being done: > Sorting a set of 20-pseudo-random rows produces the following output: > {code} > #, row #, key, value > 0(0): 11, "0" > 1(1): 14, "1" > 2(2): 17, "2" > 3(3): 0, "3" > {code} > The first number is the row index, the second is the row pointed to by the > sv2 (which should be written to create sort order). Sort was done ASC, > NULLS_HIGH, by the key field. > A strong concern here is that there is no error or other warning to the user > that Drill cannot sort this type; Drill just silently declines to perform the > operation. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5329) External sort does not support "obscure" numeric types
[ https://issues.apache.org/jira/browse/DRILL-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-5329: --- Description: A unit test was created to exercise the "Sorter" mechanism within the External Sort, which is used to sort each incoming batch. The sorter was tested with each Drill data type. The following types fail: * TinyInt * UInt1 The failure manifests on one of two ways: * If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.) * If dynamic UDFs are disabled, the generated code silently skips the comparison step, resulting in the sort not actually being done: Sorting a set of 20-pseudo-random rows produces the following output: {code} #, row #, key, value 0(0): 11, "0" 1(1): 14, "1" 2(2): 17, "2" 3(3): 0, "3" {code} The first number is the row index, the second is the row pointed to by the sv2 (which should be written to create sort order). Sort was done ASC, NULLS_HIGH, by the key field. A strong concern here is that there is no error or other warning to the user that Drill cannot sort this type; Drill just silently declines to perform the operation. was: A unit test was created to exercise the "Sorter" mechanism within the External Sort, which is used to sort each incoming batch. The sorter was tested with each Drill data type. The following types fail: * TinyInt > External sort does not support "obscure" numeric types > -- > > Key: DRILL-5329 > URL: https://issues.apache.org/jira/browse/DRILL-5329 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.10.0 >Reporter: Paul Rogers > > A unit test was created to exercise the "Sorter" mechanism within the > External Sort, which is used to sort each incoming batch. The sorter was > tested with each Drill data type. > The following types fail: > * TinyInt > * UInt1 > The failure manifests on one of two ways: > * If dynamic UDFs are enabled, the query crashes with an NPE. (See > DRILL-5331.) > * If dynamic UDFs are disabled, the generated code silently skips the > comparison step, resulting in the sort not actually being done: > Sorting a set of 20-pseudo-random rows produces the following output: > {code} > #, row #, key, value > 0(0): 11, "0" > 1(1): 14, "1" > 2(2): 17, "2" > 3(3): 0, "3" > {code} > The first number is the row index, the second is the row pointed to by the > sv2 (which should be written to create sort order). Sort was done ASC, > NULLS_HIGH, by the key field. > A strong concern here is that there is no error or other warning to the user > that Drill cannot sort this type; Drill just silently declines to perform the > operation. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5329) External sort does not support "obscure" numeric types
[ https://issues.apache.org/jira/browse/DRILL-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-5329: --- Description: A unit test was created to exercise the "Sorter" mechanism within the External Sort, which is used to sort each incoming batch. The sorter was tested with each Drill data type. The following types fail: * TinyInt * UInt1 The failure manifests on one of two ways: * If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.) * If dynamic UDFs are disabled, the generated code silently skips the comparison step, resulting in the sort not actually being done: Sorting a set of 20-pseudo-random rows produces the following output: {code} #, row #, key, value 0(0): 11, "0" 1(1): 14, "1" 2(2): 17, "2" 3(3): 0, "3" {code} By contrast, the (working) Int type produces the correct results: {code} #, row #, key, value 0(3): 0, "3" 1(10): 1, "10" 2(17): 2, "17" 3(4): 3, "4" {code} The first number is the row index, the second is the row pointed to by the sv2 (which should be written to create sort order). Sort was done ASC, NULLS_HIGH, by the key field. A strong concern here is that there is no error or other warning to the user that Drill cannot sort this type; Drill just silently declines to perform the operation. was: A unit test was created to exercise the "Sorter" mechanism within the External Sort, which is used to sort each incoming batch. The sorter was tested with each Drill data type. The following types fail: * TinyInt * UInt1 The failure manifests on one of two ways: * If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.) * If dynamic UDFs are disabled, the generated code silently skips the comparison step, resulting in the sort not actually being done: Sorting a set of 20-pseudo-random rows produces the following output: {code} #, row #, key, value 0(0): 11, "0" 1(1): 14, "1" 2(2): 17, "2" 3(3): 0, "3" {code} The first number is the row index, the second is the row pointed to by the sv2 (which should be written to create sort order). Sort was done ASC, NULLS_HIGH, by the key field. A strong concern here is that there is no error or other warning to the user that Drill cannot sort this type; Drill just silently declines to perform the operation. > External sort does not support "obscure" numeric types > -- > > Key: DRILL-5329 > URL: https://issues.apache.org/jira/browse/DRILL-5329 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.10.0 >Reporter: Paul Rogers > > A unit test was created to exercise the "Sorter" mechanism within the > External Sort, which is used to sort each incoming batch. The sorter was > tested with each Drill data type. > The following types fail: > * TinyInt > * UInt1 > The failure manifests on one of two ways: > * If dynamic UDFs are enabled, the query crashes with an NPE. (See > DRILL-5331.) > * If dynamic UDFs are disabled, the generated code silently skips the > comparison step, resulting in the sort not actually being done: > Sorting a set of 20-pseudo-random rows produces the following output: > {code} > #, row #, key, value > 0(0): 11, "0" > 1(1): 14, "1" > 2(2): 17, "2" > 3(3): 0, "3" > {code} > By contrast, the (working) Int type produces the correct results: > {code} > #, row #, key, value > 0(3): 0, "3" > 1(10): 1, "10" > 2(17): 2, "17" > 3(4): 3, "4" > {code} > The first number is the row index, the second is the row pointed to by the > sv2 (which should be written to create sort order). Sort was done ASC, > NULLS_HIGH, by the key field. > A strong concern here is that there is no error or other warning to the user > that Drill cannot sort this type; Drill just silently declines to perform the > operation. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5329) External sort does not support "obscure" numeric types
[ https://issues.apache.org/jira/browse/DRILL-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-5329: --- Description: A unit test was created to exercise the "Sorter" mechanism within the External Sort, which is used to sort each incoming batch. The sorter was tested with each Drill data type. The following types fail: * TinyInt * UInt1 * SmallInt * UInt2 * UInt4 The types that work include: * Int The failure manifests on one of two ways: * If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.) * If dynamic UDFs are disabled, the generated code silently skips the comparison step, resulting in the sort not actually being done: Sorting a set of 20-pseudo-random rows produces the following output: {code} #, row #, key, value 0(0): 11, "0" 1(1): 14, "1" 2(2): 17, "2" 3(3): 0, "3" {code} By contrast, the (working) Int type produces the correct results: {code} #, row #, key, value 0(3): 0, "3" 1(10): 1, "10" 2(17): 2, "17" 3(4): 3, "4" {code} The first number is the row index, the second is the row pointed to by the sv2 (which should be written to create sort order). Sort was done ASC, NULLS_HIGH, by the key field. A strong concern here is that there is no error or other warning to the user that Drill cannot sort this type; Drill just silently declines to perform the operation. was: A unit test was created to exercise the "Sorter" mechanism within the External Sort, which is used to sort each incoming batch. The sorter was tested with each Drill data type. The following types fail: * TinyInt * UInt1 The failure manifests on one of two ways: * If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.) * If dynamic UDFs are disabled, the generated code silently skips the comparison step, resulting in the sort not actually being done: Sorting a set of 20-pseudo-random rows produces the following output: {code} #, row #, key, value 0(0): 11, "0" 1(1): 14, "1" 2(2): 17, "2" 3(3): 0, "3" {code} By contrast, the (working) Int type produces the correct results: {code} #, row #, key, value 0(3): 0, "3" 1(10): 1, "10" 2(17): 2, "17" 3(4): 3, "4" {code} The first number is the row index, the second is the row pointed to by the sv2 (which should be written to create sort order). Sort was done ASC, NULLS_HIGH, by the key field. A strong concern here is that there is no error or other warning to the user that Drill cannot sort this type; Drill just silently declines to perform the operation. > External sort does not support "obscure" numeric types > -- > > Key: DRILL-5329 > URL: https://issues.apache.org/jira/browse/DRILL-5329 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.10.0 >Reporter: Paul Rogers > > A unit test was created to exercise the "Sorter" mechanism within the > External Sort, which is used to sort each incoming batch. The sorter was > tested with each Drill data type. > The following types fail: > * TinyInt > * UInt1 > * SmallInt > * UInt2 > * UInt4 > The types that work include: > * Int > The failure manifests on one of two ways: > * If dynamic UDFs are enabled, the query crashes with an NPE. (See > DRILL-5331.) > * If dynamic UDFs are disabled, the generated code silently skips the > comparison step, resulting in the sort not actually being done: > Sorting a set of 20-pseudo-random rows produces the following output: > {code} > #, row #, key, value > 0(0): 11, "0" > 1(1): 14, "1" > 2(2): 17, "2" > 3(3): 0, "3" > {code} > By contrast, the (working) Int type produces the correct results: > {code} > #, row #, key, value > 0(3): 0, "3" > 1(10): 1, "10" > 2(17): 2, "17" > 3(4): 3, "4" > {code} > The first number is the row index, the second is the row pointed to by the > sv2 (which should be written to create sort order). Sort was done ASC, > NULLS_HIGH, by the key field. > A strong concern here is that there is no error or other warning to the user > that Drill cannot sort this type; Drill just silently declines to perform the > operation. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA
[ https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900509#comment-15900509 ] ASF GitHub Bot commented on DRILL-5326: --- Github user jinfengni commented on the issue: https://github.com/apache/drill/pull/775 @laurentgo , @vdiravka , thanks. I'll run regression & merge. +1 > Unit tests failures related to the SERVER_METADTA > - > > Key: DRILL-5326 > URL: https://issues.apache.org/jira/browse/DRILL-5326 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Affects Versions: 1.10.0 >Reporter: Vitalii Diravka >Assignee: Vitalii Diravka >Priority: Blocker > Fix For: 1.10.0 > > > 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will > support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT > it is disabled. > When I enabled this method (by way of upgrading drill version to 1.10.0 or > 1.11.0-SNAPSHOT) I found the following exception: > {code} > java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT > {code} > It appears in several tests (for example in > DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh). > The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in > the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was > added in DRILL-1126). > The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name > for this "GENERIC_OBJECT" RPC-/protobuf-level data type. > 2. After fixing the first one the mentioned above test still fails by reason > of the incorrect "NullCollation" value in the "ServerMetaProvider". According > to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the > default val should be NC_HIGH (NULL is the highest value). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5329) External sort does not support "obscure" numeric types
[ https://issues.apache.org/jira/browse/DRILL-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-5329: --- Description: A unit test was created to exercise the "Sorter" mechanism within the External Sort, which is used to sort each incoming batch. The sorter was tested with each Drill data type. The following types fail: * TinyInt * UInt1 * SmallInt * UInt2 * UInt4 * UInt8 * Var16Char The types that work include: * Int * BigInt * Float4 * Float8 * VarChar The failure manifests on one of two ways: * If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.) * If dynamic UDFs are disabled, the generated code silently skips the comparison step, resulting in the sort not actually being done: Sorting a set of 20-pseudo-random rows produces the following output: {code} #, row #, key, value 0(0): 11, "0" 1(1): 14, "1" 2(2): 17, "2" 3(3): 0, "3" {code} By contrast, the (working) Int type produces the correct results: {code} #, row #, key, value 0(3): 0, "3" 1(10): 1, "10" 2(17): 2, "17" 3(4): 3, "4" {code} The first number is the row index, the second is the row pointed to by the sv2 (which should be written to create sort order). Sort was done ASC, NULLS_HIGH, by the key field. A strong concern here is that there is no error or other warning to the user that Drill cannot sort this type; Drill just silently declines to perform the operation. was: A unit test was created to exercise the "Sorter" mechanism within the External Sort, which is used to sort each incoming batch. The sorter was tested with each Drill data type. The following types fail: * TinyInt * UInt1 * SmallInt * UInt2 * UInt4 The types that work include: * Int The failure manifests on one of two ways: * If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.) * If dynamic UDFs are disabled, the generated code silently skips the comparison step, resulting in the sort not actually being done: Sorting a set of 20-pseudo-random rows produces the following output: {code} #, row #, key, value 0(0): 11, "0" 1(1): 14, "1" 2(2): 17, "2" 3(3): 0, "3" {code} By contrast, the (working) Int type produces the correct results: {code} #, row #, key, value 0(3): 0, "3" 1(10): 1, "10" 2(17): 2, "17" 3(4): 3, "4" {code} The first number is the row index, the second is the row pointed to by the sv2 (which should be written to create sort order). Sort was done ASC, NULLS_HIGH, by the key field. A strong concern here is that there is no error or other warning to the user that Drill cannot sort this type; Drill just silently declines to perform the operation. > External sort does not support "obscure" numeric types > -- > > Key: DRILL-5329 > URL: https://issues.apache.org/jira/browse/DRILL-5329 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.10.0 >Reporter: Paul Rogers > > A unit test was created to exercise the "Sorter" mechanism within the > External Sort, which is used to sort each incoming batch. The sorter was > tested with each Drill data type. > The following types fail: > * TinyInt > * UInt1 > * SmallInt > * UInt2 > * UInt4 > * UInt8 > * Var16Char > The types that work include: > * Int > * BigInt > * Float4 > * Float8 > * VarChar > The failure manifests on one of two ways: > * If dynamic UDFs are enabled, the query crashes with an NPE. (See > DRILL-5331.) > * If dynamic UDFs are disabled, the generated code silently skips the > comparison step, resulting in the sort not actually being done: > Sorting a set of 20-pseudo-random rows produces the following output: > {code} > #, row #, key, value > 0(0): 11, "0" > 1(1): 14, "1" > 2(2): 17, "2" > 3(3): 0, "3" > {code} > By contrast, the (working) Int type produces the correct results: > {code} > #, row #, key, value > 0(3): 0, "3" > 1(10): 1, "10" > 2(17): 2, "17" > 3(4): 3, "4" > {code} > The first number is the row index, the second is the row pointed to by the > sv2 (which should be written to create sort order). Sort was done ASC, > NULLS_HIGH, by the key field. > A strong concern here is that there is no error or other warning to the user > that Drill cannot sort this type; Drill just silently declines to perform the > operation. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5329) External sort does not support "obscure" numeric types
[ https://issues.apache.org/jira/browse/DRILL-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-5329: --- Description: A unit test was created to exercise the "Sorter" mechanism within the External Sort, which is used to sort each incoming batch. The sorter was tested with each Drill data type. The following types fail: * TinyInt * UInt1 * SmallInt * UInt2 * UInt4 * UInt8 * Var16Char * VarBinary The types that work include: * Int * BigInt * Float4 * Float8 * VarChar The failure manifests on one of two ways: * If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.) * If dynamic UDFs are disabled, the generated code silently skips the comparison step, resulting in the sort not actually being done: Sorting a set of 20-pseudo-random rows produces the following output: {code} #, row #, key, value 0(0): 11, "0" 1(1): 14, "1" 2(2): 17, "2" 3(3): 0, "3" {code} By contrast, the (working) Int type produces the correct results: {code} #, row #, key, value 0(3): 0, "3" 1(10): 1, "10" 2(17): 2, "17" 3(4): 3, "4" {code} The first number is the row index, the second is the row pointed to by the sv2 (which should be written to create sort order). Sort was done ASC, NULLS_HIGH, by the key field. A strong concern here is that there is no error or other warning to the user that Drill cannot sort this type; Drill just silently declines to perform the operation. was: A unit test was created to exercise the "Sorter" mechanism within the External Sort, which is used to sort each incoming batch. The sorter was tested with each Drill data type. The following types fail: * TinyInt * UInt1 * SmallInt * UInt2 * UInt4 * UInt8 * Var16Char The types that work include: * Int * BigInt * Float4 * Float8 * VarChar The failure manifests on one of two ways: * If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.) * If dynamic UDFs are disabled, the generated code silently skips the comparison step, resulting in the sort not actually being done: Sorting a set of 20-pseudo-random rows produces the following output: {code} #, row #, key, value 0(0): 11, "0" 1(1): 14, "1" 2(2): 17, "2" 3(3): 0, "3" {code} By contrast, the (working) Int type produces the correct results: {code} #, row #, key, value 0(3): 0, "3" 1(10): 1, "10" 2(17): 2, "17" 3(4): 3, "4" {code} The first number is the row index, the second is the row pointed to by the sv2 (which should be written to create sort order). Sort was done ASC, NULLS_HIGH, by the key field. A strong concern here is that there is no error or other warning to the user that Drill cannot sort this type; Drill just silently declines to perform the operation. > External sort does not support "obscure" numeric types > -- > > Key: DRILL-5329 > URL: https://issues.apache.org/jira/browse/DRILL-5329 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.10.0 >Reporter: Paul Rogers > > A unit test was created to exercise the "Sorter" mechanism within the > External Sort, which is used to sort each incoming batch. The sorter was > tested with each Drill data type. > The following types fail: > * TinyInt > * UInt1 > * SmallInt > * UInt2 > * UInt4 > * UInt8 > * Var16Char > * VarBinary > The types that work include: > * Int > * BigInt > * Float4 > * Float8 > * VarChar > The failure manifests on one of two ways: > * If dynamic UDFs are enabled, the query crashes with an NPE. (See > DRILL-5331.) > * If dynamic UDFs are disabled, the generated code silently skips the > comparison step, resulting in the sort not actually being done: > Sorting a set of 20-pseudo-random rows produces the following output: > {code} > #, row #, key, value > 0(0): 11, "0" > 1(1): 14, "1" > 2(2): 17, "2" > 3(3): 0, "3" > {code} > By contrast, the (working) Int type produces the correct results: > {code} > #, row #, key, value > 0(3): 0, "3" > 1(10): 1, "10" > 2(17): 2, "17" > 3(4): 3, "4" > {code} > The first number is the row index, the second is the row pointed to by the > sv2 (which should be written to create sort order). Sort was done ASC, > NULLS_HIGH, by the key field. > A strong concern here is that there is no error or other warning to the user > that Drill cannot sort this type; Drill just silently declines to perform the > operation. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5329) External sort does not support "obscure" numeric types
[ https://issues.apache.org/jira/browse/DRILL-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-5329: --- Description: A unit test was created to exercise the "Sorter" mechanism within the External Sort, which is used to sort each incoming batch. The sorter was tested with each Drill data type. The following types fail: * TinyInt * UInt1 * SmallInt * UInt2 * UInt4 * UInt8 * Var16Char The types that work include: * Int * BigInt * Float4 * Float8 * VarChar * VarBinary The failure manifests on one of two ways: * If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.) * If dynamic UDFs are disabled, the generated code silently skips the comparison step, resulting in the sort not actually being done: Sorting a set of 20-pseudo-random rows produces the following output: {code} #, row #, key, value 0(0): 11, "0" 1(1): 14, "1" 2(2): 17, "2" 3(3): 0, "3" {code} By contrast, the (working) Int type produces the correct results: {code} #, row #, key, value 0(3): 0, "3" 1(10): 1, "10" 2(17): 2, "17" 3(4): 3, "4" {code} The first number is the row index, the second is the row pointed to by the sv2 (which should be written to create sort order). Sort was done ASC, NULLS_HIGH, by the key field. A strong concern here is that there is no error or other warning to the user that Drill cannot sort this type; Drill just silently declines to perform the operation. was: A unit test was created to exercise the "Sorter" mechanism within the External Sort, which is used to sort each incoming batch. The sorter was tested with each Drill data type. The following types fail: * TinyInt * UInt1 * SmallInt * UInt2 * UInt4 * UInt8 * Var16Char * VarBinary The types that work include: * Int * BigInt * Float4 * Float8 * VarChar The failure manifests on one of two ways: * If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.) * If dynamic UDFs are disabled, the generated code silently skips the comparison step, resulting in the sort not actually being done: Sorting a set of 20-pseudo-random rows produces the following output: {code} #, row #, key, value 0(0): 11, "0" 1(1): 14, "1" 2(2): 17, "2" 3(3): 0, "3" {code} By contrast, the (working) Int type produces the correct results: {code} #, row #, key, value 0(3): 0, "3" 1(10): 1, "10" 2(17): 2, "17" 3(4): 3, "4" {code} The first number is the row index, the second is the row pointed to by the sv2 (which should be written to create sort order). Sort was done ASC, NULLS_HIGH, by the key field. A strong concern here is that there is no error or other warning to the user that Drill cannot sort this type; Drill just silently declines to perform the operation. > External sort does not support "obscure" numeric types > -- > > Key: DRILL-5329 > URL: https://issues.apache.org/jira/browse/DRILL-5329 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.10.0 >Reporter: Paul Rogers > > A unit test was created to exercise the "Sorter" mechanism within the > External Sort, which is used to sort each incoming batch. The sorter was > tested with each Drill data type. > The following types fail: > * TinyInt > * UInt1 > * SmallInt > * UInt2 > * UInt4 > * UInt8 > * Var16Char > The types that work include: > * Int > * BigInt > * Float4 > * Float8 > * VarChar > * VarBinary > The failure manifests on one of two ways: > * If dynamic UDFs are enabled, the query crashes with an NPE. (See > DRILL-5331.) > * If dynamic UDFs are disabled, the generated code silently skips the > comparison step, resulting in the sort not actually being done: > Sorting a set of 20-pseudo-random rows produces the following output: > {code} > #, row #, key, value > 0(0): 11, "0" > 1(1): 14, "1" > 2(2): 17, "2" > 3(3): 0, "3" > {code} > By contrast, the (working) Int type produces the correct results: > {code} > #, row #, key, value > 0(3): 0, "3" > 1(10): 1, "10" > 2(17): 2, "17" > 3(4): 3, "4" > {code} > The first number is the row index, the second is the row pointed to by the > sv2 (which should be written to create sort order). Sort was done ASC, > NULLS_HIGH, by the key field. > A strong concern here is that there is no error or other warning to the user > that Drill cannot sort this type; Drill just silently declines to perform the > operation. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (DRILL-5165) wrong results - LIMIT ALL and OFFSET clause in same query
[ https://issues.apache.org/jira/browse/DRILL-5165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunhui Shi reassigned DRILL-5165: -- Assignee: Chunhui Shi > wrong results - LIMIT ALL and OFFSET clause in same query > - > > Key: DRILL-5165 > URL: https://issues.apache.org/jira/browse/DRILL-5165 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.10.0 >Reporter: Khurram Faraaz >Assignee: Chunhui Shi >Priority: Critical > > This issue was reported by a user on Drill's user list. > Drill 1.10.0 commit ID : bbcf4b76 > I tried a similar query on apache Drill 1.10.0 and Drill returns wrong > results when compared to Postgres, for a query that uses LIMIT ALL and OFFSET > clause in the same query. We need to file a JIRA to track this issue. > {noformat} > 0: jdbc:drill:schema=dfs.tmp> select col_int from typeall_l order by 1 limit > all offset 10; > +--+ > | col_int | > +--+ > +--+ > No rows selected (0.211 seconds) > 0: jdbc:drill:schema=dfs.tmp> select col_int from typeall_l order by col_int > limit all offset 10; > +--+ > | col_int | > +--+ > +--+ > No rows selected (0.24 seconds) > {noformat} > Query => select col_int from typeall_l limit all offset 10; > Drill 1.10.0 returns 85 rows > whereas for same query, > postgres=# select col_int from typeall_l limit all offset 10; > Postgres 9.3 returns 95 rows, which is the correct expected result. > Query plan for above query that returns wrong results > {noformat} > 0: jdbc:drill:schema=dfs.tmp> explain plan for select col_int from typeall_l > limit all offset 10; > +--+--+ > | text | json | > +--+--+ > | 00-00Screen > 00-01 Project(col_int=[$0]) > 00-02SelectionVectorRemover > 00-03 Limit(offset=[10]) > 00-04Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath > [path=maprfs:///tmp/typeall_l]], selectionRoot=maprfs:/tmp/typeall_l, > numFiles=1, usedMetadataFile=false, columns=[`col_int`]]]) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5332) DateVector support uses questionable conversions to a time
Paul Rogers created DRILL-5332: -- Summary: DateVector support uses questionable conversions to a time Key: DRILL-5332 URL: https://issues.apache.org/jira/browse/DRILL-5332 Project: Apache Drill Issue Type: Bug Affects Versions: 1.9.0 Reporter: Paul Rogers The following code in {{DateVector}} is worrisome: {code} @Override public DateTime getObject(int index) { org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), org.joda.time.DateTimeZone.UTC); date = date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault()); return date; } {code} This code takes a date/time value stored in a value vector, converts it to UTC, then strips the time zone and replaces it with local time. The result it a "timestamp" in Java format (seconds since the epoch), but not really, it really the time since the epoch, as if the epoch had started in the local time zone rather than UTC. This is the kind of fun & games that people used to do in Java with the {{Date}} type before the advent of Joda time (and the migration of Joda into Java 8.) It is, in short, very bad practice and nearly impossible to get right. Further, converting a pure date (since this is a {{DateVector}}) into a date/time is fraught with peril. A date has no corresponding time. 1 AM on Friday in one time zone might be 11 PM on Thursday in another. Converting from dates to times is very difficult. If the {{DateVector}} corresponds to a date, then it should be simple date with no implied time zone and no implied relationship to time. If there is to be a mapping of time, it must be to a {{LocalTime}} (in Joda and Java 8) that has no implied time zone. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5332) DateVector support uses questionable conversions to a time
[ https://issues.apache.org/jira/browse/DRILL-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-5332: --- Description: The following code in {{DateVector}} is worrisome: {code} @Override public DateTime getObject(int index) { org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), org.joda.time.DateTimeZone.UTC); date = date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault()); return date; } {code} This code takes a date/time value stored in a value vector, converts it to UTC, then strips the time zone and replaces it with local time. The result it a "timestamp" in Java format (seconds since the epoch), but not really, it really the time since the epoch, as if the epoch had started in the local time zone rather than UTC. This is the kind of fun & games that people used to do in Java with the {{Date}} type before the advent of Joda time (and the migration of Joda into Java 8.) It is, in short, very bad practice and nearly impossible to get right. Further, converting a pure date (since this is a {{DateVector}}) into a date/time is fraught with peril. A date has no corresponding time. 1 AM on Friday in one time zone might be 11 PM on Thursday in another. Converting from dates to times is very difficult. If the {{DateVector}} corresponds to a date, then it should be simple date with no implied time zone and no implied relationship to time. If there is to be a mapping of time, it must be to a {{LocalTime}} (in Joda and Java 8) that has no implied time zone. Note that this code directly contradicts the statement in [Drill documentation|http://drill.apache.org/docs/date-time-and-timestamp/]: "Drill stores values in Coordinated Universal Time (UTC)." Actually, even the documentation is questionable: what does it mean to store a date in UTC because of the above issues? was: The following code in {{DateVector}} is worrisome: {code} @Override public DateTime getObject(int index) { org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), org.joda.time.DateTimeZone.UTC); date = date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault()); return date; } {code} This code takes a date/time value stored in a value vector, converts it to UTC, then strips the time zone and replaces it with local time. The result it a "timestamp" in Java format (seconds since the epoch), but not really, it really the time since the epoch, as if the epoch had started in the local time zone rather than UTC. This is the kind of fun & games that people used to do in Java with the {{Date}} type before the advent of Joda time (and the migration of Joda into Java 8.) It is, in short, very bad practice and nearly impossible to get right. Further, converting a pure date (since this is a {{DateVector}}) into a date/time is fraught with peril. A date has no corresponding time. 1 AM on Friday in one time zone might be 11 PM on Thursday in another. Converting from dates to times is very difficult. If the {{DateVector}} corresponds to a date, then it should be simple date with no implied time zone and no implied relationship to time. If there is to be a mapping of time, it must be to a {{LocalTime}} (in Joda and Java 8) that has no implied time zone. > DateVector support uses questionable conversions to a time > -- > > Key: DRILL-5332 > URL: https://issues.apache.org/jira/browse/DRILL-5332 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.9.0 >Reporter: Paul Rogers > > The following code in {{DateVector}} is worrisome: > {code} > @Override > public DateTime getObject(int index) { > org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), > org.joda.time.DateTimeZone.UTC); > date = > date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault()); > return date; > } > {code} > This code takes a date/time value stored in a value vector, converts it to > UTC, then strips the time zone and replaces it with local time. The result it > a "timestamp" in Java format (seconds since the epoch), but not really, it > really the time since the epoch, as if the epoch had started in the local > time zone rather than UTC. > This is the kind of fun & games that people used to do in Java with the > {{Date}} type before the advent of Joda time (and the migration of Joda into > Java 8.) > It is, in short, very bad practice and nearly impossible to get right. > Further, converting a pure date (since this is a {{DateVector}}) into a > date/time is fraught with peril. A date has no corresponding time. 1 AM on > Friday in one time zone might be 11 PM on Thursday in another. Converting > from dates to times is very difficult. > If
[jira] [Created] (DRILL-5333) Documentation error in TIME data type description
Paul Rogers created DRILL-5333: -- Summary: Documentation error in TIME data type description Key: DRILL-5333 URL: https://issues.apache.org/jira/browse/DRILL-5333 Project: Apache Drill Issue Type: Bug Components: Documentation Affects Versions: 1.9.0 Reporter: Paul Rogers Priority: Minor Consider the following description of the TIME data type from the [documentation|http://drill.apache.org/docs/supported-data-types/]: {quote} TIME 24-hour based time before or after January 1, 2001 in hours, minutes, seconds format: HH:mm:ss 22:55:55.23 {quote} First, TIME has no associated date, so there can be no limitation on the days that can be represented. (If I tell you the bank closes at 5 PM, that statement is not just true after Jan. 1, 2001 -- it is true for as long as the bank exists.) Second, the example implies that Drill stores milliseconds, which is consistent with the implementation of the {{TimeVector}} data type. But, the format suggests that granularity is seconds. Finally, Time, as stored internally, has no format: it is a number; the number of milliseconds since the epoch. The format only comes into play when converting two or from text. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-5334) Questionable code in the TimeVector class
Paul Rogers created DRILL-5334: -- Summary: Questionable code in the TimeVector class Key: DRILL-5334 URL: https://issues.apache.org/jira/browse/DRILL-5334 Project: Apache Drill Issue Type: Bug Affects Versions: 1.9.0 Reporter: Paul Rogers The {{TimeVector}} class, which holds Time data, should hold a simple local time with no associated date or time zone. (A local time cannot be converted to UTC without a date since the conversion depends on when daylight savings is in effect.) But, the implementation of {{TimeVector}} uses the following very questionable code: {code} @Override public DateTime getObject(int index) { org.joda.time.DateTime time = new org.joda.time.DateTime(get(index), org.joda.time.DateTimeZone.UTC); time = time.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault()); return time; } {code} That is, we convert a date-less, local time into a Joda UTC DateTime object, then reset the time zone to local time. This is abusing the Joda library and is the very kind of fun & games that Joda was designed to prevent. The conversion of a time into Joda should use the {{LocalTime}} class. In fact, according to [Oracle|http://www.oracle.com/technetwork/articles/java/jf14-date-time-2125367.html], the following is the mapping from ANSI SQL date/time types to Java 8 (and thus Joda) classes: ||ANSI SQL||Java SE 8 |DATE|LocalDate |TIME|LocalTime |TIMESTAMP|LocalDateTime |TIME WITH TIMEZONE|OffsetTime |TIMESTAMP WITH TIMEZONE|OffsetDateTime -- This message was sent by Atlassian JIRA (v6.3.15#6346)