[jira] [Commented] (DRILL-5316) C++ Client Crashes When drillbitsVector.count is 0 after zoo_get_children completed with ZOK

2017-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15898909#comment-15898909
 ] 

ASF GitHub Bot commented on DRILL-5316:
---

Github user sohami commented on a diff in the pull request:

https://github.com/apache/drill/pull/772#discussion_r104606833
  
--- Diff: contrib/native/client/src/clientlib/zookeeperClient.cpp ---
@@ -138,6 +138,11 @@ int ZookeeperClient::getAllDrillbits(const 
std::string& connectStr, std::vector<
 DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "\t Unshuffled Drillbit 
id: " << drillbits[i] << std::endl;)
 }
 }
+else{
--- End diff --

Agreed. Should be handled in caller (i.e. DrillClient). If the returned 
vector size is zero then we should check that in DrillClient and close client 
connection with error as `ERR_CONN_ZKNODBIT`. Something like below:

`return handleConnError(CONN_INVALID_INPUT, getMessage(ERR_CONN_ZKNODBIT, 
pathToDrill.c_str()));`


> C++ Client Crashes When drillbitsVector.count is 0 after zoo_get_children 
> completed with ZOK
> 
>
> Key: DRILL-5316
> URL: https://issues.apache.org/jira/browse/DRILL-5316
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - C++
>Reporter: Rob Wu
>Priority: Critical
>
> When connecting to drillbit with Zookeeper, occasionally the C++ client would 
> crash without any reason.
> A further look into the code revealed that during this call 
> rc=zoo_get_children(p_zh.get(), m_path.c_str(), 0, &drillbitsVector); 
> zoo_get_children returns ZOK (0) but drillbitsVector.count is 0.
> This causes drillbits to stay empty and thus 
> causes err = zook.getEndPoint(drillbits[drillbits.size() -1], endpoint); to 
> crash
> Size check should be done to prevent this from happening



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5316) C++ Client Crashes When drillbitsVector.count is 0 after zoo_get_children completed with ZOK

2017-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15898916#comment-15898916
 ] 

ASF GitHub Bot commented on DRILL-5316:
---

Github user superbstreak commented on a diff in the pull request:

https://github.com/apache/drill/pull/772#discussion_r104607726
  
--- Diff: contrib/native/client/src/clientlib/zookeeperClient.cpp ---
@@ -138,6 +138,11 @@ int ZookeeperClient::getAllDrillbits(const 
std::string& connectStr, std::vector<
 DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "\t Unshuffled Drillbit 
id: " << drillbits[i] << std::endl;)
 }
 }
+else{
--- End diff --

Thanks both. Yea make sense, I'll make the change.


> C++ Client Crashes When drillbitsVector.count is 0 after zoo_get_children 
> completed with ZOK
> 
>
> Key: DRILL-5316
> URL: https://issues.apache.org/jira/browse/DRILL-5316
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - C++
>Reporter: Rob Wu
>Priority: Critical
>
> When connecting to drillbit with Zookeeper, occasionally the C++ client would 
> crash without any reason.
> A further look into the code revealed that during this call 
> rc=zoo_get_children(p_zh.get(), m_path.c_str(), 0, &drillbitsVector); 
> zoo_get_children returns ZOK (0) but drillbitsVector.count is 0.
> This causes drillbits to stay empty and thus 
> causes err = zook.getEndPoint(drillbits[drillbits.size() -1], endpoint); to 
> crash
> Size check should be done to prevent this from happening



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5316) C++ Client Crashes When drillbitsVector.count is 0 after zoo_get_children completed with ZOK

2017-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899081#comment-15899081
 ] 

ASF GitHub Bot commented on DRILL-5316:
---

Github user superbstreak commented on a diff in the pull request:

https://github.com/apache/drill/pull/772#discussion_r104627768
  
--- Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp ---
@@ -86,6 +86,9 @@ connectionStatus_t DrillClientImpl::connect(const char* 
connStr, DrillUserProper
 std::vector drillbits;
 int err = zook.getAllDrillbits(hostPortStr, drillbits);
 if(!err){
+if (drillbits.empty()){
+return handleConnError(CONN_INVALID_INPUT, 
getMessage(ERR_CONN_ZKNODBIT));
--- End diff --

double check CONN_INVALID_INPUT usage.


> C++ Client Crashes When drillbitsVector.count is 0 after zoo_get_children 
> completed with ZOK
> 
>
> Key: DRILL-5316
> URL: https://issues.apache.org/jira/browse/DRILL-5316
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - C++
>Reporter: Rob Wu
>Priority: Critical
>
> When connecting to drillbit with Zookeeper, occasionally the C++ client would 
> crash without any reason.
> A further look into the code revealed that during this call 
> rc=zoo_get_children(p_zh.get(), m_path.c_str(), 0, &drillbitsVector); 
> zoo_get_children returns ZOK (0) but drillbitsVector.count is 0.
> This causes drillbits to stay empty and thus 
> causes err = zook.getEndPoint(drillbits[drillbits.size() -1], endpoint); to 
> crash
> Size check should be done to prevent this from happening



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5317) Drill Configuration to access S3 buckets in Mumbai region

2017-03-07 Thread shivamurthy.dee...@gmail.com (JIRA)
shivamurthy.dee...@gmail.com created DRILL-5317:
---

 Summary: Drill Configuration to access S3 buckets in Mumbai region
 Key: DRILL-5317
 URL: https://issues.apache.org/jira/browse/DRILL-5317
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.8.0
Reporter: shivamurthy.dee...@gmail.com


I am able to access and query S3 buckets in US standard region, but not able to 
access/query buckets in Mumbai region. Is there any specific configuration that 
needs to be enabled on Drill?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-4335) Apache Drill should support network encryption

2017-03-07 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong updated DRILL-4335:

Reviewer: Sudheesh Katkam

Assigned Reviewer to [~sudheeshkatkam]

> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5318) Create a sub-operator test framework

2017-03-07 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5318:
--

 Summary: Create a sub-operator test framework
 Key: DRILL-5318
 URL: https://issues.apache.org/jira/browse/DRILL-5318
 Project: Apache Drill
  Issue Type: Improvement
  Components: Tools, Build & Test
Affects Versions: 1.10.0
Reporter: Paul Rogers
Assignee: Paul Rogers
 Fix For: Future


Drill provides two unit test frameworks for whole-server, SQL-based testing: 
the original {{BaseTestQuery}} and the newer {{ClusterFixture}}. Both use the 
{{TestBuilder}} mechanism to build system-level functional tests that run 
queries and check results.

Jason provided an operator-level test framework based, in part on mocks: 

As Drill operators become more complex, we have a crying need for true 
unit-level tests at a level below the whole system and below operators. That 
is, we need to test the individual pieces that, together, form the operator.

This umbrella ticket includes a number of tasks needed to create the 
sub-operator framework. Our intention is that, over time, as we find the need 
to revisit existing operators, or create new ones, we can employ the 
sub-operator test framework to exercise code at a finer granularity than is 
possible prior to this framework.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5319) Refactor OperatorContext and OptionsManager for unit testing

2017-03-07 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5319:
--

 Summary: Refactor OperatorContext and OptionsManager for unit 
testing
 Key: DRILL-5319
 URL: https://issues.apache.org/jira/browse/DRILL-5319
 Project: Apache Drill
  Issue Type: Sub-task
Affects Versions: 1.10.0
Reporter: Paul Rogers
Assignee: Paul Rogers
 Fix For: Future


Roll-up task for two refactorings, see the sub-tasks for details. This ticket 
allows a single PR for the two different refactorings since the work heavily 
overlaps.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5320) Refactor OptionManager to allow better unit testing

2017-03-07 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5320:
--

 Summary: Refactor OptionManager to allow better unit testing
 Key: DRILL-5320
 URL: https://issues.apache.org/jira/browse/DRILL-5320
 Project: Apache Drill
  Issue Type: Sub-task
Affects Versions: 1.10.0
Reporter: Paul Rogers
Assignee: Paul Rogers
 Fix For: Future


The {{OptionManager}} interface serves two purposes:

* Create and modify options
* Access option values

The implementations  of this class are integrated with the rest of Drill, 
making it difficult to use the classes in isolation in unit testing. Further, 
since operators are given the full interface, the operator has the ability to 
modify options, and so each unit test should either verify that no modification 
is, in fact, done, or must track down modifications and test them.

For operator and sub-operator unit tests we need a simpler interface. As it 
turns out, most low-level uses of {{OptionManager}} are all read-only. This 
allows a simple refactoring to enhance unit testability: create a new 
super-interface {{OptionSet}}, which provides only the read-only methods.

Then, refactor low-level classes (code generation, compilers, and so on) to use 
the restricted {{OptionSet}} interface.

Finally, for unit tests, create a trivial, map-based implementation that can be 
populated as needed for each specific test.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5321) Refactor FragmentContext for unit testing

2017-03-07 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5321:
--

 Summary: Refactor FragmentContext for unit testing
 Key: DRILL-5321
 URL: https://issues.apache.org/jira/browse/DRILL-5321
 Project: Apache Drill
  Issue Type: Sub-task
Affects Versions: 1.10.0
Reporter: Paul Rogers
Assignee: Paul Rogers
 Fix For: Future


Each operator has visibility to the {{FragmentContext}} class. 
{{FragmentContext}} provides access to all of Drill internals: the Drillbit 
context, the network interfaces, RPC messages and so on.

Further, all the code generation mechanisms require a {{FragmentContext}} 
object.

This structure creates a large barrier to unit testing. To test, say, a 
particular bit of generated code, we must have the entire Drillbit running so 
we can obtain a {{FragmentContext}}. Clearly, this is less than ideal.

Upon inspection, it turns out that the {{FragmentContext}} is mostly needed, by 
many operators, to generate code. Of the many methods in {{FragmentContext}}, 
code generation uses only six.

The solution is to create a new super-interface, {{CodeGenContext}}, which 
holds those six methods. The {{CodeGenContext}} can be easily re-implemented 
for unit testing.

Then, modify all the code-generation classes that currently take 
{{FragmentContext}} to take {{CodeGenContext}} instead.

Since {{FragmentContext}} derives from {{CodeGenContext}}, existing operator 
code "just works."



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5319) Refactor FragmentContext and OptionManager for unit testing

2017-03-07 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5319:
---
Summary: Refactor FragmentContext and OptionManager for unit testing  (was: 
Refactor OperatorContext and OptionsManager for unit testing)

> Refactor FragmentContext and OptionManager for unit testing
> ---
>
> Key: DRILL-5319
> URL: https://issues.apache.org/jira/browse/DRILL-5319
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: Future
>
>
> Roll-up task for two refactorings, see the sub-tasks for details. This ticket 
> allows a single PR for the two different refactorings since the work heavily 
> overlaps.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5320) Refactor OptionManager to allow better unit testing

2017-03-07 Thread Paul Rogers (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899821#comment-15899821
 ] 

Paul Rogers commented on DRILL-5320:


Proposed new interface (comments omitted):

{code}
public interface OptionSet {
  OptionValue getOption(String name);
  boolean getOption(TypeValidators.BooleanValidator validator);
  double getOption(TypeValidators.DoubleValidator validator);
  long getOption(TypeValidators.LongValidator validator);
  String getOption(TypeValidators.StringValidator validator);
}
{code}


> Refactor OptionManager to allow better unit testing
> ---
>
> Key: DRILL-5320
> URL: https://issues.apache.org/jira/browse/DRILL-5320
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: Future
>
>
> The {{OptionManager}} interface serves two purposes:
> * Create and modify options
> * Access option values
> The implementations  of this class are integrated with the rest of Drill, 
> making it difficult to use the classes in isolation in unit testing. Further, 
> since operators are given the full interface, the operator has the ability to 
> modify options, and so each unit test should either verify that no 
> modification is, in fact, done, or must track down modifications and test 
> them.
> For operator and sub-operator unit tests we need a simpler interface. As it 
> turns out, most low-level uses of {{OptionManager}} are all read-only. This 
> allows a simple refactoring to enhance unit testability: create a new 
> super-interface {{OptionSet}}, which provides only the read-only methods.
> Then, refactor low-level classes (code generation, compilers, and so on) to 
> use the restricted {{OptionSet}} interface.
> Finally, for unit tests, create a trivial, map-based implementation that can 
> be populated as needed for each specific test.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5321) Refactor FragmentContext for unit testing

2017-03-07 Thread Paul Rogers (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899824#comment-15899824
 ] 

Paul Rogers commented on DRILL-5321:


Proposed new interface (comments omitted):

{code}
public interface CodeGenContext {
  FunctionImplementationRegistry getFunctionRegistry();
  OptionSet getOptionSet();
   T getImplementationClass(final ClassGenerator cg)
  throws ClassTransformationException, IOException;
   T getImplementationClass(final CodeGenerator cg)
  throws ClassTransformationException, IOException;
   List getImplementationClass(final ClassGenerator cg, final int 
instanceCount)
  throws ClassTransformationException, IOException;
   List getImplementationClass(final CodeGenerator cg, final int 
instanceCount)
  throws ClassTransformationException, IOException;
}
{code}


> Refactor FragmentContext for unit testing
> -
>
> Key: DRILL-5321
> URL: https://issues.apache.org/jira/browse/DRILL-5321
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: Future
>
>
> Each operator has visibility to the {{FragmentContext}} class. 
> {{FragmentContext}} provides access to all of Drill internals: the Drillbit 
> context, the network interfaces, RPC messages and so on.
> Further, all the code generation mechanisms require a {{FragmentContext}} 
> object.
> This structure creates a large barrier to unit testing. To test, say, a 
> particular bit of generated code, we must have the entire Drillbit running so 
> we can obtain a {{FragmentContext}}. Clearly, this is less than ideal.
> Upon inspection, it turns out that the {{FragmentContext}} is mostly needed, 
> by many operators, to generate code. Of the many methods in 
> {{FragmentContext}}, code generation uses only six.
> The solution is to create a new super-interface, {{CodeGenContext}}, which 
> holds those six methods. The {{CodeGenContext}} can be easily re-implemented 
> for unit testing.
> Then, modify all the code-generation classes that currently take 
> {{FragmentContext}} to take {{CodeGenContext}} instead.
> Since {{FragmentContext}} derives from {{CodeGenContext}}, existing operator 
> code "just works."



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5319) Refactor FragmentContext and OptionManager for unit testing

2017-03-07 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5319:
---
Description: Roll-up task for two refactorings, see the sub-tasks for 
details. This ticket allows a single PR for the two different refactorings 
since the work heavily overlaps. See DRILL-5320 and DRILL-5321 for details.  
(was: Roll-up task for two refactorings, see the sub-tasks for details. This 
ticket allows a single PR for the two different refactorings since the work 
heavily overlaps.)

> Refactor FragmentContext and OptionManager for unit testing
> ---
>
> Key: DRILL-5319
> URL: https://issues.apache.org/jira/browse/DRILL-5319
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: Future
>
>
> Roll-up task for two refactorings, see the sub-tasks for details. This ticket 
> allows a single PR for the two different refactorings since the work heavily 
> overlaps. See DRILL-5320 and DRILL-5321 for details.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5322) Provide an OperatorFixture for sub-operator unit testing setup

2017-03-07 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5322:
--

 Summary: Provide an OperatorFixture for sub-operator unit testing 
setup
 Key: DRILL-5322
 URL: https://issues.apache.org/jira/browse/DRILL-5322
 Project: Apache Drill
  Issue Type: Sub-task
Affects Versions: 1.11.0
Reporter: Paul Rogers
Assignee: Paul Rogers
 Fix For: 1.11.0


We recently created various "fixture" classes to assist with system-level 
testing: {{LogFixture}}, {{ClusterFixture}} and {{ClientFixture}}. Each handles 
the tedious work of setting up the conditions to run certain kinds of tests.

In the same way, we need an {{OperatorFixture}} to set up the low-level bits 
and pieces needed for operator-level, and sub-operator-level unit testing.

The {{DrillConfig}} is used by both the system-level and operator-level 
fixtures. So, pull the config-setup tasks our of (cluster) {{FixtureBuilder}} 
(should rename) and into a new {{ConfigBuilder}}. Leave the existing methods in 
{{FixtureBuilder}}, but modify them to be wrappers around the new config 
builder.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-03-07 Thread Laurent Goujon (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899831#comment-15899831
 ] 

Laurent Goujon commented on DRILL-4335:
---

It looks like from the initial implementation that there are lots of byte 
copying involved. Any estimation/benchmark to quantify the impact on 
throughput? wouldn't a SSL/TLS implementation be a more performant alternative 
here (because of its integration directly into netty?)

> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5318) Create a sub-operator test framework

2017-03-07 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5318:
---
Affects Version/s: (was: 1.10.0)
   1.11.0

> Create a sub-operator test framework
> 
>
> Key: DRILL-5318
> URL: https://issues.apache.org/jira/browse/DRILL-5318
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Drill provides two unit test frameworks for whole-server, SQL-based testing: 
> the original {{BaseTestQuery}} and the newer {{ClusterFixture}}. Both use the 
> {{TestBuilder}} mechanism to build system-level functional tests that run 
> queries and check results.
> Jason provided an operator-level test framework based, in part on mocks: 
> As Drill operators become more complex, we have a crying need for true 
> unit-level tests at a level below the whole system and below operators. That 
> is, we need to test the individual pieces that, together, form the operator.
> This umbrella ticket includes a number of tasks needed to create the 
> sub-operator framework. Our intention is that, over time, as we find the need 
> to revisit existing operators, or create new ones, we can employ the 
> sub-operator test framework to exercise code at a finer granularity than is 
> possible prior to this framework.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5319) Refactor FragmentContext and OptionManager for unit testing

2017-03-07 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5319:
---
Fix Version/s: (was: Future)
   1.11.0

> Refactor FragmentContext and OptionManager for unit testing
> ---
>
> Key: DRILL-5319
> URL: https://issues.apache.org/jira/browse/DRILL-5319
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Roll-up task for two refactorings, see the sub-tasks for details. This ticket 
> allows a single PR for the two different refactorings since the work heavily 
> overlaps. See DRILL-5320 and DRILL-5321 for details.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5320) Refactor OptionManager to allow better unit testing

2017-03-07 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5320:
---
Affects Version/s: (was: 1.10.0)
   1.11.0
Fix Version/s: (was: Future)
   1.11.0

> Refactor OptionManager to allow better unit testing
> ---
>
> Key: DRILL-5320
> URL: https://issues.apache.org/jira/browse/DRILL-5320
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> The {{OptionManager}} interface serves two purposes:
> * Create and modify options
> * Access option values
> The implementations  of this class are integrated with the rest of Drill, 
> making it difficult to use the classes in isolation in unit testing. Further, 
> since operators are given the full interface, the operator has the ability to 
> modify options, and so each unit test should either verify that no 
> modification is, in fact, done, or must track down modifications and test 
> them.
> For operator and sub-operator unit tests we need a simpler interface. As it 
> turns out, most low-level uses of {{OptionManager}} are all read-only. This 
> allows a simple refactoring to enhance unit testability: create a new 
> super-interface {{OptionSet}}, which provides only the read-only methods.
> Then, refactor low-level classes (code generation, compilers, and so on) to 
> use the restricted {{OptionSet}} interface.
> Finally, for unit tests, create a trivial, map-based implementation that can 
> be populated as needed for each specific test.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5318) Create a sub-operator test framework

2017-03-07 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5318:
---
Fix Version/s: (was: Future)
   1.11.0

> Create a sub-operator test framework
> 
>
> Key: DRILL-5318
> URL: https://issues.apache.org/jira/browse/DRILL-5318
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Drill provides two unit test frameworks for whole-server, SQL-based testing: 
> the original {{BaseTestQuery}} and the newer {{ClusterFixture}}. Both use the 
> {{TestBuilder}} mechanism to build system-level functional tests that run 
> queries and check results.
> Jason provided an operator-level test framework based, in part on mocks: 
> As Drill operators become more complex, we have a crying need for true 
> unit-level tests at a level below the whole system and below operators. That 
> is, we need to test the individual pieces that, together, form the operator.
> This umbrella ticket includes a number of tasks needed to create the 
> sub-operator framework. Our intention is that, over time, as we find the need 
> to revisit existing operators, or create new ones, we can employ the 
> sub-operator test framework to exercise code at a finer granularity than is 
> possible prior to this framework.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-03-07 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5323:
--

 Summary: Provide test tools to create, populate and compare row 
sets
 Key: DRILL-5323
 URL: https://issues.apache.org/jira/browse/DRILL-5323
 Project: Apache Drill
  Issue Type: Sub-task
Affects Versions: 1.11.0
Reporter: Paul Rogers
Assignee: Paul Rogers
 Fix For: 1.11.0


Operators work with individual row sets. A row set is a collection of records 
stored as column vectors. (Drill uses various terms for this concept. A record 
batch is a row set with an operator implementation wrapped around it. A vector 
container is a row set, but with much functionality left as an exercise for the 
developer. And so on.)

To simplify tests, we need a {{TestRowSet}} concept that wraps a 
{{VectorContainer}} and provides easy ways to:

* Define a schema for the row set.
* Create a set of vectors that implement the schema.
* Populate the row set with test data via code.
* Add an SV2 to the row set.
* Pass the row set to operator components (such as generated code blocks.)
* Compare the results of the operation with an expected result set.
* Dispose of the underling direct memory when work is done.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5321) Refactor FragmentContext for unit testing

2017-03-07 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5321:
---
Affects Version/s: (was: 1.10.0)
   1.11.0
Fix Version/s: (was: Future)
   1.11.0

> Refactor FragmentContext for unit testing
> -
>
> Key: DRILL-5321
> URL: https://issues.apache.org/jira/browse/DRILL-5321
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Each operator has visibility to the {{FragmentContext}} class. 
> {{FragmentContext}} provides access to all of Drill internals: the Drillbit 
> context, the network interfaces, RPC messages and so on.
> Further, all the code generation mechanisms require a {{FragmentContext}} 
> object.
> This structure creates a large barrier to unit testing. To test, say, a 
> particular bit of generated code, we must have the entire Drillbit running so 
> we can obtain a {{FragmentContext}}. Clearly, this is less than ideal.
> Upon inspection, it turns out that the {{FragmentContext}} is mostly needed, 
> by many operators, to generate code. Of the many methods in 
> {{FragmentContext}}, code generation uses only six.
> The solution is to create a new super-interface, {{CodeGenContext}}, which 
> holds those six methods. The {{CodeGenContext}} can be easily re-implemented 
> for unit testing.
> Then, modify all the code-generation classes that currently take 
> {{FragmentContext}} to take {{CodeGenContext}} instead.
> Since {{FragmentContext}} derives from {{CodeGenContext}}, existing operator 
> code "just works."



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5324) Provide simplified column reader/writer for use in tests

2017-03-07 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5324:
--

 Summary: Provide simplified column reader/writer for use in tests
 Key: DRILL-5324
 URL: https://issues.apache.org/jira/browse/DRILL-5324
 Project: Apache Drill
  Issue Type: Sub-task
Affects Versions: 1.11.0
Reporter: Paul Rogers
Assignee: Paul Rogers
 Fix For: 1.11.0


In support of DRILL-, we wish to provide a very easy way to work with row 
sets. See the comment section for examples of the target API.

Drill provides over 100 different value vectors, any of which may be required 
to perform a specific unit test. Creating these vectors, populating them, and 
retrieving values, is very tedious. The work is so complex that it acts to 
discourage developers from writing such tests.

To simplify the task, we wish to provide a simplified row set reader and 
writer. To do that, we need to generate the corresponding column reader and 
writer for each value vector. This ticket focuses on the column-level readers 
and writers, and the required code generation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5324) Provide simplified column reader/writer for use in tests

2017-03-07 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5324:
---
Description: 
In support of DRILL-5323, we wish to provide a very easy way to work with row 
sets. See the comment section for examples of the target API.

Drill provides over 100 different value vectors, any of which may be required 
to perform a specific unit test. Creating these vectors, populating them, and 
retrieving values, is very tedious. The work is so complex that it acts to 
discourage developers from writing such tests.

To simplify the task, we wish to provide a simplified row set reader and 
writer. To do that, we need to generate the corresponding column reader and 
writer for each value vector. This ticket focuses on the column-level readers 
and writers, and the required code generation.

  was:
In support of DRILL-, we wish to provide a very easy way to work with row 
sets. See the comment section for examples of the target API.

Drill provides over 100 different value vectors, any of which may be required 
to perform a specific unit test. Creating these vectors, populating them, and 
retrieving values, is very tedious. The work is so complex that it acts to 
discourage developers from writing such tests.

To simplify the task, we wish to provide a simplified row set reader and 
writer. To do that, we need to generate the corresponding column reader and 
writer for each value vector. This ticket focuses on the column-level readers 
and writers, and the required code generation.


> Provide simplified column reader/writer for use in tests
> 
>
> Key: DRILL-5324
> URL: https://issues.apache.org/jira/browse/DRILL-5324
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> In support of DRILL-5323, we wish to provide a very easy way to work with row 
> sets. See the comment section for examples of the target API.
> Drill provides over 100 different value vectors, any of which may be required 
> to perform a specific unit test. Creating these vectors, populating them, and 
> retrieving values, is very tedious. The work is so complex that it acts to 
> discourage developers from writing such tests.
> To simplify the task, we wish to provide a simplified row set reader and 
> writer. To do that, we need to generate the corresponding column reader and 
> writer for each value vector. This ticket focuses on the column-level readers 
> and writers, and the required code generation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5324) Provide simplified column reader/writer for use in tests

2017-03-07 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5324:
---
Description: 
In support of DRILL-5323, we wish to provide a very easy way to work with row 
sets. See the comment section for examples of the target API.

Drill provides over 100 different value vectors, any of which may be required 
to perform a specific unit test. Creating these vectors, populating them, and 
retrieving values, is very tedious. The work is so complex that it acts to 
discourage developers from writing such tests.

To simplify the task, we wish to provide a simplified row set reader and 
writer. To do that, we need to generate the corresponding column reader and 
writer for each value vector. This ticket focuses on the column-level readers 
and writers, and the required code generation.

Drill already provides vector readers and writers derived from {{FieldReader}}. 
However, these readers do not provide a uniform get/set interface that is type 
independent on the application side. Instead, application code must be aware of 
the type of the vector, something we seek to avoid for test code.

The reader and writer classes are designed to be used in many contexts, not 
just for testing. As a result, their implementation makes no assumptions about 
the broader row reader and writer, other than that a row index and the required 
value vector are both available. 

  was:
In support of DRILL-5323, we wish to provide a very easy way to work with row 
sets. See the comment section for examples of the target API.

Drill provides over 100 different value vectors, any of which may be required 
to perform a specific unit test. Creating these vectors, populating them, and 
retrieving values, is very tedious. The work is so complex that it acts to 
discourage developers from writing such tests.

To simplify the task, we wish to provide a simplified row set reader and 
writer. To do that, we need to generate the corresponding column reader and 
writer for each value vector. This ticket focuses on the column-level readers 
and writers, and the required code generation.


> Provide simplified column reader/writer for use in tests
> 
>
> Key: DRILL-5324
> URL: https://issues.apache.org/jira/browse/DRILL-5324
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> In support of DRILL-5323, we wish to provide a very easy way to work with row 
> sets. See the comment section for examples of the target API.
> Drill provides over 100 different value vectors, any of which may be required 
> to perform a specific unit test. Creating these vectors, populating them, and 
> retrieving values, is very tedious. The work is so complex that it acts to 
> discourage developers from writing such tests.
> To simplify the task, we wish to provide a simplified row set reader and 
> writer. To do that, we need to generate the corresponding column reader and 
> writer for each value vector. This ticket focuses on the column-level readers 
> and writers, and the required code generation.
> Drill already provides vector readers and writers derived from 
> {{FieldReader}}. However, these readers do not provide a uniform get/set 
> interface that is type independent on the application side. Instead, 
> application code must be aware of the type of the vector, something we seek 
> to avoid for test code.
> The reader and writer classes are designed to be used in many contexts, not 
> just for testing. As a result, their implementation makes no assumptions 
> about the broader row reader and writer, other than that a row index and the 
> required value vector are both available. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5324) Provide simplified column reader/writer for use in tests

2017-03-07 Thread Paul Rogers (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899873#comment-15899873
 ] 

Paul Rogers commented on DRILL-5324:


The idea is to provide a uniform column access interface, similar to how JDBC 
works (but much simpler!). Assume a row reader defined elsewhere:

{code}
  public interface RowSetReader {
boolean next();
ColumnReader column(int colIndex);
  }
{code}

Then, we can read values as follows. Assume a schema of (Int, VarChar):

{code}
  RowSetReader reader = ...;
  assertEquals(10, reader.column(0).getInt());
  assertEquals("foo", reader.column(1).getString());
{code}

Proposed interfaces (without comments):

{code}
public interface ColumnReader {
  ValueType getType();
  boolean isNull();
  int getInt();
  long getLong();
  double getDouble();
  String getString();
  byte[] getBytes();
}

public interface ColumnWriter {
  ValueType getType();
  void setNull();
  void setInt(int value);
  void setLong(long value);
  void setDouble(double value);
  void setString(String value);
  void setBytes(byte[] value);
}
{code}


> Provide simplified column reader/writer for use in tests
> 
>
> Key: DRILL-5324
> URL: https://issues.apache.org/jira/browse/DRILL-5324
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> In support of DRILL-5323, we wish to provide a very easy way to work with row 
> sets. See the comment section for examples of the target API.
> Drill provides over 100 different value vectors, any of which may be required 
> to perform a specific unit test. Creating these vectors, populating them, and 
> retrieving values, is very tedious. The work is so complex that it acts to 
> discourage developers from writing such tests.
> To simplify the task, we wish to provide a simplified row set reader and 
> writer. To do that, we need to generate the corresponding column reader and 
> writer for each value vector. This ticket focuses on the column-level readers 
> and writers, and the required code generation.
> Drill already provides vector readers and writers derived from 
> {{FieldReader}}. However, these readers do not provide a uniform get/set 
> interface that is type independent on the application side. Instead, 
> application code must be aware of the type of the vector, something we seek 
> to avoid for test code.
> The reader and writer classes are designed to be used in many contexts, not 
> just for testing. As a result, their implementation makes no assumptions 
> about the broader row reader and writer, other than that a row index and the 
> required value vector are both available. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (DRILL-5324) Provide simplified column reader/writer for use in tests

2017-03-07 Thread Paul Rogers (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899873#comment-15899873
 ] 

Paul Rogers edited comment on DRILL-5324 at 3/7/17 6:13 PM:


The idea is to provide a uniform column access interface, similar to how JDBC 
works (but much simpler!). Assume a row reader defined elsewhere:

{code}
  public interface RowSetReader {
boolean next();
ColumnReader column(int colIndex);
  }
{code}

Then, we can read values as follows. Assume a schema of (Int, VarChar):

{code}
  RowSetReader reader = ...;
  assertEquals(10, reader.column(0).getInt());
  assertEquals("foo", reader.column(1).getString());
{code}

Proposed interfaces (without comments):

{code}
public enum ValueType {
  INTEGER, LONG, DOUBLE, STRING, BYTES
}

public interface ColumnReader {
  ValueType getType();
  boolean isNull();
  int getInt();
  long getLong();
  double getDouble();
  String getString();
  byte[] getBytes();
}

public interface ColumnWriter {
  ValueType getType();
  void setNull();
  void setInt(int value);
  void setLong(long value);
  void setDouble(double value);
  void setString(String value);
  void setBytes(byte[] value);
}
{code}



was (Author: paul-rogers):
The idea is to provide a uniform column access interface, similar to how JDBC 
works (but much simpler!). Assume a row reader defined elsewhere:

{code}
  public interface RowSetReader {
boolean next();
ColumnReader column(int colIndex);
  }
{code}

Then, we can read values as follows. Assume a schema of (Int, VarChar):

{code}
  RowSetReader reader = ...;
  assertEquals(10, reader.column(0).getInt());
  assertEquals("foo", reader.column(1).getString());
{code}

Proposed interfaces (without comments):

{code}
public interface ColumnReader {
  ValueType getType();
  boolean isNull();
  int getInt();
  long getLong();
  double getDouble();
  String getString();
  byte[] getBytes();
}

public interface ColumnWriter {
  ValueType getType();
  void setNull();
  void setInt(int value);
  void setLong(long value);
  void setDouble(double value);
  void setString(String value);
  void setBytes(byte[] value);
}
{code}


> Provide simplified column reader/writer for use in tests
> 
>
> Key: DRILL-5324
> URL: https://issues.apache.org/jira/browse/DRILL-5324
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> In support of DRILL-5323, we wish to provide a very easy way to work with row 
> sets. See the comment section for examples of the target API.
> Drill provides over 100 different value vectors, any of which may be required 
> to perform a specific unit test. Creating these vectors, populating them, and 
> retrieving values, is very tedious. The work is so complex that it acts to 
> discourage developers from writing such tests.
> To simplify the task, we wish to provide a simplified row set reader and 
> writer. To do that, we need to generate the corresponding column reader and 
> writer for each value vector. This ticket focuses on the column-level readers 
> and writers, and the required code generation.
> Drill already provides vector readers and writers derived from 
> {{FieldReader}}. However, these readers do not provide a uniform get/set 
> interface that is type independent on the application side. Instead, 
> application code must be aware of the type of the vector, something we seek 
> to avoid for test code.
> The reader and writer classes are designed to be used in many contexts, not 
> just for testing. As a result, their implementation makes no assumptions 
> about the broader row reader and writer, other than that a row index and the 
> required value vector are both available. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (DRILL-5324) Provide simplified column reader/writer for use in tests

2017-03-07 Thread Paul Rogers (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899873#comment-15899873
 ] 

Paul Rogers edited comment on DRILL-5324 at 3/7/17 6:16 PM:


The idea is to provide a uniform column access interface, similar to how JDBC 
works (but much simpler!). Assume a row reader defined elsewhere:

{code}
  public interface RowSetReader {
boolean next();
ColumnReader column(int colIndex);
  }
{code}

Then, we can read values as follows. Assume a schema of (Int, VarChar):

{code}
  RowSetReader reader = ...;
  assertEquals(10, reader.column(0).getInt());
  assertEquals("foo", reader.column(1).getString());
{code}

Proposed interfaces (without comments):

{code}
public enum ValueType {
  INTEGER, LONG, DOUBLE, STRING, BYTES
}

public interface ColumnReader {
  ValueType getType();
  boolean isNull();
  int getInt();
  long getLong();
  double getDouble();
  String getString();
  byte[] getBytes();
}

public interface ColumnWriter {
  ValueType getType();
  void setNull();
  void setInt(int value);
  void setLong(long value);
  void setDouble(double value);
  void setString(String value);
  void setBytes(byte[] value);
}
{code}

The generated versions are very type-specific: each value vector type supports 
only one of the above methods. (The numeric types all convert to/from int or 
double.) An implementation could certainly create another layer that does type 
conversion: say from int to String. But, to keep the generated code simple, the 
generated code implements only a single get or set method. (The others throw an 
exception if called.)


was (Author: paul-rogers):
The idea is to provide a uniform column access interface, similar to how JDBC 
works (but much simpler!). Assume a row reader defined elsewhere:

{code}
  public interface RowSetReader {
boolean next();
ColumnReader column(int colIndex);
  }
{code}

Then, we can read values as follows. Assume a schema of (Int, VarChar):

{code}
  RowSetReader reader = ...;
  assertEquals(10, reader.column(0).getInt());
  assertEquals("foo", reader.column(1).getString());
{code}

Proposed interfaces (without comments):

{code}
public enum ValueType {
  INTEGER, LONG, DOUBLE, STRING, BYTES
}

public interface ColumnReader {
  ValueType getType();
  boolean isNull();
  int getInt();
  long getLong();
  double getDouble();
  String getString();
  byte[] getBytes();
}

public interface ColumnWriter {
  ValueType getType();
  void setNull();
  void setInt(int value);
  void setLong(long value);
  void setDouble(double value);
  void setString(String value);
  void setBytes(byte[] value);
}
{code}


> Provide simplified column reader/writer for use in tests
> 
>
> Key: DRILL-5324
> URL: https://issues.apache.org/jira/browse/DRILL-5324
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> In support of DRILL-5323, we wish to provide a very easy way to work with row 
> sets. See the comment section for examples of the target API.
> Drill provides over 100 different value vectors, any of which may be required 
> to perform a specific unit test. Creating these vectors, populating them, and 
> retrieving values, is very tedious. The work is so complex that it acts to 
> discourage developers from writing such tests.
> To simplify the task, we wish to provide a simplified row set reader and 
> writer. To do that, we need to generate the corresponding column reader and 
> writer for each value vector. This ticket focuses on the column-level readers 
> and writers, and the required code generation.
> Drill already provides vector readers and writers derived from 
> {{FieldReader}}. However, these readers do not provide a uniform get/set 
> interface that is type independent on the application side. Instead, 
> application code must be aware of the type of the vector, something we seek 
> to avoid for test code.
> The reader and writer classes are designed to be used in many contexts, not 
> just for testing. As a result, their implementation makes no assumptions 
> about the broader row reader and writer, other than that a row index and the 
> required value vector are both available. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5325) Implement sub-opertor unit tests for managed external sort

2017-03-07 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5325:
--

 Summary: Implement sub-opertor unit tests for managed external sort
 Key: DRILL-5325
 URL: https://issues.apache.org/jira/browse/DRILL-5325
 Project: Apache Drill
  Issue Type: Sub-task
Affects Versions: 1.11.0
Reporter: Paul Rogers
Assignee: Paul Rogers
 Fix For: 1.11.0


Validate the proposed sub-operator test framework, by creating low-level unit 
tests for the managed version of the external sort.

The external sort has a small number of existing tests, but those tests are 
quite superficial; the "managed sort" project found many bugs. The managed sort 
itself was tested with ad-hoc system-level tests created using the new "cluster 
fixture" framework. But, again, such tests could not reach deep inside the sort 
code to exercise very specific conditions.

As a result, we spent far too much time using QA functional tests to identify 
specific code issues.

Using the sub-opeator unit test framework, we can instead test each bit of 
functionality at the unit test level.

If doing so works, and is practical, it can serve as a model for other operator 
testing projects.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (DRILL-4963) Issues when overloading Drill native functions with dynamic UDFs

2017-03-07 Thread Roman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman closed DRILL-4963.


I have tested this issue and it was not reproduced. Verified and closed.

> Issues when overloading Drill native functions with dynamic UDFs
> 
>
> Key: DRILL-4963
> URL: https://issues.apache.org/jira/browse/DRILL-4963
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.9.0
>Reporter: Roman
>Assignee: Arina Ielchiieva
>  Labels: ready-to-commit
> Fix For: 1.10.0
>
> Attachments: subquery_udf-1.0.jar, subquery_udf-1.0-sources.jar, 
> test_overloading-1.0.jar, test_overloading-1.0-sources.jar
>
>
> I created jar file which overloads 3 DRILL native functions 
> (LOG(VARCHAR-REQUIRED), CURRENT_DATE(VARCHAR-REQUIRED) and 
> ABS(VARCHAR-REQUIRED,VARCHAR-REQUIRED)) and registered it as dynamic UDF.
> If I try to use my functions I will get errors:
> {code:xml}
> SELECT CURRENT_DATE('test') FROM (VALUES(1));
> {code}
> Error: FUNCTION ERROR: CURRENT_DATE does not support operand types (CHAR)
> SQL Query null
> {code:xml}
> SELECT ABS('test','test') FROM (VALUES(1));
> {code}
> Error: FUNCTION ERROR: ABS does not support operand types (CHAR,CHAR)
> SQL Query null
> {code:xml}
> SELECT LOG('test') FROM (VALUES(1));
> {code}
> Error: SYSTEM ERROR: DrillRuntimeException: Failure while materializing 
> expression in constant expression evaluator LOG('test').  Errors: 
> Error in expression at index -1.  Error: Missing function implementation: 
> castTINYINT(VARCHAR-REQUIRED).  Full expression: UNKNOWN EXPRESSION.
> But if I rerun all this queries after "DrillRuntimeException", they will run 
> correctly. It seems that Drill have not updated the function signature before 
> that error. Also if I add jar as usual UDF (copy jar to 
> /drill_home/jars/3rdparty and restart drillbits), all queries will run 
> correctly without errors.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5326) Unexpected/unhandled MinorType value GENERIC_OBJECT

2017-03-07 Thread Vitalii Diravka (JIRA)
Vitalii Diravka created DRILL-5326:
--

 Summary: Unexpected/unhandled MinorType value GENERIC_OBJECT
 Key: DRILL-5326
 URL: https://issues.apache.org/jira/browse/DRILL-5326
 Project: Apache Drill
  Issue Type: Bug
  Components: Metadata
Affects Versions: 1.10.0
Reporter: Vitalii Diravka
Assignee: Vitalii Diravka
Priority: Blocker
 Fix For: 1.10.0


In DRILL-5301 a new SERVER_META rpc call was introduced. The server will 
support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT it 
is disabled. 
When I enabled this method (by way of upgrading drill version to 1.10.0 or 
1.11.0-SNAPSHOT) I found the following exception:
{code}
java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT
{code}
It appears in several tests (for example in 
DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh).
The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in 
the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was added 
in DRILL-1126).

The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name for 
this "GENERIC_OBJECT" RPC-/protobuf-level data type.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5326) Unit tests failures related to the SERVER_METADTA

2017-03-07 Thread Vitalii Diravka (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated DRILL-5326:
---
Summary: Unit tests failures related to the SERVER_METADTA  (was: 
Unexpected/unhandled MinorType value GENERIC_OBJECT)

> Unit tests failures related to the SERVER_METADTA
> -
>
> Key: DRILL-5326
> URL: https://issues.apache.org/jira/browse/DRILL-5326
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.10.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Blocker
> Fix For: 1.10.0
>
>
> In DRILL-5301 a new SERVER_META rpc call was introduced. The server will 
> support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT 
> it is disabled. 
> When I enabled this method (by way of upgrading drill version to 1.10.0 or 
> 1.11.0-SNAPSHOT) I found the following exception:
> {code}
> java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT
> {code}
> It appears in several tests (for example in 
> DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh).
> The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in 
> the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was 
> added in DRILL-1126).
> The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name 
> for this "GENERIC_OBJECT" RPC-/protobuf-level data type.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-2293) CTAS does not clean up when it fails

2017-03-07 Thread Kunal Khatua (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900149#comment-15900149
 ] 

Kunal Khatua commented on DRILL-2293:
-

[~rkins], [~khfaraaz] had tested a similar usecase for CTTAS. Perhaps, a 
similar solution can be ported for this specifically, if it is unresolved. 

> CTAS does not clean up when it fails
> 
>
> Key: DRILL-2293
> URL: https://issues.apache.org/jira/browse/DRILL-2293
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Reporter: Rahul Challapalli
> Fix For: Future
>
>
> git.commit.id.abbrev=6676f2d
> Data Set :
> {code}
> {
>   "id" : 1,
>   "map":{"rm": [
> {"mapid":"m1","mapvalue":{"col1":1,"col2":[0,1,2,3,4,5]},"rptd": [{ "a": 
> "foo"},{"b":"boo"}]},
> {"mapid":"m2","mapvalue":{"col1":0,"col2":[]},"rptd": [{ "a": 
> "bar"},{"c":1},{"d":4.5}]}
>   ]}
> }
> {code}
> The below query fails :
> {code}
> create table rep_map as select d.map from `temp.json` d;
> Query failed: Query stopped., index: -4, length: 4 (expected: range(0, 
> 16384)) [ d76e3f74-7e2c-406f-a7fd-5efc68227e75 on qa-node190.qa.lab:31010 ]
> {code}
> However drill created a folder 'rep_map' and the folder contained a broken 
> parquet file. 
> {code}
> create table rep_map as select d.map from `temp.json` d;
> +++
> | ok |  summary   |
> +++
> | false  | Table 'rep_map' already exists. |
> {code}
> Drill should clean up properly in case of a failure.
> I raised a different issue for the actual failure.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5326) Unit tests failures related to the SERVER_METADTA

2017-03-07 Thread Vitalii Diravka (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated DRILL-5326:
---
Description: 
1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will 
support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT it 
is disabled. 
When I enabled this method (by way of upgrading drill version to 1.10.0 or 
1.11.0-SNAPSHOT) I found the following exception:
{code}
java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT
{code}
It appears in several tests (for example in 
DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh).
The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in 
the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was added 
in DRILL-1126).

The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name for 
this "GENERIC_OBJECT" RPC-/protobuf-level data type.

2. After the fixing first one the mentioned above test still fails by reason of 
the incorrect "NullCollation" value in the "ServerMetaProvider". According to 
the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the 
default val should be NC_HIGH (NULL is the highest value).


  was:
In DRILL-5301 a new SERVER_META rpc call was introduced. The server will 
support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT it 
is disabled. 
When I enabled this method (by way of upgrading drill version to 1.10.0 or 
1.11.0-SNAPSHOT) I found the following exception:
{code}
java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT
{code}
It appears in several tests (for example in 
DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh).
The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in 
the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was added 
in DRILL-1126).

The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name for 
this "GENERIC_OBJECT" RPC-/protobuf-level data type.



> Unit tests failures related to the SERVER_METADTA
> -
>
> Key: DRILL-5326
> URL: https://issues.apache.org/jira/browse/DRILL-5326
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.10.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Blocker
> Fix For: 1.10.0
>
>
> 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will 
> support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT 
> it is disabled. 
> When I enabled this method (by way of upgrading drill version to 1.10.0 or 
> 1.11.0-SNAPSHOT) I found the following exception:
> {code}
> java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT
> {code}
> It appears in several tests (for example in 
> DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh).
> The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in 
> the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was 
> added in DRILL-1126).
> The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name 
> for this "GENERIC_OBJECT" RPC-/protobuf-level data type.
> 2. After the fixing first one the mentioned above test still fails by reason 
> of the incorrect "NullCollation" value in the "ServerMetaProvider". According 
> to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the 
> default val should be NC_HIGH (NULL is the highest value).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA

2017-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900155#comment-15900155
 ] 

ASF GitHub Bot commented on DRILL-5326:
---

GitHub user vdiravka opened a pull request:

https://github.com/apache/drill/pull/775

DRILL-5326: Unit tests failures related to the SERVER_METADTA

- adding of the sql type name for the "GENERIC_OBJECT";
- changing "NullCollation" in the "ServerMetaProvider" to the correct 
default value;

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vdiravka/drill DRILL-5326

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/775.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #775


commit e7ca7650fa1bcc32638cfe4aade96aa56406a362
Author: Vitalii Diravka 
Date:   2017-03-07T20:53:03Z

DRILL-5326: Unit tests failures related to the SERVER_METADTA
- adding of the sql type name for the "GENERIC_OBJECT";
- changing "NullCollation" in the "ServerMetaProvider" to the correct 
default value;




> Unit tests failures related to the SERVER_METADTA
> -
>
> Key: DRILL-5326
> URL: https://issues.apache.org/jira/browse/DRILL-5326
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.10.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Blocker
> Fix For: 1.10.0
>
>
> 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will 
> support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT 
> it is disabled. 
> When I enabled this method (by way of upgrading drill version to 1.10.0 or 
> 1.11.0-SNAPSHOT) I found the following exception:
> {code}
> java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT
> {code}
> It appears in several tests (for example in 
> DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh).
> The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in 
> the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was 
> added in DRILL-1126).
> The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name 
> for this "GENERIC_OBJECT" RPC-/protobuf-level data type.
> 2. After the fixing first one the mentioned above test still fails by reason 
> of the incorrect "NullCollation" value in the "ServerMetaProvider". According 
> to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the 
> default val should be NC_HIGH (NULL is the highest value).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5326) Unit tests failures related to the SERVER_METADTA

2017-03-07 Thread Vitalii Diravka (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated DRILL-5326:
---
Description: 
1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will 
support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT it 
is disabled. 
When I enabled this method (by way of upgrading drill version to 1.10.0 or 
1.11.0-SNAPSHOT) I found the following exception:
{code}
java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT
{code}
It appears in several tests (for example in 
DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh).
The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in 
the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was added 
in DRILL-1126).

The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name for 
this "GENERIC_OBJECT" RPC-/protobuf-level data type.

2. After fixing the first one the mentioned above test still fails by reason of 
the incorrect "NullCollation" value in the "ServerMetaProvider". According to 
the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the 
default val should be NC_HIGH (NULL is the highest value).


  was:
1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will 
support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT it 
is disabled. 
When I enabled this method (by way of upgrading drill version to 1.10.0 or 
1.11.0-SNAPSHOT) I found the following exception:
{code}
java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT
{code}
It appears in several tests (for example in 
DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh).
The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in 
the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was added 
in DRILL-1126).

The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name for 
this "GENERIC_OBJECT" RPC-/protobuf-level data type.

2. After the fixing first one the mentioned above test still fails by reason of 
the incorrect "NullCollation" value in the "ServerMetaProvider". According to 
the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the 
default val should be NC_HIGH (NULL is the highest value).



> Unit tests failures related to the SERVER_METADTA
> -
>
> Key: DRILL-5326
> URL: https://issues.apache.org/jira/browse/DRILL-5326
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.10.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Blocker
> Fix For: 1.10.0
>
>
> 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will 
> support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT 
> it is disabled. 
> When I enabled this method (by way of upgrading drill version to 1.10.0 or 
> 1.11.0-SNAPSHOT) I found the following exception:
> {code}
> java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT
> {code}
> It appears in several tests (for example in 
> DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh).
> The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in 
> the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was 
> added in DRILL-1126).
> The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name 
> for this "GENERIC_OBJECT" RPC-/protobuf-level data type.
> 2. After fixing the first one the mentioned above test still fails by reason 
> of the incorrect "NullCollation" value in the "ServerMetaProvider". According 
> to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the 
> default val should be NC_HIGH (NULL is the highest value).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (DRILL-4851) TPCDS Query 64 just hang there at "starting" state

2017-03-07 Thread Dechang Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dechang Gu resolved DRILL-4851.
---
Resolution: Duplicate

duplicate of DRILL-4347

> TPCDS Query 64 just hang there at "starting" state
> --
>
> Key: DRILL-4851
> URL: https://issues.apache.org/jira/browse/DRILL-4851
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.8.0
> Environment: REL 6.0
>Reporter: Dechang Gu
>Assignee: Jinfeng Ni
>
> TPC-DS Query 64 on SF100 (views on parquet files) hung there at "starting" 
> state. drillbit.log on the foreman node show the following info:
> 2016-08-17 14:26:00,785 ucs-node4.perf.lab 
> [284b2996-d25a-d9af-178c-143b32ea6969:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 284b2996-d25a-d9af-178c-143b32ea6969: WITH cs_ui AS (SELECT cs_item_sk, 
> Sum(cs_ext_list_price) AS sale, Sum(cr_refunded_cash + cr_reversed_charge + 
> cr_store_credit) AS refund FROM   catalog_sales, catalog_returns WHERE  
> cs_item_sk = cr_item_sk AND cs_order_number = cr_order_number GROUP  BY 
> cs_item_sk HAVING Sum(cs_ext_list_price) > 2 * Sum( cr_refunded_cash + 
> cr_reversed_charge + cr_store_credit)), cross_sales AS (SELECT i_product_name 
> product_name, i_item_sk  item_sk, s_store_name   
> store_name, s_zip  store_zip, ad1.ca_street_number   
> b_street_number, ad1.ca_street_name b_streen_name, ad1.ca_city
> b_city, ad1.ca_zip b_zip, ad2.ca_street_number   c_street_number, 
> ad2.ca_street_name c_street_name, ad2.ca_cityc_city, 
> ad2.ca_zip c_zip, d1.d_year  AS syear, d2.d_year  
> AS fsyear, d3.d_year  s2year, Count(*)   cnt, 
> Sum(ss_wholesale_cost) s1, Sum(ss_list_price) s2, Sum(ss_coupon_amt) 
> s3 FROM   store_sales, store_returns, cs_ui, date_dim d1, date_dim d2, 
> date_dim d3, store, customer, customer_dd
> emographics cd1, customer_demographics cd2, promotion, household_demographics 
> hd1, household_demographics hd2, customer_address ad1, customer_address ad2, 
> income_band ib1, income_band ib2, item WHERE  ss_store_sk = s_stt
> ore_sk AND ss_sold_date_sk = d1.d_date_sk AND ss_customer_sk = c_customer_sk 
> AND ss_cdemo_sk = cd1.cd_demo_sk AND ss_hdemo_sk = hd1.hd_demo_sk AND 
> ss_addr_sk = ad1.ca_address_sk AND ss_item_sk = i_item_sk AND ss_item_skk
>  = sr_item_sk AND ss_ticket_number = sr_ticket_number AND ss_item_sk = 
> cs_ui.cs_item_sk AND c_current_cdemo_sk = cd2.cd_demo_sk AND 
> c_current_hdemo_sk = hd2.hd_demo_sk AND c_current_addr_sk = ad2.ca_address_sk 
> AND c_firr
> st_sales_date_sk = d2.d_date_sk AND c_first_shipto_date_sk = d3.d_date_sk AND 
> ss_promo_sk = p_promo_sk AND hd1.hd_income_band_sk = ib1.ib_income_band_sk 
> AND hd2.hd_income_band_sk = ib2.ib_income_band_sk AND cd1.cd_maritt
> al_status <> cd2.cd_marital_status AND i_color IN ( 'cyan', 'peach', 'blush', 
> 'frosted', 'powder', 'orange' ) AND i_current_price BETWEEN 58 AND 58 + 10 
> AND i_current_price BETWEEN 58 + 1 AND 58 + 15 GROUP  BY i_productt
> _name, i_item_sk, s_store_name, s_zip, ad1.ca_street_number, 
> ad1.ca_street_name, ad1.ca_city, ad1.ca_zip, ad2.ca_street_number, 
> ad2.ca_street_name, ad2.ca_city, ad2.ca_zip, d1.d_year, d2.d_year, d3.d_year) 
> SELECT cs1.prr
> oduct_name, cs1.store_name, cs1.store_zip, cs1.b_street_number, 
> cs1.b_streen_name, cs1.b_city, cs1.b_zip, cs1.c_street_number, 
> cs1.c_street_name, cs1.c_city, cs1.c_zip, cs1.syear, cs1.cnt, cs1.s1, cs1.s2, 
> cs1.s3, cs2.s11
> , cs2.s2, cs2.s3, cs2.syear, cs2.cnt FROM   cross_sales cs1, cross_sales cs2 
> WHERE  cs1.item_sk = cs2.item_sk AND cs1.syear = 2001 AND cs2.syear = 2001 + 
> 1 AND cs2.cnt <= cs1.cnt AND cs1.store_name = cs2.store_name AND
> cs1.store_zip = cs2.store_zip ORDER  BY cs1.product_name, cs1.store_name, 
> cs2.cnt
> 2016-08-17 14:26:04,045 ucs-node4.perf.lab 
> [284b2996-d25a-d9af-178c-143b32ea6969:foreman] INFO  
> o.a.d.exec.store.parquet.Metadata - Took 1 ms to get file statuses
> 2016-08-17 14:26:04,051 ucs-node4.perf.lab 
> [284b2996-d25a-d9af-178c-143b32ea6969:foreman] INFO  
> o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 1 out of 
> 1 using 1 threads. Time: 4ms total, 4.753323mm
> s avg, 4ms max.
> 2016-08-17 14:26:04,051 ucs-node4.perf.lab 
> [284b2996-d25a-d9af-178c-143b32ea6969:foreman] INFO  
> o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 1 out of 
> 1 using 1 threads. Earliest start: 12.261000
> μs, Latest start: 12.261000 μs, Average start: 12.261000 μs .
> 2016-08-17 14:26:04,051 ucs-node4.perf.lab 
> [284b2996-d25a-d9af-178c-143b32ea6969:foreman] INFO  
> o.a.d.exec.store.parquet.Metadata - Took 6 ms

[jira] [Closed] (DRILL-4851) TPCDS Query 64 just hang there at "starting" state

2017-03-07 Thread Dechang Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dechang Gu closed DRILL-4851.
-

Duplicate of DRILL-4347, hence close this one

> TPCDS Query 64 just hang there at "starting" state
> --
>
> Key: DRILL-4851
> URL: https://issues.apache.org/jira/browse/DRILL-4851
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.8.0
> Environment: REL 6.0
>Reporter: Dechang Gu
>Assignee: Jinfeng Ni
>
> TPC-DS Query 64 on SF100 (views on parquet files) hung there at "starting" 
> state. drillbit.log on the foreman node show the following info:
> 2016-08-17 14:26:00,785 ucs-node4.perf.lab 
> [284b2996-d25a-d9af-178c-143b32ea6969:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 284b2996-d25a-d9af-178c-143b32ea6969: WITH cs_ui AS (SELECT cs_item_sk, 
> Sum(cs_ext_list_price) AS sale, Sum(cr_refunded_cash + cr_reversed_charge + 
> cr_store_credit) AS refund FROM   catalog_sales, catalog_returns WHERE  
> cs_item_sk = cr_item_sk AND cs_order_number = cr_order_number GROUP  BY 
> cs_item_sk HAVING Sum(cs_ext_list_price) > 2 * Sum( cr_refunded_cash + 
> cr_reversed_charge + cr_store_credit)), cross_sales AS (SELECT i_product_name 
> product_name, i_item_sk  item_sk, s_store_name   
> store_name, s_zip  store_zip, ad1.ca_street_number   
> b_street_number, ad1.ca_street_name b_streen_name, ad1.ca_city
> b_city, ad1.ca_zip b_zip, ad2.ca_street_number   c_street_number, 
> ad2.ca_street_name c_street_name, ad2.ca_cityc_city, 
> ad2.ca_zip c_zip, d1.d_year  AS syear, d2.d_year  
> AS fsyear, d3.d_year  s2year, Count(*)   cnt, 
> Sum(ss_wholesale_cost) s1, Sum(ss_list_price) s2, Sum(ss_coupon_amt) 
> s3 FROM   store_sales, store_returns, cs_ui, date_dim d1, date_dim d2, 
> date_dim d3, store, customer, customer_dd
> emographics cd1, customer_demographics cd2, promotion, household_demographics 
> hd1, household_demographics hd2, customer_address ad1, customer_address ad2, 
> income_band ib1, income_band ib2, item WHERE  ss_store_sk = s_stt
> ore_sk AND ss_sold_date_sk = d1.d_date_sk AND ss_customer_sk = c_customer_sk 
> AND ss_cdemo_sk = cd1.cd_demo_sk AND ss_hdemo_sk = hd1.hd_demo_sk AND 
> ss_addr_sk = ad1.ca_address_sk AND ss_item_sk = i_item_sk AND ss_item_skk
>  = sr_item_sk AND ss_ticket_number = sr_ticket_number AND ss_item_sk = 
> cs_ui.cs_item_sk AND c_current_cdemo_sk = cd2.cd_demo_sk AND 
> c_current_hdemo_sk = hd2.hd_demo_sk AND c_current_addr_sk = ad2.ca_address_sk 
> AND c_firr
> st_sales_date_sk = d2.d_date_sk AND c_first_shipto_date_sk = d3.d_date_sk AND 
> ss_promo_sk = p_promo_sk AND hd1.hd_income_band_sk = ib1.ib_income_band_sk 
> AND hd2.hd_income_band_sk = ib2.ib_income_band_sk AND cd1.cd_maritt
> al_status <> cd2.cd_marital_status AND i_color IN ( 'cyan', 'peach', 'blush', 
> 'frosted', 'powder', 'orange' ) AND i_current_price BETWEEN 58 AND 58 + 10 
> AND i_current_price BETWEEN 58 + 1 AND 58 + 15 GROUP  BY i_productt
> _name, i_item_sk, s_store_name, s_zip, ad1.ca_street_number, 
> ad1.ca_street_name, ad1.ca_city, ad1.ca_zip, ad2.ca_street_number, 
> ad2.ca_street_name, ad2.ca_city, ad2.ca_zip, d1.d_year, d2.d_year, d3.d_year) 
> SELECT cs1.prr
> oduct_name, cs1.store_name, cs1.store_zip, cs1.b_street_number, 
> cs1.b_streen_name, cs1.b_city, cs1.b_zip, cs1.c_street_number, 
> cs1.c_street_name, cs1.c_city, cs1.c_zip, cs1.syear, cs1.cnt, cs1.s1, cs1.s2, 
> cs1.s3, cs2.s11
> , cs2.s2, cs2.s3, cs2.syear, cs2.cnt FROM   cross_sales cs1, cross_sales cs2 
> WHERE  cs1.item_sk = cs2.item_sk AND cs1.syear = 2001 AND cs2.syear = 2001 + 
> 1 AND cs2.cnt <= cs1.cnt AND cs1.store_name = cs2.store_name AND
> cs1.store_zip = cs2.store_zip ORDER  BY cs1.product_name, cs1.store_name, 
> cs2.cnt
> 2016-08-17 14:26:04,045 ucs-node4.perf.lab 
> [284b2996-d25a-d9af-178c-143b32ea6969:foreman] INFO  
> o.a.d.exec.store.parquet.Metadata - Took 1 ms to get file statuses
> 2016-08-17 14:26:04,051 ucs-node4.perf.lab 
> [284b2996-d25a-d9af-178c-143b32ea6969:foreman] INFO  
> o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 1 out of 
> 1 using 1 threads. Time: 4ms total, 4.753323mm
> s avg, 4ms max.
> 2016-08-17 14:26:04,051 ucs-node4.perf.lab 
> [284b2996-d25a-d9af-178c-143b32ea6969:foreman] INFO  
> o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 1 out of 
> 1 using 1 threads. Earliest start: 12.261000
> μs, Latest start: 12.261000 μs, Average start: 12.261000 μs .
> 2016-08-17 14:26:04,051 ucs-node4.perf.lab 
> [284b2996-d25a-d9af-178c-143b32ea6969:foreman] INFO  
> o.a.d.exec.store.parquet.Metadata - Took 6 ms to read

[jira] [Resolved] (DRILL-4850) TPCDS Query 33 failed in the second and 3rd runs, but succeeded in the 1st run

2017-03-07 Thread Dechang Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dechang Gu resolved DRILL-4850.
---
   Resolution: Cannot Reproduce
Fix Version/s: 1.10.0

Cannot repro in 1.10.0 (gitid: 3bfb497), hence close it.

> TPCDS Query 33 failed in the second and 3rd runs, but succeeded in the 1st run
> --
>
> Key: DRILL-4850
> URL: https://issues.apache.org/jira/browse/DRILL-4850
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.8.0
> Environment: REL 6.0 
>Reporter: Dechang Gu
>Assignee: Parth Chandra
> Fix For: 1.10.0
>
>
> I run tpcds query 33 on SF100 database, 3 times consecutively. The first one 
> succeeded, but the 2nd and 3rd runs hit the following error:
>  
> 2016-08-16 20:20:52,530 ucs-node6.perf.lab 
> [284c27f1-ee13-dfd0-6cbb-37f49810e93f:frag:3:9] ERROR 
> o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: IllegalStateException: 
> Failure while reading vector.  Expected vector class of 
> org.apache.drill.exec.vector.NullableIntVector but was holding vector class 
> org.apache.drill.exec.vector.NullableBigIntVector, field= 
> i_manufact_id(BIGINT:OPTIONAL)[$bits$(UINT1:REQUIRED), 
> i_manufact_id(BIGINT:OPTIONAL)]
> Fragment 3:9
> [Error Id: 7fc06ab9-6c63-402b-a1b4-465526aa7dc7 on ucs-node6.perf.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IllegalStateException: Failure while reading vector.  Expected vector class 
> of org.apache.drill.exec.vector.NullableIntVector but was holding vector 
> class org.apache.drill.exec.vector.NullableBigIntVector, field= 
> i_manufact_id(BIGINT:OPTIONAL)[$bits$(UINT1:REQUIRED), 
> i_manufact_id(BIGINT:OPTIONAL)]
> Fragment 3:9
> [Error Id: 7fc06ab9-6c63-402b-a1b4-465526aa7dc7 on ucs-node6.perf.lab:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543)
>  ~[drill-common-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:293)
>  [drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160)
>  [drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:262)
>  [drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_65]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_65]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
> Caused by: java.lang.IllegalStateException: Failure while reading vector.  
> Expected vector class of org.apache.drill.exec.vector.NullableIntVector but 
> was holding vector class org.apache.drill.exec.vector.NullableBigIntVector, 
> field= i_manufact_id(BIGINT:OPTIONAL)[$bits$(UINT1:REQUIRED), 
> i_manufact_id(BIGINT:OPTIONAL)]
> at 
> org.apache.drill.exec.record.VectorContainer.getValueAccessorById(VectorContainer.java:290)
>  ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.RecordBatchLoader.getValueAccessorById(RecordBatchLoader.java:178)
>  ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.unorderedreceiver.UnorderedReceiverBatch.getValueAccessorById(UnorderedReceiverBatch.java:135)
>  ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.test.generated.PartitionerGen36655$OutgoingRecordBatch.doSetup(PartitionerTemplate.java:64)
>  ~[na:na]
> at 
> org.apache.drill.exec.test.generated.PartitionerGen36655$OutgoingRecordBatch.initializeBatch(PartitionerTemplate.java:358)
>  ~[na:na]
> at 
> org.apache.drill.exec.test.generated.PartitionerGen36655.flushOutgoingBatches(PartitionerTemplate.java:163)
>  ~[na:na]
> at 
> org.apache.drill.exec.physical.impl.partitionsender.PartitionerDecorator$FlushBatchesHandlingClass.execute(PartitionerDecorator.java:266)
>  ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.partitionsender.PartitionerDecorator.executeMethodLogic(PartitionerDecorator.java:138)
>  ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.partitionsender.PartitionerDecorator.flushOutgoingBatches(PartitionerDecorator.java:82)
>  ~[drill-java-exec-1.8.0-SNAPSHOT.jar:

[jira] [Closed] (DRILL-4850) TPCDS Query 33 failed in the second and 3rd runs, but succeeded in the 1st run

2017-03-07 Thread Dechang Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dechang Gu closed DRILL-4850.
-

Cannot repro in 1.10.0, so close it.

> TPCDS Query 33 failed in the second and 3rd runs, but succeeded in the 1st run
> --
>
> Key: DRILL-4850
> URL: https://issues.apache.org/jira/browse/DRILL-4850
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.8.0
> Environment: REL 6.0 
>Reporter: Dechang Gu
>Assignee: Parth Chandra
> Fix For: 1.10.0
>
>
> I run tpcds query 33 on SF100 database, 3 times consecutively. The first one 
> succeeded, but the 2nd and 3rd runs hit the following error:
>  
> 2016-08-16 20:20:52,530 ucs-node6.perf.lab 
> [284c27f1-ee13-dfd0-6cbb-37f49810e93f:frag:3:9] ERROR 
> o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: IllegalStateException: 
> Failure while reading vector.  Expected vector class of 
> org.apache.drill.exec.vector.NullableIntVector but was holding vector class 
> org.apache.drill.exec.vector.NullableBigIntVector, field= 
> i_manufact_id(BIGINT:OPTIONAL)[$bits$(UINT1:REQUIRED), 
> i_manufact_id(BIGINT:OPTIONAL)]
> Fragment 3:9
> [Error Id: 7fc06ab9-6c63-402b-a1b4-465526aa7dc7 on ucs-node6.perf.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IllegalStateException: Failure while reading vector.  Expected vector class 
> of org.apache.drill.exec.vector.NullableIntVector but was holding vector 
> class org.apache.drill.exec.vector.NullableBigIntVector, field= 
> i_manufact_id(BIGINT:OPTIONAL)[$bits$(UINT1:REQUIRED), 
> i_manufact_id(BIGINT:OPTIONAL)]
> Fragment 3:9
> [Error Id: 7fc06ab9-6c63-402b-a1b4-465526aa7dc7 on ucs-node6.perf.lab:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543)
>  ~[drill-common-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:293)
>  [drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160)
>  [drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:262)
>  [drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_65]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_65]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
> Caused by: java.lang.IllegalStateException: Failure while reading vector.  
> Expected vector class of org.apache.drill.exec.vector.NullableIntVector but 
> was holding vector class org.apache.drill.exec.vector.NullableBigIntVector, 
> field= i_manufact_id(BIGINT:OPTIONAL)[$bits$(UINT1:REQUIRED), 
> i_manufact_id(BIGINT:OPTIONAL)]
> at 
> org.apache.drill.exec.record.VectorContainer.getValueAccessorById(VectorContainer.java:290)
>  ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.RecordBatchLoader.getValueAccessorById(RecordBatchLoader.java:178)
>  ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.unorderedreceiver.UnorderedReceiverBatch.getValueAccessorById(UnorderedReceiverBatch.java:135)
>  ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.test.generated.PartitionerGen36655$OutgoingRecordBatch.doSetup(PartitionerTemplate.java:64)
>  ~[na:na]
> at 
> org.apache.drill.exec.test.generated.PartitionerGen36655$OutgoingRecordBatch.initializeBatch(PartitionerTemplate.java:358)
>  ~[na:na]
> at 
> org.apache.drill.exec.test.generated.PartitionerGen36655.flushOutgoingBatches(PartitionerTemplate.java:163)
>  ~[na:na]
> at 
> org.apache.drill.exec.physical.impl.partitionsender.PartitionerDecorator$FlushBatchesHandlingClass.execute(PartitionerDecorator.java:266)
>  ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.partitionsender.PartitionerDecorator.executeMethodLogic(PartitionerDecorator.java:138)
>  ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.partitionsender.PartitionerDecorator.flushOutgoingBatches(PartitionerDecorator.java:82)
>  ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.partitionsender.Pa

[jira] [Commented] (DRILL-4347) Planning time for query64 from TPCDS test suite has increased 10 times compared to 1.4 release

2017-03-07 Thread Dechang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900247#comment-15900247
 ] 

Dechang Gu commented on DRILL-4347:
---

Check it with the current AD1.10.0 master (gitid 3dfb497), it takes >4 minutes 
for planning:
DURATION: 05 min 54.007 sec
PLANNING: 04 min 12.826 sec
EXECUTION: 01 min 41.181 sec

So someone need to chase the issue further

> Planning time for query64 from TPCDS test suite has increased 10 times 
> compared to 1.4 release
> --
>
> Key: DRILL-4347
> URL: https://issues.apache.org/jira/browse/DRILL-4347
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.5.0
>Reporter: Victoria Markman
>Assignee: Aman Sinha
> Fix For: Future
>
> Attachments: 294e9fb9-cdda-a89f-d1a7-b852878926a1.sys.drill_1.4.0, 
> 294ea418-9fb8-3082-1725-74e3cfe38fe9.sys.drill_1.5.0, drill4347_jstack.txt
>
>
> mapr-drill-1.5.0.201602012001-1.noarch.rpm
> {code}
> 0: jdbc:drill:schema=dfs> WITH cs_ui
> . . . . . . . . . . . . >  AS (SELECT cs_item_sk,
> . . . . . . . . . . . . > Sum(cs_ext_list_price) AS sale,
> . . . . . . . . . . . . > Sum(cr_refunded_cash + 
> cr_reversed_charge
> . . . . . . . . . . . . > + cr_store_credit) AS refund
> . . . . . . . . . . . . >  FROM   catalog_sales,
> . . . . . . . . . . . . > catalog_returns
> . . . . . . . . . . . . >  WHERE  cs_item_sk = cr_item_sk
> . . . . . . . . . . . . > AND cs_order_number = 
> cr_order_number
> . . . . . . . . . . . . >  GROUP  BY cs_item_sk
> . . . . . . . . . . . . >  HAVING Sum(cs_ext_list_price) > 2 * Sum(
> . . . . . . . . . . . . > cr_refunded_cash + 
> cr_reversed_charge
> . . . . . . . . . . . . > + cr_store_credit)),
> . . . . . . . . . . . . >  cross_sales
> . . . . . . . . . . . . >  AS (SELECT i_product_name product_name,
> . . . . . . . . . . . . > i_item_sk  item_sk,
> . . . . . . . . . . . . > s_store_name   store_name,
> . . . . . . . . . . . . > s_zip  store_zip,
> . . . . . . . . . . . . > ad1.ca_street_number   
> b_street_number,
> . . . . . . . . . . . . > ad1.ca_street_name 
> b_streen_name,
> . . . . . . . . . . . . > ad1.ca_cityb_city,
> . . . . . . . . . . . . > ad1.ca_zip b_zip,
> . . . . . . . . . . . . > ad2.ca_street_number   
> c_street_number,
> . . . . . . . . . . . . > ad2.ca_street_name 
> c_street_name,
> . . . . . . . . . . . . > ad2.ca_cityc_city,
> . . . . . . . . . . . . > ad2.ca_zip c_zip,
> . . . . . . . . . . . . > d1.d_year  AS syear,
> . . . . . . . . . . . . > d2.d_year  AS fsyear,
> . . . . . . . . . . . . > d3.d_year  s2year,
> . . . . . . . . . . . . > Count(*)   cnt,
> . . . . . . . . . . . . > Sum(ss_wholesale_cost) s1,
> . . . . . . . . . . . . > Sum(ss_list_price) s2,
> . . . . . . . . . . . . > Sum(ss_coupon_amt) s3
> . . . . . . . . . . . . >  FROM   store_sales,
> . . . . . . . . . . . . > store_returns,
> . . . . . . . . . . . . > cs_ui,
> . . . . . . . . . . . . > date_dim d1,
> . . . . . . . . . . . . > date_dim d2,
> . . . . . . . . . . . . > date_dim d3,
> . . . . . . . . . . . . > store,
> . . . . . . . . . . . . > customer,
> . . . . . . . . . . . . > customer_demographics cd1,
> . . . . . . . . . . . . > customer_demographics cd2,
> . . . . . . . . . . . . > promotion,
> . . . . . . . . . . . . > household_demographics hd1,
> . . . . . . . . . . . . > household_demographics hd2,
> . . . . . . . . . . . . > customer_address ad1,
> . . . . . . . . . . . . > customer_address ad2,
> . . . . . . . . . . . . > income_band ib1,
> . . . . . . . . . . . . > income_band ib2,
> . . . . . . . . . . . . > item
> . . . . . . . . . . . . >  WHERE  ss_store_sk = s_store_sk
> . . . . . . . . . . . . > AND ss_sold_date_sk = d1.d_date_sk
> . . . . . . . . . . . . > AND ss_customer_sk = c_customer_sk
> . . . . . . . . . . . . >   

[jira] [Updated] (DRILL-5326) Unit tests failures related to the SERVER_METADTA

2017-03-07 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong updated DRILL-5326:

Reviewer: Jinfeng Ni

Assigned Reviewer to [~jni]

> Unit tests failures related to the SERVER_METADTA
> -
>
> Key: DRILL-5326
> URL: https://issues.apache.org/jira/browse/DRILL-5326
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.10.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Blocker
> Fix For: 1.10.0
>
>
> 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will 
> support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT 
> it is disabled. 
> When I enabled this method (by way of upgrading drill version to 1.10.0 or 
> 1.11.0-SNAPSHOT) I found the following exception:
> {code}
> java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT
> {code}
> It appears in several tests (for example in 
> DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh).
> The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in 
> the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was 
> added in DRILL-1126).
> The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name 
> for this "GENERIC_OBJECT" RPC-/protobuf-level data type.
> 2. After fixing the first one the mentioned above test still fails by reason 
> of the incorrect "NullCollation" value in the "ServerMetaProvider". According 
> to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the 
> default val should be NC_HIGH (NULL is the highest value).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA

2017-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900343#comment-15900343
 ] 

ASF GitHub Bot commented on DRILL-5326:
---

Github user jinfengni commented on a diff in the pull request:

https://github.com/apache/drill/pull/775#discussion_r104802660
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/metadata/ServerMetaProvider.java
 ---
@@ -76,7 +76,7 @@
   .setReadOnly(false)
   .setGroupBySupport(GroupBySupport.GB_UNRELATED)
   .setLikeEscapeClauseSupported(true)
-  .setNullCollation(NullCollation.NC_AT_END)
+  .setNullCollation(NullCollation.NC_HIGH)
--- End diff --

I'm not completely sure why we should change from NC_AT_END to NC_HIGH, in 
stead of NC_AT_END.  I thought Drill is using ASC as default ordering, and 
NULLS LAST as default null collation for ASC. This is consistent to what Oracle 
[1] and Postgres [2] is doing : ASC /NULL LAST is the default option.  

1. http://docs.oracle.com/javadb/10.6.2.1/ref/rrefsqlj13658.html
2. https://www.postgresql.org/docs/9.4/static/queries-order.html



> Unit tests failures related to the SERVER_METADTA
> -
>
> Key: DRILL-5326
> URL: https://issues.apache.org/jira/browse/DRILL-5326
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.10.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Blocker
> Fix For: 1.10.0
>
>
> 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will 
> support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT 
> it is disabled. 
> When I enabled this method (by way of upgrading drill version to 1.10.0 or 
> 1.11.0-SNAPSHOT) I found the following exception:
> {code}
> java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT
> {code}
> It appears in several tests (for example in 
> DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh).
> The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in 
> the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was 
> added in DRILL-1126).
> The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name 
> for this "GENERIC_OBJECT" RPC-/protobuf-level data type.
> 2. After fixing the first one the mentioned above test still fails by reason 
> of the incorrect "NullCollation" value in the "ServerMetaProvider". According 
> to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the 
> default val should be NC_HIGH (NULL is the highest value).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA

2017-03-07 Thread Jinfeng Ni (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900354#comment-15900354
 ] 

Jinfeng Ni commented on DRILL-5326:
---

[~laurentgo], can you please take a look this PR as well, since it's related to 
the change you made in DRILL-5301, and it's blocking issue for drill 1.10.0 
RC0?  Thanks. 


> Unit tests failures related to the SERVER_METADTA
> -
>
> Key: DRILL-5326
> URL: https://issues.apache.org/jira/browse/DRILL-5326
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.10.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Blocker
> Fix For: 1.10.0
>
>
> 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will 
> support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT 
> it is disabled. 
> When I enabled this method (by way of upgrading drill version to 1.10.0 or 
> 1.11.0-SNAPSHOT) I found the following exception:
> {code}
> java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT
> {code}
> It appears in several tests (for example in 
> DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh).
> The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in 
> the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was 
> added in DRILL-1126).
> The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name 
> for this "GENERIC_OBJECT" RPC-/protobuf-level data type.
> 2. After fixing the first one the mentioned above test still fails by reason 
> of the incorrect "NullCollation" value in the "ServerMetaProvider". According 
> to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the 
> default val should be NC_HIGH (NULL is the highest value).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5327) Hash aggregate can return empty batch which can cause schema change exception

2017-03-07 Thread Chun Chang (JIRA)
Chun Chang created DRILL-5327:
-

 Summary: Hash aggregate can return empty batch which can cause 
schema change exception
 Key: DRILL-5327
 URL: https://issues.apache.org/jira/browse/DRILL-5327
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.10.0
Reporter: Chun Chang
Priority: Blocker


Hash aggregate can return empty batches which cause drill to throw schema 
change exception (not handling this type of schema change). This is not a new 
bug. But a recent hash function change (a theoretically correct change) may 
have increased the chance of hitting this issue. I don't have scientific data 
to support my claim (in fact I don't believe it's the case), but a regular 
regression run used to pass fails now due to this bug. My concern is that 
existing drill users out there may have queries that used to work but fail now. 
It will be difficult to explain why the new release is better for them. I put 
this bug as blocker so we can discuss it before releasing 1.10.

{noformat}
/root/drillAutomation/framework-master/framework/resources/Advanced/tpcds/tpcds_sf1/original/text/query66.sql
Query: 
-- start query 66 in stream 0 using template query66.tpl 
SELECT w_warehouse_name, 
   w_warehouse_sq_ft, 
   w_city, 
   w_county, 
   w_state, 
   w_country, 
   ship_carriers, 
   year1,
   Sum(jan_sales) AS jan_sales, 
   Sum(feb_sales) AS feb_sales, 
   Sum(mar_sales) AS mar_sales, 
   Sum(apr_sales) AS apr_sales, 
   Sum(may_sales) AS may_sales, 
   Sum(jun_sales) AS jun_sales, 
   Sum(jul_sales) AS jul_sales, 
   Sum(aug_sales) AS aug_sales, 
   Sum(sep_sales) AS sep_sales, 
   Sum(oct_sales) AS oct_sales, 
   Sum(nov_sales) AS nov_sales, 
   Sum(dec_sales) AS dec_sales, 
   Sum(jan_sales / w_warehouse_sq_ft) AS jan_sales_per_sq_foot, 
   Sum(feb_sales / w_warehouse_sq_ft) AS feb_sales_per_sq_foot, 
   Sum(mar_sales / w_warehouse_sq_ft) AS mar_sales_per_sq_foot, 
   Sum(apr_sales / w_warehouse_sq_ft) AS apr_sales_per_sq_foot, 
   Sum(may_sales / w_warehouse_sq_ft) AS may_sales_per_sq_foot, 
   Sum(jun_sales / w_warehouse_sq_ft) AS jun_sales_per_sq_foot, 
   Sum(jul_sales / w_warehouse_sq_ft) AS jul_sales_per_sq_foot, 
   Sum(aug_sales / w_warehouse_sq_ft) AS aug_sales_per_sq_foot, 
   Sum(sep_sales / w_warehouse_sq_ft) AS sep_sales_per_sq_foot, 
   Sum(oct_sales / w_warehouse_sq_ft) AS oct_sales_per_sq_foot, 
   Sum(nov_sales / w_warehouse_sq_ft) AS nov_sales_per_sq_foot, 
   Sum(dec_sales / w_warehouse_sq_ft) AS dec_sales_per_sq_foot, 
   Sum(jan_net)   AS jan_net, 
   Sum(feb_net)   AS feb_net, 
   Sum(mar_net)   AS mar_net, 
   Sum(apr_net)   AS apr_net, 
   Sum(may_net)   AS may_net, 
   Sum(jun_net)   AS jun_net, 
   Sum(jul_net)   AS jul_net, 
   Sum(aug_net)   AS aug_net, 
   Sum(sep_net)   AS sep_net, 
   Sum(oct_net)   AS oct_net, 
   Sum(nov_net)   AS nov_net, 
   Sum(dec_net)   AS dec_net 
FROM   (SELECT w_warehouse_name, 
   w_warehouse_sq_ft, 
   w_city, 
   w_county, 
   w_state, 
   w_country, 
   'ZOUROS' 
   || ',' 
   || 'ZHOU' AS ship_carriers, 
   d_yearAS year1, 
   Sum(CASE 
 WHEN d_moy = 1 THEN ws_ext_sales_price * ws_quantity 
 ELSE 0 
   END)  AS jan_sales, 
   Sum(CASE 
 WHEN d_moy = 2 THEN ws_ext_sales_price * ws_quantity 
 ELSE 0 
   END)  AS feb_sales, 
   Sum(CASE 
 WHEN d_moy = 3 THEN ws_ext_sales_price * ws_quantity 
 ELSE 0 
   END)  AS mar_sales, 
   Sum(CASE 
 WHEN d_moy = 4 THEN ws_ext_sales_price * ws_quantity 
 ELSE 0 
   END)  AS apr_sales, 
  

[jira] [Commented] (DRILL-5327) Hash aggregate can return empty batch which can cause schema change exception

2017-03-07 Thread Chun Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900357#comment-15900357
 ] 

Chun Chang commented on DRILL-5327:
---

This is related to DRILL-3991

> Hash aggregate can return empty batch which can cause schema change exception
> -
>
> Key: DRILL-5327
> URL: https://issues.apache.org/jira/browse/DRILL-5327
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.10.0
>Reporter: Chun Chang
>Priority: Blocker
>
> Hash aggregate can return empty batches which cause drill to throw schema 
> change exception (not handling this type of schema change). This is not a new 
> bug. But a recent hash function change (a theoretically correct change) may 
> have increased the chance of hitting this issue. I don't have scientific data 
> to support my claim (in fact I don't believe it's the case), but a regular 
> regression run used to pass fails now due to this bug. My concern is that 
> existing drill users out there may have queries that used to work but fail 
> now. It will be difficult to explain why the new release is better for them. 
> I put this bug as blocker so we can discuss it before releasing 1.10.
> {noformat}
> /root/drillAutomation/framework-master/framework/resources/Advanced/tpcds/tpcds_sf1/original/text/query66.sql
> Query: 
> -- start query 66 in stream 0 using template query66.tpl 
> SELECT w_warehouse_name, 
>w_warehouse_sq_ft, 
>w_city, 
>w_county, 
>w_state, 
>w_country, 
>ship_carriers, 
>year1,
>Sum(jan_sales) AS jan_sales, 
>Sum(feb_sales) AS feb_sales, 
>Sum(mar_sales) AS mar_sales, 
>Sum(apr_sales) AS apr_sales, 
>Sum(may_sales) AS may_sales, 
>Sum(jun_sales) AS jun_sales, 
>Sum(jul_sales) AS jul_sales, 
>Sum(aug_sales) AS aug_sales, 
>Sum(sep_sales) AS sep_sales, 
>Sum(oct_sales) AS oct_sales, 
>Sum(nov_sales) AS nov_sales, 
>Sum(dec_sales) AS dec_sales, 
>Sum(jan_sales / w_warehouse_sq_ft) AS jan_sales_per_sq_foot, 
>Sum(feb_sales / w_warehouse_sq_ft) AS feb_sales_per_sq_foot, 
>Sum(mar_sales / w_warehouse_sq_ft) AS mar_sales_per_sq_foot, 
>Sum(apr_sales / w_warehouse_sq_ft) AS apr_sales_per_sq_foot, 
>Sum(may_sales / w_warehouse_sq_ft) AS may_sales_per_sq_foot, 
>Sum(jun_sales / w_warehouse_sq_ft) AS jun_sales_per_sq_foot, 
>Sum(jul_sales / w_warehouse_sq_ft) AS jul_sales_per_sq_foot, 
>Sum(aug_sales / w_warehouse_sq_ft) AS aug_sales_per_sq_foot, 
>Sum(sep_sales / w_warehouse_sq_ft) AS sep_sales_per_sq_foot, 
>Sum(oct_sales / w_warehouse_sq_ft) AS oct_sales_per_sq_foot, 
>Sum(nov_sales / w_warehouse_sq_ft) AS nov_sales_per_sq_foot, 
>Sum(dec_sales / w_warehouse_sq_ft) AS dec_sales_per_sq_foot, 
>Sum(jan_net)   AS jan_net, 
>Sum(feb_net)   AS feb_net, 
>Sum(mar_net)   AS mar_net, 
>Sum(apr_net)   AS apr_net, 
>Sum(may_net)   AS may_net, 
>Sum(jun_net)   AS jun_net, 
>Sum(jul_net)   AS jul_net, 
>Sum(aug_net)   AS aug_net, 
>Sum(sep_net)   AS sep_net, 
>Sum(oct_net)   AS oct_net, 
>Sum(nov_net)   AS nov_net, 
>Sum(dec_net)   AS dec_net 
> FROM   (SELECT w_warehouse_name, 
>w_warehouse_sq_ft, 
>w_city, 
>w_county, 
>w_state, 
>w_country, 
>'ZOUROS' 
>|| ',' 
>|| 'ZHOU' AS ship_carriers, 
>d_yearAS year1, 
>Sum(CASE 
>  WHEN d_moy = 1 THEN ws_ext_sales_price * ws_quantity 
>  ELSE 0 
>END)  AS jan_sales, 
>Sum(CASE 
>  WHEN d_moy = 2 THEN ws_ext_sales_price * ws_

[jira] [Assigned] (DRILL-5327) Hash aggregate can return empty batch which can cause schema change exception

2017-03-07 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong reassigned DRILL-5327:
---

Assignee: Boaz Ben-Zvi

[~ben-zvi] - can you confirm that this is due to your changes for DRILL-5293.

> Hash aggregate can return empty batch which can cause schema change exception
> -
>
> Key: DRILL-5327
> URL: https://issues.apache.org/jira/browse/DRILL-5327
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.10.0
>Reporter: Chun Chang
>Assignee: Boaz Ben-Zvi
>Priority: Blocker
>
> Hash aggregate can return empty batches which cause drill to throw schema 
> change exception (not handling this type of schema change). This is not a new 
> bug. But a recent hash function change (a theoretically correct change) may 
> have increased the chance of hitting this issue. I don't have scientific data 
> to support my claim (in fact I don't believe it's the case), but a regular 
> regression run used to pass fails now due to this bug. My concern is that 
> existing drill users out there may have queries that used to work but fail 
> now. It will be difficult to explain why the new release is better for them. 
> I put this bug as blocker so we can discuss it before releasing 1.10.
> {noformat}
> /root/drillAutomation/framework-master/framework/resources/Advanced/tpcds/tpcds_sf1/original/text/query66.sql
> Query: 
> -- start query 66 in stream 0 using template query66.tpl 
> SELECT w_warehouse_name, 
>w_warehouse_sq_ft, 
>w_city, 
>w_county, 
>w_state, 
>w_country, 
>ship_carriers, 
>year1,
>Sum(jan_sales) AS jan_sales, 
>Sum(feb_sales) AS feb_sales, 
>Sum(mar_sales) AS mar_sales, 
>Sum(apr_sales) AS apr_sales, 
>Sum(may_sales) AS may_sales, 
>Sum(jun_sales) AS jun_sales, 
>Sum(jul_sales) AS jul_sales, 
>Sum(aug_sales) AS aug_sales, 
>Sum(sep_sales) AS sep_sales, 
>Sum(oct_sales) AS oct_sales, 
>Sum(nov_sales) AS nov_sales, 
>Sum(dec_sales) AS dec_sales, 
>Sum(jan_sales / w_warehouse_sq_ft) AS jan_sales_per_sq_foot, 
>Sum(feb_sales / w_warehouse_sq_ft) AS feb_sales_per_sq_foot, 
>Sum(mar_sales / w_warehouse_sq_ft) AS mar_sales_per_sq_foot, 
>Sum(apr_sales / w_warehouse_sq_ft) AS apr_sales_per_sq_foot, 
>Sum(may_sales / w_warehouse_sq_ft) AS may_sales_per_sq_foot, 
>Sum(jun_sales / w_warehouse_sq_ft) AS jun_sales_per_sq_foot, 
>Sum(jul_sales / w_warehouse_sq_ft) AS jul_sales_per_sq_foot, 
>Sum(aug_sales / w_warehouse_sq_ft) AS aug_sales_per_sq_foot, 
>Sum(sep_sales / w_warehouse_sq_ft) AS sep_sales_per_sq_foot, 
>Sum(oct_sales / w_warehouse_sq_ft) AS oct_sales_per_sq_foot, 
>Sum(nov_sales / w_warehouse_sq_ft) AS nov_sales_per_sq_foot, 
>Sum(dec_sales / w_warehouse_sq_ft) AS dec_sales_per_sq_foot, 
>Sum(jan_net)   AS jan_net, 
>Sum(feb_net)   AS feb_net, 
>Sum(mar_net)   AS mar_net, 
>Sum(apr_net)   AS apr_net, 
>Sum(may_net)   AS may_net, 
>Sum(jun_net)   AS jun_net, 
>Sum(jul_net)   AS jul_net, 
>Sum(aug_net)   AS aug_net, 
>Sum(sep_net)   AS sep_net, 
>Sum(oct_net)   AS oct_net, 
>Sum(nov_net)   AS nov_net, 
>Sum(dec_net)   AS dec_net 
> FROM   (SELECT w_warehouse_name, 
>w_warehouse_sq_ft, 
>w_city, 
>w_county, 
>w_state, 
>w_country, 
>'ZOUROS' 
>|| ',' 
>|| 'ZHOU' AS ship_carriers, 
>d_yearAS year1, 
>Sum(CASE 
>  WHEN d_moy = 1 THEN ws_ext_sales_price * ws_quantity 
>  ELSE 0 
>END)  AS jan_sales, 
>Sum(CASE 
> 

[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA

2017-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900366#comment-15900366
 ] 

ASF GitHub Bot commented on DRILL-5326:
---

Github user zfong commented on a diff in the pull request:

https://github.com/apache/drill/pull/775#discussion_r104805550
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/metadata/ServerMetaProvider.java
 ---
@@ -76,7 +76,7 @@
   .setReadOnly(false)
   .setGroupBySupport(GroupBySupport.GB_UNRELATED)
   .setLikeEscapeClauseSupported(true)
-  .setNullCollation(NullCollation.NC_AT_END)
+  .setNullCollation(NullCollation.NC_HIGH)
--- End diff --

@jinfengni  - See @vdiravka's comments in the Jira.  The Drill 
documentation at https://drill.apache.org/docs/order-by-clause/#usage-notes 
says NULLs sort highest.  If the doc is wrong, then we should fix the doc. 


> Unit tests failures related to the SERVER_METADTA
> -
>
> Key: DRILL-5326
> URL: https://issues.apache.org/jira/browse/DRILL-5326
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.10.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Blocker
> Fix For: 1.10.0
>
>
> 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will 
> support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT 
> it is disabled. 
> When I enabled this method (by way of upgrading drill version to 1.10.0 or 
> 1.11.0-SNAPSHOT) I found the following exception:
> {code}
> java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT
> {code}
> It appears in several tests (for example in 
> DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh).
> The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in 
> the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was 
> added in DRILL-1126).
> The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name 
> for this "GENERIC_OBJECT" RPC-/protobuf-level data type.
> 2. After fixing the first one the mentioned above test still fails by reason 
> of the incorrect "NullCollation" value in the "ServerMetaProvider". According 
> to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the 
> default val should be NC_HIGH (NULL is the highest value).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA

2017-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900378#comment-15900378
 ] 

ASF GitHub Bot commented on DRILL-5326:
---

Github user jinfengni commented on a diff in the pull request:

https://github.com/apache/drill/pull/775#discussion_r104807540
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/metadata/ServerMetaProvider.java
 ---
@@ -76,7 +76,7 @@
   .setReadOnly(false)
   .setGroupBySupport(GroupBySupport.GB_UNRELATED)
   .setLikeEscapeClauseSupported(true)
-  .setNullCollation(NullCollation.NC_AT_END)
+  .setNullCollation(NullCollation.NC_HIGH)
--- End diff --

My understanding is NULL collation should be specified together with 
ASC/DESC.  ASC/NULL LAST as the default option essentially implies NC_HIGH. 
However, I'm not sure if we just specify NC_HIGH alone, or ASC/NC_AT_END 
combined.   


> Unit tests failures related to the SERVER_METADTA
> -
>
> Key: DRILL-5326
> URL: https://issues.apache.org/jira/browse/DRILL-5326
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.10.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Blocker
> Fix For: 1.10.0
>
>
> 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will 
> support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT 
> it is disabled. 
> When I enabled this method (by way of upgrading drill version to 1.10.0 or 
> 1.11.0-SNAPSHOT) I found the following exception:
> {code}
> java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT
> {code}
> It appears in several tests (for example in 
> DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh).
> The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in 
> the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was 
> added in DRILL-1126).
> The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name 
> for this "GENERIC_OBJECT" RPC-/protobuf-level data type.
> 2. After fixing the first one the mentioned above test still fails by reason 
> of the incorrect "NullCollation" value in the "ServerMetaProvider". According 
> to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the 
> default val should be NC_HIGH (NULL is the highest value).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA

2017-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900399#comment-15900399
 ] 

ASF GitHub Bot commented on DRILL-5326:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/775#discussion_r104809964
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/metadata/ServerMetaProvider.java
 ---
@@ -76,7 +76,7 @@
   .setReadOnly(false)
   .setGroupBySupport(GroupBySupport.GB_UNRELATED)
   .setLikeEscapeClauseSupported(true)
-  .setNullCollation(NullCollation.NC_AT_END)
+  .setNullCollation(NullCollation.NC_HIGH)
--- End diff --

When sorting in Drill, the detailed sort spec is set in the 
{{ExternalSort}} operator definition. This thing is a bit complex. One can 
control sort order (ASC, DESC) and nulls position (LOW, HIGH, UNSPECIFIED.)

Data sorts according to ASC, DESC.
Nulls sort as follows:

HIGH: last if ASC, first if DESC
LOW: first if ASC, last if DESC
UNSPECIFIED: always high

If the planner has no way of setting the nulls ordering from a SQL query, 
then the value is UNSPECIFIED, which means nulls always sort last as Jinfeng 
explained.


> Unit tests failures related to the SERVER_METADTA
> -
>
> Key: DRILL-5326
> URL: https://issues.apache.org/jira/browse/DRILL-5326
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.10.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Blocker
> Fix For: 1.10.0
>
>
> 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will 
> support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT 
> it is disabled. 
> When I enabled this method (by way of upgrading drill version to 1.10.0 or 
> 1.11.0-SNAPSHOT) I found the following exception:
> {code}
> java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT
> {code}
> It appears in several tests (for example in 
> DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh).
> The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in 
> the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was 
> added in DRILL-1126).
> The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name 
> for this "GENERIC_OBJECT" RPC-/protobuf-level data type.
> 2. After fixing the first one the mentioned above test still fails by reason 
> of the incorrect "NullCollation" value in the "ServerMetaProvider". According 
> to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the 
> default val should be NC_HIGH (NULL is the highest value).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5328) Trim down physical plan size - replace StoragePluginConfig with storage name

2017-03-07 Thread Chunhui Shi (JIRA)
Chunhui Shi created DRILL-5328:
--

 Summary: Trim down physical plan size - replace 
StoragePluginConfig with storage name
 Key: DRILL-5328
 URL: https://issues.apache.org/jira/browse/DRILL-5328
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Chunhui Shi


For a physical plan, we now pass StoragePluginConfig as part of plan, then the 
destination use the config to fetch the storage plugin in 
StoragePluginRegistry. However, we can also fetch a storage plugin with the 
name which is identical to all Drillbits. 

In the example of simple physical plan of 150 lines shown below,  the storage 
plugin config took 60 lines. In a typical large system, FileSystem's 
StoragePluginConfig could be >500 lines. So this improvement should save the 
cost of passing a larger physical plan among nodes.

0: jdbc:drill:zk=10.10.88.126:5181> explain plan for select * from 
dfs.tmp.employee1 where last_name='Blumberg';
+--+--+
| text | json |
+--+--+
| 00-00Screen
00-01  Project(*=[$0])
00-02Project(T1¦¦*=[$0])
00-03  SelectionVectorRemover
00-04Filter(condition=[=($1, 'Blumberg')])
00-05  Project(T1¦¦*=[$0], last_name=[$1])
00-06Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=/tmp/employee1/0_0_0.parquet]], 
selectionRoot=/tmp/employee1, numFiles=1, usedMetadataFile=true, 
cacheFileRoot=/tmp/employee1, columns=[`*`]]])
 | {
  "head" : {
"version" : 1,
"generator" : {
  "type" : "ExplainHandler",
  "info" : ""
},
"type" : "APACHE_DRILL_PHYSICAL",
"options" : [ ],
"queue" : 0,
"resultMode" : "EXEC"
  },
  "graph" : [ {
"pop" : "parquet-scan",
"@id" : 6,
"userName" : "root",
"entries" : [ {
  "path" : "/tmp/employee1/0_0_0.parquet"
} ],
"storage" : {
  "type" : "file",
  "enabled" : true,
  "connection" : "maprfs:///",
  "config" : null,
  "workspaces" : {
"root" : {
  "location" : "/",
  "writable" : false,
  "defaultInputFormat" : null
},
"tmp" : {
  "location" : "/tmp",
  "writable" : true,
  "defaultInputFormat" : null
},
"shi" : {
  "location" : "/user/shi",
  "writable" : true,
  "defaultInputFormat" : null
},
"dir700" : {
  "location" : "/user/shi/dir700",
  "writable" : true,
  "defaultInputFormat" : null
},
"dir775" : {
  "location" : "/user/shi/dir775",
  "writable" : true,
  "defaultInputFormat" : null
},
"xyz" : {
  "location" : "/user/xyz",
  "writable" : true,
  "defaultInputFormat" : null
}
  },
  "formats" : {
"psv" : {
  "type" : "text",
  "extensions" : [ "tbl" ],
  "delimiter" : "|"
},
"csv" : {
  "type" : "text",
  "extensions" : [ "csv" ],
  "delimiter" : ","
},
"tsv" : {
  "type" : "text",
  "extensions" : [ "tsv" ],
  "delimiter" : "\t"
},
"parquet" : {
  "type" : "parquet"
},
"json" : {
  "type" : "json",
  "extensions" : [ "json" ]
},
"maprdb" : {
  "type" : "maprdb"
}
  }
},
"format" : {
  "type" : "parquet"
},
"columns" : [ "`*`" ],
"selectionRoot" : "/tmp/employee1",
"filter" : "true",
"fileSet" : [ "/tmp/employee1/0_0_0.parquet" ],
"files" : [ "/tmp/employee1/0_0_0.parquet" ],
"cost" : 1155.0
  }, {
"pop" : "project",
"@id" : 5,
"exprs" : [ {
  "ref" : "`T1¦¦*`",
  "expr" : "`*`"
}, {
  "ref" : "`last_name`",
  "expr" : "`last_name`"
} ],
"child" : 6,
"initialAllocation" : 100,
"maxAllocation" : 100,
"cost" : 1155.0
  }, {
"pop" : "filter",
"@id" : 4,
"child" : 5,
"expr" : "equal(`last_name`, 'Blumberg') ",
"initialAllocation" : 100,
"maxAllocation" : 100,
"cost" : 173.25
  }, {
"pop" : "selection-vector-remover",
"@id" : 3,
"child" : 4,
"initialAllocation" : 100,
"maxAllocation" : 100,
"cost" : 173.25
  }, {
"pop" : "project",
"@id" : 2,
"exprs" : [ {
  "ref" : "`T1¦¦*`",
  "expr" : "`T1¦¦*`"
} ],
"child" : 3,
"initialAllocation" : 100,
"maxAllocation" : 100,
"cost" : 173.25
  }, {
"pop" : "project",
"@id" : 1,
"exprs" : [ {
  "ref" : "`*`",
  "expr" : "`T1¦¦*`"
} ],
"child" : 2,
"initialAllocation" : 100,
"maxAllocation" : 100,
"cost" : 173.25
  }, {
"pop" : "screen",
"@id" : 0,
"child" : 1,
"initialAllocation" : 100,
"maxAllocation" : 100,
"co

[jira] [Commented] (DRILL-5327) Hash aggregate can return empty batch which can cause schema change exception

2017-03-07 Thread Boaz Ben-Zvi (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900405#comment-15900405
 ] 

Boaz Ben-Zvi commented on DRILL-5327:
-

DRILL-5293 has only exposed this existing bug (e.g., seen before when MapR-DB 
storage was used).
The underlying cause is a hardcoded decision to mark the schema a schema-less 
empty batch as an INT, which conflicts with the existing varchar schema 
(probably `w_warehouse_name`).
There were two rows/records distributed to one batch, and none to the second, 
which was thus empty.
With a different hashing, the two rows were split among the two batches, hence 
none was empty.
Another familiar symptom -- this bug is intermittent -- reflecting the race 
situation between the batches -- when the empty batch arrives first (at the 
Hash Aggr), there is no schema change as the second arrives because we can 
change INT into VARCHAR. 

Also the relation to DRILL-3991 is highly speculative; that other Jira has to 
do with coping with an actual schema change. There is some slight chance that 
by such coping we could overcome the empty batch problem.
Though not likely (e.g., once the schema was set to varchar, then comes an 
empty batch with our default as INT -- can we force INT upon all those varchars 
?)


> Hash aggregate can return empty batch which can cause schema change exception
> -
>
> Key: DRILL-5327
> URL: https://issues.apache.org/jira/browse/DRILL-5327
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.10.0
>Reporter: Chun Chang
>Assignee: Boaz Ben-Zvi
>Priority: Blocker
>
> Hash aggregate can return empty batches which cause drill to throw schema 
> change exception (not handling this type of schema change). This is not a new 
> bug. But a recent hash function change (a theoretically correct change) may 
> have increased the chance of hitting this issue. I don't have scientific data 
> to support my claim (in fact I don't believe it's the case), but a regular 
> regression run used to pass fails now due to this bug. My concern is that 
> existing drill users out there may have queries that used to work but fail 
> now. It will be difficult to explain why the new release is better for them. 
> I put this bug as blocker so we can discuss it before releasing 1.10.
> {noformat}
> /root/drillAutomation/framework-master/framework/resources/Advanced/tpcds/tpcds_sf1/original/text/query66.sql
> Query: 
> -- start query 66 in stream 0 using template query66.tpl 
> SELECT w_warehouse_name, 
>w_warehouse_sq_ft, 
>w_city, 
>w_county, 
>w_state, 
>w_country, 
>ship_carriers, 
>year1,
>Sum(jan_sales) AS jan_sales, 
>Sum(feb_sales) AS feb_sales, 
>Sum(mar_sales) AS mar_sales, 
>Sum(apr_sales) AS apr_sales, 
>Sum(may_sales) AS may_sales, 
>Sum(jun_sales) AS jun_sales, 
>Sum(jul_sales) AS jul_sales, 
>Sum(aug_sales) AS aug_sales, 
>Sum(sep_sales) AS sep_sales, 
>Sum(oct_sales) AS oct_sales, 
>Sum(nov_sales) AS nov_sales, 
>Sum(dec_sales) AS dec_sales, 
>Sum(jan_sales / w_warehouse_sq_ft) AS jan_sales_per_sq_foot, 
>Sum(feb_sales / w_warehouse_sq_ft) AS feb_sales_per_sq_foot, 
>Sum(mar_sales / w_warehouse_sq_ft) AS mar_sales_per_sq_foot, 
>Sum(apr_sales / w_warehouse_sq_ft) AS apr_sales_per_sq_foot, 
>Sum(may_sales / w_warehouse_sq_ft) AS may_sales_per_sq_foot, 
>Sum(jun_sales / w_warehouse_sq_ft) AS jun_sales_per_sq_foot, 
>Sum(jul_sales / w_warehouse_sq_ft) AS jul_sales_per_sq_foot, 
>Sum(aug_sales / w_warehouse_sq_ft) AS aug_sales_per_sq_foot, 
>Sum(sep_sales / w_warehouse_sq_ft) AS sep_sales_per_sq_foot, 
>Sum(oct_sales / w_warehouse_sq_ft) AS oct_sales_per_sq_foot, 
>Sum(nov_sales / w_warehouse_sq_ft) AS nov_sales_per_sq_foot, 
>Sum(dec_sales / w_warehouse_sq_ft) AS dec_sales_per_sq_foot, 
>Sum(jan_net)   AS jan_net, 
>Sum(feb_net)   AS feb_net, 
>Sum(mar_net)   AS mar_net, 
>Sum(apr_net)   

[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA

2017-03-07 Thread Vitalii Diravka (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900411#comment-15900411
 ] 

Vitalii Diravka commented on DRILL-5326:


Example of the row order sorting with nulls:
{code}
0: jdbc:drill:zk=local> select a from 
dfs.`/home/vitalii/IdeaProjects/drill/exec/java-exec/src/test/resources/parquet/data.snappy.parquet`
 limit 5;
++
|   a|
++
| null   |
| null   |
| 67985  |
| null   |
| null   |
++
5 rows selected (0.153 seconds)
0: jdbc:drill:zk=local> select a from 
dfs.`/home/vitalii/IdeaProjects/drill/exec/java-exec/src/test/resources/parquet/data.snappy.parquet`
 order by `a` limit 5;
+--+
|  a   |
+--+
| 42   |
| 50   |
| 95   |
| 116  |
| 116  |
+--+
5 rows selected (0.248 seconds)
0: jdbc:drill:zk=local> select a from 
dfs.`/home/vitalii/IdeaProjects/drill/exec/java-exec/src/test/resources/parquet/data.snappy.parquet`
 order by `a` DESC limit 5;
+---+
|   a   |
+---+
| null  |
| null  |
| null  |
| null  |
| null  |
+---+
5 rows selected (0.247 seconds)
{code}

> Unit tests failures related to the SERVER_METADTA
> -
>
> Key: DRILL-5326
> URL: https://issues.apache.org/jira/browse/DRILL-5326
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.10.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Blocker
> Fix For: 1.10.0
>
>
> 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will 
> support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT 
> it is disabled. 
> When I enabled this method (by way of upgrading drill version to 1.10.0 or 
> 1.11.0-SNAPSHOT) I found the following exception:
> {code}
> java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT
> {code}
> It appears in several tests (for example in 
> DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh).
> The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in 
> the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was 
> added in DRILL-1126).
> The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name 
> for this "GENERIC_OBJECT" RPC-/protobuf-level data type.
> 2. After fixing the first one the mentioned above test still fails by reason 
> of the incorrect "NullCollation" value in the "ServerMetaProvider". According 
> to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the 
> default val should be NC_HIGH (NULL is the highest value).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5328) Trim down physical plan size - replace StoragePluginConfig with storage name

2017-03-07 Thread Paul Rogers (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900415#comment-15900415
 ] 

Paul Rogers commented on DRILL-5328:


Careful not to introduce race conditions.

* Submit plan onto Drillbit A, with one storage plugin config.
* At the same time, alter the storage plugin on Drillbit B.

The query is planned, and will execute, on Drillbit A based on the old config. 
The query will execute on Drillbit B with the new config.

Eventually, Drillbit A will learn of the changed config, but it takes time. 
This creates a race condition between the query submission and the plugin 
updates.

This issue is very similar to the issue that arrises in Dynamic UDFs. We've had 
trouble getting the design right. We plan to move to an MVCC model to finally 
resolve all the race conditions.

MVCC could be used here as well. Store storage plugin configs as versions. Now, 
the above race condition is resolved:

* Query is planned on Drillbit A using version 17 of, say, "dfs."
* User modifies plugins on Drillbit B to create version 18.
* When the query executes on Drillbit B, it uses (dfs, 17), and so uses the 
same version of the information with which it was planned.

> Trim down physical plan size - replace StoragePluginConfig with storage name
> 
>
> Key: DRILL-5328
> URL: https://issues.apache.org/jira/browse/DRILL-5328
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Chunhui Shi
>
> For a physical plan, we now pass StoragePluginConfig as part of plan, then 
> the destination use the config to fetch the storage plugin in 
> StoragePluginRegistry. However, we can also fetch a storage plugin with the 
> name which is identical to all Drillbits. 
> In the example of simple physical plan of 150 lines shown below,  the storage 
> plugin config took 60 lines. In a typical large system, FileSystem's 
> StoragePluginConfig could be >500 lines. So this improvement should save the 
> cost of passing a larger physical plan among nodes.
> 0: jdbc:drill:zk=10.10.88.126:5181> explain plan for select * from 
> dfs.tmp.employee1 where last_name='Blumberg';
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(*=[$0])
> 00-02Project(T1¦¦*=[$0])
> 00-03  SelectionVectorRemover
> 00-04Filter(condition=[=($1, 'Blumberg')])
> 00-05  Project(T1¦¦*=[$0], last_name=[$1])
> 00-06Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=/tmp/employee1/0_0_0.parquet]], 
> selectionRoot=/tmp/employee1, numFiles=1, usedMetadataFile=true, 
> cacheFileRoot=/tmp/employee1, columns=[`*`]]])
>  | {
>   "head" : {
> "version" : 1,
> "generator" : {
>   "type" : "ExplainHandler",
>   "info" : ""
> },
> "type" : "APACHE_DRILL_PHYSICAL",
> "options" : [ ],
> "queue" : 0,
> "resultMode" : "EXEC"
>   },
>   "graph" : [ {
> "pop" : "parquet-scan",
> "@id" : 6,
> "userName" : "root",
> "entries" : [ {
>   "path" : "/tmp/employee1/0_0_0.parquet"
> } ],
> "storage" : {
>   "type" : "file",
>   "enabled" : true,
>   "connection" : "maprfs:///",
>   "config" : null,
>   "workspaces" : {
> "root" : {
>   "location" : "/",
>   "writable" : false,
>   "defaultInputFormat" : null
> },
> "tmp" : {
>   "location" : "/tmp",
>   "writable" : true,
>   "defaultInputFormat" : null
> },
> "shi" : {
>   "location" : "/user/shi",
>   "writable" : true,
>   "defaultInputFormat" : null
> },
> "dir700" : {
>   "location" : "/user/shi/dir700",
>   "writable" : true,
>   "defaultInputFormat" : null
> },
> "dir775" : {
>   "location" : "/user/shi/dir775",
>   "writable" : true,
>   "defaultInputFormat" : null
> },
> "xyz" : {
>   "location" : "/user/xyz",
>   "writable" : true,
>   "defaultInputFormat" : null
> }
>   },
>   "formats" : {
> "psv" : {
>   "type" : "text",
>   "extensions" : [ "tbl" ],
>   "delimiter" : "|"
> },
> "csv" : {
>   "type" : "text",
>   "extensions" : [ "csv" ],
>   "delimiter" : ","
> },
> "tsv" : {
>   "type" : "text",
>   "extensions" : [ "tsv" ],
>   "delimiter" : "\t"
> },
> "parquet" : {
>   "type" : "parquet"
> },
> "json" : {
>   "type" : "json",
>   "extensions" : [ "json" ]
> },
> "maprdb" : {
>   "type" : "maprdb"
> }
>   }
> },
> "fo

[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA

2017-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900414#comment-15900414
 ] 

ASF GitHub Bot commented on DRILL-5326:
---

Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/775#discussion_r104812618
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/metadata/ServerMetaProvider.java
 ---
@@ -76,7 +76,7 @@
   .setReadOnly(false)
   .setGroupBySupport(GroupBySupport.GB_UNRELATED)
   .setLikeEscapeClauseSupported(true)
-  .setNullCollation(NullCollation.NC_AT_END)
+  .setNullCollation(NullCollation.NC_HIGH)
--- End diff --

@jinfengni Agree with Paul. The difference between NC_HIGH and NC_AT_END 
for the DESC case. I checked that Drill uses NC_HIGH for sorting. Please, see 
it at 
[jira](https://issues.apache.org/jira/browse/DRILL-5326?focusedCommentId=15900411&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15900411),
 since github doesn't show formatted code correctly in the comments.



> Unit tests failures related to the SERVER_METADTA
> -
>
> Key: DRILL-5326
> URL: https://issues.apache.org/jira/browse/DRILL-5326
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.10.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Blocker
> Fix For: 1.10.0
>
>
> 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will 
> support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT 
> it is disabled. 
> When I enabled this method (by way of upgrading drill version to 1.10.0 or 
> 1.11.0-SNAPSHOT) I found the following exception:
> {code}
> java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT
> {code}
> It appears in several tests (for example in 
> DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh).
> The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in 
> the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was 
> added in DRILL-1126).
> The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name 
> for this "GENERIC_OBJECT" RPC-/protobuf-level data type.
> 2. After fixing the first one the mentioned above test still fails by reason 
> of the incorrect "NullCollation" value in the "ServerMetaProvider". According 
> to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the 
> default val should be NC_HIGH (NULL is the highest value).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5089) Skip initializing all enabled storage plugins for every query

2017-03-07 Thread Abhishek Girish (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-5089:
---
Priority: Critical  (was: Major)

> Skip initializing all enabled storage plugins for every query
> -
>
> Key: DRILL-5089
> URL: https://issues.apache.org/jira/browse/DRILL-5089
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning & Optimization
>Affects Versions: 1.9.0
>Reporter: Abhishek Girish
>Assignee: Chunhui Shi
>Priority: Critical
>
> In a query's lifecycle, at attempt is made to initialize each enabled storage 
> plugin, while building the schema tree. This is done regardless of the actual 
> plugins involved within a query. 
> Sometimes, when one or more of the enabled storage plugins have issues - 
> either due to misconfiguration or the underlying datasource being slow or 
> being down, the overall query time taken increases drastically. Most likely 
> due the attempt being made to register schemas from a faulty plugin.
> For example, when a jdbc plugin is configured with SQL Server, and at one 
> point the underlying SQL Server db goes down, any Drill query starting to 
> execute at that point and beyond begin to slow down drastically. 
> We must skip registering unrelated schemas (& workspaces) for a query. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5303) Drillbits fail to start when Drill server built with JDK 8 is deployed on a JDK 7 environment

2017-03-07 Thread Abhishek Girish (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900416#comment-15900416
 ] 

Abhishek Girish commented on DRILL-5303:


Cool, thanks [~laurentgo]

> Drillbits fail to start when Drill server built with JDK 8 is deployed on a 
> JDK 7 environment
> -
>
> Key: DRILL-5303
> URL: https://issues.apache.org/jira/browse/DRILL-5303
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server, Tools, Build & Test
>Affects Versions: 1.9.0
>Reporter: Abhishek Girish
>
> When Drill is built on a node configured with JDK 8 and is then deployed in a 
> JDK 7 environment, Drillbits fail to start and the following errors are seen 
> in Drillbit.out:
> {code}
> Exception in thread "main" java.lang.NoSuchMethodError: 
> java.util.concurrent.ConcurrentHashMap.keySet()Ljava/util/concurrent/ConcurrentHashMap$KeySetView;
> at 
> org.apache.drill.exec.coord.ClusterCoordinator.drillbitRegistered(ClusterCoordinator.java:85)
> at 
> org.apache.drill.exec.coord.zk.ZKClusterCoordinator.updateEndpoints(ZKClusterCoordinator.java:266)
> at 
> org.apache.drill.exec.coord.zk.ZKClusterCoordinator.start(ZKClusterCoordinator.java:135)
> at org.apache.drill.exec.server.Drillbit.run(Drillbit.java:117)
> at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:292)
> at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:272)
> at org.apache.drill.exec.server.Drillbit.main(Drillbit.java:268)
> {code}
> Workaround is to match the Java versions of build and deployment environments.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA

2017-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900421#comment-15900421
 ] 

ASF GitHub Bot commented on DRILL-5326:
---

Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/775#discussion_r104813282
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/metadata/ServerMetaProvider.java
 ---
@@ -76,7 +76,7 @@
   .setReadOnly(false)
   .setGroupBySupport(GroupBySupport.GB_UNRELATED)
   .setLikeEscapeClauseSupported(true)
-  .setNullCollation(NullCollation.NC_AT_END)
+  .setNullCollation(NullCollation.NC_HIGH)
--- End diff --

I'm not sure if the last sentence is correct. If user specified DESC, and 
null collation is unspecified, then null should sort first, no (if always 
HIGH). NC_HIGH seems then the correct value (unless the default can be changed 
in the planner configuration).


> Unit tests failures related to the SERVER_METADTA
> -
>
> Key: DRILL-5326
> URL: https://issues.apache.org/jira/browse/DRILL-5326
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.10.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Blocker
> Fix For: 1.10.0
>
>
> 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will 
> support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT 
> it is disabled. 
> When I enabled this method (by way of upgrading drill version to 1.10.0 or 
> 1.11.0-SNAPSHOT) I found the following exception:
> {code}
> java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT
> {code}
> It appears in several tests (for example in 
> DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh).
> The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in 
> the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was 
> added in DRILL-1126).
> The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name 
> for this "GENERIC_OBJECT" RPC-/protobuf-level data type.
> 2. After fixing the first one the mentioned above test still fails by reason 
> of the incorrect "NullCollation" value in the "ServerMetaProvider". According 
> to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the 
> default val should be NC_HIGH (NULL is the highest value).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-3365) Query with window function on large dataset fails with "IOException: Mkdirs failed to create spill directory"

2017-03-07 Thread Abhishek Girish (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-3365:
---
Priority: Major  (was: Minor)

> Query with window function on large dataset fails with "IOException: Mkdirs 
> failed to create spill directory"
> -
>
> Key: DRILL-3365
> URL: https://issues.apache.org/jira/browse/DRILL-3365
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.1.0
>Reporter: Abhishek Girish
> Fix For: Future
>
>
> Dataset: TPC-DS SF100 Parquet
> Query: 
> {code:sql}
> SELECT sum(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk ORDER BY 
> ss.ss_customer_sk) AS PartialSum FROM store_sales ss GROUP BY 
> ss.ss_net_paid_inc_tax, ss.ss_store_sk, ss.ss_customer_sk  LIMIT 20;
> java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
> java.io.IOException: Mkdirs failed to create 
> /tmp/drill/spill/2a74ac18-0679-ab99-26c6-af41b9af7f4e/major_fragment_1/minor_fragment_17/operator_4
>  (exists=false, cwd=file:///opt/mapr/drill/drill-1.1.0/bin)
> Fragment 1:17
> [Error Id: 4905b400-fc0f-4287-beba-d1ca18359986 on abhi5.qa.lab:31010]
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:85)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:116)
>   at sqlline.SqlLine.print(SqlLine.java:1583)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:738)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> {code}
> Was unable to find corresponding logs. This was consistently seen via JDBC 
> program and sqlline. 
> After I restarted Drillbits the issue seems to have been resolved. But wanted 
> to report this anyway. Possible explanation is DRILL-2917 (one or more 
> drillbits were in an inconsistent state)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5329) External sort does not support TinyInt type

2017-03-07 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5329:
--

 Summary: External sort does not support TinyInt type
 Key: DRILL-5329
 URL: https://issues.apache.org/jira/browse/DRILL-5329
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.10.0
Reporter: Paul Rogers


A unit test was created to exercise the "Sorter" mechanism within the External 
Sort, which is used to sort each incoming batch. The sorter was tested with 
each Drill data type.

The following types fail:

* TinyInt




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (DRILL-3111) Drill UI should support fast schema return and streaming results

2017-03-07 Thread Abhishek Girish (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish resolved DRILL-3111.

   Resolution: Fixed
Fix Version/s: (was: Future)

> Drill UI should support fast schema return and streaming results
> 
>
> Key: DRILL-3111
> URL: https://issues.apache.org/jira/browse/DRILL-3111
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - HTTP
>Affects Versions: 1.0.0
>Reporter: Abhishek Girish
>
> On sqlline, a query which returns several hundred rows, supports fast schema 
> return and streams results as they are fetched. 
> Drill UI doesn't support either of these. It waits until all results are 
> fetched and displays them at once. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (DRILL-3111) Drill UI should support fast schema return and streaming results

2017-03-07 Thread Abhishek Girish (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish closed DRILL-3111.
--

> Drill UI should support fast schema return and streaming results
> 
>
> Key: DRILL-3111
> URL: https://issues.apache.org/jira/browse/DRILL-3111
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - HTTP
>Affects Versions: 1.0.0
>Reporter: Abhishek Girish
>
> On sqlline, a query which returns several hundred rows, supports fast schema 
> return and streams results as they are fetched. 
> Drill UI doesn't support either of these. It waits until all results are 
> fetched and displays them at once. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5329) External sort does not support TinyInt type

2017-03-07 Thread Paul Rogers (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900431#comment-15900431
 ] 

Paul Rogers commented on DRILL-5329:


Failure for TinyInt:

{code}
ERROR o.a.d.e.e.f.r.RemoteFunctionRegistry - Problem during trying to access 
remote function registry [registry]
java.lang.NullPointerException: null
at 
org.apache.drill.exec.expr.fn.registry.RemoteFunctionRegistry.getRegistryVersion(RemoteFunctionRegistry.java:119)
 ~[classes/:na]
at 
org.apache.drill.exec.expr.fn.FunctionImplementationRegistry.syncWithRemoteRegistry(FunctionImplementationRegistry.java:320)
 [classes/:na]
at 
org.apache.drill.exec.expr.fn.FunctionImplementationRegistry.findDrillFunction(FunctionImplementationRegistry.java:164)
 [classes/:na]
at 
org.apache.drill.exec.expr.ExpressionTreeMaterializer$AbstractMaterializeVisitor.visitFunctionCall(ExpressionTreeMaterializer.java:352)
 [classes/:na]
at 
org.apache.drill.exec.expr.ExpressionTreeMaterializer$AbstractMaterializeVisitor.visitFunctionCall(ExpressionTreeMaterializer.java:1)
 [classes/:na]
at 
org.apache.drill.common.expression.FunctionCall.accept(FunctionCall.java:60) 
[classes/:na]
at 
org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize(ExpressionTreeMaterializer.java:131)
 [classes/:na]
at 
org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize(ExpressionTreeMaterializer.java:106)
 [classes/:na]
at 
org.apache.drill.exec.expr.fn.FunctionGenerationHelper.getOrderingComparator(FunctionGenerationHelper.java:84)
 [classes/:na]
{code}


> External sort does not support TinyInt type
> ---
>
> Key: DRILL-5329
> URL: https://issues.apache.org/jira/browse/DRILL-5329
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>
> A unit test was created to exercise the "Sorter" mechanism within the 
> External Sort, which is used to sort each incoming batch. The sorter was 
> tested with each Drill data type.
> The following types fail:
> * TinyInt



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5329) External sort does not support "obscure" numeric types

2017-03-07 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5329:
---
Summary: External sort does not support "obscure" numeric types  (was: 
External sort does not support TinyInt type)

> External sort does not support "obscure" numeric types
> --
>
> Key: DRILL-5329
> URL: https://issues.apache.org/jira/browse/DRILL-5329
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>
> A unit test was created to exercise the "Sorter" mechanism within the 
> External Sort, which is used to sort each incoming batch. The sorter was 
> tested with each Drill data type.
> The following types fail:
> * TinyInt



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5330) NPE in FunctionImplementationRegistry.functionReplacement()

2017-03-07 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5330:
--

 Summary: NPE in 
FunctionImplementationRegistry.functionReplacement()
 Key: DRILL-5330
 URL: https://issues.apache.org/jira/browse/DRILL-5330
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.10.0
Reporter: Paul Rogers
Assignee: Paul Rogers
 Fix For: 1.11.0


The code in {{FunctionImplementationRegistry.functionReplacement()}} will 
produce an NPE if ever it is called:

{code}
  if (optionManager != null
  && optionManager.getOption(
   ExecConstants.CAST_TO_NULLABLE_NUMERIC).bool_val
  ...
{code}

If an option manager is provided, then get the specified option. The option 
manager will contain a value for that option only if the user has explicitly 
set that option. Suppose the user had not set the option. Then the return from 
{{getOption()}} will be null.

The next thing we do is *assume* that the option exists and is a boolean by 
dereferencing the option. This will trigger an NPE. This NPE was seen in 
detail-level unit tests.

The proper way to handle such options is to use an option validator. Strangely, 
one actually exists in {{ExecConstants}}:

{code}
  String CAST_TO_NULLABLE_NUMERIC = 
"drill.exec.functions.cast_empty_string_to_null";
  OptionValidator CAST_TO_NULLABLE_NUMERIC_OPTION = new 
BooleanValidator(CAST_TO_NULLABLE_NUMERIC, false);
{code}

Then do:

{code}
optionManager.getOption(
 ExecConstants.CAST_TO_NULLABLE_NUMERIC_OPTION)
{code}




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5330) NPE in FunctionImplementationRegistry.functionReplacement()

2017-03-07 Thread Paul Rogers (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900451#comment-15900451
 ] 

Paul Rogers commented on DRILL-5330:


To fix this, {{ExecConstants}} must change:

{code}
  BooleanValidator CAST_TO_NULLABLE_NUMERIC_OPTION = new 
BooleanValidator(CAST_TO_NULLABLE_NUMERIC, false);
{code}

The {{BooleanValidator}} is required to select the correct overloaded method in 
the option manager. (This may be why the original author didn't use the 
validator in the first place...)

> NPE in FunctionImplementationRegistry.functionReplacement()
> ---
>
> Key: DRILL-5330
> URL: https://issues.apache.org/jira/browse/DRILL-5330
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> The code in {{FunctionImplementationRegistry.functionReplacement()}} will 
> produce an NPE if ever it is called:
> {code}
>   if (optionManager != null
>   && optionManager.getOption(
>ExecConstants.CAST_TO_NULLABLE_NUMERIC).bool_val
>   ...
> {code}
> If an option manager is provided, then get the specified option. The option 
> manager will contain a value for that option only if the user has explicitly 
> set that option. Suppose the user had not set the option. Then the return 
> from {{getOption()}} will be null.
> The next thing we do is *assume* that the option exists and is a boolean by 
> dereferencing the option. This will trigger an NPE. This NPE was seen in 
> detail-level unit tests.
> The proper way to handle such options is to use an option validator. 
> Strangely, one actually exists in {{ExecConstants}}:
> {code}
>   String CAST_TO_NULLABLE_NUMERIC = 
> "drill.exec.functions.cast_empty_string_to_null";
>   OptionValidator CAST_TO_NULLABLE_NUMERIC_OPTION = new 
> BooleanValidator(CAST_TO_NULLABLE_NUMERIC, false);
> {code}
> Then do:
> {code}
> optionManager.getOption(
>  ExecConstants.CAST_TO_NULLABLE_NUMERIC_OPTION)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA

2017-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900452#comment-15900452
 ] 

ASF GitHub Bot commented on DRILL-5326:
---

Github user jinfengni commented on a diff in the pull request:

https://github.com/apache/drill/pull/775#discussion_r104816830
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/metadata/ServerMetaProvider.java
 ---
@@ -76,7 +76,7 @@
   .setReadOnly(false)
   .setGroupBySupport(GroupBySupport.GB_UNRELATED)
   .setLikeEscapeClauseSupported(true)
-  .setNullCollation(NullCollation.NC_AT_END)
+  .setNullCollation(NullCollation.NC_HIGH)
--- End diff --

If only specify NC here (no sort order), then NC_HIGH looks more 
reasonable. The change looks fine to me.  


> Unit tests failures related to the SERVER_METADTA
> -
>
> Key: DRILL-5326
> URL: https://issues.apache.org/jira/browse/DRILL-5326
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.10.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Blocker
> Fix For: 1.10.0
>
>
> 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will 
> support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT 
> it is disabled. 
> When I enabled this method (by way of upgrading drill version to 1.10.0 or 
> 1.11.0-SNAPSHOT) I found the following exception:
> {code}
> java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT
> {code}
> It appears in several tests (for example in 
> DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh).
> The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in 
> the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was 
> added in DRILL-1126).
> The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name 
> for this "GENERIC_OBJECT" RPC-/protobuf-level data type.
> 2. After fixing the first one the mentioned above test still fails by reason 
> of the incorrect "NullCollation" value in the "ServerMetaProvider". According 
> to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the 
> default val should be NC_HIGH (NULL is the highest value).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA

2017-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900460#comment-15900460
 ] 

ASF GitHub Bot commented on DRILL-5326:
---

Github user vdiravka commented on the issue:

https://github.com/apache/drill/pull/775
  
@laurentgo I changed the names for the server meta method. 


> Unit tests failures related to the SERVER_METADTA
> -
>
> Key: DRILL-5326
> URL: https://issues.apache.org/jira/browse/DRILL-5326
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.10.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Blocker
> Fix For: 1.10.0
>
>
> 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will 
> support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT 
> it is disabled. 
> When I enabled this method (by way of upgrading drill version to 1.10.0 or 
> 1.11.0-SNAPSHOT) I found the following exception:
> {code}
> java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT
> {code}
> It appears in several tests (for example in 
> DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh).
> The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in 
> the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was 
> added in DRILL-1126).
> The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name 
> for this "GENERIC_OBJECT" RPC-/protobuf-level data type.
> 2. After fixing the first one the mentioned above test still fails by reason 
> of the incorrect "NullCollation" value in the "ServerMetaProvider". According 
> to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the 
> default val should be NC_HIGH (NULL is the highest value).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5331) NPE in FunctionImplementationRegistry.findDrillFunction() if dynamic UDFs disabled

2017-03-07 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5331:
--

 Summary: NPE in FunctionImplementationRegistry.findDrillFunction() 
if dynamic UDFs disabled
 Key: DRILL-5331
 URL: https://issues.apache.org/jira/browse/DRILL-5331
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.10.0
Reporter: Paul Rogers
Assignee: Paul Rogers
 Fix For: 1.11.0


Drill provides the Dynamic UDF (DUDF) functionality. DUFDs can be disabled 
using the following option in {{ExecConstants}}:

{code}
  String USE_DYNAMIC_UDFS_KEY = "exec.udf.use_dynamic";
  BooleanValidator USE_DYNAMIC_UDFS = new 
BooleanValidator(USE_DYNAMIC_UDFS_KEY, true);
{code}

In a unit test, we created a setup in which we wish to use only the local 
function registry, no DUDF support is needed. Run the code. The following code 
is invoked when asking for a non-existent function:

{code}
  public DrillFuncHolder findDrillFunction(FunctionResolver functionResolver, 
FunctionCall functionCall) {
...
if (holder == null) {
  syncWithRemoteRegistry(version.get());
  List updatedFunctions = 
localFunctionRegistry.getMethods(newFunctionName, version);
  holder = functionResolver.getBestMatch(updatedFunctions, functionCall);
}
{code}

The result is an NPE:

{code}
ERROR o.a.d.e.e.f.r.RemoteFunctionRegistry - Problem during trying to access 
remote function registry [registry]
java.lang.NullPointerException: null
at 
org.apache.drill.exec.expr.fn.registry.RemoteFunctionRegistry.getRegistryVersion(RemoteFunctionRegistry.java:119)
 ~[classes/:na]
{code}

The fix is simply to add a DUDF-enabled check:

{code}
if (holder == null) {
  boolean useDynamicUdfs = optionManager != null && 
optionManager.getOption(ExecConstants.USE_DYNAMIC_UDFS);
  if (useDynamicUdfs) {
syncWithRemoteRegistry(version.get());
...
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5331) NPE in FunctionImplementationRegistry.findDrillFunction() if dynamic UDFs disabled

2017-03-07 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5331:
---
Description: 
Drill provides the Dynamic UDF (DUDF) functionality. DUFDs can be disabled 
using the following option in {{ExecConstants}}:

{code}
  String USE_DYNAMIC_UDFS_KEY = "exec.udf.use_dynamic";
  BooleanValidator USE_DYNAMIC_UDFS = new 
BooleanValidator(USE_DYNAMIC_UDFS_KEY, true);
{code}

In a unit test, we created a setup in which we wish to use only the local 
function registry, no DUDF support is needed. Run the code. The following code 
is invoked when asking for a non-existent function:

{code}
  public DrillFuncHolder findDrillFunction(FunctionResolver functionResolver, 
FunctionCall functionCall) {
...
if (holder == null) {
  syncWithRemoteRegistry(version.get());
  List updatedFunctions = 
localFunctionRegistry.getMethods(newFunctionName, version);
  holder = functionResolver.getBestMatch(updatedFunctions, functionCall);
}
{code}

The result is an NPE:

{code}
ERROR o.a.d.e.e.f.r.RemoteFunctionRegistry - Problem during trying to access 
remote function registry [registry]
java.lang.NullPointerException: null
at 
org.apache.drill.exec.expr.fn.registry.RemoteFunctionRegistry.getRegistryVersion(RemoteFunctionRegistry.java:119)
 ~[classes/:na]
{code}

The fix is simply to add a DUDF-enabled check:

{code}
if (holder == null) {
  boolean useDynamicUdfs = optionManager != null && 
optionManager.getOption(ExecConstants.USE_DYNAMIC_UDFS);
  if (useDynamicUdfs) {
syncWithRemoteRegistry(version.get());
...
{code}

Then, disable dynamic UDFs for the test case by setting 
{{ExecConstants.USE_DYNAMIC_UDFS}} to false.

  was:
Drill provides the Dynamic UDF (DUDF) functionality. DUFDs can be disabled 
using the following option in {{ExecConstants}}:

{code}
  String USE_DYNAMIC_UDFS_KEY = "exec.udf.use_dynamic";
  BooleanValidator USE_DYNAMIC_UDFS = new 
BooleanValidator(USE_DYNAMIC_UDFS_KEY, true);
{code}

In a unit test, we created a setup in which we wish to use only the local 
function registry, no DUDF support is needed. Run the code. The following code 
is invoked when asking for a non-existent function:

{code}
  public DrillFuncHolder findDrillFunction(FunctionResolver functionResolver, 
FunctionCall functionCall) {
...
if (holder == null) {
  syncWithRemoteRegistry(version.get());
  List updatedFunctions = 
localFunctionRegistry.getMethods(newFunctionName, version);
  holder = functionResolver.getBestMatch(updatedFunctions, functionCall);
}
{code}

The result is an NPE:

{code}
ERROR o.a.d.e.e.f.r.RemoteFunctionRegistry - Problem during trying to access 
remote function registry [registry]
java.lang.NullPointerException: null
at 
org.apache.drill.exec.expr.fn.registry.RemoteFunctionRegistry.getRegistryVersion(RemoteFunctionRegistry.java:119)
 ~[classes/:na]
{code}

The fix is simply to add a DUDF-enabled check:

{code}
if (holder == null) {
  boolean useDynamicUdfs = optionManager != null && 
optionManager.getOption(ExecConstants.USE_DYNAMIC_UDFS);
  if (useDynamicUdfs) {
syncWithRemoteRegistry(version.get());
...
{code}


> NPE in FunctionImplementationRegistry.findDrillFunction() if dynamic UDFs 
> disabled
> --
>
> Key: DRILL-5331
> URL: https://issues.apache.org/jira/browse/DRILL-5331
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Drill provides the Dynamic UDF (DUDF) functionality. DUFDs can be disabled 
> using the following option in {{ExecConstants}}:
> {code}
>   String USE_DYNAMIC_UDFS_KEY = "exec.udf.use_dynamic";
>   BooleanValidator USE_DYNAMIC_UDFS = new 
> BooleanValidator(USE_DYNAMIC_UDFS_KEY, true);
> {code}
> In a unit test, we created a setup in which we wish to use only the local 
> function registry, no DUDF support is needed. Run the code. The following 
> code is invoked when asking for a non-existent function:
> {code}
>   public DrillFuncHolder findDrillFunction(FunctionResolver functionResolver, 
> FunctionCall functionCall) {
> ...
> if (holder == null) {
>   syncWithRemoteRegistry(version.get());
>   List updatedFunctions = 
> localFunctionRegistry.getMethods(newFunctionName, version);
>   holder = functionResolver.getBestMatch(updatedFunctions, functionCall);
> }
> {code}
> The result is an NPE:
> {code}
> ERROR o.a.d.e.e.f.r.RemoteFunctionRegistry - Problem during trying to access 
> remote function registry [registry]
> java.lang.NullPointerException: null
>   at 
> org.apache.drill.exec.expr.fn.registry.RemoteFunctionRegistry.getRegistryVersion(RemoteFunctionRegistry

[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA

2017-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900476#comment-15900476
 ] 

ASF GitHub Bot commented on DRILL-5326:
---

Github user laurentgo commented on the issue:

https://github.com/apache/drill/pull/775
  
LGTM


> Unit tests failures related to the SERVER_METADTA
> -
>
> Key: DRILL-5326
> URL: https://issues.apache.org/jira/browse/DRILL-5326
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.10.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Blocker
> Fix For: 1.10.0
>
>
> 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will 
> support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT 
> it is disabled. 
> When I enabled this method (by way of upgrading drill version to 1.10.0 or 
> 1.11.0-SNAPSHOT) I found the following exception:
> {code}
> java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT
> {code}
> It appears in several tests (for example in 
> DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh).
> The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in 
> the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was 
> added in DRILL-1126).
> The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name 
> for this "GENERIC_OBJECT" RPC-/protobuf-level data type.
> 2. After fixing the first one the mentioned above test still fails by reason 
> of the incorrect "NullCollation" value in the "ServerMetaProvider". According 
> to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the 
> default val should be NC_HIGH (NULL is the highest value).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Issue Comment Deleted] (DRILL-5329) External sort does not support "obscure" numeric types

2017-03-07 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5329:
---
Comment: was deleted

(was: Failure for TinyInt:

{code}
ERROR o.a.d.e.e.f.r.RemoteFunctionRegistry - Problem during trying to access 
remote function registry [registry]
java.lang.NullPointerException: null
at 
org.apache.drill.exec.expr.fn.registry.RemoteFunctionRegistry.getRegistryVersion(RemoteFunctionRegistry.java:119)
 ~[classes/:na]
at 
org.apache.drill.exec.expr.fn.FunctionImplementationRegistry.syncWithRemoteRegistry(FunctionImplementationRegistry.java:320)
 [classes/:na]
at 
org.apache.drill.exec.expr.fn.FunctionImplementationRegistry.findDrillFunction(FunctionImplementationRegistry.java:164)
 [classes/:na]
at 
org.apache.drill.exec.expr.ExpressionTreeMaterializer$AbstractMaterializeVisitor.visitFunctionCall(ExpressionTreeMaterializer.java:352)
 [classes/:na]
at 
org.apache.drill.exec.expr.ExpressionTreeMaterializer$AbstractMaterializeVisitor.visitFunctionCall(ExpressionTreeMaterializer.java:1)
 [classes/:na]
at 
org.apache.drill.common.expression.FunctionCall.accept(FunctionCall.java:60) 
[classes/:na]
at 
org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize(ExpressionTreeMaterializer.java:131)
 [classes/:na]
at 
org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize(ExpressionTreeMaterializer.java:106)
 [classes/:na]
at 
org.apache.drill.exec.expr.fn.FunctionGenerationHelper.getOrderingComparator(FunctionGenerationHelper.java:84)
 [classes/:na]
{code}
)

> External sort does not support "obscure" numeric types
> --
>
> Key: DRILL-5329
> URL: https://issues.apache.org/jira/browse/DRILL-5329
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>
> A unit test was created to exercise the "Sorter" mechanism within the 
> External Sort, which is used to sort each incoming batch. The sorter was 
> tested with each Drill data type.
> The following types fail:
> * TinyInt
> * UInt1
> The failure manifests on one of two ways:
> * If dynamic UDFs are enabled, the query crashes with an NPE. (See 
> DRILL-5331.)
> * If dynamic UDFs are disabled, the generated code silently skips the 
> comparison step, resulting in the sort not actually being done:
> Sorting a set of 20-pseudo-random rows produces the following output:
> {code}
> #, row #, key, value
> 0(0): 11, "0"
> 1(1): 14, "1"
> 2(2): 17, "2"
> 3(3): 0, "3"
> {code}
> The first number is the row index, the second is the row pointed to by the 
> sv2 (which should be written to create sort order). Sort was done ASC, 
> NULLS_HIGH, by the key field.
> A strong concern here is that there is no error or other warning to the user 
> that Drill cannot sort this type; Drill just silently declines to perform the 
> operation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5329) External sort does not support "obscure" numeric types

2017-03-07 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5329:
---
Description: 
A unit test was created to exercise the "Sorter" mechanism within the External 
Sort, which is used to sort each incoming batch. The sorter was tested with 
each Drill data type.

The following types fail:

* TinyInt
* UInt1

The failure manifests on one of two ways:

* If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.)
* If dynamic UDFs are disabled, the generated code silently skips the 
comparison step, resulting in the sort not actually being done:

Sorting a set of 20-pseudo-random rows produces the following output:

{code}
#, row #, key, value
0(0): 11, "0"
1(1): 14, "1"
2(2): 17, "2"
3(3): 0, "3"
{code}

The first number is the row index, the second is the row pointed to by the sv2 
(which should be written to create sort order). Sort was done ASC, NULLS_HIGH, 
by the key field.

A strong concern here is that there is no error or other warning to the user 
that Drill cannot sort this type; Drill just silently declines to perform the 
operation.

  was:
A unit test was created to exercise the "Sorter" mechanism within the External 
Sort, which is used to sort each incoming batch. The sorter was tested with 
each Drill data type.

The following types fail:

* TinyInt



> External sort does not support "obscure" numeric types
> --
>
> Key: DRILL-5329
> URL: https://issues.apache.org/jira/browse/DRILL-5329
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>
> A unit test was created to exercise the "Sorter" mechanism within the 
> External Sort, which is used to sort each incoming batch. The sorter was 
> tested with each Drill data type.
> The following types fail:
> * TinyInt
> * UInt1
> The failure manifests on one of two ways:
> * If dynamic UDFs are enabled, the query crashes with an NPE. (See 
> DRILL-5331.)
> * If dynamic UDFs are disabled, the generated code silently skips the 
> comparison step, resulting in the sort not actually being done:
> Sorting a set of 20-pseudo-random rows produces the following output:
> {code}
> #, row #, key, value
> 0(0): 11, "0"
> 1(1): 14, "1"
> 2(2): 17, "2"
> 3(3): 0, "3"
> {code}
> The first number is the row index, the second is the row pointed to by the 
> sv2 (which should be written to create sort order). Sort was done ASC, 
> NULLS_HIGH, by the key field.
> A strong concern here is that there is no error or other warning to the user 
> that Drill cannot sort this type; Drill just silently declines to perform the 
> operation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5329) External sort does not support "obscure" numeric types

2017-03-07 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5329:
---
Description: 
A unit test was created to exercise the "Sorter" mechanism within the External 
Sort, which is used to sort each incoming batch. The sorter was tested with 
each Drill data type.

The following types fail:

* TinyInt
* UInt1

The failure manifests on one of two ways:

* If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.)
* If dynamic UDFs are disabled, the generated code silently skips the 
comparison step, resulting in the sort not actually being done:

Sorting a set of 20-pseudo-random rows produces the following output:

{code}
#, row #, key, value
0(0): 11, "0"
1(1): 14, "1"
2(2): 17, "2"
3(3): 0, "3"
{code}

By contrast, the (working) Int type produces the correct results:

{code}
#, row #, key, value
0(3): 0, "3"
1(10): 1, "10"
2(17): 2, "17"
3(4): 3, "4"
{code}

The first number is the row index, the second is the row pointed to by the sv2 
(which should be written to create sort order). Sort was done ASC, NULLS_HIGH, 
by the key field.

A strong concern here is that there is no error or other warning to the user 
that Drill cannot sort this type; Drill just silently declines to perform the 
operation.

  was:
A unit test was created to exercise the "Sorter" mechanism within the External 
Sort, which is used to sort each incoming batch. The sorter was tested with 
each Drill data type.

The following types fail:

* TinyInt
* UInt1

The failure manifests on one of two ways:

* If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.)
* If dynamic UDFs are disabled, the generated code silently skips the 
comparison step, resulting in the sort not actually being done:

Sorting a set of 20-pseudo-random rows produces the following output:

{code}
#, row #, key, value
0(0): 11, "0"
1(1): 14, "1"
2(2): 17, "2"
3(3): 0, "3"
{code}

The first number is the row index, the second is the row pointed to by the sv2 
(which should be written to create sort order). Sort was done ASC, NULLS_HIGH, 
by the key field.

A strong concern here is that there is no error or other warning to the user 
that Drill cannot sort this type; Drill just silently declines to perform the 
operation.


> External sort does not support "obscure" numeric types
> --
>
> Key: DRILL-5329
> URL: https://issues.apache.org/jira/browse/DRILL-5329
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>
> A unit test was created to exercise the "Sorter" mechanism within the 
> External Sort, which is used to sort each incoming batch. The sorter was 
> tested with each Drill data type.
> The following types fail:
> * TinyInt
> * UInt1
> The failure manifests on one of two ways:
> * If dynamic UDFs are enabled, the query crashes with an NPE. (See 
> DRILL-5331.)
> * If dynamic UDFs are disabled, the generated code silently skips the 
> comparison step, resulting in the sort not actually being done:
> Sorting a set of 20-pseudo-random rows produces the following output:
> {code}
> #, row #, key, value
> 0(0): 11, "0"
> 1(1): 14, "1"
> 2(2): 17, "2"
> 3(3): 0, "3"
> {code}
> By contrast, the (working) Int type produces the correct results:
> {code}
> #, row #, key, value
> 0(3): 0, "3"
> 1(10): 1, "10"
> 2(17): 2, "17"
> 3(4): 3, "4"
> {code}
> The first number is the row index, the second is the row pointed to by the 
> sv2 (which should be written to create sort order). Sort was done ASC, 
> NULLS_HIGH, by the key field.
> A strong concern here is that there is no error or other warning to the user 
> that Drill cannot sort this type; Drill just silently declines to perform the 
> operation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5329) External sort does not support "obscure" numeric types

2017-03-07 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5329:
---
Description: 
A unit test was created to exercise the "Sorter" mechanism within the External 
Sort, which is used to sort each incoming batch. The sorter was tested with 
each Drill data type.

The following types fail:

* TinyInt
* UInt1
* SmallInt
* UInt2
* UInt4

The types that work include:

* Int

The failure manifests on one of two ways:

* If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.)
* If dynamic UDFs are disabled, the generated code silently skips the 
comparison step, resulting in the sort not actually being done:

Sorting a set of 20-pseudo-random rows produces the following output:

{code}
#, row #, key, value
0(0): 11, "0"
1(1): 14, "1"
2(2): 17, "2"
3(3): 0, "3"
{code}

By contrast, the (working) Int type produces the correct results:

{code}
#, row #, key, value
0(3): 0, "3"
1(10): 1, "10"
2(17): 2, "17"
3(4): 3, "4"
{code}

The first number is the row index, the second is the row pointed to by the sv2 
(which should be written to create sort order). Sort was done ASC, NULLS_HIGH, 
by the key field.

A strong concern here is that there is no error or other warning to the user 
that Drill cannot sort this type; Drill just silently declines to perform the 
operation.

  was:
A unit test was created to exercise the "Sorter" mechanism within the External 
Sort, which is used to sort each incoming batch. The sorter was tested with 
each Drill data type.

The following types fail:

* TinyInt
* UInt1

The failure manifests on one of two ways:

* If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.)
* If dynamic UDFs are disabled, the generated code silently skips the 
comparison step, resulting in the sort not actually being done:

Sorting a set of 20-pseudo-random rows produces the following output:

{code}
#, row #, key, value
0(0): 11, "0"
1(1): 14, "1"
2(2): 17, "2"
3(3): 0, "3"
{code}

By contrast, the (working) Int type produces the correct results:

{code}
#, row #, key, value
0(3): 0, "3"
1(10): 1, "10"
2(17): 2, "17"
3(4): 3, "4"
{code}

The first number is the row index, the second is the row pointed to by the sv2 
(which should be written to create sort order). Sort was done ASC, NULLS_HIGH, 
by the key field.

A strong concern here is that there is no error or other warning to the user 
that Drill cannot sort this type; Drill just silently declines to perform the 
operation.


> External sort does not support "obscure" numeric types
> --
>
> Key: DRILL-5329
> URL: https://issues.apache.org/jira/browse/DRILL-5329
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>
> A unit test was created to exercise the "Sorter" mechanism within the 
> External Sort, which is used to sort each incoming batch. The sorter was 
> tested with each Drill data type.
> The following types fail:
> * TinyInt
> * UInt1
> * SmallInt
> * UInt2
> * UInt4
> The types that work include:
> * Int
> The failure manifests on one of two ways:
> * If dynamic UDFs are enabled, the query crashes with an NPE. (See 
> DRILL-5331.)
> * If dynamic UDFs are disabled, the generated code silently skips the 
> comparison step, resulting in the sort not actually being done:
> Sorting a set of 20-pseudo-random rows produces the following output:
> {code}
> #, row #, key, value
> 0(0): 11, "0"
> 1(1): 14, "1"
> 2(2): 17, "2"
> 3(3): 0, "3"
> {code}
> By contrast, the (working) Int type produces the correct results:
> {code}
> #, row #, key, value
> 0(3): 0, "3"
> 1(10): 1, "10"
> 2(17): 2, "17"
> 3(4): 3, "4"
> {code}
> The first number is the row index, the second is the row pointed to by the 
> sv2 (which should be written to create sort order). Sort was done ASC, 
> NULLS_HIGH, by the key field.
> A strong concern here is that there is no error or other warning to the user 
> that Drill cannot sort this type; Drill just silently declines to perform the 
> operation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5326) Unit tests failures related to the SERVER_METADTA

2017-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900509#comment-15900509
 ] 

ASF GitHub Bot commented on DRILL-5326:
---

Github user jinfengni commented on the issue:

https://github.com/apache/drill/pull/775
  
@laurentgo , @vdiravka , thanks. I'll run regression & merge.

+1



> Unit tests failures related to the SERVER_METADTA
> -
>
> Key: DRILL-5326
> URL: https://issues.apache.org/jira/browse/DRILL-5326
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.10.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Blocker
> Fix For: 1.10.0
>
>
> 1. In DRILL-5301 a new SERVER_META rpc call was introduced. The server will 
> support this method only from 1.10.0 drill version. For drill 1.10.0-SNAPHOT 
> it is disabled. 
> When I enabled this method (by way of upgrading drill version to 1.10.0 or 
> 1.11.0-SNAPSHOT) I found the following exception:
> {code}
> java.lang.AssertionError: Unexpected/unhandled MinorType value GENERIC_OBJECT
> {code}
> It appears in several tests (for example in 
> DatabaseMetadataTest#testNullsAreSortedMethodsSaySortedHigh).
> The reason of it is "GENERIC_OBJECT" RPC-/protobuf-level type is appeared in 
> the ServerMetadata#ConvertSupportList. (Supporting of GENERIC_OBJECT was 
> added in DRILL-1126).
> The proposed solution is to add the appropriate "JAVA_OBJECT" sql type name 
> for this "GENERIC_OBJECT" RPC-/protobuf-level data type.
> 2. After fixing the first one the mentioned above test still fails by reason 
> of the incorrect "NullCollation" value in the "ServerMetaProvider". According 
> to the [doc|https://drill.apache.org/docs/order-by-clause/#usage-notes] the 
> default val should be NC_HIGH (NULL is the highest value).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5329) External sort does not support "obscure" numeric types

2017-03-07 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5329:
---
Description: 
A unit test was created to exercise the "Sorter" mechanism within the External 
Sort, which is used to sort each incoming batch. The sorter was tested with 
each Drill data type.

The following types fail:

* TinyInt
* UInt1
* SmallInt
* UInt2
* UInt4
* UInt8
* Var16Char

The types that work include:

* Int
* BigInt
* Float4
* Float8
* VarChar

The failure manifests on one of two ways:

* If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.)
* If dynamic UDFs are disabled, the generated code silently skips the 
comparison step, resulting in the sort not actually being done:

Sorting a set of 20-pseudo-random rows produces the following output:

{code}
#, row #, key, value
0(0): 11, "0"
1(1): 14, "1"
2(2): 17, "2"
3(3): 0, "3"
{code}

By contrast, the (working) Int type produces the correct results:

{code}
#, row #, key, value
0(3): 0, "3"
1(10): 1, "10"
2(17): 2, "17"
3(4): 3, "4"
{code}

The first number is the row index, the second is the row pointed to by the sv2 
(which should be written to create sort order). Sort was done ASC, NULLS_HIGH, 
by the key field.

A strong concern here is that there is no error or other warning to the user 
that Drill cannot sort this type; Drill just silently declines to perform the 
operation.

  was:
A unit test was created to exercise the "Sorter" mechanism within the External 
Sort, which is used to sort each incoming batch. The sorter was tested with 
each Drill data type.

The following types fail:

* TinyInt
* UInt1
* SmallInt
* UInt2
* UInt4

The types that work include:

* Int

The failure manifests on one of two ways:

* If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.)
* If dynamic UDFs are disabled, the generated code silently skips the 
comparison step, resulting in the sort not actually being done:

Sorting a set of 20-pseudo-random rows produces the following output:

{code}
#, row #, key, value
0(0): 11, "0"
1(1): 14, "1"
2(2): 17, "2"
3(3): 0, "3"
{code}

By contrast, the (working) Int type produces the correct results:

{code}
#, row #, key, value
0(3): 0, "3"
1(10): 1, "10"
2(17): 2, "17"
3(4): 3, "4"
{code}

The first number is the row index, the second is the row pointed to by the sv2 
(which should be written to create sort order). Sort was done ASC, NULLS_HIGH, 
by the key field.

A strong concern here is that there is no error or other warning to the user 
that Drill cannot sort this type; Drill just silently declines to perform the 
operation.


> External sort does not support "obscure" numeric types
> --
>
> Key: DRILL-5329
> URL: https://issues.apache.org/jira/browse/DRILL-5329
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>
> A unit test was created to exercise the "Sorter" mechanism within the 
> External Sort, which is used to sort each incoming batch. The sorter was 
> tested with each Drill data type.
> The following types fail:
> * TinyInt
> * UInt1
> * SmallInt
> * UInt2
> * UInt4
> * UInt8
> * Var16Char
> The types that work include:
> * Int
> * BigInt
> * Float4
> * Float8
> * VarChar
> The failure manifests on one of two ways:
> * If dynamic UDFs are enabled, the query crashes with an NPE. (See 
> DRILL-5331.)
> * If dynamic UDFs are disabled, the generated code silently skips the 
> comparison step, resulting in the sort not actually being done:
> Sorting a set of 20-pseudo-random rows produces the following output:
> {code}
> #, row #, key, value
> 0(0): 11, "0"
> 1(1): 14, "1"
> 2(2): 17, "2"
> 3(3): 0, "3"
> {code}
> By contrast, the (working) Int type produces the correct results:
> {code}
> #, row #, key, value
> 0(3): 0, "3"
> 1(10): 1, "10"
> 2(17): 2, "17"
> 3(4): 3, "4"
> {code}
> The first number is the row index, the second is the row pointed to by the 
> sv2 (which should be written to create sort order). Sort was done ASC, 
> NULLS_HIGH, by the key field.
> A strong concern here is that there is no error or other warning to the user 
> that Drill cannot sort this type; Drill just silently declines to perform the 
> operation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5329) External sort does not support "obscure" numeric types

2017-03-07 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5329:
---
Description: 
A unit test was created to exercise the "Sorter" mechanism within the External 
Sort, which is used to sort each incoming batch. The sorter was tested with 
each Drill data type.

The following types fail:

* TinyInt
* UInt1
* SmallInt
* UInt2
* UInt4
* UInt8
* Var16Char
* VarBinary

The types that work include:

* Int
* BigInt
* Float4
* Float8
* VarChar

The failure manifests on one of two ways:

* If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.)
* If dynamic UDFs are disabled, the generated code silently skips the 
comparison step, resulting in the sort not actually being done:

Sorting a set of 20-pseudo-random rows produces the following output:

{code}
#, row #, key, value
0(0): 11, "0"
1(1): 14, "1"
2(2): 17, "2"
3(3): 0, "3"
{code}

By contrast, the (working) Int type produces the correct results:

{code}
#, row #, key, value
0(3): 0, "3"
1(10): 1, "10"
2(17): 2, "17"
3(4): 3, "4"
{code}

The first number is the row index, the second is the row pointed to by the sv2 
(which should be written to create sort order). Sort was done ASC, NULLS_HIGH, 
by the key field.

A strong concern here is that there is no error or other warning to the user 
that Drill cannot sort this type; Drill just silently declines to perform the 
operation.

  was:
A unit test was created to exercise the "Sorter" mechanism within the External 
Sort, which is used to sort each incoming batch. The sorter was tested with 
each Drill data type.

The following types fail:

* TinyInt
* UInt1
* SmallInt
* UInt2
* UInt4
* UInt8
* Var16Char

The types that work include:

* Int
* BigInt
* Float4
* Float8
* VarChar

The failure manifests on one of two ways:

* If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.)
* If dynamic UDFs are disabled, the generated code silently skips the 
comparison step, resulting in the sort not actually being done:

Sorting a set of 20-pseudo-random rows produces the following output:

{code}
#, row #, key, value
0(0): 11, "0"
1(1): 14, "1"
2(2): 17, "2"
3(3): 0, "3"
{code}

By contrast, the (working) Int type produces the correct results:

{code}
#, row #, key, value
0(3): 0, "3"
1(10): 1, "10"
2(17): 2, "17"
3(4): 3, "4"
{code}

The first number is the row index, the second is the row pointed to by the sv2 
(which should be written to create sort order). Sort was done ASC, NULLS_HIGH, 
by the key field.

A strong concern here is that there is no error or other warning to the user 
that Drill cannot sort this type; Drill just silently declines to perform the 
operation.


> External sort does not support "obscure" numeric types
> --
>
> Key: DRILL-5329
> URL: https://issues.apache.org/jira/browse/DRILL-5329
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>
> A unit test was created to exercise the "Sorter" mechanism within the 
> External Sort, which is used to sort each incoming batch. The sorter was 
> tested with each Drill data type.
> The following types fail:
> * TinyInt
> * UInt1
> * SmallInt
> * UInt2
> * UInt4
> * UInt8
> * Var16Char
> * VarBinary
> The types that work include:
> * Int
> * BigInt
> * Float4
> * Float8
> * VarChar
> The failure manifests on one of two ways:
> * If dynamic UDFs are enabled, the query crashes with an NPE. (See 
> DRILL-5331.)
> * If dynamic UDFs are disabled, the generated code silently skips the 
> comparison step, resulting in the sort not actually being done:
> Sorting a set of 20-pseudo-random rows produces the following output:
> {code}
> #, row #, key, value
> 0(0): 11, "0"
> 1(1): 14, "1"
> 2(2): 17, "2"
> 3(3): 0, "3"
> {code}
> By contrast, the (working) Int type produces the correct results:
> {code}
> #, row #, key, value
> 0(3): 0, "3"
> 1(10): 1, "10"
> 2(17): 2, "17"
> 3(4): 3, "4"
> {code}
> The first number is the row index, the second is the row pointed to by the 
> sv2 (which should be written to create sort order). Sort was done ASC, 
> NULLS_HIGH, by the key field.
> A strong concern here is that there is no error or other warning to the user 
> that Drill cannot sort this type; Drill just silently declines to perform the 
> operation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5329) External sort does not support "obscure" numeric types

2017-03-07 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5329:
---
Description: 
A unit test was created to exercise the "Sorter" mechanism within the External 
Sort, which is used to sort each incoming batch. The sorter was tested with 
each Drill data type.

The following types fail:

* TinyInt
* UInt1
* SmallInt
* UInt2
* UInt4
* UInt8
* Var16Char

The types that work include:

* Int
* BigInt
* Float4
* Float8
* VarChar
* VarBinary

The failure manifests on one of two ways:

* If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.)
* If dynamic UDFs are disabled, the generated code silently skips the 
comparison step, resulting in the sort not actually being done:

Sorting a set of 20-pseudo-random rows produces the following output:

{code}
#, row #, key, value
0(0): 11, "0"
1(1): 14, "1"
2(2): 17, "2"
3(3): 0, "3"
{code}

By contrast, the (working) Int type produces the correct results:

{code}
#, row #, key, value
0(3): 0, "3"
1(10): 1, "10"
2(17): 2, "17"
3(4): 3, "4"
{code}

The first number is the row index, the second is the row pointed to by the sv2 
(which should be written to create sort order). Sort was done ASC, NULLS_HIGH, 
by the key field.

A strong concern here is that there is no error or other warning to the user 
that Drill cannot sort this type; Drill just silently declines to perform the 
operation.

  was:
A unit test was created to exercise the "Sorter" mechanism within the External 
Sort, which is used to sort each incoming batch. The sorter was tested with 
each Drill data type.

The following types fail:

* TinyInt
* UInt1
* SmallInt
* UInt2
* UInt4
* UInt8
* Var16Char
* VarBinary

The types that work include:

* Int
* BigInt
* Float4
* Float8
* VarChar

The failure manifests on one of two ways:

* If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.)
* If dynamic UDFs are disabled, the generated code silently skips the 
comparison step, resulting in the sort not actually being done:

Sorting a set of 20-pseudo-random rows produces the following output:

{code}
#, row #, key, value
0(0): 11, "0"
1(1): 14, "1"
2(2): 17, "2"
3(3): 0, "3"
{code}

By contrast, the (working) Int type produces the correct results:

{code}
#, row #, key, value
0(3): 0, "3"
1(10): 1, "10"
2(17): 2, "17"
3(4): 3, "4"
{code}

The first number is the row index, the second is the row pointed to by the sv2 
(which should be written to create sort order). Sort was done ASC, NULLS_HIGH, 
by the key field.

A strong concern here is that there is no error or other warning to the user 
that Drill cannot sort this type; Drill just silently declines to perform the 
operation.


> External sort does not support "obscure" numeric types
> --
>
> Key: DRILL-5329
> URL: https://issues.apache.org/jira/browse/DRILL-5329
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>
> A unit test was created to exercise the "Sorter" mechanism within the 
> External Sort, which is used to sort each incoming batch. The sorter was 
> tested with each Drill data type.
> The following types fail:
> * TinyInt
> * UInt1
> * SmallInt
> * UInt2
> * UInt4
> * UInt8
> * Var16Char
> The types that work include:
> * Int
> * BigInt
> * Float4
> * Float8
> * VarChar
> * VarBinary
> The failure manifests on one of two ways:
> * If dynamic UDFs are enabled, the query crashes with an NPE. (See 
> DRILL-5331.)
> * If dynamic UDFs are disabled, the generated code silently skips the 
> comparison step, resulting in the sort not actually being done:
> Sorting a set of 20-pseudo-random rows produces the following output:
> {code}
> #, row #, key, value
> 0(0): 11, "0"
> 1(1): 14, "1"
> 2(2): 17, "2"
> 3(3): 0, "3"
> {code}
> By contrast, the (working) Int type produces the correct results:
> {code}
> #, row #, key, value
> 0(3): 0, "3"
> 1(10): 1, "10"
> 2(17): 2, "17"
> 3(4): 3, "4"
> {code}
> The first number is the row index, the second is the row pointed to by the 
> sv2 (which should be written to create sort order). Sort was done ASC, 
> NULLS_HIGH, by the key field.
> A strong concern here is that there is no error or other warning to the user 
> that Drill cannot sort this type; Drill just silently declines to perform the 
> operation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (DRILL-5165) wrong results - LIMIT ALL and OFFSET clause in same query

2017-03-07 Thread Chunhui Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunhui Shi reassigned DRILL-5165:
--

Assignee: Chunhui Shi

> wrong results - LIMIT ALL and OFFSET clause in same query
> -
>
> Key: DRILL-5165
> URL: https://issues.apache.org/jira/browse/DRILL-5165
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.10.0
>Reporter: Khurram Faraaz
>Assignee: Chunhui Shi
>Priority: Critical
>
> This issue was reported by a user on Drill's user list.
> Drill 1.10.0 commit ID : bbcf4b76
> I tried a similar query on apache Drill 1.10.0 and Drill returns wrong 
> results when compared to Postgres, for a query that uses LIMIT ALL and OFFSET 
> clause in the same query. We need to file a JIRA to track this issue.
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> select col_int from typeall_l order by 1 limit 
> all offset 10;
> +--+
> | col_int  |
> +--+
> +--+
> No rows selected (0.211 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select col_int from typeall_l order by col_int 
> limit all offset 10;
> +--+
> | col_int  |
> +--+
> +--+
> No rows selected (0.24 seconds)
> {noformat}
> Query => select col_int from typeall_l limit all offset 10;
> Drill 1.10.0 returns 85 rows
> whereas for same query,
> postgres=# select col_int from typeall_l limit all offset 10;
> Postgres 9.3 returns 95 rows, which is the correct expected result.
> Query plan for above query that returns wrong results
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select col_int from typeall_l 
> limit all offset 10;
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(col_int=[$0])
> 00-02SelectionVectorRemover
> 00-03  Limit(offset=[10])
> 00-04Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=maprfs:///tmp/typeall_l]], selectionRoot=maprfs:/tmp/typeall_l, 
> numFiles=1, usedMetadataFile=false, columns=[`col_int`]]])
> {noformat} 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5332) DateVector support uses questionable conversions to a time

2017-03-07 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5332:
--

 Summary: DateVector support uses questionable conversions to a time
 Key: DRILL-5332
 URL: https://issues.apache.org/jira/browse/DRILL-5332
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.9.0
Reporter: Paul Rogers


The following code in {{DateVector}} is worrisome:

{code}
@Override
public DateTime getObject(int index) {
org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), 
org.joda.time.DateTimeZone.UTC);
date = 
date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault());
return date;
}
{code}

This code takes a date/time value stored in a value vector, converts it to UTC, 
then strips the time zone and replaces it with local time. The result it a 
"timestamp" in Java format (seconds since the epoch), but not really, it really 
the time since the epoch, as if the epoch had started in the local time zone 
rather than UTC.

This is the kind of fun & games that people used to do in Java with the 
{{Date}}  type before the advent of Joda time (and the migration of Joda into 
Java 8.)

It is, in short, very bad practice and nearly impossible to get right.

Further, converting a pure date (since this is a {{DateVector}}) into a 
date/time is fraught with peril. A date has no corresponding time. 1 AM on 
Friday in one time zone might be 11 PM on Thursday in another. Converting from 
dates to times is very difficult.

If the {{DateVector}} corresponds to a date, then it should be simple date with 
no implied time zone and no implied relationship to time. If there is to be a 
mapping of time, it must be to a {{LocalTime}} (in Joda and Java 8) that has no 
implied time zone.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5332) DateVector support uses questionable conversions to a time

2017-03-07 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5332:
---
Description: 
The following code in {{DateVector}} is worrisome:

{code}
@Override
public DateTime getObject(int index) {
org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), 
org.joda.time.DateTimeZone.UTC);
date = 
date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault());
return date;
}
{code}

This code takes a date/time value stored in a value vector, converts it to UTC, 
then strips the time zone and replaces it with local time. The result it a 
"timestamp" in Java format (seconds since the epoch), but not really, it really 
the time since the epoch, as if the epoch had started in the local time zone 
rather than UTC.

This is the kind of fun & games that people used to do in Java with the 
{{Date}}  type before the advent of Joda time (and the migration of Joda into 
Java 8.)

It is, in short, very bad practice and nearly impossible to get right.

Further, converting a pure date (since this is a {{DateVector}}) into a 
date/time is fraught with peril. A date has no corresponding time. 1 AM on 
Friday in one time zone might be 11 PM on Thursday in another. Converting from 
dates to times is very difficult.

If the {{DateVector}} corresponds to a date, then it should be simple date with 
no implied time zone and no implied relationship to time. If there is to be a 
mapping of time, it must be to a {{LocalTime}} (in Joda and Java 8) that has no 
implied time zone.

Note that this code directly contradicts the statement in [Drill 
documentation|http://drill.apache.org/docs/date-time-and-timestamp/]: "Drill 
stores values in Coordinated Universal Time (UTC)." Actually, even the 
documentation is questionable: what does it mean to store a date in UTC because 
of the above issues?

  was:
The following code in {{DateVector}} is worrisome:

{code}
@Override
public DateTime getObject(int index) {
org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), 
org.joda.time.DateTimeZone.UTC);
date = 
date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault());
return date;
}
{code}

This code takes a date/time value stored in a value vector, converts it to UTC, 
then strips the time zone and replaces it with local time. The result it a 
"timestamp" in Java format (seconds since the epoch), but not really, it really 
the time since the epoch, as if the epoch had started in the local time zone 
rather than UTC.

This is the kind of fun & games that people used to do in Java with the 
{{Date}}  type before the advent of Joda time (and the migration of Joda into 
Java 8.)

It is, in short, very bad practice and nearly impossible to get right.

Further, converting a pure date (since this is a {{DateVector}}) into a 
date/time is fraught with peril. A date has no corresponding time. 1 AM on 
Friday in one time zone might be 11 PM on Thursday in another. Converting from 
dates to times is very difficult.

If the {{DateVector}} corresponds to a date, then it should be simple date with 
no implied time zone and no implied relationship to time. If there is to be a 
mapping of time, it must be to a {{LocalTime}} (in Joda and Java 8) that has no 
implied time zone.


> DateVector support uses questionable conversions to a time
> --
>
> Key: DRILL-5332
> URL: https://issues.apache.org/jira/browse/DRILL-5332
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.9.0
>Reporter: Paul Rogers
>
> The following code in {{DateVector}} is worrisome:
> {code}
> @Override
> public DateTime getObject(int index) {
> org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), 
> org.joda.time.DateTimeZone.UTC);
> date = 
> date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault());
> return date;
> }
> {code}
> This code takes a date/time value stored in a value vector, converts it to 
> UTC, then strips the time zone and replaces it with local time. The result it 
> a "timestamp" in Java format (seconds since the epoch), but not really, it 
> really the time since the epoch, as if the epoch had started in the local 
> time zone rather than UTC.
> This is the kind of fun & games that people used to do in Java with the 
> {{Date}}  type before the advent of Joda time (and the migration of Joda into 
> Java 8.)
> It is, in short, very bad practice and nearly impossible to get right.
> Further, converting a pure date (since this is a {{DateVector}}) into a 
> date/time is fraught with peril. A date has no corresponding time. 1 AM on 
> Friday in one time zone might be 11 PM on Thursday in another. Converting 
> from dates to times is very difficult.
> If 

[jira] [Created] (DRILL-5333) Documentation error in TIME data type description

2017-03-07 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5333:
--

 Summary: Documentation error in TIME data type description
 Key: DRILL-5333
 URL: https://issues.apache.org/jira/browse/DRILL-5333
 Project: Apache Drill
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.9.0
Reporter: Paul Rogers
Priority: Minor


Consider the following description of the TIME data type from the 
[documentation|http://drill.apache.org/docs/supported-data-types/]:

{quote}
TIME

24-hour based time before or after January 1, 2001 in hours, minutes, seconds 
format: HH:mm:ss

22:55:55.23
{quote}

First, TIME has no associated date, so there can be no limitation on the days 
that can be represented. (If I tell you the bank closes at 5 PM, that statement 
is not just true after Jan. 1, 2001 -- it is true for as long as the bank 
exists.)

Second, the example implies that Drill stores milliseconds, which is consistent 
with the implementation of the {{TimeVector}} data type. But, the format 
suggests that granularity is seconds.

Finally, Time, as stored internally, has no format: it is a number; the number 
of milliseconds since the epoch. The format only comes into play when 
converting two or from text.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5334) Questionable code in the TimeVector class

2017-03-07 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5334:
--

 Summary: Questionable code in the TimeVector class
 Key: DRILL-5334
 URL: https://issues.apache.org/jira/browse/DRILL-5334
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.9.0
Reporter: Paul Rogers


The {{TimeVector}} class, which holds Time data, should hold a simple local 
time with no associated date or time zone. (A local time cannot be converted to 
UTC without a date since the conversion depends on when daylight savings is in 
effect.)

But, the implementation of {{TimeVector}} uses the following very questionable 
code:

{code}
@Override
public DateTime getObject(int index) {
org.joda.time.DateTime time = new org.joda.time.DateTime(get(index), 
org.joda.time.DateTimeZone.UTC);
time = 
time.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault());
return time;
}
{code}

That is, we convert a date-less, local time into a Joda UTC DateTime object, 
then reset the time zone to local time. This is abusing the Joda library and is 
the very kind of fun & games that Joda was designed to prevent.

The conversion of a time into Joda should use the {{LocalTime}} class.

In fact, according to 
[Oracle|http://www.oracle.com/technetwork/articles/java/jf14-date-time-2125367.html],
 the following is the mapping from ANSI SQL date/time types to Java 8 (and thus 
Joda) classes:

||ANSI SQL||Java SE 8
|DATE|LocalDate
|TIME|LocalTime
|TIMESTAMP|LocalDateTime
|TIME WITH TIMEZONE|OffsetTime
|TIMESTAMP WITH TIMEZONE|OffsetDateTime




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)