[jira] [Commented] (DRILL-5459) Extend physical operator test framework to test mini plans consisting of multiple operators

2017-05-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009433#comment-16009433
 ] 

ASF GitHub Bot commented on DRILL-5459:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/823


> Extend physical operator test framework to test mini plans consisting of 
> multiple operators
> ---
>
> Key: DRILL-5459
> URL: https://issues.apache.org/jira/browse/DRILL-5459
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>  Labels: ready-to-commit
>
> DRILL-4437 introduced a unit test framework to test a non-scan physical 
> operator. A JSON reader is implicitly used to specify the inputs to the 
> physical operator under test. 
> There are needs to extend such unit test framework for two scenarios.
> 1. We need a way to test scan operator with different record readers. Drill 
> supports a variety of data source, and it's important to make sure every 
> record reader work properly according to the protocol defined.
> 2. We need a way to test a so-called mini-plan (aka plan fragment) consisting 
> of multiple non-scan operators. 
> For the 2nd need, an alternative is to leverage SQL statement and query 
> planner. However, such approach has a direct dependency on query planner; 1) 
> any planner change may impact the testcase and lead to a different plan, 2) 
> it's not always easy job to force the planner to get a desired plan fragment 
> for testing.
> In particular, it would be good to have a relatively easy way to specify a 
> mini-plan with a couple of targeted physical operators. 
> This JIRA is created to track the work to extend the unit test framework in 
> DRILL-4437.
>  
> Related work: DRILL-5318 introduced a sub-operator test fixture, which mainly 
> targeted to test at sub-operator level. The framework in DRILL-4437 and the 
> extension would focus on operator level, or multiple operator levels, where 
> execution would go through RecordBatch's API call. 
> Same as DRILL-4437, we are going to use mockit to mock required objects such 
> fragment context, operator context etc. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5459) Extend physical operator test framework to test mini plans consisting of multiple operators

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008927#comment-16008927
 ] 

ASF GitHub Bot commented on DRILL-5459:
---

Github user parthchandra commented on the issue:

https://github.com/apache/drill/pull/823
  
+1


> Extend physical operator test framework to test mini plans consisting of 
> multiple operators
> ---
>
> Key: DRILL-5459
> URL: https://issues.apache.org/jira/browse/DRILL-5459
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>  Labels: ready-to-commit
>
> DRILL-4437 introduced a unit test framework to test a non-scan physical 
> operator. A JSON reader is implicitly used to specify the inputs to the 
> physical operator under test. 
> There are needs to extend such unit test framework for two scenarios.
> 1. We need a way to test scan operator with different record readers. Drill 
> supports a variety of data source, and it's important to make sure every 
> record reader work properly according to the protocol defined.
> 2. We need a way to test a so-called mini-plan (aka plan fragment) consisting 
> of multiple non-scan operators. 
> For the 2nd need, an alternative is to leverage SQL statement and query 
> planner. However, such approach has a direct dependency on query planner; 1) 
> any planner change may impact the testcase and lead to a different plan, 2) 
> it's not always easy job to force the planner to get a desired plan fragment 
> for testing.
> In particular, it would be good to have a relatively easy way to specify a 
> mini-plan with a couple of targeted physical operators. 
> This JIRA is created to track the work to extend the unit test framework in 
> DRILL-4437.
>  
> Related work: DRILL-5318 introduced a sub-operator test fixture, which mainly 
> targeted to test at sub-operator level. The framework in DRILL-4437 and the 
> extension would focus on operator level, or multiple operator levels, where 
> execution would go through RecordBatch's API call. 
> Same as DRILL-4437, we are going to use mockit to mock required objects such 
> fragment context, operator context etc. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5459) Extend physical operator test framework to test mini plans consisting of multiple operators

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008878#comment-16008878
 ] 

ASF GitHub Bot commented on DRILL-5459:
---

Github user jinfengni commented on a diff in the pull request:

https://github.com/apache/drill/pull/823#discussion_r11670
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/unit/MiniPlanUnitTestBase.java
 ---
@@ -0,0 +1,439 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.physical.unit;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Lists;
+import mockit.NonStrictExpectations;
+import org.apache.drill.DrillTestWrapper;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.impl.BatchCreator;
+import org.apache.drill.exec.physical.impl.ScanBatch;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.MaterializedField;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.rpc.NamedThreadFactory;
+import org.apache.drill.exec.store.RecordReader;
+import org.apache.drill.exec.store.dfs.DrillFileSystem;
+import 
org.apache.drill.exec.store.parquet.ParquetDirectByteBufferAllocator;
+import org.apache.drill.exec.store.parquet.ParquetReaderUtility;
+import 
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader;
+import org.apache.drill.exec.util.TestUtilities;
+import org.apache.hadoop.fs.Path;
+import org.apache.parquet.hadoop.CodecFactory;
+import org.apache.parquet.hadoop.ParquetFileReader;
+import org.apache.parquet.hadoop.metadata.ParquetMetadata;
+
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+
+import static org.apache.drill.exec.physical.unit.TestMiniPlan.fs;
+
+/**
+ * A MiniPlanUnitTestBase extends PhysicalOpUnitTestBase, to construct 
MiniPlan (aka plan fragment).
+ * in the form of physical operator tree, and verify both the expected 
schema and output row results.
+ * Steps to construct a unit:
+ * 1. Call PopBuilder / ScanPopBuilder to construct the MiniPlan
+ * 2. Create a MiniPlanTestBuilder, and specify the expected schema and 
base line values, or if there
+ * is no batch expected.
+ */
+
+public class MiniPlanUnitTestBase extends PhysicalOpUnitTestBase {
+
+  private final ExecutorService scanExecutor =  
Executors.newFixedThreadPool(2, new NamedThreadFactory("scan-"));
+
+  public class MiniPlanTestBuilder {
--- End diff --

Change this to static class. 


> Extend physical operator test framework to test mini plans consisting of 
> multiple operators
> ---
>
> Key: DRILL-5459
> URL: https://issues.apache.org/jira/browse/DRILL-5459
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>  Labels: ready-to-commit
>
> DRILL-4437 introduced a unit test framework to test a non-scan physical 
> operator. A JSON reader is implicitly used to specify the inputs to the 
> physical operator under test. 
> There are needs to extend such unit test framework for two scenarios.
> 1. We need a way to test scan operator with different record readers. Drill 
> supports a variety of data source, and it's important to make sure every 
> record reader work properly according to the protocol defined.
> 2. We need a way to test a so-call

[jira] [Commented] (DRILL-5459) Extend physical operator test framework to test mini plans consisting of multiple operators

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008877#comment-16008877
 ] 

ASF GitHub Bot commented on DRILL-5459:
---

Github user jinfengni commented on a diff in the pull request:

https://github.com/apache/drill/pull/823#discussion_r116336861
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/unit/TestMiniPlan.java
 ---
@@ -0,0 +1,200 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.physical.unit;
+
+import com.google.common.collect.Lists;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.util.FileUtils;
+import org.apache.drill.exec.physical.config.Filter;
+import org.apache.drill.exec.physical.config.UnionAll;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.store.dfs.DrillFileSystem;
+import org.apache.drill.test.rowSet.SchemaBuilder;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.junit.BeforeClass;
+import org.junit.Ignore;
+import org.junit.Test;
+
+import java.util.Collections;
+import java.util.List;
+
+public class TestMiniPlan extends MiniPlanUnitTestBase {
--- End diff --

Added. 


> Extend physical operator test framework to test mini plans consisting of 
> multiple operators
> ---
>
> Key: DRILL-5459
> URL: https://issues.apache.org/jira/browse/DRILL-5459
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>  Labels: ready-to-commit
>
> DRILL-4437 introduced a unit test framework to test a non-scan physical 
> operator. A JSON reader is implicitly used to specify the inputs to the 
> physical operator under test. 
> There are needs to extend such unit test framework for two scenarios.
> 1. We need a way to test scan operator with different record readers. Drill 
> supports a variety of data source, and it's important to make sure every 
> record reader work properly according to the protocol defined.
> 2. We need a way to test a so-called mini-plan (aka plan fragment) consisting 
> of multiple non-scan operators. 
> For the 2nd need, an alternative is to leverage SQL statement and query 
> planner. However, such approach has a direct dependency on query planner; 1) 
> any planner change may impact the testcase and lead to a different plan, 2) 
> it's not always easy job to force the planner to get a desired plan fragment 
> for testing.
> In particular, it would be good to have a relatively easy way to specify a 
> mini-plan with a couple of targeted physical operators. 
> This JIRA is created to track the work to extend the unit test framework in 
> DRILL-4437.
>  
> Related work: DRILL-5318 introduced a sub-operator test fixture, which mainly 
> targeted to test at sub-operator level. The framework in DRILL-4437 and the 
> extension would focus on operator level, or multiple operator levels, where 
> execution would go through RecordBatch's API call. 
> Same as DRILL-4437, we are going to use mockit to mock required objects such 
> fragment context, operator context etc. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5459) Extend physical operator test framework to test mini plans consisting of multiple operators

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008879#comment-16008879
 ] 

ASF GitHub Bot commented on DRILL-5459:
---

Github user jinfengni commented on a diff in the pull request:

https://github.com/apache/drill/pull/823#discussion_r116336012
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/unit/TestMiniPlan.java
 ---
@@ -0,0 +1,200 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.physical.unit;
+
+import com.google.common.collect.Lists;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.util.FileUtils;
+import org.apache.drill.exec.physical.config.Filter;
+import org.apache.drill.exec.physical.config.UnionAll;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.store.dfs.DrillFileSystem;
+import org.apache.drill.test.rowSet.SchemaBuilder;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.junit.BeforeClass;
+import org.junit.Ignore;
+import org.junit.Test;
+
+import java.util.Collections;
+import java.util.List;
+
+public class TestMiniPlan extends MiniPlanUnitTestBase {
+
+  protected static DrillFileSystem fs;
+
+  @BeforeClass
+  public static void initFS() throws Exception {
+Configuration conf = new Configuration();
+conf.set(FileSystem.FS_DEFAULT_NAME_KEY, "local");
+fs = new DrillFileSystem(conf);
+  }
+
+  @Test
+  @Ignore("A bug in JsonRecordReader handling empty file")
--- End diff --

Good points. Add two JIRA number for the two disabled testcases. Will 
enable the testcase, when those JIRAs are to be fixed. 



> Extend physical operator test framework to test mini plans consisting of 
> multiple operators
> ---
>
> Key: DRILL-5459
> URL: https://issues.apache.org/jira/browse/DRILL-5459
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>  Labels: ready-to-commit
>
> DRILL-4437 introduced a unit test framework to test a non-scan physical 
> operator. A JSON reader is implicitly used to specify the inputs to the 
> physical operator under test. 
> There are needs to extend such unit test framework for two scenarios.
> 1. We need a way to test scan operator with different record readers. Drill 
> supports a variety of data source, and it's important to make sure every 
> record reader work properly according to the protocol defined.
> 2. We need a way to test a so-called mini-plan (aka plan fragment) consisting 
> of multiple non-scan operators. 
> For the 2nd need, an alternative is to leverage SQL statement and query 
> planner. However, such approach has a direct dependency on query planner; 1) 
> any planner change may impact the testcase and lead to a different plan, 2) 
> it's not always easy job to force the planner to get a desired plan fragment 
> for testing.
> In particular, it would be good to have a relatively easy way to specify a 
> mini-plan with a couple of targeted physical operators. 
> This JIRA is created to track the work to extend the unit test framework in 
> DRILL-4437.
>  
> Related work: DRILL-5318 introduced a sub-operator test fixture, which mainly 
> targeted to test at sub-operator level. The framework in DRILL-4437 and the 
> extension would focus on operator level, or multiple operator levels, where 
> execution would go through RecordBatch's API call. 
> Same as DRILL-4437, we are going to use mockit to mock required objects such 
> fragment context, operator context etc. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5459) Extend physical operator test framework to test mini plans consisting of multiple operators

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008881#comment-16008881
 ] 

ASF GitHub Bot commented on DRILL-5459:
---

Github user jinfengni commented on a diff in the pull request:

https://github.com/apache/drill/pull/823#discussion_r116337155
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/unit/TestMiniPlan.java
 ---
@@ -0,0 +1,200 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.physical.unit;
+
+import com.google.common.collect.Lists;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.util.FileUtils;
+import org.apache.drill.exec.physical.config.Filter;
+import org.apache.drill.exec.physical.config.UnionAll;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.store.dfs.DrillFileSystem;
+import org.apache.drill.test.rowSet.SchemaBuilder;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.junit.BeforeClass;
+import org.junit.Ignore;
+import org.junit.Test;
+
+import java.util.Collections;
+import java.util.List;
+
+public class TestMiniPlan extends MiniPlanUnitTestBase {
+
+  protected static DrillFileSystem fs;
+
+  @BeforeClass
+  public static void initFS() throws Exception {
+Configuration conf = new Configuration();
+conf.set(FileSystem.FS_DEFAULT_NAME_KEY, "local");
--- End diff --

Modified per review comment. Also modified several existing testcases which 
used "local". 


> Extend physical operator test framework to test mini plans consisting of 
> multiple operators
> ---
>
> Key: DRILL-5459
> URL: https://issues.apache.org/jira/browse/DRILL-5459
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>  Labels: ready-to-commit
>
> DRILL-4437 introduced a unit test framework to test a non-scan physical 
> operator. A JSON reader is implicitly used to specify the inputs to the 
> physical operator under test. 
> There are needs to extend such unit test framework for two scenarios.
> 1. We need a way to test scan operator with different record readers. Drill 
> supports a variety of data source, and it's important to make sure every 
> record reader work properly according to the protocol defined.
> 2. We need a way to test a so-called mini-plan (aka plan fragment) consisting 
> of multiple non-scan operators. 
> For the 2nd need, an alternative is to leverage SQL statement and query 
> planner. However, such approach has a direct dependency on query planner; 1) 
> any planner change may impact the testcase and lead to a different plan, 2) 
> it's not always easy job to force the planner to get a desired plan fragment 
> for testing.
> In particular, it would be good to have a relatively easy way to specify a 
> mini-plan with a couple of targeted physical operators. 
> This JIRA is created to track the work to extend the unit test framework in 
> DRILL-4437.
>  
> Related work: DRILL-5318 introduced a sub-operator test fixture, which mainly 
> targeted to test at sub-operator level. The framework in DRILL-4437 and the 
> extension would focus on operator level, or multiple operator levels, where 
> execution would go through RecordBatch's API call. 
> Same as DRILL-4437, we are going to use mockit to mock required objects such 
> fragment context, operator context etc. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5459) Extend physical operator test framework to test mini plans consisting of multiple operators

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008880#comment-16008880
 ] 

ASF GitHub Bot commented on DRILL-5459:
---

Github user jinfengni commented on a diff in the pull request:

https://github.com/apache/drill/pull/823#discussion_r116335475
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/unit/PhysicalOpUnitTestBase.java
 ---
@@ -82,6 +81,8 @@
  * Look! Doesn't extend BaseTestQuery!!
  */
 public class PhysicalOpUnitTestBase extends ExecTest {
+  public static long INIT_ALLOCATION = 10_000_000l;
+  public static long MAX_ALLOCATION = 15_000_000L;
--- End diff --

I adopted the suggested code refactoring in AbstractBase, and use the two 
constant values there. 

I also set maxAllocation to 15M for 
BasicPhysicalOpUnitTest.testExternalSort(), since 15M was the original value 
Jason used for that testcase, and using the default 10M would cause OOM for 
that testcase. 



> Extend physical operator test framework to test mini plans consisting of 
> multiple operators
> ---
>
> Key: DRILL-5459
> URL: https://issues.apache.org/jira/browse/DRILL-5459
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>  Labels: ready-to-commit
>
> DRILL-4437 introduced a unit test framework to test a non-scan physical 
> operator. A JSON reader is implicitly used to specify the inputs to the 
> physical operator under test. 
> There are needs to extend such unit test framework for two scenarios.
> 1. We need a way to test scan operator with different record readers. Drill 
> supports a variety of data source, and it's important to make sure every 
> record reader work properly according to the protocol defined.
> 2. We need a way to test a so-called mini-plan (aka plan fragment) consisting 
> of multiple non-scan operators. 
> For the 2nd need, an alternative is to leverage SQL statement and query 
> planner. However, such approach has a direct dependency on query planner; 1) 
> any planner change may impact the testcase and lead to a different plan, 2) 
> it's not always easy job to force the planner to get a desired plan fragment 
> for testing.
> In particular, it would be good to have a relatively easy way to specify a 
> mini-plan with a couple of targeted physical operators. 
> This JIRA is created to track the work to extend the unit test framework in 
> DRILL-4437.
>  
> Related work: DRILL-5318 introduced a sub-operator test fixture, which mainly 
> targeted to test at sub-operator level. The framework in DRILL-4437 and the 
> extension would focus on operator level, or multiple operator levels, where 
> execution would go through RecordBatch's API call. 
> Same as DRILL-4437, we are going to use mockit to mock required objects such 
> fragment context, operator context etc. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5459) Extend physical operator test framework to test mini plans consisting of multiple operators

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008799#comment-16008799
 ] 

ASF GitHub Bot commented on DRILL-5459:
---

Github user jinfengni commented on a diff in the pull request:

https://github.com/apache/drill/pull/823#discussion_r11618
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/unit/MiniPlanUnitTestBase.java
 ---
@@ -0,0 +1,439 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.physical.unit;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Lists;
+import mockit.NonStrictExpectations;
+import org.apache.drill.DrillTestWrapper;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.impl.BatchCreator;
+import org.apache.drill.exec.physical.impl.ScanBatch;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.MaterializedField;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.rpc.NamedThreadFactory;
+import org.apache.drill.exec.store.RecordReader;
+import org.apache.drill.exec.store.dfs.DrillFileSystem;
+import 
org.apache.drill.exec.store.parquet.ParquetDirectByteBufferAllocator;
+import org.apache.drill.exec.store.parquet.ParquetReaderUtility;
+import 
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader;
+import org.apache.drill.exec.util.TestUtilities;
+import org.apache.hadoop.fs.Path;
+import org.apache.parquet.hadoop.CodecFactory;
+import org.apache.parquet.hadoop.ParquetFileReader;
+import org.apache.parquet.hadoop.metadata.ParquetMetadata;
+
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+
+import static org.apache.drill.exec.physical.unit.TestMiniPlan.fs;
+
+/**
+ * A MiniPlanUnitTestBase extends PhysicalOpUnitTestBase, to construct 
MiniPlan (aka plan fragment).
+ * in the form of physical operator tree, and verify both the expected 
schema and output row results.
+ * Steps to construct a unit:
+ * 1. Call PopBuilder / ScanPopBuilder to construct the MiniPlan
+ * 2. Create a MiniPlanTestBuilder, and specify the expected schema and 
base line values, or if there
+ * is no batch expected.
+ */
+
+public class MiniPlanUnitTestBase extends PhysicalOpUnitTestBase {
--- End diff --

Thanks for the suggestion. I'll leave it as future improvement, as it 
requires refactor the new class as well as the existing physical operator test 
framework. 


> Extend physical operator test framework to test mini plans consisting of 
> multiple operators
> ---
>
> Key: DRILL-5459
> URL: https://issues.apache.org/jira/browse/DRILL-5459
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>  Labels: ready-to-commit
>
> DRILL-4437 introduced a unit test framework to test a non-scan physical 
> operator. A JSON reader is implicitly used to specify the inputs to the 
> physical operator under test. 
> There are needs to extend such unit test framework for two scenarios.
> 1. We need a way to test scan operator with different record readers. Drill 
> supports a variety of data source, and it's important to make sure every 
> record reader work properly according to the protocol defined.
> 2. We need a way to test a so-called mini-plan (aka plan fragment) consisti

[jira] [Commented] (DRILL-5459) Extend physical operator test framework to test mini plans consisting of multiple operators

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999110#comment-15999110
 ] 

ASF GitHub Bot commented on DRILL-5459:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/823#discussion_r115103846
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/unit/TestMiniPlan.java
 ---
@@ -0,0 +1,200 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.physical.unit;
+
+import com.google.common.collect.Lists;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.util.FileUtils;
+import org.apache.drill.exec.physical.config.Filter;
+import org.apache.drill.exec.physical.config.UnionAll;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.store.dfs.DrillFileSystem;
+import org.apache.drill.test.rowSet.SchemaBuilder;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.junit.BeforeClass;
+import org.junit.Ignore;
+import org.junit.Test;
+
+import java.util.Collections;
+import java.util.List;
+
+public class TestMiniPlan extends MiniPlanUnitTestBase {
+
+  protected static DrillFileSystem fs;
+
+  @BeforeClass
+  public static void initFS() throws Exception {
+Configuration conf = new Configuration();
+conf.set(FileSystem.FS_DEFAULT_NAME_KEY, "local");
--- End diff --

I believe the modern way to specify the default file system is 
FileSystem.DEFAULT_FS, not "local".


> Extend physical operator test framework to test mini plans consisting of 
> multiple operators
> ---
>
> Key: DRILL-5459
> URL: https://issues.apache.org/jira/browse/DRILL-5459
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>
> DRILL-4437 introduced a unit test framework to test a non-scan physical 
> operator. A JSON reader is implicitly used to specify the inputs to the 
> physical operator under test. 
> There are needs to extend such unit test framework for two scenarios.
> 1. We need a way to test scan operator with different record readers. Drill 
> supports a variety of data source, and it's important to make sure every 
> record reader work properly according to the protocol defined.
> 2. We need a way to test a so-called mini-plan (aka plan fragment) consisting 
> of multiple non-scan operators. 
> For the 2nd need, an alternative is to leverage SQL statement and query 
> planner. However, such approach has a direct dependency on query planner; 1) 
> any planner change may impact the testcase and lead to a different plan, 2) 
> it's not always easy job to force the planner to get a desired plan fragment 
> for testing.
> In particular, it would be good to have a relatively easy way to specify a 
> mini-plan with a couple of targeted physical operators. 
> This JIRA is created to track the work to extend the unit test framework in 
> DRILL-4437.
>  
> Related work: DRILL-5318 introduced a sub-operator test fixture, which mainly 
> targeted to test at sub-operator level. The framework in DRILL-4437 and the 
> extension would focus on operator level, or multiple operator levels, where 
> execution would go through RecordBatch's API call. 
> Same as DRILL-4437, we are going to use mockit to mock required objects such 
> fragment context, operator context etc. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5459) Extend physical operator test framework to test mini plans consisting of multiple operators

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999108#comment-15999108
 ] 

ASF GitHub Bot commented on DRILL-5459:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/823#discussion_r115103744
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/unit/TestMiniPlan.java
 ---
@@ -0,0 +1,200 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.physical.unit;
+
+import com.google.common.collect.Lists;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.util.FileUtils;
+import org.apache.drill.exec.physical.config.Filter;
+import org.apache.drill.exec.physical.config.UnionAll;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.store.dfs.DrillFileSystem;
+import org.apache.drill.test.rowSet.SchemaBuilder;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.junit.BeforeClass;
+import org.junit.Ignore;
+import org.junit.Test;
+
+import java.util.Collections;
+import java.util.List;
+
+public class TestMiniPlan extends MiniPlanUnitTestBase {
--- End diff --

Maybe a Javadoc comment to say what this thing tests? It tests the mini 
plan, but in what way?


> Extend physical operator test framework to test mini plans consisting of 
> multiple operators
> ---
>
> Key: DRILL-5459
> URL: https://issues.apache.org/jira/browse/DRILL-5459
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>
> DRILL-4437 introduced a unit test framework to test a non-scan physical 
> operator. A JSON reader is implicitly used to specify the inputs to the 
> physical operator under test. 
> There are needs to extend such unit test framework for two scenarios.
> 1. We need a way to test scan operator with different record readers. Drill 
> supports a variety of data source, and it's important to make sure every 
> record reader work properly according to the protocol defined.
> 2. We need a way to test a so-called mini-plan (aka plan fragment) consisting 
> of multiple non-scan operators. 
> For the 2nd need, an alternative is to leverage SQL statement and query 
> planner. However, such approach has a direct dependency on query planner; 1) 
> any planner change may impact the testcase and lead to a different plan, 2) 
> it's not always easy job to force the planner to get a desired plan fragment 
> for testing.
> In particular, it would be good to have a relatively easy way to specify a 
> mini-plan with a couple of targeted physical operators. 
> This JIRA is created to track the work to extend the unit test framework in 
> DRILL-4437.
>  
> Related work: DRILL-5318 introduced a sub-operator test fixture, which mainly 
> targeted to test at sub-operator level. The framework in DRILL-4437 and the 
> extension would focus on operator level, or multiple operator levels, where 
> execution would go through RecordBatch's API call. 
> Same as DRILL-4437, we are going to use mockit to mock required objects such 
> fragment context, operator context etc. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5459) Extend physical operator test framework to test mini plans consisting of multiple operators

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999107#comment-15999107
 ] 

ASF GitHub Bot commented on DRILL-5459:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/823#discussion_r115102896
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/unit/MiniPlanUnitTestBase.java
 ---
@@ -0,0 +1,439 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.physical.unit;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Lists;
+import mockit.NonStrictExpectations;
+import org.apache.drill.DrillTestWrapper;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.impl.BatchCreator;
+import org.apache.drill.exec.physical.impl.ScanBatch;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.MaterializedField;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.rpc.NamedThreadFactory;
+import org.apache.drill.exec.store.RecordReader;
+import org.apache.drill.exec.store.dfs.DrillFileSystem;
+import 
org.apache.drill.exec.store.parquet.ParquetDirectByteBufferAllocator;
+import org.apache.drill.exec.store.parquet.ParquetReaderUtility;
+import 
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader;
+import org.apache.drill.exec.util.TestUtilities;
+import org.apache.hadoop.fs.Path;
+import org.apache.parquet.hadoop.CodecFactory;
+import org.apache.parquet.hadoop.ParquetFileReader;
+import org.apache.parquet.hadoop.metadata.ParquetMetadata;
+
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+
+import static org.apache.drill.exec.physical.unit.TestMiniPlan.fs;
+
+/**
+ * A MiniPlanUnitTestBase extends PhysicalOpUnitTestBase, to construct 
MiniPlan (aka plan fragment).
+ * in the form of physical operator tree, and verify both the expected 
schema and output row results.
+ * Steps to construct a unit:
+ * 1. Call PopBuilder / ScanPopBuilder to construct the MiniPlan
+ * 2. Create a MiniPlanTestBuilder, and specify the expected schema and 
base line values, or if there
+ * is no batch expected.
+ */
+
+public class MiniPlanUnitTestBase extends PhysicalOpUnitTestBase {
--- End diff --

While there is nothing wrong with putting this stuff in a test base class, 
it is constraining. A test can have only one superclass. This makes it hard to 
create a hybrid test using different frameworks.

Not necessary for this PR, but a future improvement would be to move this 
into a "fixture" that can be used in any test class.


> Extend physical operator test framework to test mini plans consisting of 
> multiple operators
> ---
>
> Key: DRILL-5459
> URL: https://issues.apache.org/jira/browse/DRILL-5459
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>
> DRILL-4437 introduced a unit test framework to test a non-scan physical 
> operator. A JSON reader is implicitly used to specify the inputs to the 
> physical operator under test. 
> There are needs to extend such unit test framework for two scenarios.
> 1. We need a way to test scan operator with different record readers. Drill 
> supports a variety of data source, and it's important to make sure every 

[jira] [Commented] (DRILL-5459) Extend physical operator test framework to test mini plans consisting of multiple operators

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999112#comment-15999112
 ] 

ASF GitHub Bot commented on DRILL-5459:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/823#discussion_r115102970
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/unit/MiniPlanUnitTestBase.java
 ---
@@ -0,0 +1,439 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.physical.unit;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Lists;
+import mockit.NonStrictExpectations;
+import org.apache.drill.DrillTestWrapper;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.impl.BatchCreator;
+import org.apache.drill.exec.physical.impl.ScanBatch;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.MaterializedField;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.rpc.NamedThreadFactory;
+import org.apache.drill.exec.store.RecordReader;
+import org.apache.drill.exec.store.dfs.DrillFileSystem;
+import 
org.apache.drill.exec.store.parquet.ParquetDirectByteBufferAllocator;
+import org.apache.drill.exec.store.parquet.ParquetReaderUtility;
+import 
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader;
+import org.apache.drill.exec.util.TestUtilities;
+import org.apache.hadoop.fs.Path;
+import org.apache.parquet.hadoop.CodecFactory;
+import org.apache.parquet.hadoop.ParquetFileReader;
+import org.apache.parquet.hadoop.metadata.ParquetMetadata;
+
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+
+import static org.apache.drill.exec.physical.unit.TestMiniPlan.fs;
+
+/**
+ * A MiniPlanUnitTestBase extends PhysicalOpUnitTestBase, to construct 
MiniPlan (aka plan fragment).
+ * in the form of physical operator tree, and verify both the expected 
schema and output row results.
+ * Steps to construct a unit:
+ * 1. Call PopBuilder / ScanPopBuilder to construct the MiniPlan
+ * 2. Create a MiniPlanTestBuilder, and specify the expected schema and 
base line values, or if there
+ * is no batch expected.
+ */
+
+public class MiniPlanUnitTestBase extends PhysicalOpUnitTestBase {
+
+  private final ExecutorService scanExecutor =  
Executors.newFixedThreadPool(2, new NamedThreadFactory("scan-"));
+
+  public class MiniPlanTestBuilder {
--- End diff --

Can this class be static so it can be used from other places rather than 
just subclasses of this class?

Again, not critical for this PR, but a future improvement.


> Extend physical operator test framework to test mini plans consisting of 
> multiple operators
> ---
>
> Key: DRILL-5459
> URL: https://issues.apache.org/jira/browse/DRILL-5459
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>
> DRILL-4437 introduced a unit test framework to test a non-scan physical 
> operator. A JSON reader is implicitly used to specify the inputs to the 
> physical operator under test. 
> There are needs to extend such unit test framework for two scenarios.
> 1. We need a way to test scan operator with different record readers. Drill 
> supports a variety of data source, and it's important to make sure ev

[jira] [Commented] (DRILL-5459) Extend physical operator test framework to test mini plans consisting of multiple operators

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999109#comment-15999109
 ] 

ASF GitHub Bot commented on DRILL-5459:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/823#discussion_r115103872
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/unit/TestMiniPlan.java
 ---
@@ -0,0 +1,200 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.physical.unit;
+
+import com.google.common.collect.Lists;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.util.FileUtils;
+import org.apache.drill.exec.physical.config.Filter;
+import org.apache.drill.exec.physical.config.UnionAll;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.store.dfs.DrillFileSystem;
+import org.apache.drill.test.rowSet.SchemaBuilder;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.junit.BeforeClass;
+import org.junit.Ignore;
+import org.junit.Test;
+
+import java.util.Collections;
+import java.util.List;
+
+public class TestMiniPlan extends MiniPlanUnitTestBase {
+
+  protected static DrillFileSystem fs;
+
+  @BeforeClass
+  public static void initFS() throws Exception {
+Configuration conf = new Configuration();
+conf.set(FileSystem.FS_DEFAULT_NAME_KEY, "local");
+fs = new DrillFileSystem(conf);
+  }
+
+  @Test
+  @Ignore("A bug in JsonRecordReader handling empty file")
--- End diff --

File a JIRA for that bug and put the JIRA entry here?


> Extend physical operator test framework to test mini plans consisting of 
> multiple operators
> ---
>
> Key: DRILL-5459
> URL: https://issues.apache.org/jira/browse/DRILL-5459
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>
> DRILL-4437 introduced a unit test framework to test a non-scan physical 
> operator. A JSON reader is implicitly used to specify the inputs to the 
> physical operator under test. 
> There are needs to extend such unit test framework for two scenarios.
> 1. We need a way to test scan operator with different record readers. Drill 
> supports a variety of data source, and it's important to make sure every 
> record reader work properly according to the protocol defined.
> 2. We need a way to test a so-called mini-plan (aka plan fragment) consisting 
> of multiple non-scan operators. 
> For the 2nd need, an alternative is to leverage SQL statement and query 
> planner. However, such approach has a direct dependency on query planner; 1) 
> any planner change may impact the testcase and lead to a different plan, 2) 
> it's not always easy job to force the planner to get a desired plan fragment 
> for testing.
> In particular, it would be good to have a relatively easy way to specify a 
> mini-plan with a couple of targeted physical operators. 
> This JIRA is created to track the work to extend the unit test framework in 
> DRILL-4437.
>  
> Related work: DRILL-5318 introduced a sub-operator test fixture, which mainly 
> targeted to test at sub-operator level. The framework in DRILL-4437 and the 
> extension would focus on operator level, or multiple operator levels, where 
> execution would go through RecordBatch's API call. 
> Same as DRILL-4437, we are going to use mockit to mock required objects such 
> fragment context, operator context etc. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5459) Extend physical operator test framework to test mini plans consisting of multiple operators

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999113#comment-15999113
 ] 

ASF GitHub Bot commented on DRILL-5459:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/823#discussion_r115103967
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/unit/TestMiniPlan.java
 ---
@@ -0,0 +1,200 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.physical.unit;
+
+import com.google.common.collect.Lists;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.util.FileUtils;
+import org.apache.drill.exec.physical.config.Filter;
+import org.apache.drill.exec.physical.config.UnionAll;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.store.dfs.DrillFileSystem;
+import org.apache.drill.test.rowSet.SchemaBuilder;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.junit.BeforeClass;
+import org.junit.Ignore;
+import org.junit.Test;
+
+import java.util.Collections;
+import java.util.List;
+
+public class TestMiniPlan extends MiniPlanUnitTestBase {
+
+  protected static DrillFileSystem fs;
+
+  @BeforeClass
+  public static void initFS() throws Exception {
+Configuration conf = new Configuration();
+conf.set(FileSystem.FS_DEFAULT_NAME_KEY, "local");
+fs = new DrillFileSystem(conf);
+  }
+
+  @Test
+  @Ignore("A bug in JsonRecordReader handling empty file")
+  public void testEmptyInput() throws Exception {
+String emptyFile = 
FileUtils.getResourceAsFile("/project/pushdown/empty.json").toURI().toString();
+
+RecordBatch scanBatch = new JsonScanBuilder()
+.fileSystem(fs)
+.inputPaths(Lists.newArrayList(emptyFile))
+.build();
+
+new MiniPlanTestBuilder()
+.root(scanBatch)
+.expectZeroBatch(true)
+.go();
+  }
+
+  @Test
+  public void testSimpleParquetScan() throws Exception {
+String file = 
FileUtils.getResourceAsFile("/tpchmulti/region/01.parquet").toURI().toString();
+
+RecordBatch scanBatch = new ParquetScanBuilder()
+.fileSystem(fs)
+.columnsToRead("R_REGIONKEY")
+.inputPaths(Lists.newArrayList(file))
+.build();
+
+BatchSchema expectedSchema = new SchemaBuilder()
+.add("R_REGIONKEY", TypeProtos.MinorType.BIGINT)
+.build();
+
+new MiniPlanTestBuilder()
+.root(scanBatch)
+.expectedSchema(expectedSchema)
+.baselineValues(0L)
+.baselineValues(1L)
+.go();
+  }
+
+  @Test
+  public void testSimpleJson() throws Exception {
+List jsonBatches = Lists.newArrayList(
+"{\"a\":100}"
+);
+
+RecordBatch scanBatch = new JsonScanBuilder()
+.jsonBatches(jsonBatches)
+.build();
+
+BatchSchema expectedSchema = new SchemaBuilder()
+.addNullable("a", TypeProtos.MinorType.BIGINT)
+.build();
+
+new MiniPlanTestBuilder()
+.root(scanBatch)
+.expectedSchema(expectedSchema)
+.baselineValues(100L)
+.go();
+  }
+
+  @Test
+  public void testUnionFilter() throws Exception {
+List leftJsonBatches = Lists.newArrayList(
+"[{\"a\": 5, \"b\" : 1 }]",
+"[{\"a\": 5, \"b\" : 5},{\"a\": 3, \"b\" : 8}]",
+"[{\"a\": 40, \"b\" : 3},{\"a\": 13, \"b\" : 100}]");
+
+List rightJsonBatches = Lists.newArrayList(
+"[{\"a\": 5, \"b\" : 10 }]",
+"[{\"a\": 50, \"b\" : 100}]");
+
+RecordBatch batch = new PopBuilder()
+

[jira] [Commented] (DRILL-5459) Extend physical operator test framework to test mini plans consisting of multiple operators

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999106#comment-15999106
 ] 

ASF GitHub Bot commented on DRILL-5459:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/823#discussion_r115102606
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/DrillTestWrapper.java ---
@@ -307,19 +308,43 @@ public void close() throws Exception {
   }
 
   /**
+   * Iterate over batches, and combine the batches into a map, where key 
is schema path, and value is
+   * the list of column values across all the batches.
--- End diff --

Thanks for adding the Javadoc comments! Very helpful.


> Extend physical operator test framework to test mini plans consisting of 
> multiple operators
> ---
>
> Key: DRILL-5459
> URL: https://issues.apache.org/jira/browse/DRILL-5459
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>
> DRILL-4437 introduced a unit test framework to test a non-scan physical 
> operator. A JSON reader is implicitly used to specify the inputs to the 
> physical operator under test. 
> There are needs to extend such unit test framework for two scenarios.
> 1. We need a way to test scan operator with different record readers. Drill 
> supports a variety of data source, and it's important to make sure every 
> record reader work properly according to the protocol defined.
> 2. We need a way to test a so-called mini-plan (aka plan fragment) consisting 
> of multiple non-scan operators. 
> For the 2nd need, an alternative is to leverage SQL statement and query 
> planner. However, such approach has a direct dependency on query planner; 1) 
> any planner change may impact the testcase and lead to a different plan, 2) 
> it's not always easy job to force the planner to get a desired plan fragment 
> for testing.
> In particular, it would be good to have a relatively easy way to specify a 
> mini-plan with a couple of targeted physical operators. 
> This JIRA is created to track the work to extend the unit test framework in 
> DRILL-4437.
>  
> Related work: DRILL-5318 introduced a sub-operator test fixture, which mainly 
> targeted to test at sub-operator level. The framework in DRILL-4437 and the 
> extension would focus on operator level, or multiple operator levels, where 
> execution would go through RecordBatch's API call. 
> Same as DRILL-4437, we are going to use mockit to mock required objects such 
> fragment context, operator context etc. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5459) Extend physical operator test framework to test mini plans consisting of multiple operators

2017-05-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999111#comment-15999111
 ] 

ASF GitHub Bot commented on DRILL-5459:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/823#discussion_r115103600
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/unit/PhysicalOpUnitTestBase.java
 ---
@@ -82,6 +81,8 @@
  * Look! Doesn't extend BaseTestQuery!!
  */
 public class PhysicalOpUnitTestBase extends ExecTest {
+  public static long INIT_ALLOCATION = 10_000_000l;
+  public static long MAX_ALLOCATION = 15_000_000L;
--- End diff --

Should we use the values already defined in `AbstractBase`? To do that, 
perhaps refactor the code from:

```
public abstract class AbstractBase implements PhysicalOperator{
  ...
  protected long initialAllocation = 1_000_000L;
  protected long maxAllocation = 10_000_000_000L;
```

To:

```
public abstract class AbstractBase implements PhysicalOperator{
  ...
  public static long INIT_ALLOCATION = 1_000_000L;
  public static long MAX_ALLOCATION = 10_000_000_000L;
  protected long initialAllocation = INIT_ALLOCATION;
  protected long maxAllocation = MAX_ALLOCATION;
```

Actually, I think I did that in my much-delayed external sort PR for unit 
testing...


> Extend physical operator test framework to test mini plans consisting of 
> multiple operators
> ---
>
> Key: DRILL-5459
> URL: https://issues.apache.org/jira/browse/DRILL-5459
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>
> DRILL-4437 introduced a unit test framework to test a non-scan physical 
> operator. A JSON reader is implicitly used to specify the inputs to the 
> physical operator under test. 
> There are needs to extend such unit test framework for two scenarios.
> 1. We need a way to test scan operator with different record readers. Drill 
> supports a variety of data source, and it's important to make sure every 
> record reader work properly according to the protocol defined.
> 2. We need a way to test a so-called mini-plan (aka plan fragment) consisting 
> of multiple non-scan operators. 
> For the 2nd need, an alternative is to leverage SQL statement and query 
> planner. However, such approach has a direct dependency on query planner; 1) 
> any planner change may impact the testcase and lead to a different plan, 2) 
> it's not always easy job to force the planner to get a desired plan fragment 
> for testing.
> In particular, it would be good to have a relatively easy way to specify a 
> mini-plan with a couple of targeted physical operators. 
> This JIRA is created to track the work to extend the unit test framework in 
> DRILL-4437.
>  
> Related work: DRILL-5318 introduced a sub-operator test fixture, which mainly 
> targeted to test at sub-operator level. The framework in DRILL-4437 and the 
> extension would focus on operator level, or multiple operator levels, where 
> execution would go through RecordBatch's API call. 
> Same as DRILL-4437, we are going to use mockit to mock required objects such 
> fragment context, operator context etc. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5459) Extend physical operator test framework to test mini plans consisting of multiple operators

2017-05-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992401#comment-15992401
 ] 

ASF GitHub Bot commented on DRILL-5459:
---

GitHub user jinfengni opened a pull request:

https://github.com/apache/drill/pull/823

DRILL-5459: Extend physical operator test framework to test mini plan…

…s consisting of multiple operators.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jinfengni/incubator-drill DRILL-5459

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/823.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #823


commit 100d78791d7a10c5b791c4cc660a57b1d9f3a0d3
Author: Jinfeng Ni 
Date:   2017-04-22T00:34:15Z

DRILL-5459: Extend physical operator test framework to test mini plans 
consisting of multiple operators.




> Extend physical operator test framework to test mini plans consisting of 
> multiple operators
> ---
>
> Key: DRILL-5459
> URL: https://issues.apache.org/jira/browse/DRILL-5459
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>
> DRILL-4437 introduced a unit test framework to test a non-scan physical 
> operator. A JSON reader is implicitly used to specify the inputs to the 
> physical operator under test. 
> There are needs to extend such unit test framework for two scenarios.
> 1. We need a way to test scan operator with different record readers. Drill 
> supports a variety of data source, and it's important to make sure every 
> record reader work properly according to the protocol defined.
> 2. We need a way to test a so-called mini-plan (aka plan fragment) consisting 
> of multiple non-scan operators. 
> For the 2nd need, an alternative is to leverage SQL statement and query 
> planner. However, such approach has a direct dependency on query planner; 1) 
> any planner change may impact the testcase and lead to a different plan, 2) 
> it's not always easy job to force the planner to get a desired plan fragment 
> for testing.
> In particular, it would be good to have a relatively easy way to specify a 
> mini-plan with a couple of targeted physical operators. 
> This JIRA is created to track the work to extend the unit test framework in 
> DRILL-4437.
>  
> Related work: DRILL-5318 introduced a sub-operator test fixture, which mainly 
> targeted to test at sub-operator level. The framework in DRILL-4437 and the 
> extension would focus on operator level, or multiple operator levels, where 
> execution would go through RecordBatch's API call. 
> Same as DRILL-4437, we are going to use mockit to mock required objects such 
> fragment context, operator context etc. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)