[jira] [Commented] (DRILL-6373) Refactor the Result Set Loader to prepare for Union, List support

2018-07-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536535#comment-16536535
 ] 

ASF GitHub Bot commented on DRILL-6373:
---

paul-rogers edited a comment on issue #1244: DRILL-6373: Refactor Result Set 
Loader for Union, List support
URL: https://github.com/apache/drill/pull/1244#issuecomment-403358757
 
 
   Rebased on latest master. Cherry-picked the fix from DRILL-6585 to this 
branch.
   
   @vrozov or @ppadma - can you kick off a pre-commit build to see if the 
DRILL-6586 fix resolves the failure we saw earlier? Thanks! 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Refactor the Result Set Loader to prepare for Union, List support
> -
>
> Key: DRILL-6373
> URL: https://issues.apache.org/jira/browse/DRILL-6373
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.13.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>
> As the next step in merging the "batch sizing" enhancements, refactor the 
> {{ResultSetLoader}} and related classes to prepare for Union and List 
> support. This fix follows the refactoring of the column accessors for the 
> same purpose. Actual Union and List support is to follow in a separate PR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6373) Refactor the Result Set Loader to prepare for Union, List support

2018-07-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536534#comment-16536534
 ] 

ASF GitHub Bot commented on DRILL-6373:
---

paul-rogers commented on issue #1244: DRILL-6373: Refactor Result Set Loader 
for Union, List support
URL: https://github.com/apache/drill/pull/1244#issuecomment-403358757
 
 
   Rebased on latest paster. Cherry-picked the fix from DRILL-6585 to this 
branch.
   
   @vrozov or @ppadma - can you kick off a pre-commit build to see if the 
DRILL-6586 fix resolves the failure we saw earlier? Thanks! 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Refactor the Result Set Loader to prepare for Union, List support
> -
>
> Key: DRILL-6373
> URL: https://issues.apache.org/jira/browse/DRILL-6373
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.13.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>
> As the next step in merging the "batch sizing" enhancements, refactor the 
> {{ResultSetLoader}} and related classes to prepare for Union and List 
> support. This fix follows the refactoring of the column accessors for the 
> same purpose. Actual Union and List support is to follow in a separate PR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6585) PartitionSender clones vectors, but shares field metdata

2018-07-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536533#comment-16536533
 ] 

ASF GitHub Bot commented on DRILL-6585:
---

paul-rogers commented on issue #1367: DRILL-6585: PartitionSender clones 
vectors, but shares field metdata
URL: https://github.com/apache/drill/pull/1367#issuecomment-403358142
 
 
   @vrozov, please review. This is the fix for the issue we discussed a few 
weeks ago. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> PartitionSender clones vectors, but shares field metdata
> 
>
> Key: DRILL-6585
> URL: https://issues.apache.org/jira/browse/DRILL-6585
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>
> See the discussion for [PR #1244 for 
> DRILL-6373|https://github.com/apache/drill/pull/1244].
> The PartitionSender clones vectors. But, it does so by reusing the 
> {{MaterializedField}} from the original vector. Though the original authors 
> of {{MaterializedField}} apparently meant it to be immutable, later changes 
> for maps and unions ended up changing it to add members.
> When cloning a map, we get the original map materialized field, then start 
> doctoring it up as we add the cloned map members. This screws up the original 
> map vector's metadata.
> The solution is to clone an empty version of the materialized field when 
> creating a new vector.
> But, since much code creates vectors by giving a perfectly valid, unique 
> materialized field, we want to add a new method for use by the ill-behaved 
> uses, such as PartitionSender, that ask to create a new vector without 
> cloning the materialized field.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6585) PartitionSender clones vectors, but shares field metdata

2018-07-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536528#comment-16536528
 ] 

ASF GitHub Bot commented on DRILL-6585:
---

paul-rogers opened a new pull request #1367: DRILL-6585: PartitionSender clones 
vectors, but shares field metdata
URL: https://github.com/apache/drill/pull/1367
 
 
   See the discussion for [PR #1244 for 
DRILL-6373](https://github.com/apache/drill/pull/1244).
   
   The PartitionSender clones vectors. But, it does so by reusing the 
MaterializedField from the original vector. Though the original authors of 
MaterializedField apparently meant it to be immutable, later changes for maps 
and unions ended up changing it to add members.
   
   When cloning a map, we get the original map materialized field, then start 
doctoring it up as we add the cloned map members. This screws up the original 
map vector's metadata.
   
   The solution is to clone an empty version of the materialized field when 
creating a new vector.
   But, since much code creates vectors by giving a perfectly valid, unique 
materialized field, we want to add a new method for use by the ill-behaved 
uses, such as PartitionSender, that ask to create a new vector without cloning 
the materialized field.
   
   The solution is to add a new method, `TypeHelper.getClonedVector()`, which 
handles the "new vector from the materialized field of an existing vector" 
case. Modified the partition sender to use this version.
   
   Moved an existing method to group the "new vector" functions.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> PartitionSender clones vectors, but shares field metdata
> 
>
> Key: DRILL-6585
> URL: https://issues.apache.org/jira/browse/DRILL-6585
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>
> See the discussion for [PR #1244 for 
> DRILL-6373|https://github.com/apache/drill/pull/1244].
> The PartitionSender clones vectors. But, it does so by reusing the 
> {{MaterializedField}} from the original vector. Though the original authors 
> of {{MaterializedField}} apparently meant it to be immutable, later changes 
> for maps and unions ended up changing it to add members.
> When cloning a map, we get the original map materialized field, then start 
> doctoring it up as we add the cloned map members. This screws up the original 
> map vector's metadata.
> The solution is to clone an empty version of the materialized field when 
> creating a new vector.
> But, since much code creates vectors by giving a perfectly valid, unique 
> materialized field, we want to add a new method for use by the ill-behaved 
> uses, such as PartitionSender, that ask to create a new vector without 
> cloning the materialized field.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6585) PartitionSender clones vectors, but shares field metdata

2018-07-08 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-6585:
--

 Summary: PartitionSender clones vectors, but shares field metdata
 Key: DRILL-6585
 URL: https://issues.apache.org/jira/browse/DRILL-6585
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.13.0
Reporter: Paul Rogers
Assignee: Paul Rogers


See the discussion for [PR #1244 for 
DRILL-6373|https://github.com/apache/drill/pull/1244].

The PartitionSender clones vectors. But, it does so by reusing the 
{{MaterializedField}} from the original vector. Though the original authors of 
{{MaterializedField}} apparently meant it to be immutable, later changes for 
maps and unions ended up changing it to add members.

When cloning a map, we get the original map materialized field, then start 
doctoring it up as we add the cloned map members. This screws up the original 
map vector's metadata.

The solution is to clone an empty version of the materialized field when 
creating a new vector.

But, since much code creates vectors by giving a perfectly valid, unique 
materialized field, we want to add a new method for use by the ill-behaved 
uses, such as PartitionSender, that ask to create a new vector without cloning 
the materialized field.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6346) Create an Official Drill Docker Container

2018-07-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536465#comment-16536465
 ] 

ASF GitHub Bot commented on DRILL-6346:
---

paul-rogers commented on a change in pull request #1348: DRILL-6346: Create an 
Official Drill Docker Container
URL: https://github.com/apache/drill/pull/1348#discussion_r200860634
 
 

 ##
 File path: distribution/Dockerfile
 ##
 @@ -0,0 +1,35 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+FROM centos:7
+
+# Project version defined in pom.xml is passed as an argument
+ARG VERSION
+
+# JDK 8 is a pre-requisite to run Drill ; 'which' package is needed for 
drill-config.sh
+RUN yum install -y java-1.8.0-openjdk-devel which ; yum clean all ; rm -rf 
/var/cache/yum
+
+# The drill tarball is generated upon building the Drill project
+COPY target/apache-drill-$VERSION.tar.gz /tmp
+
+# Drill binaries are extracted into the '/opt/drill' directory
+RUN mkdir /opt/drill
+RUN tar -xvzf /tmp/apache-drill-$VERSION.tar.gz --directory=/opt/drill 
--strip-components 1
+
+# Starts Drill in embedded mode and connects to Sqlline
+ENTRYPOINT /opt/drill/bin/drill-embedded
 
 Review comment:
   Wondering how best to configure a true Drill server? Would the user really 
want to bind the ZK address, say, at container build time (as we suggested 
above, copying in the user's `drill-override.conf` file.)
   
   Or, should we think about how to pass in config? In K8s, config can be 
passed in via config maps and mounted into the proper location.
   
   Doing so, however, points out a flaw in the site directory structure: config 
files go into `$DRILL_SITE` but jars go into `$DRILL_SITE/jars`. If 
`$DRILL_SITE` is mounted from a config map, we hide the jars.
   
   Maybe need to modify the site directory so that the scripts first look in 
`$DRILL_SITE/conf`, then in `$DRILL_SITE`. In this case, the config map would 
be mounted into `$DRILL_SITE/conf`.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create an Official Drill Docker Container
> -
>
> Key: DRILL-6346
> URL: https://issues.apache.org/jira/browse/DRILL-6346
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Abhishek Girish
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6346) Create an Official Drill Docker Container

2018-07-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536468#comment-16536468
 ] 

ASF GitHub Bot commented on DRILL-6346:
---

paul-rogers commented on a change in pull request #1348: DRILL-6346: Create an 
Official Drill Docker Container
URL: https://github.com/apache/drill/pull/1348#discussion_r200860402
 
 

 ##
 File path: distribution/Dockerfile
 ##
 @@ -0,0 +1,35 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+FROM centos:7
+
+# Project version defined in pom.xml is passed as an argument
+ARG VERSION
+
+# JDK 8 is a pre-requisite to run Drill ; 'which' package is needed for 
drill-config.sh
+RUN yum install -y java-1.8.0-openjdk-devel which ; yum clean all ; rm -rf 
/var/cache/yum
+
+# The drill tarball is generated upon building the Drill project
+COPY target/apache-drill-$VERSION.tar.gz /tmp
+
+# Drill binaries are extracted into the '/opt/drill' directory
+RUN mkdir /opt/drill
+RUN tar -xvzf /tmp/apache-drill-$VERSION.tar.gz --directory=/opt/drill 
--strip-components 1
+
+# Starts Drill in embedded mode and connects to Sqlline
+ENTRYPOINT /opt/drill/bin/drill-embedded
 
 Review comment:
   We've all learned that Drill logs are essential to figure out what went 
wrong. As configured, these will end up in `/opt/drill/log` in the container's 
writeable layer. The data is lost when the container exits.
   
   Might want to configure the container to write logs to, say, 
`/var/log/drill`, then encourage the user to do a bind or volume mount to that 
location in order to persist logs after the container exits.
   
   Alternatively, for use in a system such as K8s, configure the logs to write 
to stdout so that the K8s log system can capture the log output, display it to 
the user (`kubectl logs `) or route it to the log aggregation system.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create an Official Drill Docker Container
> -
>
> Key: DRILL-6346
> URL: https://issues.apache.org/jira/browse/DRILL-6346
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Abhishek Girish
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6346) Create an Official Drill Docker Container

2018-07-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536470#comment-16536470
 ] 

ASF GitHub Bot commented on DRILL-6346:
---

paul-rogers commented on a change in pull request #1348: DRILL-6346: Create an 
Official Drill Docker Container
URL: https://github.com/apache/drill/pull/1348#discussion_r200860792
 
 

 ##
 File path: distribution/Dockerfile
 ##
 @@ -0,0 +1,35 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+FROM centos:7
+
+# Project version defined in pom.xml is passed as an argument
+ARG VERSION
+
+# JDK 8 is a pre-requisite to run Drill ; 'which' package is needed for 
drill-config.sh
+RUN yum install -y java-1.8.0-openjdk-devel which ; yum clean all ; rm -rf 
/var/cache/yum
+
+# The drill tarball is generated upon building the Drill project
+COPY target/apache-drill-$VERSION.tar.gz /tmp
+
+# Drill binaries are extracted into the '/opt/drill' directory
+RUN mkdir /opt/drill
+RUN tar -xvzf /tmp/apache-drill-$VERSION.tar.gz --directory=/opt/drill 
--strip-components 1
+
+# Starts Drill in embedded mode and connects to Sqlline
+ENTRYPOINT /opt/drill/bin/drill-embedded
 
 Review comment:
   The `Dockerfile` does not expose ports:
   
   ```
   EXPOSE 8047/tcp
   ```
   
   The above is for the web console port. Should we expose others?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create an Official Drill Docker Container
> -
>
> Key: DRILL-6346
> URL: https://issues.apache.org/jira/browse/DRILL-6346
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Abhishek Girish
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6346) Create an Official Drill Docker Container

2018-07-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536472#comment-16536472
 ] 

ASF GitHub Bot commented on DRILL-6346:
---

paul-rogers commented on a change in pull request #1348: DRILL-6346: Create an 
Official Drill Docker Container
URL: https://github.com/apache/drill/pull/1348#discussion_r200860927
 
 

 ##
 File path: distribution/Dockerfile
 ##
 @@ -0,0 +1,35 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+FROM centos:7
+
+# Project version defined in pom.xml is passed as an argument
+ARG VERSION
+
+# JDK 8 is a pre-requisite to run Drill ; 'which' package is needed for 
drill-config.sh
+RUN yum install -y java-1.8.0-openjdk-devel which ; yum clean all ; rm -rf 
/var/cache/yum
+
+# The drill tarball is generated upon building the Drill project
+COPY target/apache-drill-$VERSION.tar.gz /tmp
+
+# Drill binaries are extracted into the '/opt/drill' directory
+RUN mkdir /opt/drill
+RUN tar -xvzf /tmp/apache-drill-$VERSION.tar.gz --directory=/opt/drill 
--strip-components 1
+
+# Starts Drill in embedded mode and connects to Sqlline
+ENTRYPOINT /opt/drill/bin/drill-embedded
 
 Review comment:
   Should we provide a `README.md` file? Explain that this particular 
`Dockerfile`:
   
   * Runs an embedded Drillbit
   * Works only for data already on the class path
   * That the user must ssh(?) into the container?
   
   Explain how to do additional configuration (as discussed in other comments).
   
   Explain how to create a production `Dockerfile`.
   
   Else, the user must be a bit of a Docker and Drill expert to work out what 
would be required, an it won't be clear what goal this particular file is 
trying to achieve.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create an Official Drill Docker Container
> -
>
> Key: DRILL-6346
> URL: https://issues.apache.org/jira/browse/DRILL-6346
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Abhishek Girish
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6346) Create an Official Drill Docker Container

2018-07-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536469#comment-16536469
 ] 

ASF GitHub Bot commented on DRILL-6346:
---

paul-rogers commented on a change in pull request #1348: DRILL-6346: Create an 
Official Drill Docker Container
URL: https://github.com/apache/drill/pull/1348#discussion_r200861000
 
 

 ##
 File path: distribution/Dockerfile
 ##
 @@ -0,0 +1,35 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+FROM centos:7
+
+# Project version defined in pom.xml is passed as an argument
+ARG VERSION
+
+# JDK 8 is a pre-requisite to run Drill ; 'which' package is needed for 
drill-config.sh
+RUN yum install -y java-1.8.0-openjdk-devel which ; yum clean all ; rm -rf 
/var/cache/yum
+
+# The drill tarball is generated upon building the Drill project
+COPY target/apache-drill-$VERSION.tar.gz /tmp
+
+# Drill binaries are extracted into the '/opt/drill' directory
+RUN mkdir /opt/drill
+RUN tar -xvzf /tmp/apache-drill-$VERSION.tar.gz --directory=/opt/drill 
--strip-components 1
+
+# Starts Drill in embedded mode and connects to Sqlline
+ENTRYPOINT /opt/drill/bin/drill-embedded
 
 Review comment:
   The entry point launches an embedded Drillbit. That in turn fires up Sqlline 
which has an interactive console.
   
   It would seem that the only way to use this image is to launch it in an 
interactive session:
   ```
   docker run -it 
   ```
   
   This seems like a handy trick for trying Drill. But, very limited for 
production. In any event, the `README.md` file should probably explain how to 
use the image.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create an Official Drill Docker Container
> -
>
> Key: DRILL-6346
> URL: https://issues.apache.org/jira/browse/DRILL-6346
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Abhishek Girish
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6346) Create an Official Drill Docker Container

2018-07-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536466#comment-16536466
 ] 

ASF GitHub Bot commented on DRILL-6346:
---

paul-rogers commented on a change in pull request #1348: DRILL-6346: Create an 
Official Drill Docker Container
URL: https://github.com/apache/drill/pull/1348#discussion_r200860239
 
 

 ##
 File path: distribution/Dockerfile
 ##
 @@ -0,0 +1,35 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+FROM centos:7
+
+# Project version defined in pom.xml is passed as an argument
+ARG VERSION
+
+# JDK 8 is a pre-requisite to run Drill ; 'which' package is needed for 
drill-config.sh
+RUN yum install -y java-1.8.0-openjdk-devel which ; yum clean all ; rm -rf 
/var/cache/yum
+
+# The drill tarball is generated upon building the Drill project
+COPY target/apache-drill-$VERSION.tar.gz /tmp
+
+# Drill binaries are extracted into the '/opt/drill' directory
+RUN mkdir /opt/drill
+RUN tar -xvzf /tmp/apache-drill-$VERSION.tar.gz --directory=/opt/drill 
--strip-components 1
+
+# Starts Drill in embedded mode and connects to Sqlline
+ENTRYPOINT /opt/drill/bin/drill-embedded
 
 Review comment:
   This `Dockerfile` is only for an embedded Drillbit, which is find for 
playing around.
   
   Should we also offer an example production `Dockerfile`? In such a file, the 
following would also be needed:
   
   * Copy in custom jars (UDFs custom storage plugins.)
   * Copy in custom libraries (PAM, etc.)
   * Specify custom config.
   
   Doing the above will be easier if the user's files are not placed in the 
`$DRILL_HOME/conf` and other directories. That is, we want to use the site 
directory  feature.
   
   Maybe allow a `/opt/drill-site` location and pass `--site /opt/drill-site` 
on the command line for the entry point.
   
   Then, the `Dockerfile` can provide examples of how to copy the various kinds 
of files to the proper site directory locations:
   
   * `/opt/drill-site/` - `drill-override.conf`, `core-site.xml`, `logback.xml`
   * `/opt/drill-site/jar/` - UDF and storage-plugin jars
   * `/opt/drill-site/lib/` - native libraries
   
   Note that the functionality for Drill to find things in these site directory 
locations already exists. All we're doing here is showing the user how to use 
them.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create an Official Drill Docker Container
> -
>
> Key: DRILL-6346
> URL: https://issues.apache.org/jira/browse/DRILL-6346
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Abhishek Girish
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6346) Create an Official Drill Docker Container

2018-07-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536471#comment-16536471
 ] 

ASF GitHub Bot commented on DRILL-6346:
---

paul-rogers commented on a change in pull request #1348: DRILL-6346: Create an 
Official Drill Docker Container
URL: https://github.com/apache/drill/pull/1348#discussion_r200860662
 
 

 ##
 File path: distribution/Dockerfile
 ##
 @@ -0,0 +1,35 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+FROM centos:7
+
+# Project version defined in pom.xml is passed as an argument
+ARG VERSION
+
+# JDK 8 is a pre-requisite to run Drill ; 'which' package is needed for 
drill-config.sh
+RUN yum install -y java-1.8.0-openjdk-devel which ; yum clean all ; rm -rf 
/var/cache/yum
+
+# The drill tarball is generated upon building the Drill project
+COPY target/apache-drill-$VERSION.tar.gz /tmp
+
+# Drill binaries are extracted into the '/opt/drill' directory
+RUN mkdir /opt/drill
+RUN tar -xvzf /tmp/apache-drill-$VERSION.tar.gz --directory=/opt/drill 
--strip-components 1
+
+# Starts Drill in embedded mode and connects to Sqlline
+ENTRYPOINT /opt/drill/bin/drill-embedded
 
 Review comment:
   How will secrets be handled? MapR tickets or Kerberos certificates?
   
   What other configuration is needed?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create an Official Drill Docker Container
> -
>
> Key: DRILL-6346
> URL: https://issues.apache.org/jira/browse/DRILL-6346
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Abhishek Girish
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6346) Create an Official Drill Docker Container

2018-07-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536467#comment-16536467
 ] 

ASF GitHub Bot commented on DRILL-6346:
---

paul-rogers commented on a change in pull request #1348: DRILL-6346: Create an 
Official Drill Docker Container
URL: https://github.com/apache/drill/pull/1348#discussion_r200860284
 
 

 ##
 File path: distribution/Dockerfile
 ##
 @@ -0,0 +1,35 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+FROM centos:7
+
+# Project version defined in pom.xml is passed as an argument
+ARG VERSION
+
+# JDK 8 is a pre-requisite to run Drill ; 'which' package is needed for 
drill-config.sh
+RUN yum install -y java-1.8.0-openjdk-devel which ; yum clean all ; rm -rf 
/var/cache/yum
+
+# The drill tarball is generated upon building the Drill project
+COPY target/apache-drill-$VERSION.tar.gz /tmp
+
+# Drill binaries are extracted into the '/opt/drill' directory
+RUN mkdir /opt/drill
+RUN tar -xvzf /tmp/apache-drill-$VERSION.tar.gz --directory=/opt/drill 
--strip-components 1
 
 Review comment:
   A handy trick I learned is to do the container setup in a script, say 
`setup.sh` that is copied into the container and run. This avoids creating a 
bunch of extra Docker layers as the set of steps gets larger.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create an Official Drill Docker Container
> -
>
> Key: DRILL-6346
> URL: https://issues.apache.org/jira/browse/DRILL-6346
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Abhishek Girish
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6584) Implementing bitmap Indexes

2018-07-08 Thread mehran (JIRA)
mehran created DRILL-6584:
-

 Summary: Implementing bitmap Indexes
 Key: DRILL-6584
 URL: https://issues.apache.org/jira/browse/DRILL-6584
 Project: Apache Drill
  Issue Type: Improvement
  Components: Query Planning  Optimization
Reporter: mehran


I see that you may have priorities in your development.

and supporting multiple plugins for drill connections are also appreciated.

But your default storage engine is parquet, that is very cool for its kind of 
purposes.

Is it possible to bring forward implementing an index( roaring bitmap indexes 
similar to druid)?

or just to write a guideline for developing index on drill?

In fact full scan problems of drill is one of big problems that if solved, 
drill will be the best sql engine that can replace many use cases of databases.

 

Thank you in advance.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6385) Support JPPD (Join Predicate Push Down)

2018-07-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536086#comment-16536086
 ] 

ASF GitHub Bot commented on DRILL-6385:
---

weijietong commented on issue #1334: DRILL-6385: Support JPPD feature
URL: https://github.com/apache/drill/pull/1334#issuecomment-403284219
 
 
   @amansinha100 RuntimeFilterManager has changed to support the left deep tree 
case which you mentioned at the JIRA. Also please see the JRIA reply and review 
the updates, Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support JPPD (Join Predicate Push Down)
> ---
>
> Key: DRILL-6385
> URL: https://issues.apache.org/jira/browse/DRILL-6385
> Project: Apache Drill
>  Issue Type: New Feature
>  Components:  Server, Execution - Flow
>Affects Versions: 1.14.0
>Reporter: weijie.tong
>Assignee: weijie.tong
>Priority: Major
>
> This feature is to support the JPPD (Join Predicate Push Down). It will 
> benefit the HashJoin ,Broadcast HashJoin performance by reducing the number 
> of rows to send across the network ,the memory consumed. This feature is 
> already supported by Impala which calls it RuntimeFilter 
> ([https://www.cloudera.com/documentation/enterprise/5-9-x/topics/impala_runtime_filtering.html]).
>  The first PR will try to push down a bloom filter of HashJoin node to 
> Parquet’s scan node.   The propose basic procedure is described as follow:
>  # The HashJoin build side accumulate the equal join condition rows to 
> construct a bloom filter. Then it sends out the bloom filter to the foreman 
> node.
>  # The foreman node accept the bloom filters passively from all the fragments 
> that has the HashJoin operator. It then aggregates the bloom filters to form 
> a global bloom filter.
>  # The foreman node broadcasts the global bloom filter to all the probe side 
> scan nodes which maybe already have send out partial data to the hash join 
> nodes(currently the hash join node will prefetch one batch from both sides ).
>       4.  The scan node accepts a global bloom filter from the foreman node. 
> It will filter the rest rows satisfying the bloom filter.
>  
> To implement above execution flow, some main new notion described as below:
>       1. RuntimeFilter
> It’s a filter container which may contain BloomFilter or MinMaxFilter.
>       2. RuntimeFilterReporter
> It wraps the logic to send hash join’s bloom filter to the foreman.The 
> serialized bloom filter will be sent out through the data tunnel.This object 
> will be instanced by the FragmentExecutor and passed to the 
> FragmentContext.So the HashJoin operator can obtain it through the 
> FragmentContext.
>      3. RuntimeFilterRequestHandler
> It is responsible to accept a SendRuntimeFilterRequest RPC to strip the 
> actual BloomFilter from the network. It then translates this filter to the 
> WorkerBee’s new interface registerRuntimeFilter.
> Another RPC type is BroadcastRuntimeFilterRequest. It will register the 
> accepted global bloom filter to the WorkerBee by the registerRuntimeFilter 
> method and then propagate to the FragmentContext through which the probe side 
> scan node can fetch the aggregated bloom filter.
>       4.RuntimeFilterManager
> The foreman will instance a RuntimeFilterManager .It will indirectly get 
> every RuntimeFilter by the WorkerBee. Once all the BloomFilters have been 
> accepted and aggregated . It will broadcast the aggregated bloom filter to 
> all the probe side scan nodes through the data tunnel by a 
> BroadcastRuntimeFilterRequest RPC.
>      5. RuntimeFilterEnableOption 
>  A global option will be added to decide whether to enable this new feature.
>  
> Welcome suggestion and advice from you.The related PR will be presented as 
> soon as possible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)