[jira] [Updated] (DRILL-4600) Document Public Compatibility Commitments

2016-04-12 Thread Jacques Nadeau (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacques Nadeau updated DRILL-4600:
--
Summary: Document Public Compatibility Commitments  (was: Document Public 
Compatibility Comittments)

> Document Public Compatibility Commitments
> -
>
> Key: DRILL-4600
> URL: https://issues.apache.org/jira/browse/DRILL-4600
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Jacques Nadeau
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4600) Document Public Compatibility Comittments

2016-04-12 Thread Jacques Nadeau (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacques Nadeau updated DRILL-4600:
--
Fix Version/s: 2.0.0

> Document Public Compatibility Comittments
> -
>
> Key: DRILL-4600
> URL: https://issues.apache.org/jira/browse/DRILL-4600
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Jacques Nadeau
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4600) Document Public Compatibility Comittments

2016-04-12 Thread Jacques Nadeau (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238387#comment-15238387
 ] 

Jacques Nadeau commented on DRILL-4600:
---

Initial proposal: 
https://gist.github.com/jacques-n/47604a47c1a3b3ebf18bf2357763e890

> Document Public Compatibility Comittments
> -
>
> Key: DRILL-4600
> URL: https://issues.apache.org/jira/browse/DRILL-4600
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Jacques Nadeau
>Assignee: Jacques Nadeau
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4600) Document Public Compatibility Comittments

2016-04-12 Thread Jacques Nadeau (JIRA)
Jacques Nadeau created DRILL-4600:
-

 Summary: Document Public Compatibility Comittments
 Key: DRILL-4600
 URL: https://issues.apache.org/jira/browse/DRILL-4600
 Project: Apache Drill
  Issue Type: New Feature
Reporter: Jacques Nadeau
Assignee: Jacques Nadeau






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4589) Reduce planning time for file system partition pruning by reducing filter evaluation overhead

2016-04-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238311#comment-15238311
 ] 

ASF GitHub Bot commented on DRILL-4589:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/468


> Reduce planning time for file system partition pruning by reducing filter 
> evaluation overhead
> -
>
> Key: DRILL-4589
> URL: https://issues.apache.org/jira/browse/DRILL-4589
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>
> When Drill is used to query hundreds of thousands, or even millions of files 
> organized into multi-level directories, user typically will provide a 
> partition filter like  : dir0 = something and dir1 = something2 and .. .  
> For such queries, we saw the query planning time could be unacceptable long, 
> due to three main overheads: 1) to expand and get the list of files, 2) to 
> evaluate the partition filter, 3) to get the metadata, in the case of parquet 
> files for which metadata cache file is not available. 
> DRILL-2517 targets at the 3rd part of overhead. As a follow-up work after 
> DRILL-2517, we plan to reduce the filter evaluation overhead. For now, the 
> partition filter evaluation is applied to file level. In many cases, we saw 
> that the number of leaf subdirectories is significantly lower than that of 
> files. Since all the files under the same leaf subdirecctory share the same 
> directory metadata, we should apply the filter evaluation at the leaf 
> subdirectory. By doing that, we could reduce the cpu overhead to evaluate the 
> filter, and the memory overhead as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4591) Extend config system with distrib, site, node property files

2016-04-12 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-4591:
---
Description: 
Today Drill provides the drill-override.conf file to set Drill properties, and 
the drill-env.sh file to provide custom launch properties. Today, most users 
seem to have a copy of DRILL_HOME per node, and thus they copy these two files 
per-node. The result is that the two files act as both the overall "site" 
configuration (for all nodes) and the "per-node" configuration for that one 
node.

In addition, some distributions of Drill (such as MapR), modify the "user" 
config files with settings for that distribution. Now, the same files hold 
settings for the distribution, site and node.

The approach works, but is awkward. Ideally, provide the option to have three 
sets of files: for the distribution, site, and node.

The proposal is to extend configuration to provide additional levels:

* Drill defaults (drill default and module conf files, code in drill-config.sh)
* Distribution settings (special JVM settings, say)
* Site settings (standard log or spill file locations)
* Node settings
* Launch settings (environment variables, -Dname=value options)

The improvement becomes more important if a user employs NFS, MapR FS or YARN 
to automatically deploy the site-wide files. In that case, the site files 
cannot also act as per-node files.

The improvement also simplifies upgrades. Today, users must copy customizations 
from and old to a new install. With the revision, Drill files are complely 
separated from user files, making upgrades (of software) easier.

For backward compatibility, the site and node directories are optional and 
ignored if the environment variables are not set. The site and node config 
files should be optional: skip them if they do not exist.


  was:
Today Drill provides the drill-override.conf file to set Drill properties, and 
the drill-env.sh file to provide custom launch properties. Today, most users 
seem to have a copy of DRILL_HOME per node, and thus they copy these two files 
per-node. The result is that the two files act as both the overall "site" 
configuration (for all nodes) and the "per-node" configuration for that one 
node.

In addition, some distributions of Drill (such as MapR), modify the "user" 
config files with settings for that distribution. Now, the same files hold 
settings for the distribution, site and node.

The approach works, but is awkward. Ideally, provide the option to have three 
sets of files: for the distribution, site, and node.

Let's assume each set resides in its own directory. (Merging the three sets 
into a single directory would simply shift the sync problems from files to 
directories.)

* DRILL_HOME/conf: Distribution files
* DRILL_SITE_DIR/conf: Site-wide files
* DRILL_NODE_DIR/conf: Node-specific files

Each directory might contain its own drill-override.conf, drill-env.sh files. 
(The idea extends to site jar files as well by adding a jars directory under 
DRILL_SITE_DIR.)

Configuration now provides additional levels:

* Drill defaults (drill default and module conf files, code in drill-config.sh)
* Distribution settings (special JVM settings, say)
* Site settings (standard log or spill file locations)
* Node settings
* Launch settings (environment variables, -Dname=value options)

The improvement becomes more important if a user employs NFS, MapR FS or YARN 
to automatically deploy the site-wide files. In that case, the site files 
cannot also act as per-node files.

The improvement also simplifies upgrades. Today, users must copy customizations 
from and old to a new install. With the new system, Drill files are complely 
separated from user files, making upgrades (of software) trivial.

Note that the current version of Drill does allow users to put config files in 
/etc/drill/conf, but that location replaces $DRILL_HOME/conf; the user must 
still start with the Distribution-specific files, and must merge any new 
distribution changes in each new release.

For backward compatibility, the site and node directories are optional and 
ignored if the environment variables are not set. The site and node config 
files should be optional: skip them if they do not exist (or, for node files, 
skip them if DRILL_NODE_CONF_DIR is not set.)



> Extend config system with distrib, site, node property files
> 
>
> Key: DRILL-4591
> URL: https://issues.apache.org/jira/browse/DRILL-4591
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Paul Rogers
> Attachments: Drill-on-YARNDirectoryStructures.pdf
>
>
> Today Drill provides the drill-override.conf file to set Drill properties, 
> and the drill-env.sh file to provide custom launch properties. Today, most 
> users seem to have a copy of DRILL_HOME per node,

[jira] [Commented] (DRILL-4591) Extend config system with distrib, site, node property files

2016-04-12 Thread Paul Rogers (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238290#comment-15238290
 ] 

Paul Rogers commented on DRILL-4591:


Sorry, early code name. The actual name (unless we change it again) is 
"Drill-on-YARN." Will be fixed in next doc. revision.

> Extend config system with distrib, site, node property files
> 
>
> Key: DRILL-4591
> URL: https://issues.apache.org/jira/browse/DRILL-4591
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Paul Rogers
> Attachments: Drill-on-YARNDirectoryStructures.pdf
>
>
> Today Drill provides the drill-override.conf file to set Drill properties, 
> and the drill-env.sh file to provide custom launch properties. Today, most 
> users seem to have a copy of DRILL_HOME per node, and thus they copy these 
> two files per-node. The result is that the two files act as both the overall 
> "site" configuration (for all nodes) and the "per-node" configuration for 
> that one node.
> In addition, some distributions of Drill (such as MapR), modify the "user" 
> config files with settings for that distribution. Now, the same files hold 
> settings for the distribution, site and node.
> The approach works, but is awkward. Ideally, provide the option to have three 
> sets of files: for the distribution, site, and node.
> Let's assume each set resides in its own directory. (Merging the three sets 
> into a single directory would simply shift the sync problems from files to 
> directories.)
> * DRILL_HOME/conf: Distribution files
> * DRILL_SITE_DIR/conf: Site-wide files
> * DRILL_NODE_DIR/conf: Node-specific files
> Each directory might contain its own drill-override.conf, drill-env.sh files. 
> (The idea extends to site jar files as well by adding a jars directory under 
> DRILL_SITE_DIR.)
> Configuration now provides additional levels:
> * Drill defaults (drill default and module conf files, code in 
> drill-config.sh)
> * Distribution settings (special JVM settings, say)
> * Site settings (standard log or spill file locations)
> * Node settings
> * Launch settings (environment variables, -Dname=value options)
> The improvement becomes more important if a user employs NFS, MapR FS or YARN 
> to automatically deploy the site-wide files. In that case, the site files 
> cannot also act as per-node files.
> The improvement also simplifies upgrades. Today, users must copy 
> customizations from and old to a new install. With the new system, Drill 
> files are complely separated from user files, making upgrades (of software) 
> trivial.
> Note that the current version of Drill does allow users to put config files 
> in /etc/drill/conf, but that location replaces $DRILL_HOME/conf; the user 
> must still start with the Distribution-specific files, and must merge any new 
> distribution changes in each new release.
> For backward compatibility, the site and node directories are optional and 
> ignored if the environment variables are not set. The site and node config 
> files should be optional: skip them if they do not exist (or, for node files, 
> skip them if DRILL_NODE_CONF_DIR is not set.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4565) Document entire set of Drill config properties including default values

2016-04-12 Thread Jacques Nadeau (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238278#comment-15238278
 ] 

Jacques Nadeau commented on DRILL-4565:
---

I think this should probably start with cleaning up the no-longer used values.

> Document entire set of Drill config properties including default values
> ---
>
> Key: DRILL-4565
> URL: https://issues.apache.org/jira/browse/DRILL-4565
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Paul Rogers
>Assignee: Bridget Bevens
>Priority: Minor
>
> We should prepare a file or web page that lists all Drill properties, their 
> defaults and an explanation. The developers should provide this file which we 
> can then include in the documentation.
> Here’s an example from Hadoop: 
> https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml.
>  It isn’t very pretty, but, as a Hadoop newbie, I find I keep returning to it 
> again and again.
> For Drill, it might be handy to include:
> * Name
> * Start-up or runtime option
> * Description
> * Default value
> * Environment variable (if any)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4591) Extend config system with distrib, site, node property files

2016-04-12 Thread Jacques Nadeau (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238267#comment-15238267
 ] 

Jacques Nadeau commented on DRILL-4591:
---

What is Drill-major?

Why is it called major?

> Extend config system with distrib, site, node property files
> 
>
> Key: DRILL-4591
> URL: https://issues.apache.org/jira/browse/DRILL-4591
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Paul Rogers
> Attachments: Drill-on-YARNDirectoryStructures.pdf
>
>
> Today Drill provides the drill-override.conf file to set Drill properties, 
> and the drill-env.sh file to provide custom launch properties. Today, most 
> users seem to have a copy of DRILL_HOME per node, and thus they copy these 
> two files per-node. The result is that the two files act as both the overall 
> "site" configuration (for all nodes) and the "per-node" configuration for 
> that one node.
> In addition, some distributions of Drill (such as MapR), modify the "user" 
> config files with settings for that distribution. Now, the same files hold 
> settings for the distribution, site and node.
> The approach works, but is awkward. Ideally, provide the option to have three 
> sets of files: for the distribution, site, and node.
> Let's assume each set resides in its own directory. (Merging the three sets 
> into a single directory would simply shift the sync problems from files to 
> directories.)
> * DRILL_HOME/conf: Distribution files
> * DRILL_SITE_DIR/conf: Site-wide files
> * DRILL_NODE_DIR/conf: Node-specific files
> Each directory might contain its own drill-override.conf, drill-env.sh files. 
> (The idea extends to site jar files as well by adding a jars directory under 
> DRILL_SITE_DIR.)
> Configuration now provides additional levels:
> * Drill defaults (drill default and module conf files, code in 
> drill-config.sh)
> * Distribution settings (special JVM settings, say)
> * Site settings (standard log or spill file locations)
> * Node settings
> * Launch settings (environment variables, -Dname=value options)
> The improvement becomes more important if a user employs NFS, MapR FS or YARN 
> to automatically deploy the site-wide files. In that case, the site files 
> cannot also act as per-node files.
> The improvement also simplifies upgrades. Today, users must copy 
> customizations from and old to a new install. With the new system, Drill 
> files are complely separated from user files, making upgrades (of software) 
> trivial.
> Note that the current version of Drill does allow users to put config files 
> in /etc/drill/conf, but that location replaces $DRILL_HOME/conf; the user 
> must still start with the Distribution-specific files, and must merge any new 
> distribution changes in each new release.
> For backward compatibility, the site and node directories are optional and 
> ignored if the environment variables are not set. The site and node config 
> files should be optional: skip them if they do not exist (or, for node files, 
> skip them if DRILL_NODE_CONF_DIR is not set.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-4563) Add HOCON reference to Drill docs

2016-04-12 Thread Bridget Bevens (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bridget Bevens closed DRILL-4563.
-

Page updated in web site. 

> Add HOCON reference to Drill docs
> -
>
> Key: DRILL-4563
> URL: https://issues.apache.org/jira/browse/DRILL-4563
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.6.0
>Reporter: Paul Rogers
>Assignee: Bridget Bevens
>Priority: Minor
>
> On http://drill.apache.org/docs/start-up-options/ We say:
> Drill’s start-up options reside in a HOCON configuration file format
> Perhaps change the word “HOCON” to a link. Have it point to the HOCON docs 
> at: https://github.com/typesafehub/config/blob/master/HOCON.md



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4563) Add HOCON reference to Drill docs

2016-04-12 Thread Bridget Bevens (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bridget Bevens resolved DRILL-4563.
---
Resolution: Fixed

Updated the page based on comments in this ticket. 

> Add HOCON reference to Drill docs
> -
>
> Key: DRILL-4563
> URL: https://issues.apache.org/jira/browse/DRILL-4563
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.6.0
>Reporter: Paul Rogers
>Assignee: Bridget Bevens
>Priority: Minor
>
> On http://drill.apache.org/docs/start-up-options/ We say:
> Drill’s start-up options reside in a HOCON configuration file format
> Perhaps change the word “HOCON” to a link. Have it point to the HOCON docs 
> at: https://github.com/typesafehub/config/blob/master/HOCON.md



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-3612) Doc. says logging configuration at /conf/logback.xml

2016-04-12 Thread Bridget Bevens (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bridget Bevens closed DRILL-3612.
-
Assignee: Bridget Bevens

Fixed issue. Closing ticket.

> Doc. says logging configuration at /conf/logback.xml
> 
>
> Key: DRILL-3612
> URL: https://issues.apache.org/jira/browse/DRILL-3612
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Daniel Barclay (Drill)
>Assignee: Bridget Bevens
> Fix For: Future
>
>
> The Drill documentation page at 
> https://drill.apache.org/docs/log-and-debug-introduction/ says:
> bq. Logback behavior is defined by configurations set in /conf/logback.xml. 
> Isn't the location really "conf/logback.xml" relative to Drill's root 
> installation directory (the apache-drill-n.n.n directory)?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3612) Doc. says logging configuration at /conf/logback.xml

2016-04-12 Thread Bridget Bevens (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bridget Bevens resolved DRILL-3612.
---
Resolution: Fixed

Missing back ticks around file path so the full path wasn't displaying on the 
web page. Fixed this issue.

> Doc. says logging configuration at /conf/logback.xml
> 
>
> Key: DRILL-3612
> URL: https://issues.apache.org/jira/browse/DRILL-3612
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Daniel Barclay (Drill)
> Fix For: Future
>
>
> The Drill documentation page at 
> https://drill.apache.org/docs/log-and-debug-introduction/ says:
> bq. Logback behavior is defined by configurations set in /conf/logback.xml. 
> Isn't the location really "conf/logback.xml" relative to Drill's root 
> installation directory (the apache-drill-n.n.n directory)?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-4440) Host file location for Windows incorrect in doc

2016-04-12 Thread Bridget Bevens (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bridget Bevens closed DRILL-4440.
-

change is published 

> Host file location for Windows incorrect in doc
> ---
>
> Key: DRILL-4440
> URL: https://issues.apache.org/jira/browse/DRILL-4440
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Andries Engelbrecht
>Assignee: Bridget Bevens
>Priority: Minor
>
> The hosts file location on the page
> https://drill.apache.org/docs/installing-the-driver-on-windows/
> show /etc/hosts which is for Linux/Mac.
> It should point to 
> \Windows\system32\drivers\etc\hosts 
> for Windows systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4440) Host file location for Windows incorrect in doc

2016-04-12 Thread Bridget Bevens (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bridget Bevens resolved DRILL-4440.
---
Resolution: Fixed

changed host file path

> Host file location for Windows incorrect in doc
> ---
>
> Key: DRILL-4440
> URL: https://issues.apache.org/jira/browse/DRILL-4440
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Andries Engelbrecht
>Assignee: Bridget Bevens
>Priority: Minor
>
> The hosts file location on the page
> https://drill.apache.org/docs/installing-the-driver-on-windows/
> show /etc/hosts which is for Linux/Mac.
> It should point to 
> \Windows\system32\drivers\etc\hosts 
> for Windows systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2678) JSON appender not packaged with Drill

2016-04-12 Thread Bridget Bevens (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bridget Bevens updated DRILL-2678:
--
Component/s: (was: Documentation)

> JSON appender not packaged with Drill
> -
>
> Key: DRILL-2678
> URL: https://issues.apache.org/jira/browse/DRILL-2678
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Bridget Bevens
> Fix For: Future
>
>
> Drill needs JSON appender and JAR file to log in JSON mode



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2357) Remove Incubator tag for Category on Apache Drill JIRA page

2016-04-12 Thread Bridget Bevens (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bridget Bevens updated DRILL-2357:
--
Component/s: (was: Documentation)

> Remove Incubator tag for Category on Apache Drill JIRA page
> ---
>
> Key: DRILL-2357
> URL: https://issues.apache.org/jira/browse/DRILL-2357
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Abhishek Girish
> Fix For: Future
>
>
> Link: 
> https://issues.apache.org/jira/browse/DRILL/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel
>  
> Category still shows up as "Incubator". Should be changed to "Drill"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4527) Remove unnecessary code: DrillAvgVarianceConvertlet.java

2016-04-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237950#comment-15237950
 ] 

ASF GitHub Bot commented on DRILL-4527:
---

Github user vkorukanti commented on the pull request:

https://github.com/apache/drill/pull/441#issuecomment-209095889
  
LGTM, +1


> Remove unnecessary code:  DrillAvgVarianceConvertlet.java
> -
>
> Key: DRILL-4527
> URL: https://issues.apache.org/jira/browse/DRILL-4527
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server
>Reporter: MinJi Kim
>Assignee: MinJi Kim
>
> DrillConvertletTable is used as a way to have custom functions.  For example, 
> for EXTRACT(), DrilLConvertletTable.get() returns DrillExtractConvertlet, 
> which returns a custom RexNode for the extract function.  
> On the other hand, DrillAvgVarianceConvertlet is never used.  
> stddev/avg/variance functions are handled by DrillAggregateRule and 
> DrillReduceAggregatesRule.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4446) Improve current fragment parallelization module

2016-04-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237942#comment-15237942
 ] 

ASF GitHub Bot commented on DRILL-4446:
---

Github user vkorukanti commented on the pull request:

https://github.com/apache/drill/pull/403#issuecomment-209094651
  
Updated patch,

Addressed review comments and also added couple of negative tests.


> Improve current fragment parallelization module
> ---
>
> Key: DRILL-4446
> URL: https://issues.apache.org/jira/browse/DRILL-4446
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.5.0
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
> Fix For: 1.7.0
>
>
> Current fragment parallelizer {{SimpleParallelizer.java}} can’t handle 
> correctly the case where an operator has mandatory scheduling requirement for 
> a set of DrillbitEndpoints and affinity for each DrillbitEndpoint (i.e how 
> much portion of the total tasks to be scheduled on each DrillbitEndpoint). It 
> assumes that scheduling requirements are soft (except one case where Mux and 
> DeMux case where mandatory parallelization requirement of 1 unit). 
> An example is:
> Cluster has 3 nodes running Drillbits and storage service on each. Data for a 
> table is only present at storage services in two nodes. So a GroupScan needs 
> to be scheduled on these two nodes in order to read the data. Storage service 
> doesn't support (or costly) reading data from remote node.
> Inserting the mandatory scheduling requirements within existing 
> SimpleParallelizer is not sufficient as you may end up with a plan that has a 
> fragment with two GroupScans each having its own hard parallelization 
> requirements.
> Proposal is:
> Add a property to each operator which tells what parallelization 
> implementation to use. Most operators don't have any particular strategy 
> (such as Project or Filter), they depend on incoming operator. Current 
> existing operators which have requirements (all existing GroupScans) default 
> to current parallelizer {{SimpleParallelizer}}. {{Screen}} defaults to new 
> mandatory assignment parallelizer. It is possible that PhysicalPlan generated 
> can have a fragment with operators having different parallelization 
> strategies. In that case an exchange is inserted in between operators where a 
> change in parallelization strategy is required.
> Will send a detailed design doc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3714) Query runs out of memory and remains in CANCELLATION_REQUESTED state until drillbit is restarted

2016-04-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237889#comment-15237889
 ] 

ASF GitHub Bot commented on DRILL-3714:
---

Github user jacques-n commented on a diff in the pull request:

https://github.com/apache/drill/pull/463#discussion_r59445000
  
--- Diff: 
exec/rpc/src/main/java/org/apache/drill/exec/rpc/RequestIdMap.java ---
@@ -84,7 +115,7 @@ public void operationComplete(ChannelFuture future) 
throws Exception {
   if (!future.isSuccess()) {
 removeFromMap(coordinationId);
 if (future.channel().isActive()) {
-   throw new RpcException("Future failed") ;
+  throw new RpcException("Future failed");
--- End diff --

This has existed since this commit (Sept 2013): 
https://github.com/apache/drill/commit/3f101aabd05f174e8bfc9dbdd2590066a1e937f4

I'm not inclined to change it in this patch as it seems out of scope. The 
change probably makes sense but let's explore in a different jira to ensure 
we're thinking through the ramifications.


> Query runs out of memory and remains in CANCELLATION_REQUESTED state until 
> drillbit is restarted
> 
>
> Key: DRILL-3714
> URL: https://issues.apache.org/jira/browse/DRILL-3714
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
>Reporter: Victoria Markman
>Assignee: Jacques Nadeau
>Priority: Critical
> Fix For: 1.7.0
>
> Attachments: Screen Shot 2015-08-26 at 10.36.33 AM.png, drillbit.log, 
> jstack.txt, query_profile_2a2210a7-7a78-c774-d54c-c863d0b77bb0.json
>
>
> This is a variation of DRILL-3705 with the difference of drill behavior when 
> hitting OOM condition.
> Query runs out of memory during execution and remains in 
> "CANCELLATION_REQUESTED" state until drillbit is bounced.
> Client (sqlline in this case) never gets a response from the server.
> Reproduction details:
> Single node drillbit installation.
> DRILL_MAX_DIRECT_MEMORY="8G"
> DRILL_HEAP="4G"
> Run this query on TPCDS SF100 data set
> {code}
> SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk) AS 
> TotalSpend FROM store_sales ss WHERE ss.ss_store_sk IS NOT NULL ORDER BY 1 
> LIMIT 10;
> {code}
> drillbit.log
> {code}
> 2015-08-26 16:54:58,469 [2a2210a7-7a78-c774-d54c-c863d0b77bb0:frag:3:22] INFO 
>  o.a.d.e.w.f.FragmentStatusReporter - 
> 2a2210a7-7a78-c774-d54c-c863d0b77bb0:3:22: State to report: RUNNING
> 2015-08-26 16:55:50,498 [BitServer-5] WARN  
> o.a.drill.exec.rpc.data.DataServer - Message of mode REQUEST of rpc type 3 
> took longer than 500ms.  Actual duration was 2569ms.
> 2015-08-26 16:56:31,086 [BitServer-5] ERROR 
> o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.  
> Connection: /10.10.88.133:31012 <--> /10.10.88.133:54554 (data server).  
> Closing connection.
> io.netty.handler.codec.DecoderException: java.lang.OutOfMemoryError: Direct 
> buffer memory
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:233)
>  ~[netty-codec-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:618)
>  [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
> at 
> io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:329) 
> [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
> at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:250) 
> [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
> at 
> io.netty.util.concurrent.SingleThreadEventEx

[jira] [Commented] (DRILL-4596) Drill should do version check among drillbits

2016-04-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237875#comment-15237875
 ] 

ASF GitHub Bot commented on DRILL-4596:
---

Github user jacques-n commented on the pull request:

https://github.com/apache/drill/pull/474#issuecomment-209081311
  
In general, I'm against version number checking. We did that in the code 
early on but we should be moving towards a capabilities flag approach.

Also agree with Paul in his mention of DRILL-4286, don't think worrying 
about rolling upgrade makes sense until we resolve the issues around 
decommissioning.


> Drill should do version check among drillbits
> -
>
> Key: DRILL-4596
> URL: https://issues.apache.org/jira/browse/DRILL-4596
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.6.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
> Fix For: Future
>
>
> Before registering new drillbit in zookeeper, we should do version check, and 
> make sure all the running drillbits are in the same version.
> Using drillbits of different version can lead to unexpected results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4591) Extend config system with distrib, site, node property files

2016-04-12 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-4591:
---
Attachment: Drill-on-YARNDirectoryStructures.pdf

Draft design proposal. Proposal explains suggested Drill changes in the context 
of the Drill-on-YARN project.

> Extend config system with distrib, site, node property files
> 
>
> Key: DRILL-4591
> URL: https://issues.apache.org/jira/browse/DRILL-4591
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Paul Rogers
> Attachments: Drill-on-YARNDirectoryStructures.pdf
>
>
> Today Drill provides the drill-override.conf file to set Drill properties, 
> and the drill-env.sh file to provide custom launch properties. Today, most 
> users seem to have a copy of DRILL_HOME per node, and thus they copy these 
> two files per-node. The result is that the two files act as both the overall 
> "site" configuration (for all nodes) and the "per-node" configuration for 
> that one node.
> In addition, some distributions of Drill (such as MapR), modify the "user" 
> config files with settings for that distribution. Now, the same files hold 
> settings for the distribution, site and node.
> The approach works, but is awkward. Ideally, provide the option to have three 
> sets of files: for the distribution, site, and node.
> Let's assume each set resides in its own directory. (Merging the three sets 
> into a single directory would simply shift the sync problems from files to 
> directories.)
> * DRILL_HOME/conf: Distribution files
> * DRILL_SITE_DIR/conf: Site-wide files
> * DRILL_NODE_DIR/conf: Node-specific files
> Each directory might contain its own drill-override.conf, drill-env.sh files. 
> (The idea extends to site jar files as well by adding a jars directory under 
> DRILL_SITE_DIR.)
> Configuration now provides additional levels:
> * Drill defaults (drill default and module conf files, code in 
> drill-config.sh)
> * Distribution settings (special JVM settings, say)
> * Site settings (standard log or spill file locations)
> * Node settings
> * Launch settings (environment variables, -Dname=value options)
> The improvement becomes more important if a user employs NFS, MapR FS or YARN 
> to automatically deploy the site-wide files. In that case, the site files 
> cannot also act as per-node files.
> The improvement also simplifies upgrades. Today, users must copy 
> customizations from and old to a new install. With the new system, Drill 
> files are complely separated from user files, making upgrades (of software) 
> trivial.
> Note that the current version of Drill does allow users to put config files 
> in /etc/drill/conf, but that location replaces $DRILL_HOME/conf; the user 
> must still start with the Distribution-specific files, and must merge any new 
> distribution changes in each new release.
> For backward compatibility, the site and node directories are optional and 
> ignored if the environment variables are not set. The site and node config 
> files should be optional: skip them if they do not exist (or, for node files, 
> skip them if DRILL_NODE_CONF_DIR is not set.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3714) Query runs out of memory and remains in CANCELLATION_REQUESTED state until drillbit is restarted

2016-04-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1523#comment-1523
 ] 

ASF GitHub Bot commented on DRILL-3714:
---

Github user jacques-n commented on a diff in the pull request:

https://github.com/apache/drill/pull/463#discussion_r59434232
  
--- Diff: 
exec/rpc/src/main/java/org/apache/drill/exec/rpc/RequestIdMap.java ---
@@ -20,51 +20,82 @@
 import io.netty.buffer.ByteBuf;
 import io.netty.channel.ChannelFuture;
 
-import java.util.Map;
-import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.concurrent.atomic.AtomicInteger;
 
 import org.apache.drill.common.exceptions.UserRemoteException;
 import org.apache.drill.exec.proto.UserBitShared.DrillPBError;
 
+import com.carrotsearch.hppc.IntObjectHashMap;
+import com.carrotsearch.hppc.procedures.IntObjectProcedure;
+import com.google.common.base.Preconditions;
+
 /**
- * Manages the creation of rpc futures for a particular socket.
+ * Manages the creation of rpc futures for a particular socket <--> socket
+ * connection. Generally speaking, there will be two threads working with 
this
+ * class (the socket thread and the Request generating thread). 
Synchronization
+ * is simple with the map being the only thing that is protected. 
Everything
+ * else works via Atomic variables.
  */
-public class CoordinationQueue {
-  static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(CoordinationQueue.class);
+class RequestIdMap {
+  static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(RequestIdMap.class);
+
+  private final AtomicInteger value = new AtomicInteger();
+  private final AtomicBoolean acceptMessage = new AtomicBoolean(true);
 
-  private final PositiveAtomicInteger circularInt = new 
PositiveAtomicInteger();
-  private final Map> map;
+  /** Access to map must be protected. **/
+  private final IntObjectHashMap> map;
 
-  public CoordinationQueue(int segmentSize, int segmentCount) {
-map = new ConcurrentHashMap>(segmentSize, 
0.75f, segmentCount);
+  public RequestIdMap() {
+map = new IntObjectHashMap>();
   }
 
   void channelClosed(Throwable ex) {
+acceptMessage.set(false);
 if (ex != null) {
-  RpcException e;
-  if (ex instanceof RpcException) {
-e = (RpcException) ex;
-  } else {
-e = new RpcException(ex);
+  final RpcException e = RpcException.mapException(ex);
+  synchronized (map) {
+map.forEach(new Closer(e));
+map.clear();
   }
-  for (RpcOutcome f : map.values()) {
-f.setException(e);
+}
+  }
+
+  private class Closer implements IntObjectProcedure> {
+final RpcException exception;
+
+public Closer(RpcException exception) {
+  this.exception = exception;
+}
+
+@Override
+public void apply(int key, RpcOutcome value) {
+  try{
+value.setException(exception);
+  }catch(Exception e){
+logger.warn("Failure while attempting to fail rpc response.", e);
   }
 }
+
   }
 
-  public  ChannelListenerWithCoordinationId get(RpcOutcomeListener 
handler, Class clazz, RemoteConnection connection) {
-int i = circularInt.getNext();
+  public  ChannelListenerWithCoordinationId 
createNewRpcListener(RpcOutcomeListener handler, Class clazz,
+  RemoteConnection connection) {
+int i = value.incrementAndGet();
 RpcListener future = new RpcListener(handler, clazz, i, 
connection);
-Object old = map.put(i, future);
-if (old != null) {
-  throw new IllegalStateException(
-  "You attempted to reuse a coordination id when the previous 
coordination id has not been removed.  This is likely rpc future callback 
memory leak.");
+final Object old;
+synchronized (map) {
+  Preconditions.checkArgument(acceptMessage.get(),
--- End diff --

I prefer to spend as little time in the synchronized block as possible. If 
we move this up we still need to check in the synchronized block (e.g. we could 
make this double-checked). 


> Query runs out of memory and remains in CANCELLATION_REQUESTED state until 
> drillbit is restarted
> 
>
> Key: DRILL-3714
> URL: https://issues.apache.org/jira/browse/DRILL-3714
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
>Reporte

[jira] [Updated] (DRILL-4591) Extend config system with distrib, site, node property files

2016-04-12 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-4591:
---
Priority: Major  (was: Minor)

> Extend config system with distrib, site, node property files
> 
>
> Key: DRILL-4591
> URL: https://issues.apache.org/jira/browse/DRILL-4591
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Paul Rogers
>
> Today Drill provides the drill-override.conf file to set Drill properties, 
> and the drill-env.sh file to provide custom launch properties. Today, most 
> users seem to have a copy of DRILL_HOME per node, and thus they copy these 
> two files per-node. The result is that the two files act as both the overall 
> "site" configuration (for all nodes) and the "per-node" configuration for 
> that one node.
> In addition, some distributions of Drill (such as MapR), modify the "user" 
> config files with settings for that distribution. Now, the same files hold 
> settings for the distribution, site and node.
> The approach works, but is awkward. Ideally, provide the option to have three 
> sets of files: for the distribution, site, and node.
> Let's assume each set resides in its own directory. (Merging the three sets 
> into a single directory would simply shift the sync problems from files to 
> directories.)
> * DRILL_HOME/conf: Distribution files
> * DRILL_SITE_DIR/conf: Site-wide files
> * DRILL_NODE_DIR/conf: Node-specific files
> Each directory might contain its own drill-override.conf, drill-env.sh files. 
> (The idea extends to site jar files as well by adding a jars directory under 
> DRILL_SITE_DIR.)
> Configuration now provides additional levels:
> * Drill defaults (drill default and module conf files, code in 
> drill-config.sh)
> * Distribution settings (special JVM settings, say)
> * Site settings (standard log or spill file locations)
> * Node settings
> * Launch settings (environment variables, -Dname=value options)
> The improvement becomes more important if a user employs NFS, MapR FS or YARN 
> to automatically deploy the site-wide files. In that case, the site files 
> cannot also act as per-node files.
> The improvement also simplifies upgrades. Today, users must copy 
> customizations from and old to a new install. With the new system, Drill 
> files are complely separated from user files, making upgrades (of software) 
> trivial.
> Note that the current version of Drill does allow users to put config files 
> in /etc/drill/conf, but that location replaces $DRILL_HOME/conf; the user 
> must still start with the Distribution-specific files, and must merge any new 
> distribution changes in each new release.
> For backward compatibility, the site and node directories are optional and 
> ignored if the environment variables are not set. The site and node config 
> files should be optional: skip them if they do not exist (or, for node files, 
> skip them if DRILL_NODE_CONF_DIR is not set.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3714) Query runs out of memory and remains in CANCELLATION_REQUESTED state until drillbit is restarted

2016-04-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237767#comment-15237767
 ] 

ASF GitHub Bot commented on DRILL-3714:
---

Github user jacques-n commented on a diff in the pull request:

https://github.com/apache/drill/pull/463#discussion_r59433373
  
--- Diff: 
exec/rpc/src/main/java/org/apache/drill/exec/rpc/RequestIdMap.java ---
@@ -20,51 +20,82 @@
 import io.netty.buffer.ByteBuf;
 import io.netty.channel.ChannelFuture;
 
-import java.util.Map;
-import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.concurrent.atomic.AtomicInteger;
 
 import org.apache.drill.common.exceptions.UserRemoteException;
 import org.apache.drill.exec.proto.UserBitShared.DrillPBError;
 
+import com.carrotsearch.hppc.IntObjectHashMap;
+import com.carrotsearch.hppc.procedures.IntObjectProcedure;
+import com.google.common.base.Preconditions;
+
 /**
- * Manages the creation of rpc futures for a particular socket.
+ * Manages the creation of rpc futures for a particular socket <--> socket
+ * connection. Generally speaking, there will be two threads working with 
this
+ * class (the socket thread and the Request generating thread). 
Synchronization
+ * is simple with the map being the only thing that is protected. 
Everything
+ * else works via Atomic variables.
  */
-public class CoordinationQueue {
-  static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(CoordinationQueue.class);
+class RequestIdMap {
+  static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(RequestIdMap.class);
+
+  private final AtomicInteger value = new AtomicInteger();
+  private final AtomicBoolean acceptMessage = new AtomicBoolean(true);
 
-  private final PositiveAtomicInteger circularInt = new 
PositiveAtomicInteger();
-  private final Map> map;
+  /** Access to map must be protected. **/
+  private final IntObjectHashMap> map;
 
-  public CoordinationQueue(int segmentSize, int segmentCount) {
-map = new ConcurrentHashMap>(segmentSize, 
0.75f, segmentCount);
+  public RequestIdMap() {
+map = new IntObjectHashMap>();
   }
 
   void channelClosed(Throwable ex) {
+acceptMessage.set(false);
 if (ex != null) {
-  RpcException e;
-  if (ex instanceof RpcException) {
-e = (RpcException) ex;
-  } else {
-e = new RpcException(ex);
+  final RpcException e = RpcException.mapException(ex);
+  synchronized (map) {
+map.forEach(new Closer(e));
+map.clear();
   }
-  for (RpcOutcome f : map.values()) {
-f.setException(e);
+}
+  }
+
+  private class Closer implements IntObjectProcedure> {
+final RpcException exception;
+
+public Closer(RpcException exception) {
+  this.exception = exception;
+}
+
+@Override
+public void apply(int key, RpcOutcome value) {
+  try{
+value.setException(exception);
+  }catch(Exception e){
+logger.warn("Failure while attempting to fail rpc response.", e);
   }
 }
+
   }
 
-  public  ChannelListenerWithCoordinationId get(RpcOutcomeListener 
handler, Class clazz, RemoteConnection connection) {
-int i = circularInt.getNext();
+  public  ChannelListenerWithCoordinationId 
createNewRpcListener(RpcOutcomeListener handler, Class clazz,
+  RemoteConnection connection) {
+int i = value.incrementAndGet();
 RpcListener future = new RpcListener(handler, clazz, i, 
connection);
-Object old = map.put(i, future);
-if (old != null) {
-  throw new IllegalStateException(
-  "You attempted to reuse a coordination id when the previous 
coordination id has not been removed.  This is likely rpc future callback 
memory leak.");
+final Object old;
+synchronized (map) {
+  Preconditions.checkArgument(acceptMessage.get(),
+  "Attempted to send a message when connection is no longer 
valid.");
+  old = map.put(i, future);
 }
+Preconditions.checkArgument(old == null,
--- End diff --

This is an assertion to ensure that there isn't a bug some place.  


> Query runs out of memory and remains in CANCELLATION_REQUESTED state until 
> drillbit is restarted
> 
>
> Key: DRILL-3714
> URL: https://issues.apache.org/jira/browse/DRILL-3714
> Project: Apache Drill
>  Issue Type: Bug
>  Components: E

[jira] [Commented] (DRILL-3714) Query runs out of memory and remains in CANCELLATION_REQUESTED state until drillbit is restarted

2016-04-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237766#comment-15237766
 ] 

ASF GitHub Bot commented on DRILL-3714:
---

Github user jacques-n commented on a diff in the pull request:

https://github.com/apache/drill/pull/463#discussion_r59433244
  
--- Diff: 
exec/rpc/src/main/java/org/apache/drill/exec/rpc/RequestIdMap.java ---
@@ -20,51 +20,82 @@
 import io.netty.buffer.ByteBuf;
 import io.netty.channel.ChannelFuture;
 
-import java.util.Map;
-import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.concurrent.atomic.AtomicInteger;
 
 import org.apache.drill.common.exceptions.UserRemoteException;
 import org.apache.drill.exec.proto.UserBitShared.DrillPBError;
 
+import com.carrotsearch.hppc.IntObjectHashMap;
+import com.carrotsearch.hppc.procedures.IntObjectProcedure;
+import com.google.common.base.Preconditions;
+
 /**
- * Manages the creation of rpc futures for a particular socket.
+ * Manages the creation of rpc futures for a particular socket <--> socket
+ * connection. Generally speaking, there will be two threads working with 
this
+ * class (the socket thread and the Request generating thread). 
Synchronization
+ * is simple with the map being the only thing that is protected. 
Everything
+ * else works via Atomic variables.
  */
-public class CoordinationQueue {
-  static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(CoordinationQueue.class);
+class RequestIdMap {
+  static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(RequestIdMap.class);
+
+  private final AtomicInteger value = new AtomicInteger();
+  private final AtomicBoolean acceptMessage = new AtomicBoolean(true);
 
-  private final PositiveAtomicInteger circularInt = new 
PositiveAtomicInteger();
--- End diff --

Correct. Originally (long ago), this was structured as an array. However, 
we're using a map now so it doesn't matter.


> Query runs out of memory and remains in CANCELLATION_REQUESTED state until 
> drillbit is restarted
> 
>
> Key: DRILL-3714
> URL: https://issues.apache.org/jira/browse/DRILL-3714
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
>Reporter: Victoria Markman
>Assignee: Jacques Nadeau
>Priority: Critical
> Fix For: 1.7.0
>
> Attachments: Screen Shot 2015-08-26 at 10.36.33 AM.png, drillbit.log, 
> jstack.txt, query_profile_2a2210a7-7a78-c774-d54c-c863d0b77bb0.json
>
>
> This is a variation of DRILL-3705 with the difference of drill behavior when 
> hitting OOM condition.
> Query runs out of memory during execution and remains in 
> "CANCELLATION_REQUESTED" state until drillbit is bounced.
> Client (sqlline in this case) never gets a response from the server.
> Reproduction details:
> Single node drillbit installation.
> DRILL_MAX_DIRECT_MEMORY="8G"
> DRILL_HEAP="4G"
> Run this query on TPCDS SF100 data set
> {code}
> SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk) AS 
> TotalSpend FROM store_sales ss WHERE ss.ss_store_sk IS NOT NULL ORDER BY 1 
> LIMIT 10;
> {code}
> drillbit.log
> {code}
> 2015-08-26 16:54:58,469 [2a2210a7-7a78-c774-d54c-c863d0b77bb0:frag:3:22] INFO 
>  o.a.d.e.w.f.FragmentStatusReporter - 
> 2a2210a7-7a78-c774-d54c-c863d0b77bb0:3:22: State to report: RUNNING
> 2015-08-26 16:55:50,498 [BitServer-5] WARN  
> o.a.drill.exec.rpc.data.DataServer - Message of mode REQUEST of rpc type 3 
> took longer than 500ms.  Actual duration was 2569ms.
> 2015-08-26 16:56:31,086 [BitServer-5] ERROR 
> o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.  
> Connection: /10.10.88.133:31012 <--> /10.10.88.133:54554 (data server).  
> Closing connection.
> io.netty.handler.codec.DecoderException: java.lang.OutOfMemoryError: Direct 
> buffer memory
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:233)
>  ~[netty-codec-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.net

[jira] [Created] (DRILL-4599) Create Querying Avro Files content

2016-04-12 Thread Bridget Bevens (JIRA)
Bridget Bevens created DRILL-4599:
-

 Summary: Create Querying Avro Files content 
 Key: DRILL-4599
 URL: https://issues.apache.org/jira/browse/DRILL-4599
 Project: Apache Drill
  Issue Type: Task
  Components: Documentation
Reporter: Bridget Bevens
Assignee: Bridget Bevens


Create content for this page: http://drill.apache.org/docs/querying-avro-files/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4598) Update Website to Mark Avro Support as Experimental

2016-04-12 Thread Bridget Bevens (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bridget Bevens resolved DRILL-4598.
---
Resolution: Fixed

Updated Avro as requested on 
http://drill.apache.org/docs/querying-a-file-system-introduction/.
Created a Querying Avro Files page and included the suggested statements, but 
it needs content: http://drill.apache.org/docs/querying-avro-files/
I am closing this bug and opening another to track the request for content on 
the avro page. 

> Update Website to Mark Avro Support as Experimental
> ---
>
> Key: DRILL-4598
> URL: https://issues.apache.org/jira/browse/DRILL-4598
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.5.0
>Reporter: John Omernik
>Assignee: Bridget Bevens
>
> Per mailing list conversation, we should update the Documentation to mark the 
> Avro Plugin as experimental.   I think there should be two changes... first 
> on the Querying a File System Introduction, 
> http://drill.apache.org/docs/querying-a-file-system-introduction/
> Have 
> Avro(type: avro)
> be changed to 
> Avro (type:avro) (Experimental See Querying Avro Files) With "Querying Avro 
> Files" being a link to the next addition. 
> Under Querying a Filesystem, and at the same level as "Querying JSON Files" 
> or "Querying Parquet Files" We should add a page with "Querying Avro Files" 
> As an initial change, We should have a basic page that states "Querying Avro 
> Files is an experimental plugin at this time, and there are known issues.  
> Please reference JIRA for more information"
> This will be a good stop gap solution.  I will work to try and come up with a 
> more complete version once I understand how the website component is pulled 
> from the gh-pages site, and how I can do a pull request. In the meantime, if 
> someone could update the pages with the "Stop gap" It would be appreciated. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-4598) Update Website to Mark Avro Support as Experimental

2016-04-12 Thread Bridget Bevens (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bridget Bevens closed DRILL-4598.
-

Closing this bug and opening another to track content required on querying avro 
page. 

> Update Website to Mark Avro Support as Experimental
> ---
>
> Key: DRILL-4598
> URL: https://issues.apache.org/jira/browse/DRILL-4598
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.5.0
>Reporter: John Omernik
>Assignee: Bridget Bevens
>
> Per mailing list conversation, we should update the Documentation to mark the 
> Avro Plugin as experimental.   I think there should be two changes... first 
> on the Querying a File System Introduction, 
> http://drill.apache.org/docs/querying-a-file-system-introduction/
> Have 
> Avro(type: avro)
> be changed to 
> Avro (type:avro) (Experimental See Querying Avro Files) With "Querying Avro 
> Files" being a link to the next addition. 
> Under Querying a Filesystem, and at the same level as "Querying JSON Files" 
> or "Querying Parquet Files" We should add a page with "Querying Avro Files" 
> As an initial change, We should have a basic page that states "Querying Avro 
> Files is an experimental plugin at this time, and there are known issues.  
> Please reference JIRA for more information"
> This will be a good stop gap solution.  I will work to try and come up with a 
> more complete version once I understand how the website component is pulled 
> from the gh-pages site, and how I can do a pull request. In the meantime, if 
> someone could update the pages with the "Stop gap" It would be appreciated. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-4598) Update Website to Mark Avro Support as Experimental

2016-04-12 Thread Bridget Bevens (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bridget Bevens reassigned DRILL-4598:
-

Assignee: Bridget Bevens

> Update Website to Mark Avro Support as Experimental
> ---
>
> Key: DRILL-4598
> URL: https://issues.apache.org/jira/browse/DRILL-4598
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.5.0
>Reporter: John Omernik
>Assignee: Bridget Bevens
>
> Per mailing list conversation, we should update the Documentation to mark the 
> Avro Plugin as experimental.   I think there should be two changes... first 
> on the Querying a File System Introduction, 
> http://drill.apache.org/docs/querying-a-file-system-introduction/
> Have 
> Avro(type: avro)
> be changed to 
> Avro (type:avro) (Experimental See Querying Avro Files) With "Querying Avro 
> Files" being a link to the next addition. 
> Under Querying a Filesystem, and at the same level as "Querying JSON Files" 
> or "Querying Parquet Files" We should add a page with "Querying Avro Files" 
> As an initial change, We should have a basic page that states "Querying Avro 
> Files is an experimental plugin at this time, and there are known issues.  
> Please reference JIRA for more information"
> This will be a good stop gap solution.  I will work to try and come up with a 
> more complete version once I understand how the website component is pulled 
> from the gh-pages site, and how I can do a pull request. In the meantime, if 
> someone could update the pages with the "Stop gap" It would be appreciated. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-4457) Difference in results returned by window function over BIGINT data

2016-04-12 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz closed DRILL-4457.
-

> Difference in results returned by window function over BIGINT data
> --
>
> Key: DRILL-4457
> URL: https://issues.apache.org/jira/browse/DRILL-4457
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.6.0
> Environment: 4 node cluster
>Reporter: Khurram Faraaz
>Assignee: Deneche A. Hakim
>  Labels: window_function
> Fix For: 1.6.0
>
>
> Difference in results returned by window function query over same data on 
> Drill vs on Postgres.
> Drill 1.6.0 commit ID 6d5f4983
> {noformat}
> Verification Failures:
> /root/public_framework/drill-test-framework/framework/resources/Functional/window_functions/frameclause/RBCRACR/RBCRACR_bgint_6.q
> Query:
> SELECT FIRST_VALUE(c3) OVER(PARTITION BY c8 ORDER BY c1 RANGE BETWEEN CURRENT 
> ROW AND CURRENT ROW) FROM `t_alltype.parquet`
>  Expected number of rows: 145
> Actual number of rows from Drill: 145
>  Number of matching rows: 143
>   Number of rows missing: 2
>Number of rows unexpected: 2
> These rows are not expected (first 10):
> 36022570792
> 21011901540311080
> These rows are missing (first 10):
> null (2 time(s))
> {noformat}
> Here is the difference in results, Drill 1.6.0 returns 36022570792 whereas 
> Postgres returns null, and another difference is that Drill returns 
> 21011901540311080 whereas Postgres returns null.
> {noformat}
> [root@centos-01 drill-output]# diff -cb 
> RBCRACR_RBCRACR_bgint_6.output_Tue_Mar_01_10\:36\:42_UTC_2016 
> ../resources/Functional/window_functions/frameclause/RBCRACR/RBCRACR_bgint_6.e
> *** RBCRACR_RBCRACR_bgint_6.output_Tue_Mar_01_10:36:42_UTC_2016   
> 2016-03-01 10:36:43.012382649 +
> --- 
> ../resources/Functional/window_functions/frameclause/RBCRACR/RBCRACR_bgint_6.e
> 2016-03-01 10:32:56.605677914 +
> ***
> *** 55,61 
>   5424751352
>   3734160392
>   36022570792
> ! 36022570792
>   584831936
>   37102817894137256
>   61958708627376736
> --- 55,61 
>   5424751352
>   3734160392
>   36022570792
> ! null
>   584831936
>   37102817894137256
>   61958708627376736
> ***
> *** 64,70 
>   29537626363643852
>   52598911986023288
>   21011901540311080
> ! 21011901540311080
>   17990322900862228
>   61608051272
>   3136812789494
> --- 64,70 
>   29537626363643852
>   52598911986023288
>   21011901540311080
> ! null
>   17990322900862228
>   61608051272
>   3136812789494
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4457) Difference in results returned by window function over BIGINT data

2016-04-12 Thread Khurram Faraaz (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237711#comment-15237711
 ] 

Khurram Faraaz commented on DRILL-4457:
---

Test added here 
Functional/window_functions/frameclause/RBCRACR/RBCRACR_bgint_6.q

> Difference in results returned by window function over BIGINT data
> --
>
> Key: DRILL-4457
> URL: https://issues.apache.org/jira/browse/DRILL-4457
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.6.0
> Environment: 4 node cluster
>Reporter: Khurram Faraaz
>Assignee: Deneche A. Hakim
>  Labels: window_function
> Fix For: 1.6.0
>
>
> Difference in results returned by window function query over same data on 
> Drill vs on Postgres.
> Drill 1.6.0 commit ID 6d5f4983
> {noformat}
> Verification Failures:
> /root/public_framework/drill-test-framework/framework/resources/Functional/window_functions/frameclause/RBCRACR/RBCRACR_bgint_6.q
> Query:
> SELECT FIRST_VALUE(c3) OVER(PARTITION BY c8 ORDER BY c1 RANGE BETWEEN CURRENT 
> ROW AND CURRENT ROW) FROM `t_alltype.parquet`
>  Expected number of rows: 145
> Actual number of rows from Drill: 145
>  Number of matching rows: 143
>   Number of rows missing: 2
>Number of rows unexpected: 2
> These rows are not expected (first 10):
> 36022570792
> 21011901540311080
> These rows are missing (first 10):
> null (2 time(s))
> {noformat}
> Here is the difference in results, Drill 1.6.0 returns 36022570792 whereas 
> Postgres returns null, and another difference is that Drill returns 
> 21011901540311080 whereas Postgres returns null.
> {noformat}
> [root@centos-01 drill-output]# diff -cb 
> RBCRACR_RBCRACR_bgint_6.output_Tue_Mar_01_10\:36\:42_UTC_2016 
> ../resources/Functional/window_functions/frameclause/RBCRACR/RBCRACR_bgint_6.e
> *** RBCRACR_RBCRACR_bgint_6.output_Tue_Mar_01_10:36:42_UTC_2016   
> 2016-03-01 10:36:43.012382649 +
> --- 
> ../resources/Functional/window_functions/frameclause/RBCRACR/RBCRACR_bgint_6.e
> 2016-03-01 10:32:56.605677914 +
> ***
> *** 55,61 
>   5424751352
>   3734160392
>   36022570792
> ! 36022570792
>   584831936
>   37102817894137256
>   61958708627376736
> --- 55,61 
>   5424751352
>   3734160392
>   36022570792
> ! null
>   584831936
>   37102817894137256
>   61958708627376736
> ***
> *** 64,70 
>   29537626363643852
>   52598911986023288
>   21011901540311080
> ! 21011901540311080
>   17990322900862228
>   61608051272
>   3136812789494
> --- 64,70 
>   29537626363643852
>   52598911986023288
>   21011901540311080
> ! null
>   17990322900862228
>   61608051272
>   3136812789494
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-4263) add support for RANGE BETWEEN CURRENT ROW AND CURRENT ROW

2016-04-12 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz closed DRILL-4263.
-

> add support for RANGE BETWEEN CURRENT ROW AND CURRENT ROW
> -
>
> Key: DRILL-4263
> URL: https://issues.apache.org/jira/browse/DRILL-4263
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Execution - Relational Operators
>Reporter: Deneche A. Hakim
>Assignee: Deneche A. Hakim
> Fix For: 1.6.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4263) add support for RANGE BETWEEN CURRENT ROW AND CURRENT ROW

2016-04-12 Thread Khurram Faraaz (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237697#comment-15237697
 ] 

Khurram Faraaz commented on DRILL-4263:
---

Tests added here Functional/window_functions/frameclause/RBCRACR

> add support for RANGE BETWEEN CURRENT ROW AND CURRENT ROW
> -
>
> Key: DRILL-4263
> URL: https://issues.apache.org/jira/browse/DRILL-4263
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Execution - Relational Operators
>Reporter: Deneche A. Hakim
>Assignee: Deneche A. Hakim
> Fix For: 1.6.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4262) add support for ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW

2016-04-12 Thread Khurram Faraaz (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237695#comment-15237695
 ] 

Khurram Faraaz commented on DRILL-4262:
---

Tests added here Functional/window_functions/frameclause/RBUPACR

> add support for ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
> 
>
> Key: DRILL-4262
> URL: https://issues.apache.org/jira/browse/DRILL-4262
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Execution - Relational Operators
>Reporter: Deneche A. Hakim
>Assignee: Deneche A. Hakim
> Fix For: 1.6.0
>
>
> RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW is already supported as the 
> default frame when an ORDER clause is present in the window definition.
> We need to add support for ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-4262) add support for ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW

2016-04-12 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz closed DRILL-4262.
-

> add support for ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
> 
>
> Key: DRILL-4262
> URL: https://issues.apache.org/jira/browse/DRILL-4262
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Execution - Relational Operators
>Reporter: Deneche A. Hakim
>Assignee: Deneche A. Hakim
> Fix For: 1.6.0
>
>
> RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW is already supported as the 
> default frame when an ORDER clause is present in the window definition.
> We need to add support for ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-4261) add support for RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING

2016-04-12 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz closed DRILL-4261.
-

> add support for RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
> -
>
> Key: DRILL-4261
> URL: https://issues.apache.org/jira/browse/DRILL-4261
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Execution - Relational Operators
>Reporter: Deneche A. Hakim
>Assignee: Deneche A. Hakim
> Fix For: 1.6.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4261) add support for RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING

2016-04-12 Thread Khurram Faraaz (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237691#comment-15237691
 ] 

Khurram Faraaz commented on DRILL-4261:
---

Test added here Functional/window_functions/frameclause/RBUPAUF

> add support for RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
> -
>
> Key: DRILL-4261
> URL: https://issues.apache.org/jira/browse/DRILL-4261
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Execution - Relational Operators
>Reporter: Deneche A. Hakim
>Assignee: Deneche A. Hakim
> Fix For: 1.6.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-4260) Adding support for some custom window frames

2016-04-12 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz closed DRILL-4260.
-

> Adding support for some custom window frames
> 
>
> Key: DRILL-4260
> URL: https://issues.apache.org/jira/browse/DRILL-4260
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Affects Versions: 1.5.0
>Reporter: Deneche A. Hakim
>Assignee: Deneche A. Hakim
> Fix For: 1.6.0
>
>
> Current implementation of window functions (<1.6) only supports the default 
> frame. We want to add support for the FRAME clause. 
> This is an umbrella task to track the progress while adding all remaining 
> frames.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4260) Adding support for some custom window frames

2016-04-12 Thread Khurram Faraaz (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237687#comment-15237687
 ] 

Khurram Faraaz commented on DRILL-4260:
---

Tests are added here, 
framework/resources/Functional/window_functions/frameclause/
[test@cent-1 frameclause]# ls
defaultFrame  multipl_wnwds  RBCRACR  RBUPACR  RBUPAUF   subQueries

> Adding support for some custom window frames
> 
>
> Key: DRILL-4260
> URL: https://issues.apache.org/jira/browse/DRILL-4260
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Affects Versions: 1.5.0
>Reporter: Deneche A. Hakim
>Assignee: Deneche A. Hakim
> Fix For: 1.6.0
>
>
> Current implementation of window functions (<1.6) only supports the default 
> frame. We want to add support for the FRAME clause. 
> This is an umbrella task to track the progress while adding all remaining 
> frames.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3036) Mongo plugin doesn't pushdown filter with a project

2016-04-12 Thread Mike Fulke (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237572#comment-15237572
 ] 

Mike Fulke commented on DRILL-3036:
---

Hi - can anybody comment any further on this one, before we start looking into 
it?


> Mongo plugin doesn't pushdown filter with a project
> ---
>
> Key: DRILL-3036
> URL: https://issues.apache.org/jira/browse/DRILL-3036
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - MongoDB
>Affects Versions: 0.9.0
>Reporter: Adam Gilmore
>Assignee: Adam Gilmore
> Attachments: DRILL-3036.1.patch.txt
>
>
> Exacerbated by DRILL-2732, if a project exists between the scan and the 
> filter, the current Mongo pushdown filter rule will not push the filter down 
> into the scan.
> As implemented in DRILL-1950 and for the partition prune optimizer rule, the 
> pushdown filter should handle the above scenario (by implementing the rule 
> for both "filter/scan" and "filter/project/scan" scenarios).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4596) Drill should do version check among drillbits

2016-04-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237365#comment-15237365
 ] 

ASF GitHub Bot commented on DRILL-4596:
---

Github user paul-rogers commented on the pull request:

https://github.com/apache/drill/pull/474#issuecomment-208959253
  
Some other design issues. The idea of a rollling upgrade presupposes that 
we can shut down a Drillbit, bring up a new one, and the cluster keeps running. 
But, today, bringing down a Drillbit causes all in-flight queries on that node 
to fail. There is no way to mark a node as "quiescent" (up, but not accepting 
new work.) So, a rolling upgrade today would entail a long series of query 
failures as we replace each of, say, 20 or 50 nodes. So, in fact, it is less 
disruptive to take the cluster down, push an upgrade, and bring it back up. 
(See DRILL-4286.)

Back on testing: testing is essential. A feature that allow +/-1 feature 
compatibility is not helpful unless someone (other than the user) can certify 
that it works. If the user gets to do the checking, then it is not very 
helpful: safer just to do a full upgrade.

To emphasize an earlier point: there are two distinct issues. One is a 
managed cluster upgrade (the admin can do it with the help of a management 
tool.) The other are the many Drill clients spread across desktops: that is a 
classic desktop software upgrade. Some might be on planes, others locked in 
desks while someone is on vacation. Let's think about how to upgrade JDBC 
drivers and the like given this reality.

Is the compatiblity policy number or time based? As an admin, can I expect 
to have a three-month window for upgrades? Or, will it sometimes be one month, 
others four months, depending on who changes what? Should we have a time-based 
policy?


> Drill should do version check among drillbits
> -
>
> Key: DRILL-4596
> URL: https://issues.apache.org/jira/browse/DRILL-4596
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.6.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
> Fix For: Future
>
>
> Before registering new drillbit in zookeeper, we should do version check, and 
> make sure all the running drillbits are in the same version.
> Using drillbits of different version can lead to unexpected results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4596) Drill should do version check among drillbits

2016-04-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237312#comment-15237312
 ] 

ASF GitHub Bot commented on DRILL-4596:
---

Github user paul-rogers commented on the pull request:

https://github.com/apache/drill/pull/474#issuecomment-208943460
  
Hi All. Version compatibility is a complex issue. Do we have a design 
document that explains our goals and policy? Is the goal to allow rolling 
updates of clients? (Drill server at, say 1.8, rolling upgrade of client from 
1.7 to 1.8, old clients work)? Or, is it to allow rolling upgrades of servers? 
Both?

MapR customers receive even releases: 1.4, 1.6, 1.8. Would the +/-1 policy 
benefit them?

As I understand it, each Drill bit is to work with another of +/-1 version. 
But, what I bring up Drill 1.6, Drill 1.5 and Drill 1.4. The 1.5 is happy to 
work with both 1.6 and 1.4. But, the 1.6 and 1.4 versions will fail only when 
they communicate with one another. When will this communication occur? At 
startup? Or, only later when, say, 1.6 tries to send a query to 1.4?

Does this mean that Drillbits should advertise their version in ZooKeeper 
so that we fail fast and can provide a clear error message?

Dremio proposes a new 2.0 release that breaks compatibility. Will Drill 1.9 
(say) be compatible with the incompatile Drill 2.0? Should it be?

As others have said, we need to consider wire protocol and semantics. The 
usual solution is protocol negotiation. If a 1.6 client connects to a 1.7 
server, they agree to "speak" 1.6. If a 1.7 client connects to a 1.6 server, 
they also agree to "speak" 1.6. Such as solution has impact on our messaging 
layer. It increases testing requirements. 

Drill-on-YARN will provide another way to do server upgrades (ramp up a new 
cluster while ramping down an old one.) Otherwise, YARN will need some way to 
run the same cluster, replacing version X drillbits with version X+1 (while 
still running the version X Application Master).

Are all these issues spelled out in a design doc?

IMHO: let's not try to bug fix our way to success here; let's step back and 
work out a complete design.


> Drill should do version check among drillbits
> -
>
> Key: DRILL-4596
> URL: https://issues.apache.org/jira/browse/DRILL-4596
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.6.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
> Fix For: Future
>
>
> Before registering new drillbit in zookeeper, we should do version check, and 
> make sure all the running drillbits are in the same version.
> Using drillbits of different version can lead to unexpected results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-3474) Filename should be an available column when querying a directory

2016-04-12 Thread Arina Ielchiieva (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva reassigned DRILL-3474:
---

Assignee: Arina Ielchiieva

> Filename should be an available column when querying a directory
> 
>
> Key: DRILL-3474
> URL: https://issues.apache.org/jira/browse/DRILL-3474
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.1.0
>Reporter: Jim Scott
>Assignee: Arina Ielchiieva
> Fix For: Future
>
>
> I could not find another ticket which talks about this ...
> The file name should be a column which can be selected or filtered when 
> querying a directory just like dir0, dir1 are available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-3559) Make filename available to sql statments just like dirN

2016-04-12 Thread Arina Ielchiieva (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva reassigned DRILL-3559:
---

Assignee: Arina Ielchiieva

> Make filename available to sql statments just like dirN
> ---
>
> Key: DRILL-3559
> URL: https://issues.apache.org/jira/browse/DRILL-3559
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: SQL Parser
>Affects Versions: 1.1.0
>Reporter: Stefán Baxter
>Assignee: Arina Ielchiieva
>Priority: Minor
> Fix For: Future
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4596) Drill should do version check among drillbits

2016-04-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236773#comment-15236773
 ] 

ASF GitHub Bot commented on DRILL-4596:
---

Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/474#discussion_r59332891
  
--- Diff: 
common/src/main/java/org/apache/drill/common/util/DrillVersionInfo.java ---
@@ -49,10 +53,52 @@ public static String getVersion() {
 }
   }
 } catch (IOException except) {
-  appVersion = "Unknown";
+  appVersion = UNKNOWN_VERSION;
 }
 return appVersion;
   }
 
+  /**
+   * Compare two Drill versions disregarding build number and comparing 
only major and minor versions.
+   * Versions are considered to be compatible:
+   * 1. if current version is the same as version to compare.
+   * 2. if current version minor version + 1 is the same as version to 
compare.
+   */
+  public static boolean isVersionsCompatible(String currentVersion, String 
versionToCompare) {
+if (currentVersion != null && currentVersion.equals(versionToCompare)) 
{
+  return true;
+}
+
+BigDecimal currentVersionDecimal = getVersionAsDecimal(currentVersion);
+BigDecimal versionToCompareDecimal = 
getVersionAsDecimal(versionToCompare);
+
+if (currentVersionDecimal != null && versionToCompareDecimal != null) {
--- End diff --

I guess, yes. Ideally all drillbits in cluster will be with the same 
version, only during rolling upgrades this situation may occur.


> Drill should do version check among drillbits
> -
>
> Key: DRILL-4596
> URL: https://issues.apache.org/jira/browse/DRILL-4596
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.6.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
> Fix For: Future
>
>
> Before registering new drillbit in zookeeper, we should do version check, and 
> make sure all the running drillbits are in the same version.
> Using drillbits of different version can lead to unexpected results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)