date:20150902

[jira] [Commented] (DRILL-3707) Fix for DRILL-3616 can cause a NullPointerException in ExternalSort cleanup

2015-09-02 Thread Aman Sinha (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14727518#comment-14727518
 ] 

Aman Sinha commented on DRILL-3707:
---

+1 on the latest PR.  

> Fix for DRILL-3616 can cause a NullPointerException in ExternalSort cleanup
> ---
>
> Key: DRILL-3707
> URL: https://issues.apache.org/jira/browse/DRILL-3707
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Deneche A. Hakim
>Assignee: Aman Sinha
> Fix For: 1.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3715) Enable selection vector sv2 and sv4 in hash aggregator

2015-09-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14727597#comment-14727597
 ] 

ASF GitHub Bot commented on DRILL-3715:
---

Github user jaltekruse commented on the pull request:

https://github.com/apache/drill/pull/136#issuecomment-137164063
  
@amithadke, could you look at trying to write a test like this instead of 
checking in a large input file?

There is a session option to turn off hash aggregate, you should be able to 
just duplicate this test changing the session option. Be sure to change it back 
to the default value after your test as the session is currently shared across 
tests.


https://github.com/apache/drill/commit/97a63168e93a70c7ed88d2e801dd2ea2e5f1dd74#diff-91bfb8c9a26887fbe594262b6b872c98R295


> Enable selection vector sv2 and sv4 in hash aggregator
> --
>
> Key: DRILL-3715
> URL: https://issues.apache.org/jira/browse/DRILL-3715
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: amit hadke
>Assignee: amit hadke
>Priority: Minor
>
> HashAggregator already can read sv2 and sv4 vectors. Enable support for all 
> of them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-3735) Directory pruning is not happening when number of files is larger than 64k

2015-09-02 Thread Hao Zhu (JIRA)

Hao Zhu created DRILL-3735:
--

 Summary: Directory pruning is not happening when number of files 
is larger than 64k
 Key: DRILL-3735
 URL: https://issues.apache.org/jira/browse/DRILL-3735
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.1.0
Reporter: Hao Zhu
Assignee: Jinfeng Ni


When the number of files is larger than 64k limit, directory pruning is not 
happening. 
We need to increase this limit further to handle most use cases.

My proposal is to separate the code for directory pruning and partition 
pruning. 
Say in a parent directory there are 100 directories and 1 million files.
If we only query the file from one directory, we should firstly read the 100 
directories and narrow down to which directory; and then read the file paths in 
that directory in memory and do the rest stuff.

Current behavior is , Drill will read all the file paths of that 1 million 
files in memory firstly, and then do directory pruning or partition pruning. 
This is not performance efficient nor memory efficient. And also it can not 
scale.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3735) Directory pruning is not happening when number of files is larger than 64k

2015-09-02 Thread Jinfeng Ni (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni updated DRILL-3735:
--
Assignee: Aman Sinha  (was: Jinfeng Ni)

> Directory pruning is not happening when number of files is larger than 64k
> --
>
> Key: DRILL-3735
> URL: https://issues.apache.org/jira/browse/DRILL-3735
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.1.0
>Reporter: Hao Zhu
>Assignee: Aman Sinha
>
> When the number of files is larger than 64k limit, directory pruning is not 
> happening. 
> We need to increase this limit further to handle most use cases.
> My proposal is to separate the code for directory pruning and partition 
> pruning. 
> Say in a parent directory there are 100 directories and 1 million files.
> If we only query the file from one directory, we should firstly read the 100 
> directories and narrow down to which directory; and then read the file paths 
> in that directory in memory and do the rest stuff.
> Current behavior is , Drill will read all the file paths of that 1 million 
> files in memory firstly, and then do directory pruning or partition pruning. 
> This is not performance efficient nor memory efficient. And also it can not 
> scale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3661) Add/edit various JDBC Javadoc.

2015-09-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14727516#comment-14727516
 ] 

ASF GitHub Bot commented on DRILL-3661:
---

Github user adeneche commented on a diff in the pull request:

https://github.com/apache/drill/pull/119#discussion_r38547252
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -66,17 +69,18 @@
   private boolean afterFirstBatch = false;
 
   /**
-   * Whether the next call to this.next() should just return {@code true} 
rather
-   * than calling nextRowInternally() to try to advance to the next
-   * record.
+   * Whether the next call to {@code this.}{@link #next()} should just 
return
+   * {@code true} rather than calling {@link #nextRowInternally()} to try 
to
+   * advance to the next record.
* 
-   *   Currently, can be true only for first call to next().
+   *   Currently, can be true only for first call to {@link #next()}.
* 
* 
-   *   (Relates to loadInitialSchema()'s calling nextRowInternally()
+   *   (Relates to {@link #loadInitialSchema()}'s calling 
nextRowInternally()
--- End diff --

why not mark nextRowInternally() as {@link} too ?


> Add/edit various JDBC Javadoc.
> --
>
> Key: DRILL-3661
> URL: https://issues.apache.org/jira/browse/DRILL-3661
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>Assignee: Deneche A. Hakim
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (DRILL-3707) Fix for DRILL-3616 can cause a NullPointerException in ExternalSort cleanup

2015-09-02 Thread Deneche A. Hakim (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim reassigned DRILL-3707:
---

Assignee: Deneche A. Hakim  (was: Aman Sinha)

> Fix for DRILL-3616 can cause a NullPointerException in ExternalSort cleanup
> ---
>
> Key: DRILL-3707
> URL: https://issues.apache.org/jira/browse/DRILL-3707
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Deneche A. Hakim
>Assignee: Deneche A. Hakim
> Fix For: 1.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3535) Drop table support

2015-09-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14727514#comment-14727514
 ] 

ASF GitHub Bot commented on DRILL-3535:
---

Github user mehant commented on a diff in the pull request:

https://github.com/apache/drill/pull/140#discussion_r38546769
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/WorkspaceSchemaFactory.java
 ---
@@ -104,6 +107,9 @@ public WorkspaceSchemaFactory(DrillConfig drillConfig, 
FileSystemPlugin plugin,
   final FormatMatcher fallbackMatcher = new 
BasicFormatMatcher(formatPlugin,
   ImmutableList.of(Pattern.compile(".*")), 
ImmutableList.of());
   fileMatchers.add(fallbackMatcher);
+  dropFileMatchers = fileMatchers.subList(0, fileMatchers.size() - 1);
--- End diff --

The matchers are the same except for the pattern associated with 
defaultInputFormat. Because the pattern associated with defaultInputFormat is 
.* and we don't want to match and delete on this.


> Drop table support
> --
>
> Key: DRILL-3535
> URL: https://issues.apache.org/jira/browse/DRILL-3535
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Mehant Baid
>Assignee: Mehant Baid
>
> Umbrella JIRA to track support for "Drop table" feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3535) Drop table support

2015-09-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14727612#comment-14727612
 ] 

ASF GitHub Bot commented on DRILL-3535:
---

Github user mehant commented on a diff in the pull request:

https://github.com/apache/drill/pull/140#discussion_r38555789
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/WorkspaceSchemaFactory.java
 ---
@@ -321,8 +327,101 @@ public DrillTable create(String key) {
   return null;
 }
 
+private FormatMatcher findMatcher(FileStatus file) {
+  FormatMatcher matcher = null;
+  try {
+for (FormatMatcher m : dropFileMatchers) {
+  if (m.isFileReadable(fs, file)) {
+return m;
+  }
+}
+  } catch (IOException e) {
+logger.debug("Failed to find format matcher for file: %s", file, 
e);
+  }
+  return matcher;
+}
+
 @Override
 public void destroy(DrillTable value) {
 }
+
+/**
+ * Check if the table contains homogenenous files that can be read by 
Drill. Eg: parquet, json csv etc.
+ * However if it contains more than one of these formats or a totally 
different file format that Drill cannot
+ * understand then we will raise an exception.
+ * @param key
+ * @return
+ * @throws IOException
+ */
+private boolean isHomogeneous(String key) throws IOException {
+  FileSelection fileSelection = FileSelection.create(fs, 
config.getLocation(), key);
+
+  if (fileSelection == null) {
+throw UserException
+.validationError()
+.message(String.format("Table [%s] not found", key))
+.build(logger);
+  }
+
+  FormatMatcher matcher = null;
+  Queue listOfFiles = new LinkedList<>();
+  listOfFiles.addAll(fileSelection.getFileStatusList(fs));
+
+  while (!listOfFiles.isEmpty()) {
+FileStatus currentFile = listOfFiles.poll();
+if (currentFile.isDirectory()) {
+  listOfFiles.addAll(fs.list(true, currentFile.getPath()));
+} else {
+  if (matcher != null) {
+if (!matcher.isFileReadable(fs, currentFile)) {
+  return false;
+}
+  } else {
+matcher = findMatcher(currentFile);
+// Did not match any of the file patterns, exit
+if (matcher == null) {
+  return false;
+}
+  }
+}
+  }
+  return true;
+}
+
+/**
+ * We check if the table contains homogeneous file formats that Drill 
can read. Once the checks are performed
+ * we rename the file to start with an "_". After the rename we issue 
a recursive delete of the directory.
+ * @param table - Path of table to be dropped
+ */
+@Override
+public void dropTable(String table) {
+
+  String[] pathSplit = table.split(Path.SEPARATOR);
+  String dirName = DrillFileSystem.HIDDEN_FILE_PREFIX + 
pathSplit[pathSplit.length - 1];
+  int lastSlashIndex = table.lastIndexOf(Path.SEPARATOR);
+
+  if (lastSlashIndex != -1) {
+dirName = table.substring(0, lastSlashIndex + 1) + dirName;
+  }
+
+  DrillFileSystem fs = getFS();
+  String defaultLocation = getDefaultLocation();
+  try {
+if (!isHomogeneous(table)) {
+  throw UserException
+  .validationError()
+  .message("Table contains different file formats. \n" +
+  "Drop Table is only supported for directories that 
contain homogeneous file formats consumable by Drill")
+  .build(logger);
+}
+fs.rename(new Path(defaultLocation, table), new 
Path(defaultLocation, dirName));
--- End diff --

I will add a unique identifier in addition to the underscore prefix.

I modified the exception handling below to separate out 
AccessControlException and the generic IOException to handle permission errors 
and other errors respectively. In the case of a race with concurrent renames 
the file system would throw a FileNotFoundException which would be caught by 
the IOException and reported to the user.


> Drop table support
> --
>
> Key: DRILL-3535
> URL: https://issues.apache.org/jira/browse/DRILL-3535
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Mehant Baid
>Assignee: Mehant Baid
>
> Umbrella JIRA to track support for "Drop table" feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3661) Add/edit various JDBC Javadoc.

2015-09-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14727524#comment-14727524
 ] 

ASF GitHub Bot commented on DRILL-3661:
---

Github user adeneche commented on the pull request:

https://github.com/apache/drill/pull/119#issuecomment-137138070
  
+1 LGTM. Can you please rebase on top of master ? thx


> Add/edit various JDBC Javadoc.
> --
>
> Key: DRILL-3661
> URL: https://issues.apache.org/jira/browse/DRILL-3661
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>Assignee: Deneche A. Hakim
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3661) Add/edit various JDBC Javadoc.

2015-09-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14727651#comment-14727651
 ] 

ASF GitHub Bot commented on DRILL-3661:
---

Github user dsbos commented on a diff in the pull request:

https://github.com/apache/drill/pull/119#discussion_r38558416
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -66,17 +69,18 @@
   private boolean afterFirstBatch = false;
 
   /**
-   * Whether the next call to this.next() should just return {@code true} 
rather
-   * than calling nextRowInternally() to try to advance to the next
-   * record.
+   * Whether the next call to {@code this.}{@link #next()} should just 
return
+   * {@code true} rather than calling {@link #nextRowInternally()} to try 
to
+   * advance to the next record.
* 
-   *   Currently, can be true only for first call to next().
+   *   Currently, can be true only for first call to {@link #next()}.
* 
* 
-   *   (Relates to loadInitialSchema()'s calling nextRowInternally()
+   *   (Relates to {@link #loadInitialSchema()}'s calling 
nextRowInternally()
--- End diff --

Frequently, for second and later references in the same paragraph or 
adjacent paragraphs, I avoid making those later references links, since having 
more links can make the text less readable.

However, since here I already have multiple links to next(), I did make 
that nextRowInternally() a link (and added a missing {@code...} for 
"Statement.execute...(...)").


> Add/edit various JDBC Javadoc.
> --
>
> Key: DRILL-3661
> URL: https://issues.apache.org/jira/browse/DRILL-3661
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3707) Fix for DRILL-3616 can cause a NullPointerException in ExternalSort cleanup

2015-09-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14727562#comment-14727562
 ] 

ASF GitHub Bot commented on DRILL-3707:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/130


> Fix for DRILL-3616 can cause a NullPointerException in ExternalSort cleanup
> ---
>
> Key: DRILL-3707
> URL: https://issues.apache.org/jira/browse/DRILL-3707
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Deneche A. Hakim
>Assignee: Aman Sinha
> Fix For: 1.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3731) Partition Pruning not taking place when we have case statement within the filter

2015-09-02 Thread Jinfeng Ni (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14727642#comment-14727642
 ] 

Jinfeng Ni commented on DRILL-3731:
---

I understand the partition pruning should work for this filter. But why can 
user simply the filter containing the case expression for now:

{code}
where case when dir0=1991 then null else 1 end is null or dir0=1992;

=> 
 where dir0=1991l or dir0=1992;
{code}

If there are way to simply the filter and make partition pruning work, I would 
suggest we do so. 




> Partition Pruning not taking place when we have case statement within the 
> filter
> 
>
> Key: DRILL-3731
> URL: https://issues.apache.org/jira/browse/DRILL-3731
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.2.0
>Reporter: Rahul Challapalli
>Assignee: Jinfeng Ni
> Fix For: 1.2.0
>
> Attachments: lineitempart.tgz
>
>
> Commit # 8f33c3d55228a98ea2a309490ecbfb7a180f012c
> Partition Pruning is not taking place in the below query
> {code}
> explain plan for select count(*) from 
> `/drill/testdata/partition_pruning/dfs/lineitempart` where case when 
> dir0=1991 then null else 1 end is null or dir0=1992;
> 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02StreamAgg(group=[{}], EXPR$0=[COUNT()])
> 00-03  Project($f0=[0])
> 00-04SelectionVectorRemover
> 00-05  Filter(condition=[OR(IS NULL(CASE(=($0, 1991), null, 1)), 
> =($0, 1992))])
> 00-06Scan(groupscan=[EasyGroupScan 
> [selectionRoot=maprfs:/drill/testdata/partition_pruning/dfs/lineitempart, 
> numFiles=7, columns=[`dir0`], 
> files=[maprfs:/drill/testdata/partition_pruning/dfs/lineitempart/1991/lineitemaa.tbl,
>  
> maprfs:/drill/testdata/partition_pruning/dfs/lineitempart/1992/lineitemab.tbl,
>  
> maprfs:/drill/testdata/partition_pruning/dfs/lineitempart/1996/lineitemaf.tbl,
>  
> maprfs:/drill/testdata/partition_pruning/dfs/lineitempart/1994/lineitemad.tbl,
>  
> maprfs:/drill/testdata/partition_pruning/dfs/lineitempart/1993/lineitemac.tbl,
>  
> maprfs:/drill/testdata/partition_pruning/dfs/lineitempart/1997/lineitemag.tbl,
>  
> maprfs:/drill/testdata/partition_pruning/dfs/lineitempart/1995/lineitemae.tbl]]])
> {code}
> The planner should have pruned all partitions except 1991 & 1992
> I attached the dataset. Let me know if you need anything.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3661) Add/edit various JDBC Javadoc.

2015-09-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14727655#comment-14727655
 ] 

ASF GitHub Bot commented on DRILL-3661:
---

Github user dsbos commented on the pull request:

https://github.com/apache/drill/pull/119#issuecomment-137176204
  
Rebased.  Edited doc. comment formatting.


> Add/edit various JDBC Javadoc.
> --
>
> Key: DRILL-3661
> URL: https://issues.apache.org/jira/browse/DRILL-3661
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3731) Partition Pruning not taking place when we have case statement within the filter

2015-09-02 Thread Sean Hsuan-Yi Chu (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Hsuan-Yi Chu updated DRILL-3731:
-
Assignee: Sean Hsuan-Yi Chu  (was: Jinfeng Ni)

> Partition Pruning not taking place when we have case statement within the 
> filter
> 
>
> Key: DRILL-3731
> URL: https://issues.apache.org/jira/browse/DRILL-3731
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.2.0
>Reporter: Rahul Challapalli
>Assignee: Sean Hsuan-Yi Chu
> Fix For: 1.2.0
>
> Attachments: lineitempart.tgz
>
>
> Commit # 8f33c3d55228a98ea2a309490ecbfb7a180f012c
> Partition Pruning is not taking place in the below query
> {code}
> explain plan for select count(*) from 
> `/drill/testdata/partition_pruning/dfs/lineitempart` where case when 
> dir0=1991 then null else 1 end is null or dir0=1992;
> 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02StreamAgg(group=[{}], EXPR$0=[COUNT()])
> 00-03  Project($f0=[0])
> 00-04SelectionVectorRemover
> 00-05  Filter(condition=[OR(IS NULL(CASE(=($0, 1991), null, 1)), 
> =($0, 1992))])
> 00-06Scan(groupscan=[EasyGroupScan 
> [selectionRoot=maprfs:/drill/testdata/partition_pruning/dfs/lineitempart, 
> numFiles=7, columns=[`dir0`], 
> files=[maprfs:/drill/testdata/partition_pruning/dfs/lineitempart/1991/lineitemaa.tbl,
>  
> maprfs:/drill/testdata/partition_pruning/dfs/lineitempart/1992/lineitemab.tbl,
>  
> maprfs:/drill/testdata/partition_pruning/dfs/lineitempart/1996/lineitemaf.tbl,
>  
> maprfs:/drill/testdata/partition_pruning/dfs/lineitempart/1994/lineitemad.tbl,
>  
> maprfs:/drill/testdata/partition_pruning/dfs/lineitempart/1993/lineitemac.tbl,
>  
> maprfs:/drill/testdata/partition_pruning/dfs/lineitempart/1997/lineitemag.tbl,
>  
> maprfs:/drill/testdata/partition_pruning/dfs/lineitempart/1995/lineitemae.tbl]]])
> {code}
> The planner should have pruned all partitions except 1991 & 1992
> I attached the dataset. Let me know if you need anything.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Closed] (DRILL-2864) Unable to cast string literal with the valid value in ISO 8601 format to interval

2015-09-02 Thread Victoria Markman (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-2864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman closed DRILL-2864.
---

> Unable to cast string literal with the valid value in ISO 8601 format to 
> interval
> -
>
> Key: DRILL-2864
> URL: https://issues.apache.org/jira/browse/DRILL-2864
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 0.9.0
>Reporter: Victoria Markman
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.2.0
>
>
> {code}
> 0: jdbc:drill:schema=dfs> select cast('P1D' as interval day) from t1;
> Query failed: PARSE ERROR: From line 1, column 8 to line 1, column 34: Cast 
> function cannot convert value of type CHAR(3) to type INTERVAL DAY
> [744f1f35-f8c5-46ba-80f9-0efd87036903 on atsqa4-134.qa.lab:31010]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}
> Workaround: cast to varchar.
> {code}
> 0: jdbc:drill:schema=dfs> select cast(cast('P1D' as varchar(3)) as interval 
> day) from t1;
> ++
> |   EXPR$0   |
> ++
> | P1D|
> | P1D|
> | P1D|
> | P1D|
> | P1D|
> | P1D|
> | P1D|
> | P1D|
> | P1D|
> | P1D|
> ++
> 10 rows selected (0.191 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3661) Add/edit various JDBC Javadoc.

2015-09-02 Thread Deneche A. Hakim (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim updated DRILL-3661:

Assignee: Daniel Barclay (Drill)  (was: Deneche A. Hakim)

> Add/edit various JDBC Javadoc.
> --
>
> Key: DRILL-3661
> URL: https://issues.apache.org/jira/browse/DRILL-3661
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3731) Partition Pruning not taking place when we have case statement within the filter

2015-09-02 Thread Sean Hsuan-Yi Chu (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14727626#comment-14727626
 ] 

Sean Hsuan-Yi Chu commented on DRILL-3731:
--

[~jni] Can I work on this one ? 

> Partition Pruning not taking place when we have case statement within the 
> filter
> 
>
> Key: DRILL-3731
> URL: https://issues.apache.org/jira/browse/DRILL-3731
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.2.0
>Reporter: Rahul Challapalli
>Assignee: Jinfeng Ni
> Fix For: 1.2.0
>
> Attachments: lineitempart.tgz
>
>
> Commit # 8f33c3d55228a98ea2a309490ecbfb7a180f012c
> Partition Pruning is not taking place in the below query
> {code}
> explain plan for select count(*) from 
> `/drill/testdata/partition_pruning/dfs/lineitempart` where case when 
> dir0=1991 then null else 1 end is null or dir0=1992;
> 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02StreamAgg(group=[{}], EXPR$0=[COUNT()])
> 00-03  Project($f0=[0])
> 00-04SelectionVectorRemover
> 00-05  Filter(condition=[OR(IS NULL(CASE(=($0, 1991), null, 1)), 
> =($0, 1992))])
> 00-06Scan(groupscan=[EasyGroupScan 
> [selectionRoot=maprfs:/drill/testdata/partition_pruning/dfs/lineitempart, 
> numFiles=7, columns=[`dir0`], 
> files=[maprfs:/drill/testdata/partition_pruning/dfs/lineitempart/1991/lineitemaa.tbl,
>  
> maprfs:/drill/testdata/partition_pruning/dfs/lineitempart/1992/lineitemab.tbl,
>  
> maprfs:/drill/testdata/partition_pruning/dfs/lineitempart/1996/lineitemaf.tbl,
>  
> maprfs:/drill/testdata/partition_pruning/dfs/lineitempart/1994/lineitemad.tbl,
>  
> maprfs:/drill/testdata/partition_pruning/dfs/lineitempart/1993/lineitemac.tbl,
>  
> maprfs:/drill/testdata/partition_pruning/dfs/lineitempart/1997/lineitemag.tbl,
>  
> maprfs:/drill/testdata/partition_pruning/dfs/lineitempart/1995/lineitemae.tbl]]])
> {code}
> The planner should have pruned all partitions except 1991 & 1992
> I attached the dataset. Let me know if you need anything.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3535) Drop table support

2015-09-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14727527#comment-14727527
 ] 

ASF GitHub Bot commented on DRILL-3535:
---

Github user mehant commented on a diff in the pull request:

https://github.com/apache/drill/pull/140#discussion_r38548103
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/WorkspaceSchemaFactory.java
 ---
@@ -321,8 +327,101 @@ public DrillTable create(String key) {
   return null;
 }
 
+private FormatMatcher findMatcher(FileStatus file) {
+  FormatMatcher matcher = null;
+  try {
+for (FormatMatcher m : dropFileMatchers) {
+  if (m.isFileReadable(fs, file)) {
+return m;
+  }
+}
+  } catch (IOException e) {
+logger.debug("Failed to find format matcher for file: %s", file, 
e);
+  }
+  return matcher;
+}
+
 @Override
 public void destroy(DrillTable value) {
 }
+
+/**
+ * Check if the table contains homogenenous files that can be read by 
Drill. Eg: parquet, json csv etc.
+ * However if it contains more than one of these formats or a totally 
different file format that Drill cannot
+ * understand then we will raise an exception.
+ * @param key
+ * @return
+ * @throws IOException
+ */
+private boolean isHomogeneous(String key) throws IOException {
--- End diff --

The only reason was to avoid the performance penalty for read that would be 
ensued by these checks and simply going ahead optimistically and failing later 
if we hit different formats. 


> Drop table support
> --
>
> Key: DRILL-3535
> URL: https://issues.apache.org/jira/browse/DRILL-3535
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Mehant Baid
>Assignee: Mehant Baid
>
> Umbrella JIRA to track support for "Drop table" feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3691) CTAS Memory Leak : IllegalStateException

2015-09-02 Thread Victoria Markman (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14728330#comment-14728330
 ] 

Victoria Markman commented on DRILL-3691:
-

Hakim,

I think DRILL-3673, DRILL-3684 and DRILL-3691 (this one) are all the same issue 
that was fixed with your commit 
(git.commit.id=b0e9e7c085a3106a612e400cd1732b6ec6267268)

You can resolve this one as well.
Vicky.


Verified that this case works with:

#Generated by Git-Commit-Id-Plugin
#Tue Sep 01 18:18:19 UTC 2015
git.commit.id.abbrev=b0e9e7c
git.commit.user.email=adene...@gmail.com
git.commit.message.full=DRILL-3684\: CTAS \: Memory Leak when using CTAS with 
tpch sf100\n\nThis closes \#141\n
git.commit.id=b0e9e7c085a3106a612e400cd1732b6ec6267268

Repro ran on a single node with 8GB direct memory with assertions disabled.

{code}
0: jdbc:drill:schema=dfs> select * from sys.options where status like 
'%CHANGED%';
+---+--+--+--+--+-+---++
|   name|   kind   |   type   |  status  | num_val  
| string_val  | bool_val  | float_val  |
+---+--+--+--+--+-+---++
| planner.enable_decimal_data_type  | BOOLEAN  | SYSTEM   | CHANGED  | null 
| null| true  | null   |
| store.parquet.block-size  | LONG | SYSTEM   | CHANGED  | 204800   
| null| null  | null   |
| store.parquet.block-size  | LONG | SESSION  | CHANGED  | 103  
| null| null  | null   |
+---+--+--+--+--+-+---++
3 rows selected (0.596 seconds)
n
0: jdbc:drill:schema=dfs> create table lineitem_x as select * from lineitem;
+---++
| Fragment  | Number of records written  |
+---++
| 1_15  | 9021944|
| 1_18  | 19397748   |
| 1_21  | 19372179   |
| 1_22  | 24339845   |
| 1_19  | 24254278   |
| 1_20  | 24477920   |
| 1_16  | 24285361   |
| 1_17  | 24552607   |
| 1_8   | 25169766   |
| 1_12  | 25424111   |
| 1_6   | 25430604   |
| 1_1   | 25485077   |
| 1_5   | 25704889   |
| 1_11  | 28519960   |
| 1_3   | 30363826   |
| 1_2   | 30318160   |
| 1_0   | 30572103   |
| 1_13  | 30425261   |
| 1_9   | 30364369   |
| 1_7   | 30603123   |
| 1_10  | 30379358   |
| 1_4   | 30587344   |
| 1_14  | 30988069   |
+---++
23 rows selected (1888.63 seconds)
0: jdbc:drill:schema=dfs> select count(*) from lineitem_x;
++
|   EXPR$0   |
++
| 600037902  |
++
1 row selected (231.722 seconds)
0: jdbc:drill:schema=dfs> select count(*) from lineitem;
++
|   EXPR$0   |
++
| 600037902  |
++
1 row selected (1.245 seconds)
{code}

> CTAS Memory Leak : IllegalStateException
> 
>
> Key: DRILL-3691
> URL: https://issues.apache.org/jira/browse/DRILL-3691
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Reporter: Rahul Challapalli
>Assignee: Deneche A. Hakim
> Attachments: error.log
>
>
> git.commit.id.abbrev=55dfd0e
> The below CTAS statement fails with a memory leak. The query runs on top of 
> Tpch SF100 data.
> {code}
> create table lineitem as select * from dfs.`/drill/testdata/tpch100/lineitem`;
> java.sql.SQLException: SYSTEM ERROR: IllegalStateException: Failure while 
> closing accountor.  Expected private and shared pools to be set to initial 
> values.  However, one or more were not.  Stats are
> zoneinitallocated   delta 
> private 100 100 0 
> shared  00  9998410176  589824.
> Fragment 1:19
> [Error Id: ba8fedf2-be40-4488-af2e-b6034527c943 on qa-node191.qa.lab:31010]
> Aborting command set because "force" is false and command failed: "create 
> table lineitem as select * from dfs.`/drill/testdata/tpch100/lineitem`;"
> {code}
> I attached the log file. I am not uploading the data as it is too large



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3736) Documentation for partition is misleading/wrong syntax

2015-09-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14728343#comment-14728343
 ] 

ASF GitHub Bot commented on DRILL-3736:
---

Github user kristinehahn commented on the pull request:

https://github.com/apache/drill/pull/142#issuecomment-137291323
  
Thanks, I also fixed this in my repo 13 days ago 
https://github.com/kristinehahn/drill/tree/gh-pages/_docs/sql-reference/sql-commands,
 but haven't been able to get it committed. 


> Documentation for partition is misleading/wrong syntax
> --
>
> Key: DRILL-3736
> URL: https://issues.apache.org/jira/browse/DRILL-3736
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.1.0
> Environment: Any
>Reporter: Edmon Begoli
>Assignee: Bridget Bevens
>Priority: Minor
>  Labels: documentation
> Fix For: 1.2.0
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Examples on the web site use appropriate syntax (PARTITION BY)
> but the syntax definition uses PARTITION_BY.
> https://drill.apache.org/docs/partition-by-clause/
> It should all be PARTITION BY.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3257) TPCDS query 74 results in a StackOverflowError on Scale Factor 1

2015-09-02 Thread Sudheesh Katkam (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sudheesh Katkam updated DRILL-3257:
---
Assignee: Sean Hsuan-Yi Chu  (was: Sudheesh Katkam)

> TPCDS query 74 results in a StackOverflowError on Scale Factor 1
> 
>
> Key: DRILL-3257
> URL: https://issues.apache.org/jira/browse/DRILL-3257
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Rahul Challapalli
>Assignee: Sean Hsuan-Yi Chu
> Fix For: 1.2.0
>
> Attachments: error.log
>
>
> git.commit.id.abbrev=5f26b8b
> Query :
> {code}
> WITH year_total 
>  AS (SELECT c_customer_idcustomer_id, 
> c_first_name customer_first_name, 
> c_last_name  customer_last_name, 
> d_year   AS year1, 
> Sum(ss_net_paid) year_total, 
> 's'  sale_type 
>  FROM   customer, 
> store_sales, 
> date_dim 
>  WHERE  c_customer_sk = ss_customer_sk 
> AND ss_sold_date_sk = d_date_sk 
> AND d_year IN ( 1999, 1999 + 1 ) 
>  GROUP  BY c_customer_id, 
>c_first_name, 
>c_last_name, 
>d_year 
>  UNION ALL 
>  SELECT c_customer_idcustomer_id, 
> c_first_name customer_first_name, 
> c_last_name  customer_last_name, 
> d_year   AS year1, 
> Sum(ws_net_paid) year_total, 
> 'w'  sale_type 
>  FROM   customer, 
> web_sales, 
> date_dim 
>  WHERE  c_customer_sk = ws_bill_customer_sk 
> AND ws_sold_date_sk = d_date_sk 
> AND d_year IN ( 1999, 1999 + 1 ) 
>  GROUP  BY c_customer_id, 
>c_first_name, 
>c_last_name, 
>d_year) 
> SELECT t_s_secyear.customer_id, 
>t_s_secyear.customer_first_name, 
>t_s_secyear.customer_last_name 
> FROM   year_total t_s_firstyear, 
>year_total t_s_secyear, 
>year_total t_w_firstyear, 
>year_total t_w_secyear 
> WHERE  t_s_secyear.customer_id = t_s_firstyear.customer_id 
>AND t_s_firstyear.customer_id = t_w_secyear.customer_id 
>AND t_s_firstyear.customer_id = t_w_firstyear.customer_id 
>AND t_s_firstyear.sale_type = 's' 
>AND t_w_firstyear.sale_type = 'w' 
>AND t_s_secyear.sale_type = 's' 
>AND t_w_secyear.sale_type = 'w' 
>AND t_s_firstyear.year1 = 1999 
>AND t_s_secyear.year1 = 1999 + 1 
>AND t_w_firstyear.year1 = 1999 
>AND t_w_secyear.year1 = 1999 + 1 
>AND t_s_firstyear.year_total > 0 
>AND t_w_firstyear.year_total > 0 
>AND CASE 
>  WHEN t_w_firstyear.year_total > 0 THEN t_w_secyear.year_total / 
> t_w_firstyear.year_total 
>  ELSE NULL 
>END > CASE 
>WHEN t_s_firstyear.year_total > 0 THEN 
>t_s_secyear.year_total / 
>t_s_firstyear.year_total 
>ELSE NULL 
>  END 
> ORDER  BY 1, 
>   2, 
>   3
> LIMIT 100;
> {code}
> The above query never returns. I attached the log file.
> Since the data is 1GB I cannot attach it here. Kindly reach out to me if you 
> want more information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3715) Enable selection vector sv2 and sv4 in hash aggregator

2015-09-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14727792#comment-14727792
 ] 

ASF GitHub Bot commented on DRILL-3715:
---

Github user jaltekruse commented on the pull request:

https://github.com/apache/drill/pull/136#issuecomment-137202183
  
I got that backwards, it looks like the existing test should be disabling 
hash aggregate and verifying that streaming agg is planned instead. To be very 
thorough you might want to include a plan check in your test, but hash based 
operations are currently the default.


> Enable selection vector sv2 and sv4 in hash aggregator
> --
>
> Key: DRILL-3715
> URL: https://issues.apache.org/jira/browse/DRILL-3715
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: amit hadke
>Assignee: amit hadke
>Priority: Minor
>
> HashAggregator already can read sv2 and sv4 vectors. Enable support for all 
> of them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3734) CTAS with partition by clause causes NPE

2015-09-02 Thread Victoria Markman (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14727980#comment-14727980
 ] 

Victoria Markman commented on DRILL-3734:
-

Query without CTAS successfully completes.


{code}
...
...
+-+-+--++--++
| EXPR$0  | EXPR$1  |  EXPR$2  | EXPR$3 |  EXPR$4  | EXPR$5 |
+-+-+--++--++
| 102926  | 822725  | 1257431  | 4937   | 2451891  | 765|
| 60911   | 822725  | 1257431  | 4937   | 2451891  | 463|
| 194089  | 822725  | 1257431  | 4937   | 2451891  | 861|
| 92047   | 822725  | 1257431  | 4937   | 2451891  | 323|
| 82381   | null| null | 4937   | null | 723|
| 52087   | 822725  | 1257431  | 4937   | 2451891  | 95 |
| 40010   | 822725  | 1257431  | 4937   | 2451891  | 773|
| 150413  | 822725  | 1257431  | 4937   | 2451891  | 29 |
| 186139  | 822725  | 1257431  | 4937   | 2451891  | 15 |
| 168037  | 822725  | 1257431  | 4937   | 2451891  | 312|
| 17318   | 822725  | 1257431  | 4937   | 2451891  | 693|
| 10847   | 822725  | 1257431  | 4937   | 2451891  | 99 |
| 167083  | 822725  | 1257431  | 4937   | 2451891  | 470|
| 99985   | 334569  | 1450664  | 1159   | 2451107  | 657|
| 108658  | 334569  | 1450664  | 1159   | 2451107  | 617|
| 186997  | 334569  | 1450664  | 1159   | 2451107  | 481|
| 18964   | 334569  | 1450664  | 1159   | 2451107  | 319|
| 124741  | 334569  | 1450664  | 1159   | 2451107  | 443|
| 8852| null| 1450664  | null   | 2451107  | null   |
| 202244  | 334569  | 1450664  | 1159   | 2451107  | 341|
| 148988  | 334569  | 1450664  | 1159   | 2451107  | 868|
| 84748   | 334569  | 1450664  | 1159   | 2451107  | 613|
| 99790   | 334569  | 1450664  | 1159   | 2451107  | 244|
| 4801| 482814  | 1447684  | 554| 2452624  | 79 |
| 147258  | 482814  | 1447684  | 554| 2452624  | 687|
+-+-+--++--++
287,997,024 rows selected (6313.766 seconds)
{code}


> CTAS with partition by clause causes NPE 
> -
>
> Key: DRILL-3734
> URL: https://issues.apache.org/jira/browse/DRILL-3734
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
>Reporter: Victoria Markman
>Assignee: Chris Westin
>
> To reproduce:
> * Single node drillbit 
> * Enough  disk space for sort spilling (partition by clause)
> * TPCDS SF 100 text file (store_sales.dat)
> * I had alter session set `store.parquet.block-size`  = 204800, reproduced 
> this case with store.parquet.block-size = 134217728, so I don't think it 
> matters. 
> {code}
> 0: jdbc:drill:schema=dfs> create table store_sales_4(ss_item_sk, 
> ss_customer_sk, ss_cdemo_sk, ss_hdemo_sk, s_sold_date_sk, ss_promo_sk) 
> partition by (ss_promo_sk) as
> . . . . . . . . . . . . > select
> . . . . . . . . . . . . > case when columns[2] = '' then cast(null as 
> varchar(100)) else cast(columns[2] as varchar(100)) end,
> . . . . . . . . . . . . > case when columns[3] = '' then cast(null as 
> varchar(100)) else cast(columns[3] as varchar(100)) end,
> . . . . . . . . . . . . > case when columns[4] = '' then cast(null as 
> varchar(100)) else cast(columns[4] as varchar(100)) end, 
> . . . . . . . . . . . . > case when columns[5] = '' then cast(null as 
> varchar(100)) else cast(columns[5] as varchar(100)) end, 
> . . . . . . . . . . . . > case when columns[0] = '' then cast(null as 
> varchar(100)) else cast(columns[0] as varchar(100)) end, 
> . . . . . . . . . . . . > case when columns[8] = '' then cast(null as 
> varchar(100)) else cast(columns[8] as varchar(100)) end
> . . . . . . . . . . . . > from
> . . . . . . . . . . . . >  `store_sales.dat` ss 
> . . . . . . . . . . . . > ;
> Error: SYSTEM ERROR: NullPointerException
> Fragment 1:20
> [Error Id: 12bb6de2-5f57-4009-9556-8803c767abb3 on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> {code}
> drillbit.log (full log is attached)
> {code}
> 2015-09-01 23:02:19,662 [2a19d429-9c75-c95c-32bc-c00f78f80ca7:frag:1:20] 
> ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: NullPointerException
> Fragment 1:20
> [Error Id: 12bb6de2-5f57-4009-9556-8803c767abb3 on atsqa4-133.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> NullPointerException
> Fragment 1:20
> [Error Id: 12bb6de2-5f57-4009-9556-8803c767abb3 on atsqa4-133.qa.lab:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:524)
>  ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:323)
>

[jira] [Created] (DRILL-3736) Documentation for partition is misleading/wrong syntax

2015-09-02 Thread Edmon Begoli (JIRA)

Edmon Begoli created DRILL-3736:
---

 Summary: Documentation for partition is misleading/wrong syntax
 Key: DRILL-3736
 URL: https://issues.apache.org/jira/browse/DRILL-3736
 Project: Apache Drill
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.1.0
 Environment: Any
Reporter: Edmon Begoli
Assignee: Bridget Bevens
Priority: Minor
 Fix For: 1.2.0


Examples on the web site use appropriate syntax (PARTITION BY)
but the syntax definition uses PARTITION_BY.

https://drill.apache.org/docs/partition-by-clause/

It should all be PARTITION BY.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3736) Documentation for partition is misleading/wrong syntax

2015-09-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14728286#comment-14728286
 ] 

ASF GitHub Bot commented on DRILL-3736:
---

GitHub user ebegoli opened a pull request:

https://github.com/apache/drill/pull/142

Issue DRILL-3736 - fixed syntax of partition by.

It was incorrectly listed as PARTITION_BY. Changed to PARTITION BY.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ebegoli/drill gh-pages

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/142.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #142


commit cc010fd64e3911e39b91f2e7f42f0d6aaf2e336b
Author: Edmon Begoli 
Date:   2015-09-03T00:25:26Z

Issue DRILL-3736 - fixed syntax of partition by. 

It was incorrectly listed as PARTITION_BY. Changed to PARTITION BY.




> Documentation for partition is misleading/wrong syntax
> --
>
> Key: DRILL-3736
> URL: https://issues.apache.org/jira/browse/DRILL-3736
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.1.0
> Environment: Any
>Reporter: Edmon Begoli
>Assignee: Bridget Bevens
>Priority: Minor
>  Labels: documentation
> Fix For: 1.2.0
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Examples on the web site use appropriate syntax (PARTITION BY)
> but the syntax definition uses PARTITION_BY.
> https://drill.apache.org/docs/partition-by-clause/
> It should all be PARTITION BY.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (DRILL-3691) CTAS Memory Leak : IllegalStateException

2015-09-02 Thread Deneche A. Hakim (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim resolved DRILL-3691.
-
Resolution: Fixed

Fixed by DRILL-3684

> CTAS Memory Leak : IllegalStateException
> 
>
> Key: DRILL-3691
> URL: https://issues.apache.org/jira/browse/DRILL-3691
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Reporter: Rahul Challapalli
>Assignee: Deneche A. Hakim
> Attachments: error.log
>
>
> git.commit.id.abbrev=55dfd0e
> The below CTAS statement fails with a memory leak. The query runs on top of 
> Tpch SF100 data.
> {code}
> create table lineitem as select * from dfs.`/drill/testdata/tpch100/lineitem`;
> java.sql.SQLException: SYSTEM ERROR: IllegalStateException: Failure while 
> closing accountor.  Expected private and shared pools to be set to initial 
> values.  However, one or more were not.  Stats are
> zoneinitallocated   delta 
> private 100 100 0 
> shared  00  9998410176  589824.
> Fragment 1:19
> [Error Id: ba8fedf2-be40-4488-af2e-b6034527c943 on qa-node191.qa.lab:31010]
> Aborting command set because "force" is false and command failed: "create 
> table lineitem as select * from dfs.`/drill/testdata/tpch100/lineitem`;"
> {code}
> I attached the log file. I am not uploading the data as it is too large



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (DRILL-3492) Add support for encoding of Drill data types into byte ordered format

2015-09-02 Thread Jinfeng Ni (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni resolved DRILL-3492.
---
Resolution: Fixed

Fixed in commit: 95623912ebf348962fe8a8846c5f47c5fdcf2f78

> Add support for encoding of Drill data types into byte ordered format
> -
>
> Key: DRILL-3492
> URL: https://issues.apache.org/jira/browse/DRILL-3492
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Smidth Panchamia
>Assignee: Smidth Panchamia
> Fix For: 1.2.0
>
> Attachments: 
> 0001-DRILL-3492-Add-support-for-encoding-decoding-of-to-f.patch, 
> 0001-DRILL-3492-Add-support-for-encoding-decoding-of-to-f.patch, 
> 0001-DRILL-3492-merged.patch
>
>
> The following JIRA added this functionality in HBase: 
> https://issues.apache.org/jira/browse/HBASE-8201
> We need to port this functionality in Drill so as to allow filtering and 
> pruning of rows during scans.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3257) TPCDS query 74 results in a StackOverflowError on Scale Factor 1

2015-09-02 Thread Aman Sinha (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14728380#comment-14728380
 ] 

Aman Sinha commented on DRILL-3257:
---

[~seanhychu] I think it would be worthwhile trying to reproduce this issue on 
Calcite.  Note that this query has a single UNION ALL in the WITH clause but 
this is referenced 4 times in the FROM clause, so in effect there are 4 UNION 
ALL.   Each side of the union all is doing  aggregation with group-by.  If you 
need help with repro, let me know.   

> TPCDS query 74 results in a StackOverflowError on Scale Factor 1
> 
>
> Key: DRILL-3257
> URL: https://issues.apache.org/jira/browse/DRILL-3257
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Rahul Challapalli
>Assignee: Sean Hsuan-Yi Chu
> Fix For: 1.2.0
>
> Attachments: error.log
>
>
> git.commit.id.abbrev=5f26b8b
> Query :
> {code}
> WITH year_total 
>  AS (SELECT c_customer_idcustomer_id, 
> c_first_name customer_first_name, 
> c_last_name  customer_last_name, 
> d_year   AS year1, 
> Sum(ss_net_paid) year_total, 
> 's'  sale_type 
>  FROM   customer, 
> store_sales, 
> date_dim 
>  WHERE  c_customer_sk = ss_customer_sk 
> AND ss_sold_date_sk = d_date_sk 
> AND d_year IN ( 1999, 1999 + 1 ) 
>  GROUP  BY c_customer_id, 
>c_first_name, 
>c_last_name, 
>d_year 
>  UNION ALL 
>  SELECT c_customer_idcustomer_id, 
> c_first_name customer_first_name, 
> c_last_name  customer_last_name, 
> d_year   AS year1, 
> Sum(ws_net_paid) year_total, 
> 'w'  sale_type 
>  FROM   customer, 
> web_sales, 
> date_dim 
>  WHERE  c_customer_sk = ws_bill_customer_sk 
> AND ws_sold_date_sk = d_date_sk 
> AND d_year IN ( 1999, 1999 + 1 ) 
>  GROUP  BY c_customer_id, 
>c_first_name, 
>c_last_name, 
>d_year) 
> SELECT t_s_secyear.customer_id, 
>t_s_secyear.customer_first_name, 
>t_s_secyear.customer_last_name 
> FROM   year_total t_s_firstyear, 
>year_total t_s_secyear, 
>year_total t_w_firstyear, 
>year_total t_w_secyear 
> WHERE  t_s_secyear.customer_id = t_s_firstyear.customer_id 
>AND t_s_firstyear.customer_id = t_w_secyear.customer_id 
>AND t_s_firstyear.customer_id = t_w_firstyear.customer_id 
>AND t_s_firstyear.sale_type = 's' 
>AND t_w_firstyear.sale_type = 'w' 
>AND t_s_secyear.sale_type = 's' 
>AND t_w_secyear.sale_type = 'w' 
>AND t_s_firstyear.year1 = 1999 
>AND t_s_secyear.year1 = 1999 + 1 
>AND t_w_firstyear.year1 = 1999 
>AND t_w_secyear.year1 = 1999 + 1 
>AND t_s_firstyear.year_total > 0 
>AND t_w_firstyear.year_total > 0 
>AND CASE 
>  WHEN t_w_firstyear.year_total > 0 THEN t_w_secyear.year_total / 
> t_w_firstyear.year_total 
>  ELSE NULL 
>END > CASE 
>WHEN t_s_firstyear.year_total > 0 THEN 
>t_s_secyear.year_total / 
>t_s_firstyear.year_total 
>ELSE NULL 
>  END 
> ORDER  BY 1, 
>   2, 
>   3
> LIMIT 100;
> {code}
> The above query never returns. I attached the log file.
> Since the data is 1GB I cannot attach it here. Kindly reach out to me if you 
> want more information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3729) Remove info message about invalid argument during query planning where filter is a directory

2015-09-02 Thread Victoria Markman (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-3729:

Assignee: Mehant Baid

> Remove info message about invalid argument during query planning where filter 
> is a directory
> 
>
> Key: DRILL-3729
> URL: https://issues.apache.org/jira/browse/DRILL-3729
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Victoria Markman
>Assignee: Mehant Baid
>
> I hit this INFO/exception during query planning. 
> It sounds like that we are treating directory (2015) as if it was a parquet 
> file.  It would be nice to get rid of an exception in the log file - it is 
> misleading.
> {code}
> explain plan for select l_orderkey from dfs.`/drill/testdata/adt/lineitem` l1 
>  where l1.dir0 = '2015'
> {code}
> {code}
> 2015-08-31 23:36:15,714 [2a1b1b0f-a703-2f9b-8697-8997d5ca00b1:foreman] INFO  
> o.a.d.e.store.mock.MockStorageEngine - Failure while attempting to check for 
> Parquet metadata file.
> java.io.IOException: Open failed for file: /drill/testdata/adt/lineitem/2015, 
> error: Invalid argument (22)
> at com.mapr.fs.MapRClientImpl.open(MapRClientImpl.java:212) 
> ~[maprfs-4.1.0-mapr.jar:4.1.0-mapr]
> at com.mapr.fs.MapRFileSystem.open(MapRFileSystem.java:853) 
> ~[maprfs-4.1.0-mapr.jar:4.1.0-mapr]
> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:800) 
> ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.drill.exec.store.dfs.DrillFileSystem.open(DrillFileSystem.java:128)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.store.dfs.BasicFormatMatcher$MagicStringMatcher.matches(BasicFormatMatcher.java:139)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.store.dfs.BasicFormatMatcher.isReadable(BasicFormatMatcher.java:108)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.store.parquet.ParquetFormatPlugin$ParquetFormatMatcher.isDirReadable(ParquetFormatPlugin.java:226)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.store.parquet.ParquetFormatPlugin$ParquetFormatMatcher.isReadable(ParquetFormatPlugin.java:206)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.create(WorkspaceSchemaFactory.java:291)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.create(WorkspaceSchemaFactory.java:118)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.ExpandingConcurrentMap.getNewEntry(ExpandingConcurrentMap.java:96)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.ExpandingConcurrentMap.get(ExpandingConcurrentMap.java:90)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.getTable(WorkspaceSchemaFactory.java:241)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.store.dfs.FileSystemSchemaFactory$FileSystemSchema.getTable(FileSystemSchemaFactory.java:117)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.calcite.jdbc.SimpleCalciteSchema.getTable(SimpleCalciteSchema.java:83)
>  [calcite-core-1.4.0-drill-r0.jar:1.4.0-drill-r0]
> at 
> org.apache.calcite.prepare.CalciteCatalogReader.getTableFrom(CalciteCatalogReader.java:116)
>  [calcite-core-1.4.0-drill-r0.jar:1.4.0-drill-r0]
> at 
> org.apache.calcite.prepare.CalciteCatalogReader.getTable(CalciteCatalogReader.java:99)
>  [calcite-core-1.4.0-drill-r0.jar:1.4.0-drill-r0]
> at 
> org.apache.calcite.prepare.CalciteCatalogReader.getTable(CalciteCatalogReader.java:70)
>  [calcite-core-1.4.0-drill-r0.jar:1.4.0-drill-r0]
> at 
> org.apache.calcite.sql.validate.EmptyScope.getTableNamespace(EmptyScope.java:75)
>  [calcite-core-1.4.0-drill-r0.jar:1.4.0-drill-r0]
> at 
> org.apache.calcite.sql.validate.DelegatingScope.getTableNamespace(DelegatingScope.java:124)
>  [calcite-core-1.4.0-drill-r0.jar:1.4.0-drill-r0]
> at 
> org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl(IdentifierNamespace.java:104)
>  [calcite-core-1.4.0-drill-r0.jar:1.4.0-drill-r0]
> at 
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:86)
>  [calcite-core-1.4.0-drill-r0.jar:1.4.0-drill-r0]
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:876)
>

[jira] [Commented] (DRILL-3736) Documentation for partition is misleading/wrong syntax

2015-09-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14728346#comment-14728346
 ] 

ASF GitHub Bot commented on DRILL-3736:
---

Github user ebegoli commented on the pull request:

https://github.com/apache/drill/pull/142#issuecomment-137291949
  
I have pull request in place so just merge it :-)

On Wednesday, September 2, 2015, Kristine Hahn 
wrote:

> Thanks, I also fixed this in my repo 13 days ago
> 
https://github.com/kristinehahn/drill/tree/gh-pages/_docs/sql-reference/sql-commands,
> but haven't been able to get it committed.
>
> —
> Reply to this email directly or view it on GitHub
> .
>



> Documentation for partition is misleading/wrong syntax
> --
>
> Key: DRILL-3736
> URL: https://issues.apache.org/jira/browse/DRILL-3736
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.1.0
> Environment: Any
>Reporter: Edmon Begoli
>Assignee: Bridget Bevens
>Priority: Minor
>  Labels: documentation
> Fix For: 1.2.0
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Examples on the web site use appropriate syntax (PARTITION BY)
> but the syntax definition uses PARTITION_BY.
> https://drill.apache.org/docs/partition-by-clause/
> It should all be PARTITION BY.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3535) Drop table support

2015-09-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14728397#comment-14728397
 ] 

ASF GitHub Bot commented on DRILL-3535:
---

Github user mehant commented on the pull request:

https://github.com/apache/drill/pull/140#issuecomment-137302059
  
@jacques-n uploaded patch with your review comments. Please take a look.


> Drop table support
> --
>
> Key: DRILL-3535
> URL: https://issues.apache.org/jira/browse/DRILL-3535
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Mehant Baid
>Assignee: Mehant Baid
>
> Umbrella JIRA to track support for "Drop table" feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3535) Drop table support

2015-09-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14728396#comment-14728396
 ] 

ASF GitHub Bot commented on DRILL-3535:
---

Github user mehant commented on a diff in the pull request:

https://github.com/apache/drill/pull/140#discussion_r38607640
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/BasicFormatMatcher.java
 ---
@@ -72,7 +72,7 @@ public boolean supportDirectoryReads() {
 
   @Override
   public FormatSelection isReadable(DrillFileSystem fs, FileSelection 
selection) throws IOException {
-if (isReadable(fs, selection.getFirstPath(fs))) {
+if (isFileReadable(fs, selection.getFirstPath(fs))) {
--- End diff --

You are right, ParquetFormatMatcher overrides isReadable to first check if 
the directory contains the metadata file, if not it delegates to the base class 
method to check if a particular file is readable or not. 

The existing naming is a bit confusing. In BasicFormatMatcher there are two 
methods with the same name:

1. FormatSelection isReadable(DrillFileSystem fs, FileSelection selection) 
-> operates on directory level
2. boolean isReadable(DrillFileSystem fs, FileStatus status) -> operates on 
a single file level.

I've renamed the second to be called isFileReadable and exposed it in the 
abstract class so drop can invoke it on every file. 





> Drop table support
> --
>
> Key: DRILL-3535
> URL: https://issues.apache.org/jira/browse/DRILL-3535
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Mehant Baid
>Assignee: Mehant Baid
>
> Umbrella JIRA to track support for "Drop table" feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3661) Add/edit various JDBC Javadoc.

2015-09-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14727829#comment-14727829
 ] 

ASF GitHub Bot commented on DRILL-3661:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/119


> Add/edit various JDBC Javadoc.
> --
>
> Key: DRILL-3661
> URL: https://issues.apache.org/jira/browse/DRILL-3661
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (DRILL-3661) Add/edit various JDBC Javadoc.

2015-09-02 Thread Daniel Barclay (Drill) (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) resolved DRILL-3661.
---
Resolution: Fixed

> Add/edit various JDBC Javadoc.
> --
>
> Key: DRILL-3661
> URL: https://issues.apache.org/jira/browse/DRILL-3661
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-1942) Improve off-heap memory usage tracking

2015-09-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14727747#comment-14727747
 ] 

ASF GitHub Bot commented on DRILL-1942:
---

Github user jaltekruse commented on a diff in the pull request:

https://github.com/apache/drill/pull/105#discussion_r38564284
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/TestTpchDistributedConcurrent.java
 ---
@@ -0,0 +1,199 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill;
+
+import java.io.IOException;
+import java.util.Random;
+import java.util.Set;
+import java.util.concurrent.Semaphore;
+
+import org.apache.drill.QueryTestUtil;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.util.TestTools;
+import org.apache.drill.exec.proto.UserBitShared;
+import org.apache.drill.exec.proto.UserBitShared.QueryResult.QueryState;
+import org.apache.drill.exec.rpc.user.UserResultsListener;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TestRule;
+
+import com.google.common.collect.Sets;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
+import static org.junit.Assert.fail;
+
+/*
+ * Note that the real interest here is that the drillbit doesn't become
+ * unstable from running a lot of queries concurrently -- it's not about
+ * any particular order of execution. We ignore the results.
+ */
+public class TestTpchDistributedConcurrent extends BaseTestQuery {
+  /*
+   * Longer timeout than usual.
+   *
+   * If the test does fail due to a timeout, see the comment in
+   * ChainingResultListener.queryCompleted() before assuming this
+   * needs to be adjusted.
+   */
+  @Rule public final TestRule TIMEOUT = TestTools.getTimeoutRule(12);
+
+  /*
+   * Valid test names taken from TestTpchDistributed. Fuller path prefixes 
are
+   * used so that tests may also be taken from other locations -- more 
variety
+   * is better as far as this test goes.
+   */
+  private final static String queryFile[] = {
+"queries/tpch/01.sql",
+"queries/tpch/03.sql",
+"queries/tpch/04.sql",
+"queries/tpch/05.sql",
+"queries/tpch/06.sql",
+"queries/tpch/07.sql",
+"queries/tpch/08.sql",
+"queries/tpch/09.sql",
+"queries/tpch/10.sql",
+"queries/tpch/11.sql",
+"queries/tpch/12.sql",
+"queries/tpch/13.sql",
+"queries/tpch/14.sql",
+// "queries/tpch/15.sql", this creates a view
+"queries/tpch/16.sql",
+"queries/tpch/18.sql",
+"queries/tpch/19_1.sql",
+"queries/tpch/20.sql",
+  };
+
+  private final static int TOTAL_QUERIES = 115;
+  private final static int CONCURRENT_QUERIES = 15;
+
+  private final static Random random = new Random(0xdeadbeef); // Use the 
same seed each time.
+  private final static String alterSession = "alter session set 
`planner.slice_target` = 10";
+
+  private int remainingQueries = TOTAL_QUERIES - CONCURRENT_QUERIES;
+  private final Semaphore completionSemaphore = new Semaphore(0);
+  private final Semaphore submissionSemaphore = new Semaphore(0);
+  private final Set listeners = 
Sets.newIdentityHashSet();
+
+  private void submitRandomQuery() {
+final String filename = queryFile[random.nextInt(queryFile.length)];
+final String query;
+try {
+  query = QueryTestUtil.normalizeQuery(getFile(filename)).replace(';', 
' ');
+} catch(IOException e) {
+  throw new RuntimeException("Caught exception", e);
+}
+final UserResultsListener listener = new ChainingSilentListener(query);
+client.runQuery(UserBitShared.QueryType.SQL, query, listener);
+synchronized(listeners) {
+  listeners.add(listener);
+}
+  }
+
+  private class

[jira] [Closed] (DRILL-3673) Memory leak in parquet writer on CTAS

2015-09-02 Thread Victoria Markman (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman closed DRILL-3673.
---

> Memory leak in parquet writer on CTAS
> -
>
> Key: DRILL-3673
> URL: https://issues.apache.org/jira/browse/DRILL-3673
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Writer
>Affects Versions: 1.2.0
>Reporter: Victoria Markman
>Assignee: Deneche A. Hakim
>Priority: Critical
> Fix For: 1.2.0
>
> Attachments: 1_rows.dat, ctas.sh
>
>
> First CTAS executes successfully, second runs out of memory.
> If I change storage.format to 'csv' this problem goes away.
> {code}
> 0: jdbc:drill:schema=dfs> create table lineitem as select
> . . . . . . . . . . . . > cast(columns[0] as int) l_orderkey,
> . . . . . . . . . . . . > cast(columns[1] as int) l_partkey,
> . . . . . . . . . . . . > cast(columns[2] as int) l_suppkey,
> . . . . . . . . . . . . > cast(columns[3] as int) l_linenumber,
> . . . . . . . . . . . . > cast(columns[4] as double) l_quantity,
> . . . . . . . . . . . . > cast(columns[5] as double) l_extendedprice,
> . . . . . . . . . . . . > cast(columns[6] as double) l_discount,
> . . . . . . . . . . . . > cast(columns[7] as double) l_tax,
> . . . . . . . . . . . . > cast(columns[8] as varchar(200)) l_returnflag,
> . . . . . . . . . . . . > cast(columns[9] as varchar(200)) l_linestatus,
> . . . . . . . . . . . . > cast(columns[10] as date) l_shipdate,
> . . . . . . . . . . . . > cast(columns[11] as date) l_commitdate,
> . . . . . . . . . . . . > cast(columns[12] as date) l_receiptdate,
> . . . . . . . . . . . . > cast(columns[13] as varchar(200)) 
> l_shipinstruct,
> . . . . . . . . . . . . > cast(columns[14] as varchar(200)) l_shipmode,
> . . . . . . . . . . . . > cast(columns[15] as varchar(200)) l_comment
> . . . . . . . . . . . . > from `lineitem.dat`;
> +---++
> | Fragment  | Number of records written  |
> +---++
> | 1_9   | 2084034|
> | 1_18  | 2083936|
> | 1_7   | 2083619|
> | 1_6   | 2083933|
> | 1_8   | 2084177|
> | 1_21  | 2084148|
> | 1_17  | 2084039|
> | 1_16  | 2083863|
> | 1_13  | 2083740|
> | 1_20  | 2083774|
> | 1_22  | 2083954|
> | 1_10  | 2083929|
> | 1_19  | 2083804|
> | 1_11  | 2084107|
> | 1_12  | 2083968|
> | 1_14  | 2084002|
> | 1_15  | 2083988|
> | 1_5   | 3633178|
> | 1_1   | 4184330|
> | 1_3   | 4184246|
> | 1_0   | 4192872|
> | 1_2   | 4184342|
> | 1_4   | 4180069|
> +---++
> 23 rows selected (89.147 seconds)
> 0: jdbc:drill:schema=dfs> select * from sys.memory;
> +++---+-+-+-+-+
> |  hostname  | user_port  | heap_current  |  heap_max   | 
> direct_current  | jvm_direct_current  | direct_max  |
> +++---+-+-+-+-+
> | atsqa4-133.qa.lab  | 31010  | 305725032 | 4294967296  | 9799113 
> | 5570050038  | 8589934592  |
> +++---+-+-+-+-+
> 1 row selected (0.225 seconds)
> *
> *** Delete line item file ***
> *
> 0: jdbc:drill:schema=dfs> create table lineitem as select
> . . . . . . . . . . . . > cast(columns[0] as int) l_orderkey,
> . . . . . . . . . . . . > cast(columns[1] as int) l_partkey,
> . . . . . . . . . . . . > cast(columns[2] as int) l_suppkey,
> . . . . . . . . . . . . > cast(columns[3] as int) l_linenumber,
> . . . . . . . . . . . . > cast(columns[4] as double) l_quantity,
> . . . . . . . . . . . . > cast(columns[5] as double) l_extendedprice,
> . . . . . . . . . . . . > cast(columns[6] as double) l_discount,
> . . . . . . . . . . . . > cast(columns[7] as double) l_tax,
> . . . . . . . . . . . . > cast(columns[8] as varchar(200)) l_returnflag,
> . . . . . . . . . . . . > cast(columns[9] as varchar(200)) l_linestatus,
> . . . . . . . . . . . . > cast(columns[10]

[jira] [Comment Edited] (DRILL-3257) TPCDS query 74 results in a StackOverflowError on Scale Factor 1

2015-09-02 Thread Sudheesh Katkam (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14728193#comment-14728193
 ] 

Sudheesh Katkam edited comment on DRILL-3257 at 9/2/15 11:18 PM:
-

This is a planning issue. We introduced pushing project past set operations in 
this 
[commit|https://github.com/apache/drill/commit/bca20655283d351d5f5c4090e9047419ff22c75e],
 and somehow this causes Calcite to loop infinitely \(?\) during plan cost 
estimation.

Other issues I noticed while debugging this issue, re Foreman node:
* Sometimes the Drillbit crashes.
* Sometimes the Drillbit hangs while shutting down, making it unresponsive.
* Other sqlline connections to the Drillbit cannot be made, with the Drillbit 
using a lot of CPU but essentially doing nothing.


was (Author: sudheeshkatkam):
This is a planning issue. We introduced pushing project past set operations in 
this 
[commit|https://github.com/apache/drill/commit/bca20655283d351d5f5c4090e9047419ff22c75e],
 and somehow this causes Calcite to loop infinitely (?) during plan cost 
estimation.

Other issues I noticed while debugging this issue, re Foreman node:
* Sometimes the Drillbit crashes.
* Sometimes the Drillbit hangs while shutting down, making it unresponsive.
* Other sqlline connections to the Drillbit cannot be made, with the Drillbit 
using a lot of CPU but essentially doing nothing.

> TPCDS query 74 results in a StackOverflowError on Scale Factor 1
> 
>
> Key: DRILL-3257
> URL: https://issues.apache.org/jira/browse/DRILL-3257
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Rahul Challapalli
>Assignee: Sudheesh Katkam
> Fix For: 1.2.0
>
> Attachments: error.log
>
>
> git.commit.id.abbrev=5f26b8b
> Query :
> {code}
> WITH year_total 
>  AS (SELECT c_customer_idcustomer_id, 
> c_first_name customer_first_name, 
> c_last_name  customer_last_name, 
> d_year   AS year1, 
> Sum(ss_net_paid) year_total, 
> 's'  sale_type 
>  FROM   customer, 
> store_sales, 
> date_dim 
>  WHERE  c_customer_sk = ss_customer_sk 
> AND ss_sold_date_sk = d_date_sk 
> AND d_year IN ( 1999, 1999 + 1 ) 
>  GROUP  BY c_customer_id, 
>c_first_name, 
>c_last_name, 
>d_year 
>  UNION ALL 
>  SELECT c_customer_idcustomer_id, 
> c_first_name customer_first_name, 
> c_last_name  customer_last_name, 
> d_year   AS year1, 
> Sum(ws_net_paid) year_total, 
> 'w'  sale_type 
>  FROM   customer, 
> web_sales, 
> date_dim 
>  WHERE  c_customer_sk = ws_bill_customer_sk 
> AND ws_sold_date_sk = d_date_sk 
> AND d_year IN ( 1999, 1999 + 1 ) 
>  GROUP  BY c_customer_id, 
>c_first_name, 
>c_last_name, 
>d_year) 
> SELECT t_s_secyear.customer_id, 
>t_s_secyear.customer_first_name, 
>t_s_secyear.customer_last_name 
> FROM   year_total t_s_firstyear, 
>year_total t_s_secyear, 
>year_total t_w_firstyear, 
>year_total t_w_secyear 
> WHERE  t_s_secyear.customer_id = t_s_firstyear.customer_id 
>AND t_s_firstyear.customer_id = t_w_secyear.customer_id 
>AND t_s_firstyear.customer_id = t_w_firstyear.customer_id 
>AND t_s_firstyear.sale_type = 's' 
>AND t_w_firstyear.sale_type = 'w' 
>AND t_s_secyear.sale_type = 's' 
>AND t_w_secyear.sale_type = 'w' 
>AND t_s_firstyear.year1 = 1999 
>AND t_s_secyear.year1 = 1999 + 1 
>AND t_w_firstyear.year1 = 1999 
>AND t_w_secyear.year1 = 1999 + 1 
>AND t_s_firstyear.year_total > 0 
>AND t_w_firstyear.year_total > 0 
>AND CASE 
>  WHEN t_w_firstyear.year_total > 0 THEN t_w_secyear.year_total / 
> t_w_firstyear.year_total 
>  ELSE NULL 
>END > CASE 
>WHEN t_s_firstyear.year_total > 0 THEN 
>t_s_secyear.year_total / 
>t_s_firstyear.year_total 
>ELSE NULL 
>  END 
> ORDER  BY 1, 
>   2, 
>   3
> LIMIT 100;
> {code}
> The above query never returns. I attached the log file.
> Since the data is 1GB I cannot attach it here. Kindly reach out to me if you 
> want more information.

[jira] [Commented] (DRILL-3497) Throw UserException#validationError for errors when modifying options

2015-09-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14728044#comment-14728044
 ] 

ASF GitHub Bot commented on DRILL-3497:
---

Github user jaltekruse commented on a diff in the pull request:

https://github.com/apache/drill/pull/98#discussion_r38586355
  
--- Diff: 
common/src/main/java/org/apache/drill/common/map/CaseInsensitiveMap.java ---
@@ -0,0 +1,141 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.common.map;
+
+import com.google.common.collect.ImmutableMap;
+import com.google.common.collect.Maps;
+
+import java.util.Collection;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+
+/**
+ * A special type of {@link Map} with {@link String}s as keys, and the 
case of a key is ignored for operations involving
+ * keys like {@link #put}, {@link #get}, etc. The keys are stored and 
retrieved in lower case. Use the static methods to
+ * create instances of this class (e.g. {@link #newConcurrentMap}).
+ *
+ * @param  the type of values to be stored in the map
+ */
+public class CaseInsensitiveMap implements Map {
+
+  /**
+   * Returns a new instance of {@link java.util.concurrent.ConcurrentMap} 
with key case-insensitivity. See
+   * {@link java.util.concurrent.ConcurrentMap}.
+   *
+   * @param  type of values to be stored in the map
+   * @return key case-insensitive concurrent map
+   */
+  public static  CaseInsensitiveMap newConcurrentMap() {
+return new CaseInsensitiveMap<>(Maps.newConcurrentMap());
+  }
+
+  /**
+   * Returns a new instance of {@link java.util.HashMap} with key 
case-insensitivity. See {@link java.util.HashMap}.
+   *
+   * @param  type of values to be stored in the map
+   * @return key case-insensitive hash map
+   */
+  public static  CaseInsensitiveMap newHashMap() {
+return new CaseInsensitiveMap<>(Maps.newHashMap());
+  }
+
+  /**
+   * Returns a new instance of {@link ImmutableMap} with key 
case-insensitivity. This map is built from the given
+   * map. See {@link ImmutableMap}.
+   *
+   * @param map map to copy from
+   * @param  type of values to be stored in the map
+   * @return key case-insensitive immutable map
+   */
+  public static  CaseInsensitiveMap newImmutableMap(final 
Map map) {
+final ImmutableMap.Builder builder = 
ImmutableMap.builder();
+for (final Entry entry : 
map.entrySet()) {
+  builder.put(entry.getKey().toLowerCase(), entry.getValue());
+}
+return new CaseInsensitiveMap<>(builder.build());
+  }
+
+  private final Map underlyingMap;
+
+  protected CaseInsensitiveMap(final Map underlyingMap) {
+this.underlyingMap = underlyingMap;
--- End diff --

Would it be worth enforcing that the map being wrapped is empty? As the 
newImmuatableMap() method makes a copy of the map, properly importing the keys, 
someone may get the wrong impression that this constructor would remove and 
re-insert the values currently in the map so that they will be accessible. 
Right now it would be possible to wrap a map that has uppercase in the key 
names and those members would not be able to be extracted through this 
interface.


> Throw UserException#validationError for errors when modifying options
> -
>
> Key: DRILL-3497
> URL: https://issues.apache.org/jira/browse/DRILL-3497
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Reporter: Sudheesh Katkam
>Assignee: Jason Altekruse
>Priority: Minor
> Fix For: 1.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3598) Use a factory to create the root allocator

2015-09-02 Thread Jason Altekruse (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14728036#comment-14728036
 ] 

Jason Altekruse commented on DRILL-3598:


I forgot to make my comment on the PR again, and comments from a commit within 
the PR are not propagating here.

+1 on the patch

> Use a factory to create the root allocator
> --
>
> Key: DRILL-3598
> URL: https://issues.apache.org/jira/browse/DRILL-3598
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Flow
>Affects Versions: 1.1.0
>Reporter: Chris Westin
>Assignee: Jason Altekruse
>
> Use a factory instead of the constructor for the top-level direct memory 
> allocator so that we can replace which allocator gets instantiated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-1805) view not found if view file directory contains child with colon in simple name

2015-09-02 Thread Daniel Barclay (Drill) (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-1805:
--
Assignee: (was: Daniel Barclay (Drill))

> view not found if view file directory contains child with colon in simple name
> --
>
> Key: DRILL-1805
> URL: https://issues.apache.org/jira/browse/DRILL-1805
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>Priority: Minor
> Fix For: Future
>
>
> Scanning the file system for view files fails (resulting in "Table 'vv' not 
> found" errors) if the directory being scanned for view files contains a file 
> whose simple name (last pathname segment) contains a colon.
> For example, the unit test method testDRILL_811View in Drill's 
> ./exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java fails 
> if /tmp contains a file named like "aptitude-root.1528:JIsVaZ".
> The cause is that Hadoop filesystem glob-pattern-matching code 
> (org.apache.hadoop.fs.Globber's glob() and org.apache.hadoop.fs.Path's 
> Path(Path,String)) mixes up relative file pathname strings and relative 
> URI-style Path strings.
> Action items:
> 1) Report Hadoop bug to Hadoop.
> 2) Review Drill's handling and propagation of the error.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3673) Memory leak in parquet writer on CTAS

2015-09-02 Thread Victoria Markman (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14728169#comment-14728169
 ] 

Victoria Markman commented on DRILL-3673:
-

I can't reproduce this bug with the latest rpm. Had 10 successful iterations 
and memory leak is not there. I suspect that fix for DRILL-3684 fixed this 
issue as well.

Verification is done with this revision of drill:
#Tue Sep 01 18:18:19 UTC 2015
git.commit.id.abbrev=b0e9e7c
git.commit.user.email=adene...@gmail.com
git.commit.message.full=DRILL-3684\: CTAS \: Memory Leak when using CTAS with 
tpch sf100\n\nThis closes \#141\n


> Memory leak in parquet writer on CTAS
> -
>
> Key: DRILL-3673
> URL: https://issues.apache.org/jira/browse/DRILL-3673
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Writer
>Affects Versions: 1.2.0
>Reporter: Victoria Markman
>Assignee: Deneche A. Hakim
>Priority: Critical
> Fix For: 1.2.0
>
> Attachments: 1_rows.dat, ctas.sh
>
>
> First CTAS executes successfully, second runs out of memory.
> If I change storage.format to 'csv' this problem goes away.
> {code}
> 0: jdbc:drill:schema=dfs> create table lineitem as select
> . . . . . . . . . . . . > cast(columns[0] as int) l_orderkey,
> . . . . . . . . . . . . > cast(columns[1] as int) l_partkey,
> . . . . . . . . . . . . > cast(columns[2] as int) l_suppkey,
> . . . . . . . . . . . . > cast(columns[3] as int) l_linenumber,
> . . . . . . . . . . . . > cast(columns[4] as double) l_quantity,
> . . . . . . . . . . . . > cast(columns[5] as double) l_extendedprice,
> . . . . . . . . . . . . > cast(columns[6] as double) l_discount,
> . . . . . . . . . . . . > cast(columns[7] as double) l_tax,
> . . . . . . . . . . . . > cast(columns[8] as varchar(200)) l_returnflag,
> . . . . . . . . . . . . > cast(columns[9] as varchar(200)) l_linestatus,
> . . . . . . . . . . . . > cast(columns[10] as date) l_shipdate,
> . . . . . . . . . . . . > cast(columns[11] as date) l_commitdate,
> . . . . . . . . . . . . > cast(columns[12] as date) l_receiptdate,
> . . . . . . . . . . . . > cast(columns[13] as varchar(200)) 
> l_shipinstruct,
> . . . . . . . . . . . . > cast(columns[14] as varchar(200)) l_shipmode,
> . . . . . . . . . . . . > cast(columns[15] as varchar(200)) l_comment
> . . . . . . . . . . . . > from `lineitem.dat`;
> +---++
> | Fragment  | Number of records written  |
> +---++
> | 1_9   | 2084034|
> | 1_18  | 2083936|
> | 1_7   | 2083619|
> | 1_6   | 2083933|
> | 1_8   | 2084177|
> | 1_21  | 2084148|
> | 1_17  | 2084039|
> | 1_16  | 2083863|
> | 1_13  | 2083740|
> | 1_20  | 2083774|
> | 1_22  | 2083954|
> | 1_10  | 2083929|
> | 1_19  | 2083804|
> | 1_11  | 2084107|
> | 1_12  | 2083968|
> | 1_14  | 2084002|
> | 1_15  | 2083988|
> | 1_5   | 3633178|
> | 1_1   | 4184330|
> | 1_3   | 4184246|
> | 1_0   | 4192872|
> | 1_2   | 4184342|
> | 1_4   | 4180069|
> +---++
> 23 rows selected (89.147 seconds)
> 0: jdbc:drill:schema=dfs> select * from sys.memory;
> +++---+-+-+-+-+
> |  hostname  | user_port  | heap_current  |  heap_max   | 
> direct_current  | jvm_direct_current  | direct_max  |
> +++---+-+-+-+-+
> | atsqa4-133.qa.lab  | 31010  | 305725032 | 4294967296  | 9799113 
> | 5570050038  | 8589934592  |
> +++---+-+-+-+-+
> 1 row selected (0.225 seconds)
> *
> *** Delete line item file ***
> *
> 0: jdbc:drill:schema=dfs> create table lineitem as select
> . . . . . . . . . . . . > cast(columns[0] as int) l_orderkey,
> . . . . . . . . . . . . > cast(columns[1] as int) l_partkey,
> . . . . . . . . . . . . > cast(columns[2] as int) l_suppkey,
> . . . . . . . . . . . . > cast(columns[3] as int) l_linenumber,
> . . . . . .

[jira] [Commented] (DRILL-3497) Throw UserException#validationError for errors when modifying options

2015-09-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14728165#comment-14728165
 ] 

ASF GitHub Bot commented on DRILL-3497:
---

Github user jaltekruse commented on a diff in the pull request:

https://github.com/apache/drill/pull/98#discussion_r38594193
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/FallbackOptionManager.java
 ---
@@ -35,61 +42,65 @@ public FallbackOptionManager(OptionManager fallback) {
 
   @Override
   public Iterator iterator() {
-return Iterables.concat(fallback, optionIterable()).iterator();
+return Iterables.concat(fallback, getLocalOptions()).iterator();
   }
 
   @Override
-  public OptionValue getOption(String name) {
-final OptionValue opt = getLocalOption(name);
-if(opt == null && fallback != null){
+  public OptionValue getOption(final String name) {
+final OptionValue value = getLocalOption(name);
+if (value == null && fallback != null) {
--- End diff --

This null check on the fallback manager seems odd to me. If something 
doesn't have a fallback, why would they create one of these? Should a non-null 
fallback be enforced in the constructor?


> Throw UserException#validationError for errors when modifying options
> -
>
> Key: DRILL-3497
> URL: https://issues.apache.org/jira/browse/DRILL-3497
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Reporter: Sudheesh Katkam
>Assignee: Jason Altekruse
>Priority: Minor
> Fix For: 1.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (DRILL-3673) Memory leak in parquet writer on CTAS

2015-09-02 Thread Victoria Markman (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman resolved DRILL-3673.
-
Resolution: Fixed

> Memory leak in parquet writer on CTAS
> -
>
> Key: DRILL-3673
> URL: https://issues.apache.org/jira/browse/DRILL-3673
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Writer
>Affects Versions: 1.2.0
>Reporter: Victoria Markman
>Assignee: Deneche A. Hakim
>Priority: Critical
> Fix For: 1.2.0
>
> Attachments: 1_rows.dat, ctas.sh
>
>
> First CTAS executes successfully, second runs out of memory.
> If I change storage.format to 'csv' this problem goes away.
> {code}
> 0: jdbc:drill:schema=dfs> create table lineitem as select
> . . . . . . . . . . . . > cast(columns[0] as int) l_orderkey,
> . . . . . . . . . . . . > cast(columns[1] as int) l_partkey,
> . . . . . . . . . . . . > cast(columns[2] as int) l_suppkey,
> . . . . . . . . . . . . > cast(columns[3] as int) l_linenumber,
> . . . . . . . . . . . . > cast(columns[4] as double) l_quantity,
> . . . . . . . . . . . . > cast(columns[5] as double) l_extendedprice,
> . . . . . . . . . . . . > cast(columns[6] as double) l_discount,
> . . . . . . . . . . . . > cast(columns[7] as double) l_tax,
> . . . . . . . . . . . . > cast(columns[8] as varchar(200)) l_returnflag,
> . . . . . . . . . . . . > cast(columns[9] as varchar(200)) l_linestatus,
> . . . . . . . . . . . . > cast(columns[10] as date) l_shipdate,
> . . . . . . . . . . . . > cast(columns[11] as date) l_commitdate,
> . . . . . . . . . . . . > cast(columns[12] as date) l_receiptdate,
> . . . . . . . . . . . . > cast(columns[13] as varchar(200)) 
> l_shipinstruct,
> . . . . . . . . . . . . > cast(columns[14] as varchar(200)) l_shipmode,
> . . . . . . . . . . . . > cast(columns[15] as varchar(200)) l_comment
> . . . . . . . . . . . . > from `lineitem.dat`;
> +---++
> | Fragment  | Number of records written  |
> +---++
> | 1_9   | 2084034|
> | 1_18  | 2083936|
> | 1_7   | 2083619|
> | 1_6   | 2083933|
> | 1_8   | 2084177|
> | 1_21  | 2084148|
> | 1_17  | 2084039|
> | 1_16  | 2083863|
> | 1_13  | 2083740|
> | 1_20  | 2083774|
> | 1_22  | 2083954|
> | 1_10  | 2083929|
> | 1_19  | 2083804|
> | 1_11  | 2084107|
> | 1_12  | 2083968|
> | 1_14  | 2084002|
> | 1_15  | 2083988|
> | 1_5   | 3633178|
> | 1_1   | 4184330|
> | 1_3   | 4184246|
> | 1_0   | 4192872|
> | 1_2   | 4184342|
> | 1_4   | 4180069|
> +---++
> 23 rows selected (89.147 seconds)
> 0: jdbc:drill:schema=dfs> select * from sys.memory;
> +++---+-+-+-+-+
> |  hostname  | user_port  | heap_current  |  heap_max   | 
> direct_current  | jvm_direct_current  | direct_max  |
> +++---+-+-+-+-+
> | atsqa4-133.qa.lab  | 31010  | 305725032 | 4294967296  | 9799113 
> | 5570050038  | 8589934592  |
> +++---+-+-+-+-+
> 1 row selected (0.225 seconds)
> *
> *** Delete line item file ***
> *
> 0: jdbc:drill:schema=dfs> create table lineitem as select
> . . . . . . . . . . . . > cast(columns[0] as int) l_orderkey,
> . . . . . . . . . . . . > cast(columns[1] as int) l_partkey,
> . . . . . . . . . . . . > cast(columns[2] as int) l_suppkey,
> . . . . . . . . . . . . > cast(columns[3] as int) l_linenumber,
> . . . . . . . . . . . . > cast(columns[4] as double) l_quantity,
> . . . . . . . . . . . . > cast(columns[5] as double) l_extendedprice,
> . . . . . . . . . . . . > cast(columns[6] as double) l_discount,
> . . . . . . . . . . . . > cast(columns[7] as double) l_tax,
> . . . . . . . . . . . . > cast(columns[8] as varchar(200)) l_returnflag,
> . . . . . . . . . . . . > cast(columns[9] as varchar(200)) l_linestatus,
> . . . . . . . . . .

[jira] [Updated] (DRILL-3257) TPCDS query 74 results in a StackOverflowError on Scale Factor 1

2015-09-02 Thread Sudheesh Katkam (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sudheesh Katkam updated DRILL-3257:
---
Component/s: (was: Execution - Flow)
 Query Planning & Optimization

This is a planning issue. We introduced pushing project past set operations in 
this 
[commit|https://github.com/apache/drill/commit/bca20655283d351d5f5c4090e9047419ff22c75e],
 and somehow this causes Calcite to loop infinitely (?) during plan cost 
estimation.

Other issues I noticed while debugging this issue, re Foreman node:
* Sometimes the Drillbit crashes.
* Sometimes the Drillbit hangs while shutting down, making it unresponsive.
* Other sqlline connections to the Drillbit cannot be made, with the Drillbit 
using a lot of CPU but essentially doing nothing.

> TPCDS query 74 results in a StackOverflowError on Scale Factor 1
> 
>
> Key: DRILL-3257
> URL: https://issues.apache.org/jira/browse/DRILL-3257
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Rahul Challapalli
>Assignee: Sudheesh Katkam
> Fix For: 1.2.0
>
> Attachments: error.log
>
>
> git.commit.id.abbrev=5f26b8b
> Query :
> {code}
> WITH year_total 
>  AS (SELECT c_customer_idcustomer_id, 
> c_first_name customer_first_name, 
> c_last_name  customer_last_name, 
> d_year   AS year1, 
> Sum(ss_net_paid) year_total, 
> 's'  sale_type 
>  FROM   customer, 
> store_sales, 
> date_dim 
>  WHERE  c_customer_sk = ss_customer_sk 
> AND ss_sold_date_sk = d_date_sk 
> AND d_year IN ( 1999, 1999 + 1 ) 
>  GROUP  BY c_customer_id, 
>c_first_name, 
>c_last_name, 
>d_year 
>  UNION ALL 
>  SELECT c_customer_idcustomer_id, 
> c_first_name customer_first_name, 
> c_last_name  customer_last_name, 
> d_year   AS year1, 
> Sum(ws_net_paid) year_total, 
> 'w'  sale_type 
>  FROM   customer, 
> web_sales, 
> date_dim 
>  WHERE  c_customer_sk = ws_bill_customer_sk 
> AND ws_sold_date_sk = d_date_sk 
> AND d_year IN ( 1999, 1999 + 1 ) 
>  GROUP  BY c_customer_id, 
>c_first_name, 
>c_last_name, 
>d_year) 
> SELECT t_s_secyear.customer_id, 
>t_s_secyear.customer_first_name, 
>t_s_secyear.customer_last_name 
> FROM   year_total t_s_firstyear, 
>year_total t_s_secyear, 
>year_total t_w_firstyear, 
>year_total t_w_secyear 
> WHERE  t_s_secyear.customer_id = t_s_firstyear.customer_id 
>AND t_s_firstyear.customer_id = t_w_secyear.customer_id 
>AND t_s_firstyear.customer_id = t_w_firstyear.customer_id 
>AND t_s_firstyear.sale_type = 's' 
>AND t_w_firstyear.sale_type = 'w' 
>AND t_s_secyear.sale_type = 's' 
>AND t_w_secyear.sale_type = 'w' 
>AND t_s_firstyear.year1 = 1999 
>AND t_s_secyear.year1 = 1999 + 1 
>AND t_w_firstyear.year1 = 1999 
>AND t_w_secyear.year1 = 1999 + 1 
>AND t_s_firstyear.year_total > 0 
>AND t_w_firstyear.year_total > 0 
>AND CASE 
>  WHEN t_w_firstyear.year_total > 0 THEN t_w_secyear.year_total / 
> t_w_firstyear.year_total 
>  ELSE NULL 
>END > CASE 
>WHEN t_s_firstyear.year_total > 0 THEN 
>t_s_secyear.year_total / 
>t_s_firstyear.year_total 
>ELSE NULL 
>  END 
> ORDER  BY 1, 
>   2, 
>   3
> LIMIT 100;
> {code}
> The above query never returns. I attached the log file.
> Since the data is 1GB I cannot attach it here. Kindly reach out to me if you 
> want more information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

44 matches

Mail list logo