date:20171101

[jira] [Commented] (DRILL-5822) The query with "SELECT *" with "ORDER BY" clause and `planner.slice_target`=1 doesn't preserve column order

2017-11-01 Thread Paul Rogers (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16235100#comment-16235100
 ] 

Paul Rogers commented on DRILL-5822:


[~vitalii], thanks for the explanation. I was hoping for a bit more of a 
conceptual overview: what is our overall approach to handling column order? The 
answer referred to specific implementations. Without a design, we just 
ping-pong back and forth among various code solutions.

In Drill, column order does not matter. At least, that is what I've been told 
by the veterans. That is, (a, b) and (b, a) are the same schema. Code 
generation uses names to find columns, not column indexes as in most DB systems.

Given this, we need a policy. One policy would be that we preserve project list 
ordering created by the planner. That works, except in the case of {{SELECT 
*}}. The standard for SQL is that a {{SELECT *}} query preserves the column 
order in the table. Fair enough.

But, in a distributed system, each table may have a different column order; 
especially in files such as JSON that use key/value pairs. So, there is no 
"right" order. Then what do we do?

We can make up an order (sort columns alphabetically as the prior code did.) 
But, this will be surprising to users if their CSV file, say, has ID, Address, 
Block and we produce an output of Address, Block, ID.

Another rule might be to preserve column order where possible, but when a 
conflict occurs, choose a "first" batch and coerce others to match that. If the 
merging receiver (which sees n batches in no real order) gets batch 3 first, 
then 3 becomes the template and other batches are coerced to match.

If the Sort, say, gets batches sequentially, then the first one is the template 
and others are coerced to match.

Makes sense? Good. Now, what about RecordBatchLoader? By itself, it can't do 
the job. It needs help.

On the first batch, it can save an ordering. On the second batch, it can:

* Pick out columns that match its existing schema.
* If prior columns do not exist, it can fill in nulls (as long as the prior 
column was nullable or an array.)
* If the prior column was required, or a new column appears, a hard schema 
change must occur.

The result is that the batch loader absorbs trivial schema changes. I call this 
"schema smoothing." But, it alerts the surrounding operator to larger issues.

Now, what about the merging receiver? The algorithm might be this:

* Start with the first batch. This "primes" the batch loader.
* Visit the second batch. The batch loader "smooths" the schema as described 
above.
* Continue with the third, and so on.
* If, for any batch, a hard schema change occurs, we fail the query.

(Not that we could actually handle the schema change as described below; but 
we'd end up with a very large number of very small batches and a very large 
number of schema changes. Until Drill has a design for schema change, there is 
really no point in adding this complexity.)

If we do the above, then we don't need the check in the new code that compare 
all batches up front. Instead, we do the comparison one by one as we convert 
them from wire to in-memory format using the record batch loader.

OK, that's the merging receiver. What about the union receiver? It can just 
send a schema change event each time the schema changes. This could be noisy; 
we might get a schema change on every batch. If we have three senders, X, Y and 
Z, and each has a different schema, then we would get a stream something like 
X1, Y1, Z1, X2, Y2, Z2 and we'd have a schema change with each one. Oh well. 
But, if the batches simply differ in column order, the schema "smoothing" 
described above will kick in and we get no schema changes.

This same logic can be applied to each operator where we might have a problem.

Note that the schema smoothing algorithm described above is not just a theory. 
It is actually implemented, tested, and working in the "batch size control" 
project. Here we are just reimplementing it in multiple places due to the 
extraordinary delays in getting large code changes approved in Drill.

> The query with "SELECT *" with "ORDER BY" clause and `planner.slice_target`=1 
> doesn't preserve column order
> ---
>
> Key: DRILL-5822
> URL: https://issues.apache.org/jira/browse/DRILL-5822
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Vitalii Diravka
>Priority: Major
> Fix For: 1.12.0
>
>
> Columns ordering doesn't preserve for the star query with sorting when this 
> is planned into multiple fragments.
> Repro steps:
> 1) {code}alter session set `planner.slice_target`=1;{code}
> 2) ORDER BY clause in the query.
> Scena

[jira] [Commented] (DRILL-5921) Counters metrics should be listed in table

2017-11-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16235096#comment-16235096
 ] 

ASF GitHub Bot commented on DRILL-5921:
---

GitHub user prasadns14 opened a pull request:

https://github.com/apache/drill/pull/1020

DRILL-5921: Display counter metrics in table

Listed the counter metrics in a table
@arina-ielchiieva please review

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/prasadns14/drill DRILL-5921

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1020.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1020


commit 3f55847fe93d3296740901f717b4e0cf50736311
Author: Prasad Nagaraj Subramanya 
Date:   2017-11-02T01:59:38Z

DRILL-5921: Display counter metrics in table




> Counters metrics should be listed in table
> --
>
> Key: DRILL-5921
> URL: https://issues.apache.org/jira/browse/DRILL-5921
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - HTTP
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Prasad Nagaraj Subramanya
>Priority: Minor
> Fix For: 1.12.0
>
>
> Counter metrics are currently displayed as json string in the Drill UI. They 
> should be listed in a table similar to other metrics.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (DRILL-5921) Counters metrics should be listed in table

2017-11-01 Thread Prasad Nagaraj Subramanya (JIRA)

Prasad Nagaraj Subramanya created DRILL-5921:


 Summary: Counters metrics should be listed in table
 Key: DRILL-5921
 URL: https://issues.apache.org/jira/browse/DRILL-5921
 Project: Apache Drill
  Issue Type: Bug
  Components: Client - HTTP
Affects Versions: 1.11.0
Reporter: Prasad Nagaraj Subramanya
Assignee: Prasad Nagaraj Subramanya
Priority: Minor
 Fix For: 1.12.0


Counter metrics are currently displayed as json string in the Drill UI. They 
should be listed in a table similar to other metrics.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5783) Make code generation in the TopN operator more modular and test it

2017-11-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16235043#comment-16235043
 ] 

ASF GitHub Bot commented on DRILL-5783:
---

Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/984
  
@paul-rogers This is ready for another round of review. I have also 
responded to / addressed your comments.



> Make code generation in the TopN operator more modular and test it
> --
>
> Key: DRILL-5783
> URL: https://issues.apache.org/jira/browse/DRILL-5783
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5832) Migrate OperatorFixture to use SystemOptionManager rather than mock

2017-11-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234990#comment-16234990
 ] 

ASF GitHub Bot commented on DRILL-5832:
---

Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/970
  
@paul-rogers Will this be going in anytime soon? I'd like to fix the 
annoying

```
test(String.format("alter session..."))
```

statements in the tests after this goes in.


> Migrate OperatorFixture to use SystemOptionManager rather than mock
> ---
>
> Key: DRILL-5832
> URL: https://issues.apache.org/jira/browse/DRILL-5832
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.12.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.12.0
>
>
> The {{OperatorFixture}} provides structure for testing individual operators 
> and other "sub-operator" bits of code. To do that, the framework provides 
> mock network-free and server-free versions of the fragment context and 
> operator context.
> As part of the mock, the {{OperatorFixture}} provides a mock version of the 
> system option manager that provides a simple test-only implementation of an 
> option set.
> With the recent major changes to the system option manager, this mock 
> implementation has drifted out of sync with the system option manager. Rather 
> than upgrading the mock implementation, this ticket asks to use the system 
> option manager directly -- but configured for no ZK or file persistence of 
> options.
> The key reason for this change is that the system option manager has 
> implemented a sophisticated way to handle option defaults; it is better to 
> leverage that than to provide a mock implementation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5783) Make code generation in the TopN operator more modular and test it

2017-11-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234814#comment-16234814
 ] 

ASF GitHub Bot commented on DRILL-5783:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/984#discussion_r148396417
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/TableFileBuilder.java ---
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.test;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Lists;
+
+import java.io.BufferedOutputStream;
+import java.io.File;
+import java.io.FileOutputStream;
+import java.io.IOException;
+import java.util.List;
+
+public class TableFileBuilder
+{
+  private final List columnNames;
+  private final List columnFormatters;
+  private final List rows = Lists.newArrayList();
+  private final String rowString;
+
+  public TableFileBuilder(List columnNames, List 
columnFormatters) {
--- End diff --

I've switched this around a bit. The general flow is now the following:

  1. A **RowSet** is constructed with a **RowSetBuilder**
  1. A **JsonFileBuilder** is created and configured to use the **RowSet** 
that was just created.
  1. The **JsonFileBuilder** reads the **RowSet** and writes it out to a 
file in a json format.



> Make code generation in the TopN operator more modular and test it
> --
>
> Key: DRILL-5783
> URL: https://issues.apache.org/jira/browse/DRILL-5783
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5783) Make code generation in the TopN operator more modular and test it

2017-11-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234805#comment-16234805
 ] 

ASF GitHub Bot commented on DRILL-5783:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/984#discussion_r148395420
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestCorruptParquetDateCorrection.java
 ---
@@ -377,21 +386,21 @@ public void 
testReadOldMetadataCacheFileOverrideCorrection() throws Exception {
 
   @Test
   public void testReadNewMetadataCacheFileOverOldAndNewFiles() throws 
Exception {
-String table = format("dfs.`%s`", new 
Path(getDfsTestTmpSchemaLocation(), 
MIXED_CORRUPTED_AND_CORRECT_PARTITIONED_FOLDER));
-copyMetaDataCacheToTempReplacingInternalPaths(
+File meta = dirTestWatcher.copyResourceToRoot(
 
"parquet/4203_corrupt_dates/mixed_version_partitioned_metadata.requires_replace.txt",
-MIXED_CORRUPTED_AND_CORRECT_PARTITIONED_FOLDER, 
Metadata.METADATA_FILENAME);
+Paths.get(MIXED_CORRUPTED_AND_CORRECT_PARTITIONED_FOLDER, 
Metadata.METADATA_FILENAME).toString());
--- End diff --

I've cleaned this up a bit. Of the two patterns for building paths and 
files I prefer pattern **A** because it is cleaner.

**Pattern A:**
```
myPath.resolve("subDir1")
  .resolve("subDir2")
  .toFile();
```

**Pattern B:**
```
new File(new File(myFile, "subDir1"), "subDir2")
```


> Make code generation in the TopN operator more modular and test it
> --
>
> Key: DRILL-5783
> URL: https://issues.apache.org/jira/browse/DRILL-5783
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5783) Make code generation in the TopN operator more modular and test it

2017-11-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234797#comment-16234797
 ] 

ASF GitHub Bot commented on DRILL-5783:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/984#discussion_r148394268
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/xsort/TestExternalSort.java
 ---
@@ -138,34 +141,34 @@ public void testNewColumnsManaged() throws Exception {
 testNewColumns(false);
   }
 
-
   @Test
   public void testNewColumnsLegacy() throws Exception {
 testNewColumns(true);
   }
 
   private void testNewColumns(boolean testLegacy) throws Exception {
 final int record_count = 1;
-String dfs_temp = getDfsTestTmpSchemaLocation();
-System.out.println(dfs_temp);
-File table_dir = new File(dfs_temp, "newColumns");
-table_dir.mkdir();
-try (BufferedOutputStream os = new BufferedOutputStream(new 
FileOutputStream(new File(table_dir, "a.json" {
-  String format = "{ a : %d, b : %d }%n";
-  for (int i = 0; i <= record_count; i += 2) {
-os.write(String.format(format, i, i).getBytes());
-  }
+final String tableDirName = "newColumns";
+
+TableFileBuilder tableA = new TableFileBuilder(Lists.newArrayList("a", 
"b"), Lists.newArrayList("%d", "%d"));
--- End diff --

I've made the JsonFileBuilder, which takes a RowSet and writes it out to a 
file as json.


> Make code generation in the TopN operator more modular and test it
> --
>
> Key: DRILL-5783
> URL: https://issues.apache.org/jira/browse/DRILL-5783
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5783) Make code generation in the TopN operator more modular and test it

2017-11-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234796#comment-16234796
 ] 

ASF GitHub Bot commented on DRILL-5783:
---

Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/984#discussion_r148393893
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/BatchUtils.java ---
@@ -0,0 +1,280 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.test;
+
+import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.record.selection.SelectionVector4;
+import org.apache.drill.exec.vector.ValueVector;
+import org.junit.Assert;
+
+import java.io.UnsupportedEncodingException;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.Comparator;
+import java.util.List;
+import java.util.Map;
+
+public class BatchUtils {
+  static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(BatchUtils.class);
+
+  public static Map> 
containerToObjects(VectorContainer vectorContainer) {
--- End diff --

Removed BatchUtils


> Make code generation in the TopN operator more modular and test it
> --
>
> Key: DRILL-5783
> URL: https://issues.apache.org/jira/browse/DRILL-5783
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5248) You tried to write a VarChar type when you are

2017-11-01 Thread Yun Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234788#comment-16234788
 ] 

Yun Liu commented on DRILL-5248:


I am experiencing the same issue. While the json file works on some machine, it 
doesn't on the other. Do you have a solution for this?

ERROR [HY000] [MapR][Drill] (1040) Drill failed to execute the query: SELECT * 
FROM `dfs`.`Inputs`.`./AAD-Compliance.json` LIMIT 100
[30027]Query execution error. Details:[ 
DATA_READ ERROR: You tried to write a VarChar type when you are using a 
ValueWriter of type NullableBitWriterImpl.

Line  1
Column  175720569
Field  critical
Fragment 0:0

[Error Id: da60b1c8-775f-463b-88e2-f25a6f3bbff1 on server]
]

>  You tried to write a VarChar type when you are 
> 
>
> Key: DRILL-5248
> URL: https://issues.apache.org/jira/browse/DRILL-5248
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC, Storage - MongoDB
>Reporter: Prasengupta
>Priority: Blocker
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We have weird problem while executing the query. below is the exception where 
> we are facing the issue, the weird part is we can execute the same query 
> while we are configuring the dev mongo database, we are only facing issue 
> when we configure the production Mongo db, Could you please help us on this 
> immediately. 
> Query : 
> SELECT q.productDetails.quantity from mongo.LAM.`Quotes` as q
> DATA_READ ERROR: You tried to write a VarChar type when you are using a 
> ValueWriter of type NullableFloat8WriterImpl.
> Line  1
> Column  208
> Field  quantity
> Fragment 0:0
> [Error Id: e13ab75b-b915-410b-8af8-50cd0bcacebe on FNB-127:31010]
> class java.sql.SQLException
> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:232)
> org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:275)
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1943)
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:76)
> oadd.org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:473)
> org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:465)
> oadd.org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:477)
> org.apache.drill.jdbc.impl.DrillConnectionImpl.prepareAndExecuteInternal(DrillConnectionImpl.java:169)
> oadd.org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:109)
> oadd.org.apache.calcite.avatica.AvaticaStatement.execute(AvaticaStatement.java:121)
> org.apache.drill.jdbc.impl.DrillStatementImpl.execute(DrillStatementImpl.java:101)
> org.apache.zeppelin.jdbc.JDBCInterpreter.executeSql(JDBCInterpreter.java:322)
> org.apache.zeppelin.jdbc.JDBCInterpreter.interpret(JDBCInterpreter.java:408)
> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:94)
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:341)
> org.apache.zeppelin.scheduler.Job.run(Job.java:176)
> org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5797) Use more often the new parquet reader

2017-11-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234749#comment-16234749
 ] 

ASF GitHub Bot commented on DRILL-5797:
---

Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/976
  
Drill follows SQL rules and is case insensitive. If case sensitivity has 
snuck in somewhere (perhaps due to the use of `equals()` rather than 
`equalsIgnorCase()` or the use of a case-sensitive map), then we should fix 
that.

Note also that column aliases should not be visible to the Parquet reader.


> Use more often the new parquet reader
> -
>
> Key: DRILL-5797
> URL: https://issues.apache.org/jira/browse/DRILL-5797
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: Damien Profeta
>Assignee: Damien Profeta
>Priority: Major
> Fix For: 1.12.0
>
>
> The choice of using the regular parquet reader of the optimized one is based 
> of what type of columns is in the file. But the columns that are read by the 
> query doesn't matter. We can increase a little bit the cases where the 
> optimized reader is used by checking is the projected column are simple or 
> not.
> This is an optimization waiting for the fast parquet reader to handle complex 
> structure.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5909) need new JMX metrics for (FAILED and CANCELED) queries

2017-11-01 Thread Prasad Nagaraj Subramanya (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Nagaraj Subramanya updated DRILL-5909:
-
Fix Version/s: 1.12.0

> need new JMX metrics for (FAILED and CANCELED) queries
> --
>
> Key: DRILL-5909
> URL: https://issues.apache.org/jira/browse/DRILL-5909
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Monitoring
>Affects Versions: 1.11.0, 1.12.0
>Reporter: Khurram Faraaz
>Assignee: Prasad Nagaraj Subramanya
>Priority: Major
> Fix For: 1.12.0
>
>
> we have these JMX metrics today
> {noformat}
> drill.queries.running
> drill.queries.completed
> {noformat}
> we need these new JMX metrics
> {noformat}
> drill.queries.failed
> drill.queries.canceled
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5909) need new JMX metrics for (FAILED and CANCELED) queries

2017-11-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234733#comment-16234733
 ] 

ASF GitHub Bot commented on DRILL-5909:
---

GitHub user prasadns14 opened a pull request:

https://github.com/apache/drill/pull/1019

DRILL-5909: Added new Counter metrics

Added new metrics to capture succeeded, failed and canceled queries count.

@paul-rogers please review

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/prasadns14/drill DRILL-5909

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1019.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1019


commit a05094a7d190f81cfe7b5222a48dce8707467313
Author: Prasad Nagaraj Subramanya 
Date:   2017-11-01T20:49:43Z

DRILL-5909: Added new Counter metrics




> need new JMX metrics for (FAILED and CANCELED) queries
> --
>
> Key: DRILL-5909
> URL: https://issues.apache.org/jira/browse/DRILL-5909
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Monitoring
>Affects Versions: 1.11.0, 1.12.0
>Reporter: Khurram Faraaz
>Assignee: Prasad Nagaraj Subramanya
>Priority: Major
>
> we have these JMX metrics today
> {noformat}
> drill.queries.running
> drill.queries.completed
> {noformat}
> we need these new JMX metrics
> {noformat}
> drill.queries.failed
> drill.queries.canceled
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5889) sqlline loses RPC connection

2017-11-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234636#comment-16234636
 ] 

ASF GitHub Bot commented on DRILL-5889:
---

Github user sachouche commented on the issue:

https://github.com/apache/drill/pull/1015
  
Padma, DRILL-5889 is the wrong JIRA (sqlline loses RPC..); I think your 
JIRA is 5899


Regards,

Salim


From: Padma Penumarthy 
Sent: Wednesday, November 1, 2017 11:15:20 AM
To: apache/drill
Cc: Salim Achouche; Mention
Subject: Re: [apache/drill] DRILL-5889: Simple pattern matchers can work 
with DrillBuf directly (#1015)


@sachouche 
@paul-rogers can you please review ?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on 
GitHub, or 
mute the 
thread.



> sqlline loses RPC connection
> 
>
> Key: DRILL-5889
> URL: https://issues.apache.org/jira/browse/DRILL-5889
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>Assignee: Pritesh Maker
>Priority: Major
> Attachments: 26183ef9-44b2-ef32-adf8-cc2b5ba9f9c0.sys.drill, 
> drillbit.log
>
>
> Query is:
> {noformat}
> alter session set `planner.memory.max_query_memory_per_node` = 10737418240;
> select count(*), max(`filename`) from dfs.`/drill/testdata/hash-agg/data1` 
> group by no_nulls_col, nulls_col;
> {noformat}
> Error is:
> {noformat}
> 0: jdbc:drill:drillbit=10.10.100.190> select count(*), max(`filename`) from 
> dfs.`/drill/testdata/hash-agg/data1` group by no_nulls_col, nulls_col;
> Error: CONNECTION ERROR: Connection /10.10.100.190:45776 <--> 
> /10.10.100.190:31010 (user client) closed unexpectedly. Drillbit down?
> [Error Id: db4aea70-11e6-4e63-b0cc-13cdba0ee87a ] (state=,code=0)
> {noformat}
> From drillbit.log:
> 2017-10-18 14:04:23,044 [UserServer-1] INFO  
> o.a.drill.exec.rpc.user.UserServer - RPC connection /10.10.100.190:31010 <--> 
> /10.10.100.190:45776 (user server) timed out.  Timeout was set to 30 seconds. 
> Closing connection.
> Plan is:
> {noformat}
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0], EXPR$1=[$1])
> 00-02UnionExchange
> 01-01  Project(EXPR$0=[$2], EXPR$1=[$3])
> 01-02HashAgg(group=[{0, 1}], EXPR$0=[$SUM0($2)], EXPR$1=[MAX($3)])
> 01-03  Project(no_nulls_col=[$0], nulls_col=[$1], EXPR$0=[$2], 
> EXPR$1=[$3])
> 01-04HashToRandomExchange(dist0=[[$0]], dist1=[[$1]])
> 02-01  UnorderedMuxExchange
> 03-01Project(no_nulls_col=[$0], nulls_col=[$1], 
> EXPR$0=[$2], EXPR$1=[$3], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($1, 
> hash32AsDouble($0, 1301011))])
> 03-02  HashAgg(group=[{0, 1}], EXPR$0=[COUNT()], 
> EXPR$1=[MAX($2)])
> 03-03Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/hash-agg/data1]], 
> selectionRoot=maprfs:/drill/testdata/hash-agg/data1, numFiles=1, 
> usedMetadataFile=false, columns=[`no_nulls_col`, `nulls_col`, `filename`]]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5920) Drill incorrectly projects column aliases to scan operator

2017-11-01 Thread Paul Rogers (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234545#comment-16234545
 ] 

Paul Rogers commented on DRILL-5920:


Turns out that this problem is due to a misunderstanding of how SQL works.

Consider the original unit test query:

{code}
select max(columns[1]) as col1
from cp.`textinput/input1.csv`
where col1 is not null
{code}

The query is incorrectly using a column alias in the {{WHERE}} clause.

To quote from a Google search:

bq. Standard SQL disallows references to column aliases in a WHERE clause. This 
restriction is imposed because when the WHERE clause is evaluated, the 
columnvalue may not yet have been determined. column_alias can be used in an 
ORDER BY clause, but it cannot be used in a WHERE, GROUP BY, or HAVING clause

Thanks to [~amansinha100] for pointing out this fact.

> Drill incorrectly projects column aliases to scan operator
> --
>
> Key: DRILL-5920
> URL: https://issues.apache.org/jira/browse/DRILL-5920
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Priority: Major
>
> The {{TestNewTextReader.ensureColumnNameDisplayedinError}} unit test runs 
> this query:
> {code}
> select max(columns[1]) as col1
> from cp.`textinput/input1.csv`
> where col1 is not null
> {code}
> The following appears in the {{SubScan}} for the {{TextFormatPlugin}}:
> {noformat}
> [`col1`, `columns`[1]]
> {noformat}
> This is clearly wrong. The actual table column is {{columns}} (and, 
> specifically, element 1.) {{col1} is an alias that should never have been 
> pushed down to the data source because the data source does not know about 
> aliases.
> Further, the projection list makes no distinction between the "real" and 
> "alias" columns, so, to the data source, both look like real table columns.
> The current workaround is to create a nullable int column for {{col1}} which 
> is, presumably, replaced by a later projection operator.
> Because this behavior is wrong, we must think though all the possible failure 
> cases and how to handle them in this incorrect design. What if the alias 
> matches an (expensive) table column? What if the alias is the same as some 
> base column in the same query?
> {code}
> SELECT a as b, b as c FROM ...
> {code}
> Incorrect name handling may work in many cases, but it does lead to problems 
> because the behavior is not following the accepted SQL standards.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (DRILL-5920) Drill incorrectly projects column aliases to scan operator

2017-11-01 Thread Paul Rogers (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved DRILL-5920.

Resolution: Not A Bug

> Drill incorrectly projects column aliases to scan operator
> --
>
> Key: DRILL-5920
> URL: https://issues.apache.org/jira/browse/DRILL-5920
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Priority: Major
>
> The {{TestNewTextReader.ensureColumnNameDisplayedinError}} unit test runs 
> this query:
> {code}
> select max(columns[1]) as col1
> from cp.`textinput/input1.csv`
> where col1 is not null
> {code}
> The following appears in the {{SubScan}} for the {{TextFormatPlugin}}:
> {noformat}
> [`col1`, `columns`[1]]
> {noformat}
> This is clearly wrong. The actual table column is {{columns}} (and, 
> specifically, element 1.) {{col1} is an alias that should never have been 
> pushed down to the data source because the data source does not know about 
> aliases.
> Further, the projection list makes no distinction between the "real" and 
> "alias" columns, so, to the data source, both look like real table columns.
> The current workaround is to create a nullable int column for {{col1}} which 
> is, presumably, replaced by a later projection operator.
> Because this behavior is wrong, we must think though all the possible failure 
> cases and how to handle them in this incorrect design. What if the alias 
> matches an (expensive) table column? What if the alias is the same as some 
> base column in the same query?
> {code}
> SELECT a as b, b as c FROM ...
> {code}
> Incorrect name handling may work in many cases, but it does lead to problems 
> because the behavior is not following the accepted SQL standards.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5889) sqlline loses RPC connection

2017-11-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234509#comment-16234509
 ] 

ASF GitHub Bot commented on DRILL-5889:
---

Github user ppadma commented on the issue:

https://github.com/apache/drill/pull/1015
  
@sachouche @paul-rogers can you please review ?


> sqlline loses RPC connection
> 
>
> Key: DRILL-5889
> URL: https://issues.apache.org/jira/browse/DRILL-5889
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>Assignee: Pritesh Maker
>Priority: Major
> Attachments: 26183ef9-44b2-ef32-adf8-cc2b5ba9f9c0.sys.drill, 
> drillbit.log
>
>
> Query is:
> {noformat}
> alter session set `planner.memory.max_query_memory_per_node` = 10737418240;
> select count(*), max(`filename`) from dfs.`/drill/testdata/hash-agg/data1` 
> group by no_nulls_col, nulls_col;
> {noformat}
> Error is:
> {noformat}
> 0: jdbc:drill:drillbit=10.10.100.190> select count(*), max(`filename`) from 
> dfs.`/drill/testdata/hash-agg/data1` group by no_nulls_col, nulls_col;
> Error: CONNECTION ERROR: Connection /10.10.100.190:45776 <--> 
> /10.10.100.190:31010 (user client) closed unexpectedly. Drillbit down?
> [Error Id: db4aea70-11e6-4e63-b0cc-13cdba0ee87a ] (state=,code=0)
> {noformat}
> From drillbit.log:
> 2017-10-18 14:04:23,044 [UserServer-1] INFO  
> o.a.drill.exec.rpc.user.UserServer - RPC connection /10.10.100.190:31010 <--> 
> /10.10.100.190:45776 (user server) timed out.  Timeout was set to 30 seconds. 
> Closing connection.
> Plan is:
> {noformat}
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0], EXPR$1=[$1])
> 00-02UnionExchange
> 01-01  Project(EXPR$0=[$2], EXPR$1=[$3])
> 01-02HashAgg(group=[{0, 1}], EXPR$0=[$SUM0($2)], EXPR$1=[MAX($3)])
> 01-03  Project(no_nulls_col=[$0], nulls_col=[$1], EXPR$0=[$2], 
> EXPR$1=[$3])
> 01-04HashToRandomExchange(dist0=[[$0]], dist1=[[$1]])
> 02-01  UnorderedMuxExchange
> 03-01Project(no_nulls_col=[$0], nulls_col=[$1], 
> EXPR$0=[$2], EXPR$1=[$3], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($1, 
> hash32AsDouble($0, 1301011))])
> 03-02  HashAgg(group=[{0, 1}], EXPR$0=[COUNT()], 
> EXPR$1=[MAX($2)])
> 03-03Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/hash-agg/data1]], 
> selectionRoot=maprfs:/drill/testdata/hash-agg/data1, numFiles=1, 
> usedMetadataFile=false, columns=[`no_nulls_col`, `nulls_col`, `filename`]]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5797) Use more often the new parquet reader

2017-11-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234473#comment-16234473
 ] 

ASF GitHub Bot commented on DRILL-5797:
---

Github user sachouche commented on the issue:

https://github.com/apache/drill/pull/976
  
Looking at the stack trace:
- The code definitely is initializing a column of type REPEATABLE
- The Fast Reader didn't expect this scenario so it used a default 
container (NullableVarBinary) for VL binary DT

Why this is happening?
- The code in ReadState::buildReader() is processing all selected columns
- This information is obtained from the ParquetSchema
- Looking at the code, this seems a case-sensitivity issue
- The ParquetSchema is case-insensitive whereas the Parquet GroupType is not
- Damien added a catch handler (column not found) to handle use-cases where 
we are projecting non-existing columns
- This basically is leading to an unforeseen use-case
- Assume column XYZ is complex
- User uses an alias (xyz)
- The new code will allow this column to pass and treat is as simple
- The ParquetSchema is being case insensitive will process this column
- and thus the exception in the test suite

Suggested Fix
- Create a map (key to-lower-case) and register all current row-group 
columns
- Use this map to locate a selected column type



> Use more often the new parquet reader
> -
>
> Key: DRILL-5797
> URL: https://issues.apache.org/jira/browse/DRILL-5797
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: Damien Profeta
>Assignee: Damien Profeta
>Priority: Major
> Fix For: 1.12.0
>
>
> The choice of using the regular parquet reader of the optimized one is based 
> of what type of columns is in the file. But the columns that are read by the 
> query doesn't matter. We can increase a little bit the cases where the 
> optimized reader is used by checking is the projected column are simple or 
> not.
> This is an optimization waiting for the fast parquet reader to handle complex 
> structure.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5920) Drill incorrectly projects column aliases to scan operator

2017-11-01 Thread Paul Rogers (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234465#comment-16234465
 ] 

Paul Rogers commented on DRILL-5920:


Detailed analysis.

Working back through the code, the mystery deepens. 

{code}
  public static ProjectPushInfo getColumns(RelDataType rowType, List 
projects) {
final List fieldNames = rowType.getFieldNames();
{code}

Where:

{code}
fieldNames = [*, col1, columns]
{code}

Which, frankly, makes no sense. How can we have all columns (“*”), a physical 
column (“columns”) and an alias for that column “col1” in the same name space? 
Isn’t it true that, in SQL, an alias hides the original name?

Further back, the RelDataTypeHolder, holds a list of RelDataTypeFieldImpl 
objects as:

{code}
[#0: * ANY, #1: col1 ANY, #2: columns ANY]
{code}

There is no data in each object to differentiate the different categories of 
fields. The …FieldImpl class seems to be a Calcite class, so this is about as 
far as I got.

Taking a step back, it seems clear that any table can support the wildcard. 
There is a clear difference between base table columns and aliases. We may know 
nothing about the referenced table columns, but we can tell, from syntax, which 
name is an alias and which is a table column.

Working way back, here is the text of the {{SqlNode}} in {{getQueryPlan()}}:

{code}
SELECT MAX(`textinput/input1.csv`.`columns`[1]) AS `col1`
FROM `cp`.`textinput/input1.csv` AS `textinput/input1.csv`
WHERE `textinput/input1.csv`.`col1` IS NOT NULL
{code}

So, the parser understood the query correctly.

Looking at other bits, on this line:

{code}
final ConvertedRelNode convertedRelNode = validateAndConvert(sqlNode);
{code}

The {{EasyGroupScan}} has:

{code}
EasyGroupScan [selectionRoot=classpath:/textinput/input1.csv, numFiles=1, 
columns=[`*`], files=[classpath:/textinput/input1.csv]]
{code}

I suppose this means that the table does not know it’s columns? So we provide 
the wildcard instead?

The field list seems to be built up here:

{code}
public class RelDataTypeHolder {
  public RelDataTypeField getField(RelDataTypeFactory typeFactory, String 
fieldName) {
{code}

First we get “*”, then “col1” (many times), and finally “columns” (six times). 
Each time we squirrel alway the name in our field list, resulting in the bogus 
list shown earlier. Here, I wonder:

* Why do we save the wildcard? That tells us nothing.
* Why is “col1”, which is an alias, resolved against the base table?

Let’s look at “col1". This comes to us via;

{code}
public class SqlValidatorUtil {
  public RelDataTypeField field(RelDataType rowType, String alias) {
return SqlValidatorUtil.lookupField(caseSensitive, elideRecord, rowType,
alias);
  }
{code}

Where alias = “col1”

The above calls this:

{code}
public abstract class ListScope extends DelegatingScope {
...
  public Pair
  findQualifyingTableName(final String columnName, SqlNode ctx) {
int count = 0;
Pair tableName = null;
for (Pair child : children) {
  final RelDataType rowType = child.right.getRowType();
  if (validator.catalogReader.field(rowType, columnName) != null) {
tableName = child;
count++;
  }
}
{code}

Which has this namespace:

{code}
[]
{code}

Because we always resolve the column (making up columns as needed), we never 
return null.

All of this is deep in a stack of DrillValidator sql converters where I’m 
getting a bit lost.

Shouldn’t names first be resolved in the SELECT statement’s name space of 
aliases, and only then in the table’s name space? Otherwise, if we go the other 
way around (table first, then local name space), we’ll always resolve to 
columns, even when the name is an alias, resulting in the original bug I’m 
facing.

> Drill incorrectly projects column aliases to scan operator
> --
>
> Key: DRILL-5920
> URL: https://issues.apache.org/jira/browse/DRILL-5920
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Priority: Major
>
> The {{TestNewTextReader.ensureColumnNameDisplayedinError}} unit test runs 
> this query:
> {code}
> select max(columns[1]) as col1
> from cp.`textinput/input1.csv`
> where col1 is not null
> {code}
> The following appears in the {{SubScan}} for the {{TextFormatPlugin}}:
> {noformat}
> [`col1`, `columns`[1]]
> {noformat}
> This is clearly wrong. The actual table column is {{columns}} (and, 
> specifically, element 1.) {{col1} is an alias that should never have been 
> pushed down to the data source because the data source does not know about 
> aliases.
> Further, the projection list makes no distinction between the "real" and 
> "alias" columns, so, to the data source, both look like real table columns.
> The current workaround is to create a nullable int

[jira] [Assigned] (DRILL-5909) need new JMX metrics for (FAILED and CANCELED) queries

2017-11-01 Thread Prasad Nagaraj Subramanya (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Nagaraj Subramanya reassigned DRILL-5909:


Assignee: Prasad Nagaraj Subramanya

> need new JMX metrics for (FAILED and CANCELED) queries
> --
>
> Key: DRILL-5909
> URL: https://issues.apache.org/jira/browse/DRILL-5909
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Monitoring
>Affects Versions: 1.11.0, 1.12.0
>Reporter: Khurram Faraaz
>Assignee: Prasad Nagaraj Subramanya
>Priority: Major
>
> we have these JMX metrics today
> {noformat}
> drill.queries.running
> drill.queries.completed
> {noformat}
> we need these new JMX metrics
> {noformat}
> drill.queries.failed
> drill.queries.canceled
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (DRILL-5920) Drill incorrectly projects column aliases to scan operator

2017-11-01 Thread Paul Rogers (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234417#comment-16234417
 ] 

Paul Rogers edited comment on DRILL-5920 at 11/1/17 5:28 PM:
-

Here are some questions.

I would like to know, is it expected that a column alias should appear in the 
project list given to a scan operator? If we expect this, how do we 
differentiate between the actual table column and the alias?

Or, do we just go ahead and create a null int column (or actual data column if 
“col1” did happen to be a real column), throw it away later, and replace it 
with the real data?

How would we handle the perfectly fine, albeit confusing, query of:

{code}
SELECT col1 AS col2, col2 AS col1 FROM ...
{code}

The problem I'm currently facing is that the text reader wants to allow only 
certain kinds of projection:

* {{SELECT *}}
* {{SELECT columns}}
* {{SELECT a, b, c}}
* {{SELECT}}

The last is an empty select list as is generated by a {{SELECT COUNT\(*)}} 
query.

But, because aliases are pushed to the data source, the picture becomes very 
muddy:

* {{SELECT *}}
* {{SELECT columns, alias}}
* {{SELECT a, b, c, alias}}
* {{SELECT alias}}

Where {{alias}} is an alias column.

That is, the meaning of a {{COUNT\(*)}} query changes. Extra columns are 
created at other times. In the wildcard case, the alias column is not created, 
creating two distinct situations that downstream operators must handle 
(materialized aliases and non-materialized aliases.)

For all these reasons, the current behavior seems like a bug, not a feature. 
But, this can be open to debate.

For now, I will alter the text readers to support the {{SELECT columns, alias}} 
case and accept the incorrect behavior as the desired behavior.


was (Author: paul.rogers):
Here are some questions.

I would like to know, is it expected that a column alias should appear in the 
project list given to a scan operator? If we expect this, how do we 
differentiate between the actual table column and the alias?

Or, do we just go ahead and create a null int column (or actual data column if 
“col1” did happen to be a real column), throw it away later, and replace it 
with the real data?

How would we handle the perfectly fine, albeit confusing, query of:

{code}
SELECT col1 AS col2, col2 AS col1 FROM ...
{code}

The problem I'm currently facing is that the text reader wants to allow only 
certain kinds of projection:

* {{SELECT *}}
* {{SELECT columns}}
* {{SELECT a, b, c}}
* {{SELECT}}

The last is an empty select list as is generated by a {{SELECT COUNT(*)}} query.

But, because aliases are pushed to the data source, the picture becomes very 
muddy:

* {{SELECT *}}
* {{SELECT columns, alias}}
* {{SELECT a, b, c, alias}}
* {{SELECT alias}}

Where {{alias}} is an alias column.

That is, the meaning of a {{COUNT(*)}} query changes. Extra columns are created 
at other times. In the wildcard case, the alias column is not created, creating 
two distinct situations that downstream operators must handle (materialized 
aliases and non-materialized aliases.)

For all these reasons, the current behavior seems like a bug, not a feature. 
But, this can be open to debate.

For now, I will alter the text readers to support the {{SELECT columns, alias}} 
case and accept the incorrect behavior as the desired behavior.

> Drill incorrectly projects column aliases to scan operator
> --
>
> Key: DRILL-5920
> URL: https://issues.apache.org/jira/browse/DRILL-5920
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Priority: Major
>
> The {{TestNewTextReader.ensureColumnNameDisplayedinError}} unit test runs 
> this query:
> {code}
> select max(columns[1]) as col1
> from cp.`textinput/input1.csv`
> where col1 is not null
> {code}
> The following appears in the {{SubScan}} for the {{TextFormatPlugin}}:
> {noformat}
> [`col1`, `columns`[1]]
> {noformat}
> This is clearly wrong. The actual table column is {{columns}} (and, 
> specifically, element 1.) {{col1} is an alias that should never have been 
> pushed down to the data source because the data source does not know about 
> aliases.
> Further, the projection list makes no distinction between the "real" and 
> "alias" columns, so, to the data source, both look like real table columns.
> The current workaround is to create a nullable int column for {{col1}} which 
> is, presumably, replaced by a later projection operator.
> Because this behavior is wrong, we must think though all the possible failure 
> cases and how to handle them in this incorrect design. What if the alias 
> matches an (expensive) table column? What if the alias is the same as some 
> base column in the same query?
> {code}
> SELECT a as b, b as c FROM ...
> {code}
>

[jira] [Comment Edited] (DRILL-5920) Drill incorrectly projects column aliases to scan operator

2017-11-01 Thread Paul Rogers (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234417#comment-16234417
 ] 

Paul Rogers edited comment on DRILL-5920 at 11/1/17 5:28 PM:
-

Here are some questions.

I would like to know, is it expected that a column alias should appear in the 
project list given to a scan operator? If we expect this, how do we 
differentiate between the actual table column and the alias?

Or, do we just go ahead and create a null int column (or actual data column if 
“col1” did happen to be a real column), throw it away later, and replace it 
with the real data?

How would we handle the perfectly fine, albeit confusing, query of:

{code}
SELECT col1 AS col2, col2 AS col1 FROM ...
{code}

The problem I'm currently facing is that the text reader wants to allow only 
certain kinds of projection:

* {{SELECT *}}
* {{SELECT columns}}
* {{SELECT a, b, c}}
* {{SELECT}}

The last is an empty select list as is generated by a {{SELECT COUNT(*)}} query.

But, because aliases are pushed to the data source, the picture becomes very 
muddy:

* {{SELECT *}}
* {{SELECT columns, alias}}
* {{SELECT a, b, c, alias}}
* {{SELECT alias}}

Where {{alias}} is an alias column.

That is, the meaning of a {{COUNT(*)}} query changes. Extra columns are created 
at other times. In the wildcard case, the alias column is not created, creating 
two distinct situations that downstream operators must handle (materialized 
aliases and non-materialized aliases.)

For all these reasons, the current behavior seems like a bug, not a feature. 
But, this can be open to debate.

For now, I will alter the text readers to support the {{SELECT columns, alias}} 
case and accept the incorrect behavior as the desired behavior.


was (Author: paul.rogers):
Here are some questions.

I would like to know, is it expected that a column alias should appear in the 
project list given to a scan operator? If we expect this, how do we 
differentiate between the actual table column and the alias?

Or, do we just go ahead and create a null int column (or actual data column if 
“col1” did happen to be a real column), throw it away later, and replace it 
with the real data?

How would we handle the perfectly fine, albeit confusing, query of:

{code}
SELECT col1 AS col2, col2 AS col1 FROM ...
{code}

The problem I'm currently facing is that the text reader wants to allow only 
certain kinds of projection:

* SELECT *
* SELECT columns
* SELECT a, b, c
* SELECT

The last is an empty select list as is generated by a SELECT COUNT(*) query.

But, because aliases are pushed to the data source, the picture becomes very 
muddy:

* SELECT *
* SELECT columns, alias
* SELECT a, b, c, alias
* SELECT alias

That is, the meaning of a COUNT(*) query changes. Extra columns are created at 
other times. In the wildcard case, the alias column is not created, creating 
two distinct situations that downstream operators must handle (materialized 
aliases and non-materialized aliases.)

For all these reasons, the current behavior seems like a bug, not a feature. 
But, this can be open to debate.


> Drill incorrectly projects column aliases to scan operator
> --
>
> Key: DRILL-5920
> URL: https://issues.apache.org/jira/browse/DRILL-5920
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Priority: Major
>
> The {{TestNewTextReader.ensureColumnNameDisplayedinError}} unit test runs 
> this query:
> {code}
> select max(columns[1]) as col1
> from cp.`textinput/input1.csv`
> where col1 is not null
> {code}
> The following appears in the {{SubScan}} for the {{TextFormatPlugin}}:
> {noformat}
> [`col1`, `columns`[1]]
> {noformat}
> This is clearly wrong. The actual table column is {{columns}} (and, 
> specifically, element 1.) {{col1} is an alias that should never have been 
> pushed down to the data source because the data source does not know about 
> aliases.
> Further, the projection list makes no distinction between the "real" and 
> "alias" columns, so, to the data source, both look like real table columns.
> The current workaround is to create a nullable int column for {{col1}} which 
> is, presumably, replaced by a later projection operator.
> Because this behavior is wrong, we must think though all the possible failure 
> cases and how to handle them in this incorrect design. What if the alias 
> matches an (expensive) table column? What if the alias is the same as some 
> base column in the same query?
> {code}
> SELECT a as b, b as c FROM ...
> {code}
> Incorrect name handling may work in many cases, but it does lead to problems 
> because the behavior is not following the accepted SQL standards.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5920) Drill incorrectly projects column aliases to scan operator

2017-11-01 Thread Paul Rogers (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234417#comment-16234417
 ] 

Paul Rogers commented on DRILL-5920:


Here are some questions.

I would like to know, is it expected that a column alias should appear in the 
project list given to a scan operator? If we expect this, how do we 
differentiate between the actual table column and the alias?

Or, do we just go ahead and create a null int column (or actual data column if 
“col1” did happen to be a real column), throw it away later, and replace it 
with the real data?

How would we handle the perfectly fine, albeit confusing, query of:

{code}
SELECT col1 AS col2, col2 AS col1 FROM ...
{code}

The problem I'm currently facing is that the text reader wants to allow only 
certain kinds of projection:

* SELECT *
* SELECT columns
* SELECT a, b, c
* SELECT

The last is an empty select list as is generated by a SELECT COUNT(*) query.

But, because aliases are pushed to the data source, the picture becomes very 
muddy:

* SELECT *
* SELECT columns, alias
* SELECT a, b, c, alias
* SELECT alias

That is, the meaning of a COUNT(*) query changes. Extra columns are created at 
other times. In the wildcard case, the alias column is not created, creating 
two distinct situations that downstream operators must handle (materialized 
aliases and non-materialized aliases.)

For all these reasons, the current behavior seems like a bug, not a feature. 
But, this can be open to debate.


> Drill incorrectly projects column aliases to scan operator
> --
>
> Key: DRILL-5920
> URL: https://issues.apache.org/jira/browse/DRILL-5920
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Priority: Major
>
> The {{TestNewTextReader.ensureColumnNameDisplayedinError}} unit test runs 
> this query:
> {code}
> select max(columns[1]) as col1
> from cp.`textinput/input1.csv`
> where col1 is not null
> {code}
> The following appears in the {{SubScan}} for the {{TextFormatPlugin}}:
> {noformat}
> [`col1`, `columns`[1]]
> {noformat}
> This is clearly wrong. The actual table column is {{columns}} (and, 
> specifically, element 1.) {{col1} is an alias that should never have been 
> pushed down to the data source because the data source does not know about 
> aliases.
> Further, the projection list makes no distinction between the "real" and 
> "alias" columns, so, to the data source, both look like real table columns.
> The current workaround is to create a nullable int column for {{col1}} which 
> is, presumably, replaced by a later projection operator.
> Because this behavior is wrong, we must think though all the possible failure 
> cases and how to handle them in this incorrect design. What if the alias 
> matches an (expensive) table column? What if the alias is the same as some 
> base column in the same query?
> {code}
> SELECT a as b, b as c FROM ...
> {code}
> Incorrect name handling may work in many cases, but it does lead to problems 
> because the behavior is not following the accepted SQL standards.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (DRILL-5920) Drill incorrectly projects column aliases to scan operator

2017-11-01 Thread Paul Rogers (JIRA)

Paul Rogers created DRILL-5920:
--

 Summary: Drill incorrectly projects column aliases to scan operator
 Key: DRILL-5920
 URL: https://issues.apache.org/jira/browse/DRILL-5920
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.10.0
Reporter: Paul Rogers
Priority: Major


The {{TestNewTextReader.ensureColumnNameDisplayedinError}} unit test runs this 
query:
{code}
select max(columns[1]) as col1
from cp.`textinput/input1.csv`
where col1 is not null
{code}
The following appears in the {{SubScan}} for the {{TextFormatPlugin}}:

{noformat}
[`col1`, `columns`[1]]
{noformat}

This is clearly wrong. The actual table column is {{columns}} (and, 
specifically, element 1.) {{col1} is an alias that should never have been 
pushed down to the data source because the data source does not know about 
aliases.

Further, the projection list makes no distinction between the "real" and 
"alias" columns, so, to the data source, both look like real table columns.

The current workaround is to create a nullable int column for {{col1}} which 
is, presumably, replaced by a later projection operator.

Because this behavior is wrong, we must think though all the possible failure 
cases and how to handle them in this incorrect design. What if the alias 
matches an (expensive) table column? What if the alias is the same as some base 
column in the same query?

{code}
SELECT a as b, b as c FROM ...
{code}

Incorrect name handling may work in many cases, but it does lead to problems 
because the behavior is not following the accepted SQL standards.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5892) Distinguish between states for query statistics exposed via JMX

2017-11-01 Thread Josiah Yan (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234136#comment-16234136
 ] 

Josiah Yan commented on DRILL-5892:
---

[~kkhatua] could I take a shot at this, if no one has started work on it? It 
looks quite doable.

> Distinguish between states for query statistics exposed via JMX
> ---
>
> Key: DRILL-5892
> URL: https://issues.apache.org/jira/browse/DRILL-5892
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.11.0
>Reporter: Kunal Khatua
>Priority: Major
> Fix For: Future
>
>
> Currently, the JMX metrics exposed 
> {code:java}
> metrics:name=drill.queries.completed
> metrics:name=drill.queries.running
> metrics:name=drill.queries.enqueued
> {code}
> The completed queries, however, do not distinguish between the outcomes of 
> the completed queries.
> The proposal is to also provide success, failed, cancelled and timeout (yet 
> to implement) states.
> {code:xml}
>   "metrics:name=drill.queries" : {
> "completed" : {
>   "successful": INTEGER,
>   "failed": INTEGER,
>   "cancelled": INTEGER,
>   "timeout": INTEGER
> },
> "running" : INTEGER,
> "enqueued" : {
>   "small" : INTEGER,
>   "large" : INTEGER
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5892) Distinguish between states for query statistics exposed via JMX

2017-11-01 Thread Josiah Yan (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234111#comment-16234111
 ] 

Josiah Yan commented on DRILL-5892:
---

Digging into 
{{/drill/exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java}},
 not sure if {{timeout}} refers to failure due to timeout in a query queue, or 
a timeout in a Drillbit RPC.

> Distinguish between states for query statistics exposed via JMX
> ---
>
> Key: DRILL-5892
> URL: https://issues.apache.org/jira/browse/DRILL-5892
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.11.0
>Reporter: Kunal Khatua
>Priority: Major
> Fix For: Future
>
>
> Currently, the JMX metrics exposed 
> {code:java}
> metrics:name=drill.queries.completed
> metrics:name=drill.queries.running
> metrics:name=drill.queries.enqueued
> {code}
> The completed queries, however, do not distinguish between the outcomes of 
> the completed queries.
> The proposal is to also provide success, failed, cancelled and timeout (yet 
> to implement) states.
> {code:xml}
>   "metrics:name=drill.queries" : {
> "completed" : {
>   "successful": INTEGER,
>   "failed": INTEGER,
>   "cancelled": INTEGER,
>   "timeout": INTEGER
> },
> "running" : INTEGER,
> "enqueued" : {
>   "small" : INTEGER,
>   "large" : INTEGER
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5919) Add session option to allow json reader/writer to work with NaN,INF

2017-11-01 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-5919:

Priority: Major  (was: Normal)

> Add session option to allow json reader/writer to work with NaN,INF
> ---
>
> Key: DRILL-5919
> URL: https://issues.apache.org/jira/browse/DRILL-5919
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Volodymyr Tkach
>Assignee: Volodymyr Tkach
>Priority: Major
> Fix For: Future
>
>
> Add session options to allow drill working with non standard json strings 
> number literals like: NaN, Infinity, -Infinity. By default these options will 
> be switched off, the user will be able to toggle them during working session.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5919) Add session option to allow json reader/writer to work with NaN,INF

2017-11-01 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-5919:

Fix Version/s: (was: 1.12.0)
   Future

> Add session option to allow json reader/writer to work with NaN,INF
> ---
>
> Key: DRILL-5919
> URL: https://issues.apache.org/jira/browse/DRILL-5919
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Volodymyr Tkach
>Assignee: Volodymyr Tkach
>Priority: Minor
> Fix For: Future
>
>
> Add session options to allow drill working with non standard json strings 
> number literals like: NaN, Infinity, -Infinity. By default these options will 
> be switched off, the user will be able to toggle them during working session.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5919) Add session option to allow json reader/writer to work with NaN,INF

2017-11-01 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-5919:

Priority: Normal  (was: Minor)

> Add session option to allow json reader/writer to work with NaN,INF
> ---
>
> Key: DRILL-5919
> URL: https://issues.apache.org/jira/browse/DRILL-5919
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Volodymyr Tkach
>Assignee: Volodymyr Tkach
>Priority: Normal
> Fix For: Future
>
>
> Add session options to allow drill working with non standard json strings 
> number literals like: NaN, Infinity, -Infinity. By default these options will 
> be switched off, the user will be able to toggle them during working session.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-2744) Provide error message when trying to query MapR-DB or HBase tables with insufficient priviliges

2017-11-01 Thread tooptoop4 (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-2744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234004#comment-16234004
 ] 

tooptoop4 commented on DRILL-2744:
--

anyone looking at this?

> Provide error message when trying to query MapR-DB or HBase tables with 
> insufficient priviliges
> ---
>
> Key: DRILL-2744
> URL: https://issues.apache.org/jira/browse/DRILL-2744
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - HBase
>Affects Versions: 0.8.0
>Reporter: Andries Engelbrecht
>Priority: Major
>  Labels: security
> Fix For: Future
>
>
> When creating MapR-DB tables with different privileges Drill will return no 
> results for tables with insufficient privileges. Propose an error is returned 
> so the user is aware of the issue, instead of simply no data being returned. 
> This can be a serious issue with complex queries when joining data across 
> multiple data sources.
> Creating 2 tables - one with mapr user and the other as root.
> lr. 1 mapr mapr 2 Apr  9 17:51 customers -> 
> mapr::table::2057.45.1574734
> lr. 1 root root 2 Apr 10 00:21 test -> mapr::table::2057.48.1574740
> hbase(main):005:0> get "test", "r1"
> COLUMNCELL
>  col1:timestamp=1428625497000, value=a
>  col2:timestamp=1428625506268, value=b
> 2 row(s) in 0.0380 seconds
> 0: jdbc:drill:zk=drilldemo:5181> show tables;
> +--++
> | TABLE_SCHEMA | TABLE_NAME |
> +--++
> | maprdb   | test   |
> | maprdb   | customers  |
> +--++
> 2 rows selected (0.098 seconds)
> querying test tables simply returns no results instead of an error.
> 0: jdbc:drill:zk=drilldemo:5181> select * from test;
> +--+
> |  |
> +--+
> +--+
> No rows selected (0.059 seconds)
> Customers does return data due to sufficient privileges.
> 0: jdbc:drill:zk=drilldemo:5181> select * from customers limit 1;
> +++++
> |  row_key   |  address   |  loyalty   |  personal  |
> +++++
> | [B@6e22c013 | {"state":"InZhIg=="} | 
> {"agg_rev":"MTk3","membership":"InNpbHZlciI="} | 
> {"age":"IjE1LTIwIg==","gender":"IkZFTUFMRSI=","name":"IkNvcnJpbmUgTWVjaGFtIg=="}
>  |
> +++++
> 1 row selected (0.236 seconds)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5919) Add session option to allow json reader/writer to work with NaN,INF

2017-11-01 Thread Volodymyr Tkach (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Volodymyr Tkach updated DRILL-5919:
---
Affects Version/s: (was: 1.12.0)
   1.11.0

> Add session option to allow json reader/writer to work with NaN,INF
> ---
>
> Key: DRILL-5919
> URL: https://issues.apache.org/jira/browse/DRILL-5919
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Volodymyr Tkach
>Assignee: Volodymyr Tkach
>Priority: Minor
> Fix For: 1.12.0
>
>
> Add session options to allow drill working with non standard json strings 
> number literals like: NaN, Infinity, -Infinity. By default these options will 
> be switched off, the user will be able to toggle them during working session.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5919) Add session option to allow json reader/writer to work with NaN,INF

2017-11-01 Thread Volodymyr Tkach (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Volodymyr Tkach updated DRILL-5919:
---
Fix Version/s: 1.12.0

> Add session option to allow json reader/writer to work with NaN,INF
> ---
>
> Key: DRILL-5919
> URL: https://issues.apache.org/jira/browse/DRILL-5919
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Volodymyr Tkach
>Assignee: Volodymyr Tkach
>Priority: Minor
> Fix For: 1.12.0
>
>
> Add session options to allow drill working with non standard json strings 
> number literals like: NaN, Infinity, -Infinity. By default these options will 
> be switched off, the user will be able to toggle them during working session.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (DRILL-5919) Add session option to allow json reader/writer to work with NaN,INF

2017-11-01 Thread Volodymyr Tkach (JIRA)

Volodymyr Tkach created DRILL-5919:
--

 Summary: Add session option to allow json reader/writer to work 
with NaN,INF
 Key: DRILL-5919
 URL: https://issues.apache.org/jira/browse/DRILL-5919
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - JSON
Affects Versions: 1.12.0
Reporter: Volodymyr Tkach
Assignee: Volodymyr Tkach
Priority: Minor


Add session options to allow drill working with non standard json strings 
number literals like: NaN, Infinity, -Infinity. By default these options will 
be switched off, the user will be able to toggle them during working session.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5645) negation of expression causes null pointer exception

2017-11-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16233945#comment-16233945
 ] 

ASF GitHub Bot commented on DRILL-5645:
---

Github user josiahyan commented on the issue:

https://github.com/apache/drill/pull/892
  
Yay! First commit!

Thanks for the review!


> negation of expression causes null pointer exception
> 
>
> Key: DRILL-5645
> URL: https://issues.apache.org/jira/browse/DRILL-5645
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server
>Affects Versions: 1.10.0
> Environment: Drill 1.10
>Reporter: N Campbell
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.12.0
>
>
> Following statement will fail when the expression is negated
> select -(2 * 2) from ( values ( 1 ) ) T ( C1 )
> Error: SYSTEM ERROR: NullPointerException



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5822) The query with "SELECT *" with "ORDER BY" clause and `planner.slice_target`=1 doesn't preserve column order

[jira] [Commented] (DRILL-5921) Counters metrics should be listed in table

[jira] [Created] (DRILL-5921) Counters metrics should be listed in table

[jira] [Commented] (DRILL-5783) Make code generation in the TopN operator more modular and test it

[jira] [Commented] (DRILL-5832) Migrate OperatorFixture to use SystemOptionManager rather than mock

[jira] [Commented] (DRILL-5783) Make code generation in the TopN operator more modular and test it

[jira] [Commented] (DRILL-5783) Make code generation in the TopN operator more modular and test it

[jira] [Commented] (DRILL-5783) Make code generation in the TopN operator more modular and test it

[jira] [Commented] (DRILL-5783) Make code generation in the TopN operator more modular and test it

[jira] [Commented] (DRILL-5248) You tried to write a VarChar type when you are

[jira] [Commented] (DRILL-5797) Use more often the new parquet reader

[jira] [Updated] (DRILL-5909) need new JMX metrics for (FAILED and CANCELED) queries

[jira] [Commented] (DRILL-5909) need new JMX metrics for (FAILED and CANCELED) queries

[jira] [Commented] (DRILL-5889) sqlline loses RPC connection

[jira] [Commented] (DRILL-5920) Drill incorrectly projects column aliases to scan operator

[jira] [Resolved] (DRILL-5920) Drill incorrectly projects column aliases to scan operator

[jira] [Commented] (DRILL-5889) sqlline loses RPC connection

[jira] [Commented] (DRILL-5797) Use more often the new parquet reader

[jira] [Commented] (DRILL-5920) Drill incorrectly projects column aliases to scan operator

[jira] [Assigned] (DRILL-5909) need new JMX metrics for (FAILED and CANCELED) queries

[jira] [Comment Edited] (DRILL-5920) Drill incorrectly projects column aliases to scan operator

[jira] [Comment Edited] (DRILL-5920) Drill incorrectly projects column aliases to scan operator

[jira] [Commented] (DRILL-5920) Drill incorrectly projects column aliases to scan operator

[jira] [Created] (DRILL-5920) Drill incorrectly projects column aliases to scan operator

[jira] [Commented] (DRILL-5892) Distinguish between states for query statistics exposed via JMX

[jira] [Commented] (DRILL-5892) Distinguish between states for query statistics exposed via JMX

[jira] [Updated] (DRILL-5919) Add session option to allow json reader/writer to work with NaN,INF

[jira] [Updated] (DRILL-5919) Add session option to allow json reader/writer to work with NaN,INF

[jira] [Updated] (DRILL-5919) Add session option to allow json reader/writer to work with NaN,INF

[jira] [Commented] (DRILL-2744) Provide error message when trying to query MapR-DB or HBase tables with insufficient priviliges

[jira] [Updated] (DRILL-5919) Add session option to allow json reader/writer to work with NaN,INF

[jira] [Updated] (DRILL-5919) Add session option to allow json reader/writer to work with NaN,INF

[jira] [Created] (DRILL-5919) Add session option to allow json reader/writer to work with NaN,INF

[jira] [Commented] (DRILL-5645) negation of expression causes null pointer exception

34 matches

Site Navigation

Mail list logo

Footer information