[jira] [Commented] (DRILL-4143) REFRESH TABLE METADATA - Permission Issues with metadata files

2016-05-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294439#comment-15294439
 ] 

ASF GitHub Bot commented on DRILL-4143:
---

Github user chunhui-shi closed the pull request at:

https://github.com/apache/drill/pull/470


> REFRESH TABLE METADATA - Permission Issues with metadata files
> --
>
> Key: DRILL-4143
> URL: https://issues.apache.org/jira/browse/DRILL-4143
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.3.0, 1.4.0
>Reporter: John Omernik
>Assignee: Chunhui Shi
>  Labels: Metadata, Parquet, Permissions
> Fix For: Future
>
>
> Summary of Refresh Metadata Issues confirmed by two different users on Drill 
> User Mailing list. (Title: REFRESH TABLE METADATA - Access Denied)
> This issue pertains to table METADATA and revolves around user 
> authentication. 
> Basically, when the drill bits are running as one user, and the data is owned 
> by another user, there can be access denied issues on subsequent queries 
> after issuing a REFRESH TABLE METADATA command. 
> To troubleshoot what is actually happening, I turned on MapR Auditing (This 
> is a handy feature) and found that when I run a query (that is giving me 
> access denied.. my query is select count(1) from testtable ) Per MapR the 
> user I am logged in as (dataowner) is trying to do a create operation on the 
> .drill.parquet_metadata file and it's failing with status: 17. Per Keys at 
> MapR, "status 17 means errno 17 which means EEXIST. Looks like Drill is 
> trying to create a file that already exists." This seems to indicate that 
> drill is perhaps trying to create the .drill.parquet_metadata on each select 
> as the dataowner user, but the permissions (as seen below) don't allow it. 
> Here are the steps to reproduce:
> Enable Authentication. 
> Run all drill bits in the cluster as "drillbituser", then have the files 
> owned by "dataowner". Note the root of the table permissions are drwxrwxr-x 
> but as Drill loads each partition it loads them as drwxr-xr-x (all with 
> dataowner:dataowner ownership). That may be something too, the default 
> permissions when creating a table?  Another note, in my setup, drillbituser 
> is in the group for dataowner.  Thus, they should always have read access. 
> # Authenticated as dataowner (this should have full permissions to all the 
> data)
> Enter username for jdbc:drill:zk=zknode1:5181: dataowner
> Enter password for jdbc:drill:zk=zknode1:5181: **
> 0: jdbc:drill:zk=zknode1> use dfs.dev;
> +---+--+
> |  ok   |   summary|
> +---+--+
> | true  | Default schema changed to [dfs.dev]  |
> +---+--+
> 1 row selected (0.307 seconds)
> # The query works fine with no table metadata
> 0: jdbc:drill:zk=zknode1> select count(1) from `testtable`;
> +---+
> |  EXPR$0   |
> +---+
> | 24565203  |
> +---+
> 1 row selected (3.392 seconds)
> # Refresh of metadata works under with no errors
> 0: jdbc:drill:zk=zknode1> refresh table metadata `testtable`;
> +---+---+
> |  ok   |summary|
> +---+---+
> | true  | Successfully updated metadata for table testtable.  |
> +---+---+
> 1 row selected (5.767 seconds)
>  
> # Trying to run the same query, it returns a access denied issue. 
> 0: jdbc:drill:zk=zknode1> select count(1) from `testtable`;
> Error: SYSTEM ERROR: IOException: 2127.7646.2950962 
> /data/dev/testtable/2015-11-12/.drill.parquet_metadata (Permission denied)
>  
>  
> [Error Id: 7bfce2e7-f78d-4fba-b047-f4c85b471de4 on node1:31010] 
> (state=,code=0)
>  
>  
> # Note how all the files are owned by the drillbituser. Per discussion on 
> list, this is normal 
>  
> $ find ./ -type f -name ".drill.parquet_metadata" -exec ls -ls {} \;
> 726 -rwxr-xr-x 1 drillbituser drillbituser 742837 Nov 30 14:27 
> ./2015-11-12/.drill.parquet_metadata
> 583 -rwxr-xr-x 1 drillbituser drillbituser 596146 Nov 30 14:27 
> ./2015-11-29/.drill.parquet_metadata
> 756 -rwxr-xr-x 1 drillbituser drillbituser 773811 Nov 30 14:27 
> ./2015-11-11/.drill.parquet_metadata
> 763 -rwxr-xr-x 1 drillbituser drillbituser 780829 Nov 30 14:27 
> ./2015-11-04/.drill.parquet_metadata
> 632 -rwxr-xr-x 1 drillbituser drillbituser 646851 Nov 30 14:27 
> ./2015-11-08/.drill.parquet_metadata
> 845 -rwxr-xr-x 1 drillbituser drillbituser 864421 Nov 30 14:27 
> ./2015-11-05/.drill.parquet_metadata
> 771 -rwxr-xr-x 1 drillbituser 

[jira] [Commented] (DRILL-4143) REFRESH TABLE METADATA - Permission Issues with metadata files

2016-05-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294362#comment-15294362
 ] 

ASF GitHub Bot commented on DRILL-4143:
---

Github user amansinha100 commented on the pull request:

https://github.com/apache/drill/pull/470#issuecomment-220730748
  
Committed in 3d92d2829


> REFRESH TABLE METADATA - Permission Issues with metadata files
> --
>
> Key: DRILL-4143
> URL: https://issues.apache.org/jira/browse/DRILL-4143
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.3.0, 1.4.0
>Reporter: John Omernik
>Assignee: Chunhui Shi
>  Labels: Metadata, Parquet, Permissions
> Fix For: Future
>
>
> Summary of Refresh Metadata Issues confirmed by two different users on Drill 
> User Mailing list. (Title: REFRESH TABLE METADATA - Access Denied)
> This issue pertains to table METADATA and revolves around user 
> authentication. 
> Basically, when the drill bits are running as one user, and the data is owned 
> by another user, there can be access denied issues on subsequent queries 
> after issuing a REFRESH TABLE METADATA command. 
> To troubleshoot what is actually happening, I turned on MapR Auditing (This 
> is a handy feature) and found that when I run a query (that is giving me 
> access denied.. my query is select count(1) from testtable ) Per MapR the 
> user I am logged in as (dataowner) is trying to do a create operation on the 
> .drill.parquet_metadata file and it's failing with status: 17. Per Keys at 
> MapR, "status 17 means errno 17 which means EEXIST. Looks like Drill is 
> trying to create a file that already exists." This seems to indicate that 
> drill is perhaps trying to create the .drill.parquet_metadata on each select 
> as the dataowner user, but the permissions (as seen below) don't allow it. 
> Here are the steps to reproduce:
> Enable Authentication. 
> Run all drill bits in the cluster as "drillbituser", then have the files 
> owned by "dataowner". Note the root of the table permissions are drwxrwxr-x 
> but as Drill loads each partition it loads them as drwxr-xr-x (all with 
> dataowner:dataowner ownership). That may be something too, the default 
> permissions when creating a table?  Another note, in my setup, drillbituser 
> is in the group for dataowner.  Thus, they should always have read access. 
> # Authenticated as dataowner (this should have full permissions to all the 
> data)
> Enter username for jdbc:drill:zk=zknode1:5181: dataowner
> Enter password for jdbc:drill:zk=zknode1:5181: **
> 0: jdbc:drill:zk=zknode1> use dfs.dev;
> +---+--+
> |  ok   |   summary|
> +---+--+
> | true  | Default schema changed to [dfs.dev]  |
> +---+--+
> 1 row selected (0.307 seconds)
> # The query works fine with no table metadata
> 0: jdbc:drill:zk=zknode1> select count(1) from `testtable`;
> +---+
> |  EXPR$0   |
> +---+
> | 24565203  |
> +---+
> 1 row selected (3.392 seconds)
> # Refresh of metadata works under with no errors
> 0: jdbc:drill:zk=zknode1> refresh table metadata `testtable`;
> +---+---+
> |  ok   |summary|
> +---+---+
> | true  | Successfully updated metadata for table testtable.  |
> +---+---+
> 1 row selected (5.767 seconds)
>  
> # Trying to run the same query, it returns a access denied issue. 
> 0: jdbc:drill:zk=zknode1> select count(1) from `testtable`;
> Error: SYSTEM ERROR: IOException: 2127.7646.2950962 
> /data/dev/testtable/2015-11-12/.drill.parquet_metadata (Permission denied)
>  
>  
> [Error Id: 7bfce2e7-f78d-4fba-b047-f4c85b471de4 on node1:31010] 
> (state=,code=0)
>  
>  
> # Note how all the files are owned by the drillbituser. Per discussion on 
> list, this is normal 
>  
> $ find ./ -type f -name ".drill.parquet_metadata" -exec ls -ls {} \;
> 726 -rwxr-xr-x 1 drillbituser drillbituser 742837 Nov 30 14:27 
> ./2015-11-12/.drill.parquet_metadata
> 583 -rwxr-xr-x 1 drillbituser drillbituser 596146 Nov 30 14:27 
> ./2015-11-29/.drill.parquet_metadata
> 756 -rwxr-xr-x 1 drillbituser drillbituser 773811 Nov 30 14:27 
> ./2015-11-11/.drill.parquet_metadata
> 763 -rwxr-xr-x 1 drillbituser drillbituser 780829 Nov 30 14:27 
> ./2015-11-04/.drill.parquet_metadata
> 632 -rwxr-xr-x 1 drillbituser drillbituser 646851 Nov 30 14:27 
> ./2015-11-08/.drill.parquet_metadata
> 845 -rwxr-xr-x 1 drillbituser drillbituser 864421 Nov 30 14:27 
> 

[jira] [Commented] (DRILL-4679) CONVERT_FROM() json format fails if 0 rows are received from upstream operator

2016-05-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294360#comment-15294360
 ] 

ASF GitHub Bot commented on DRILL-4679:
---

Github user amansinha100 closed the pull request at:

https://github.com/apache/drill/pull/504


> CONVERT_FROM()  json format fails if 0 rows are received from upstream 
> operator
> ---
>
> Key: DRILL-4679
> URL: https://issues.apache.org/jira/browse/DRILL-4679
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.6.0
>Reporter: Aman Sinha
>Assignee: Jinfeng Ni
>
> CONVERT_FROM() json format fails as below if the underlying Filter produces 0 
> rows: 
> {noformat}
> 0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'json') as x 
> from cp.`tpch/region.parquet` where r_regionkey = ;
> Error: SYSTEM ERROR: IllegalStateException: next() returned NONE without 
> first returning OK_NEW_SCHEMA [#16, ProjectRecordBatch]
> Fragment 0:0
> {noformat}
> If the conversion is applied as UTF8 format,  the same query succeeds: 
> {noformat}
> 0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'utf8') as x 
> from cp.`tpch/region.parquet` where r_regionkey = ;
> ++
> | x  |
> ++
> ++
> No rows selected (0.241 seconds)
> {noformat}
> The reason for this is the special handling in the ProjectRecordBatch for 
> JSON.  The output schema is not known for this until the run time and the 
> ComplexWriter in the Project relies on seeing the input data to determine the 
> output schema - this could be a MapVector or ListVector etc.  
> If the input data has 0 rows due to a filter condition, we should at least 
> produce a default output schema, e.g an empty MapVector ?  Need to decide a 
> good default.  Note that the CONVERT_FROM(x, 'json') could occur on 2 
> branches of a UNION-ALL and if one input is empty while the other side is 
> not, it may still cause incompatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4679) CONVERT_FROM() json format fails if 0 rows are received from upstream operator

2016-05-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294359#comment-15294359
 ] 

ASF GitHub Bot commented on DRILL-4679:
---

Github user amansinha100 commented on the pull request:

https://github.com/apache/drill/pull/504#issuecomment-220730151
  
Committed in 3d92d2829.  


> CONVERT_FROM()  json format fails if 0 rows are received from upstream 
> operator
> ---
>
> Key: DRILL-4679
> URL: https://issues.apache.org/jira/browse/DRILL-4679
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.6.0
>Reporter: Aman Sinha
>Assignee: Jinfeng Ni
>
> CONVERT_FROM() json format fails as below if the underlying Filter produces 0 
> rows: 
> {noformat}
> 0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'json') as x 
> from cp.`tpch/region.parquet` where r_regionkey = ;
> Error: SYSTEM ERROR: IllegalStateException: next() returned NONE without 
> first returning OK_NEW_SCHEMA [#16, ProjectRecordBatch]
> Fragment 0:0
> {noformat}
> If the conversion is applied as UTF8 format,  the same query succeeds: 
> {noformat}
> 0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'utf8') as x 
> from cp.`tpch/region.parquet` where r_regionkey = ;
> ++
> | x  |
> ++
> ++
> No rows selected (0.241 seconds)
> {noformat}
> The reason for this is the special handling in the ProjectRecordBatch for 
> JSON.  The output schema is not known for this until the run time and the 
> ComplexWriter in the Project relies on seeing the input data to determine the 
> output schema - this could be a MapVector or ListVector etc.  
> If the input data has 0 rows due to a filter condition, we should at least 
> produce a default output schema, e.g an empty MapVector ?  Need to decide a 
> good default.  Note that the CONVERT_FROM(x, 'json') could occur on 2 
> branches of a UNION-ALL and if one input is empty while the other side is 
> not, it may still cause incompatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4571) Add link to local Drill logs from the web UI

2016-05-20 Thread Krystal (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krystal updated DRILL-4571:
---
Attachment: drillbit_ui.log
drillbit_download.log.gz

Log files

> Add link to local Drill logs from the web UI
> 
>
> Key: DRILL-4571
> URL: https://issues.apache.org/jira/browse/DRILL-4571
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.6.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>  Labels: doc-impacting
> Fix For: 1.7.0
>
> Attachments: display_log.JPG, drillbit_download.log.gz, 
> drillbit_ui.log, log_list.JPG
>
>
> Now we have link to the profile from the web UI.
> It will be handy for the users to have the link to local logs as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4571) Add link to local Drill logs from the web UI

2016-05-20 Thread Krystal (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294169#comment-15294169
 ] 

Krystal commented on DRILL-4571:


Hi Arina,

In verifying this feature, I see some issues as describe below:
1. When the log files are downloaded from the ui, the name of the downloaded 
file is "download".  We should save the file with the same name as the log file 
(ie. drillbit.log)
2. For Chrome browser, the content of the drillbit.queries.json file is 
displayed all one 1 line, making it very hard to read.
3. The last 1 lines of the log file displayed in the web UI do not match 
the log file itself.  For your reference, I downloaded the full log from the 
web UI (this matches the actual log file).  I also copied the content of the 
same log file as shown in the web UI.  Doing a diff between the 2 files show 
that many lines were skipped in the web UI.  Attached are these 2 log files for 
your reference.

I have 3 drillbits running.  git.commit.id.abbrev=09b2627


> Add link to local Drill logs from the web UI
> 
>
> Key: DRILL-4571
> URL: https://issues.apache.org/jira/browse/DRILL-4571
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.6.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>  Labels: doc-impacting
> Fix For: 1.7.0
>
> Attachments: display_log.JPG, log_list.JPG
>
>
> Now we have link to the profile from the web UI.
> It will be handy for the users to have the link to local logs as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4679) CONVERT_FROM() json format fails if 0 rows are received from upstream operator

2016-05-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294142#comment-15294142
 ] 

ASF GitHub Bot commented on DRILL-4679:
---

Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/504#discussion_r64103528
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/project/ProjectRecordBatch.java
 ---
@@ -146,6 +159,27 @@ protected IterOutcome doWork() {
   if (next == IterOutcome.OUT_OF_MEMORY) {
 outOfMemory = true;
 return next;
+  } else if (next == IterOutcome.NONE) {
+// since this is first batch and we already got a NONE, need 
to set up the schema
+
+//allocate vv in the allocationVectors.
+for (final ValueVector v : this.allocationVectors) {
--- End diff --

Updated PR to address review comment regarding doAlloc(). 


> CONVERT_FROM()  json format fails if 0 rows are received from upstream 
> operator
> ---
>
> Key: DRILL-4679
> URL: https://issues.apache.org/jira/browse/DRILL-4679
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.6.0
>Reporter: Aman Sinha
>Assignee: Jinfeng Ni
>
> CONVERT_FROM() json format fails as below if the underlying Filter produces 0 
> rows: 
> {noformat}
> 0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'json') as x 
> from cp.`tpch/region.parquet` where r_regionkey = ;
> Error: SYSTEM ERROR: IllegalStateException: next() returned NONE without 
> first returning OK_NEW_SCHEMA [#16, ProjectRecordBatch]
> Fragment 0:0
> {noformat}
> If the conversion is applied as UTF8 format,  the same query succeeds: 
> {noformat}
> 0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'utf8') as x 
> from cp.`tpch/region.parquet` where r_regionkey = ;
> ++
> | x  |
> ++
> ++
> No rows selected (0.241 seconds)
> {noformat}
> The reason for this is the special handling in the ProjectRecordBatch for 
> JSON.  The output schema is not known for this until the run time and the 
> ComplexWriter in the Project relies on seeing the input data to determine the 
> output schema - this could be a MapVector or ListVector etc.  
> If the input data has 0 rows due to a filter condition, we should at least 
> produce a default output schema, e.g an empty MapVector ?  Need to decide a 
> good default.  Note that the CONVERT_FROM(x, 'json') could occur on 2 
> branches of a UNION-ALL and if one input is empty while the other side is 
> not, it may still cause incompatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4607) Add a split function that allows to separate string by a delimiter

2016-05-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294126#comment-15294126
 ] 

ASF GitHub Bot commented on DRILL-4607:
---

GitHub user aaas24 opened a pull request:

https://github.com/apache/drill/pull/506

DRILL-4607: Add a split function that allows to separate string by a …

…delimiter

This patch allows to apply a split function by providing a string and a 
delimiter.

Addressed the review comments from Sudheesh.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aaas24/drill DRILL-4607

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/506.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #506


commit c3bdde43f79c58e94e2fcdf32e782209486e8cca
Author: Alicia Alvarez 
Date:   2016-04-15T18:07:47Z

DRILL-4607: Add a split function that allows to separate string by a 
delimiter




> Add a split function that allows to separate string by a delimiter
> --
>
> Key: DRILL-4607
> URL: https://issues.apache.org/jira/browse/DRILL-4607
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Functions - Drill
>Affects Versions: 1.6.0
>Reporter: Alicia Alvarez
>
> Ex: Let's say I have records in a CSV file with the following schema
> {noformat}
> user_name, friend_list_separated_by_a_delimiter,other_fields
> ali,sam;adi;tom,45,...
> {noformat}
> I want to run a query which returns the friend list files as a repeated value.
> {noformat}
> select user_name, split(friend_list, ';') friends from userdata;
> {noformat}
> This should return the records in the following format
> {noformat}
> -
> | user_name |   friends |
> -
> |   ali |  [sam, adi, tom]  |
> -
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4679) CONVERT_FROM() json format fails if 0 rows are received from upstream operator

2016-05-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294108#comment-15294108
 ] 

ASF GitHub Bot commented on DRILL-4679:
---

Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/504#discussion_r64101122
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/project/ProjectRecordBatch.java
 ---
@@ -136,6 +145,10 @@ public VectorContainer getOutgoingContainer() {
 
   @Override
   protected IterOutcome doWork() {
+if (wasNone) {
+  return IterOutcome.NONE;
+}
+
 int incomingRecordCount = incoming.getRecordCount();
 
 if (first && incomingRecordCount == 0) {
--- End diff --

Actually, if the first batch was non-empty, the new changes wouldn't apply 
because of the following check: 
  if (first && incomingRecordCount == 0) { ... }
Then if the next incoming  batch is empty, it should continue to work since 
we have already produced the schema from the first batch.  On the other hand if 
the first batch is empty and we see a NONE iterator outcome, we want to make 
sure that a schema is produced but at the same time not call next() since a 
NONE outcome has already been seen. 


> CONVERT_FROM()  json format fails if 0 rows are received from upstream 
> operator
> ---
>
> Key: DRILL-4679
> URL: https://issues.apache.org/jira/browse/DRILL-4679
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.6.0
>Reporter: Aman Sinha
>Assignee: Jinfeng Ni
>
> CONVERT_FROM() json format fails as below if the underlying Filter produces 0 
> rows: 
> {noformat}
> 0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'json') as x 
> from cp.`tpch/region.parquet` where r_regionkey = ;
> Error: SYSTEM ERROR: IllegalStateException: next() returned NONE without 
> first returning OK_NEW_SCHEMA [#16, ProjectRecordBatch]
> Fragment 0:0
> {noformat}
> If the conversion is applied as UTF8 format,  the same query succeeds: 
> {noformat}
> 0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'utf8') as x 
> from cp.`tpch/region.parquet` where r_regionkey = ;
> ++
> | x  |
> ++
> ++
> No rows selected (0.241 seconds)
> {noformat}
> The reason for this is the special handling in the ProjectRecordBatch for 
> JSON.  The output schema is not known for this until the run time and the 
> ComplexWriter in the Project relies on seeing the input data to determine the 
> output schema - this could be a MapVector or ListVector etc.  
> If the input data has 0 rows due to a filter condition, we should at least 
> produce a default output schema, e.g an empty MapVector ?  Need to decide a 
> good default.  Note that the CONVERT_FROM(x, 'json') could occur on 2 
> branches of a UNION-ALL and if one input is empty while the other side is 
> not, it may still cause incompatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4607) Add a split function that allows to separate string by a delimiter

2016-05-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294103#comment-15294103
 ] 

ASF GitHub Bot commented on DRILL-4607:
---

Github user aaas24 closed the pull request at:

https://github.com/apache/drill/pull/481


> Add a split function that allows to separate string by a delimiter
> --
>
> Key: DRILL-4607
> URL: https://issues.apache.org/jira/browse/DRILL-4607
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Functions - Drill
>Affects Versions: 1.6.0
>Reporter: Alicia Alvarez
>
> Ex: Let's say I have records in a CSV file with the following schema
> {noformat}
> user_name, friend_list_separated_by_a_delimiter,other_fields
> ali,sam;adi;tom,45,...
> {noformat}
> I want to run a query which returns the friend list files as a repeated value.
> {noformat}
> select user_name, split(friend_list, ';') friends from userdata;
> {noformat}
> This should return the records in the following format
> {noformat}
> -
> | user_name |   friends |
> -
> |   ali |  [sam, adi, tom]  |
> -
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4679) CONVERT_FROM() json format fails if 0 rows are received from upstream operator

2016-05-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293978#comment-15293978
 ] 

ASF GitHub Bot commented on DRILL-4679:
---

Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/504#discussion_r64091394
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/project/ProjectRecordBatch.java
 ---
@@ -146,6 +159,27 @@ protected IterOutcome doWork() {
   if (next == IterOutcome.OUT_OF_MEMORY) {
 outOfMemory = true;
 return next;
+  } else if (next == IterOutcome.NONE) {
+// since this is first batch and we already got a NONE, need 
to set up the schema
+
+//allocate vv in the allocationVectors.
+for (final ValueVector v : this.allocationVectors) {
--- End diff --

The doAlloc() was calling incoming.getRecordCount() which would fail for 
empty batches, so I did not use it but I can modify doAlloc to take the count 
parameter and have everyone call the modified version.  


> CONVERT_FROM()  json format fails if 0 rows are received from upstream 
> operator
> ---
>
> Key: DRILL-4679
> URL: https://issues.apache.org/jira/browse/DRILL-4679
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.6.0
>Reporter: Aman Sinha
>Assignee: Jinfeng Ni
>
> CONVERT_FROM() json format fails as below if the underlying Filter produces 0 
> rows: 
> {noformat}
> 0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'json') as x 
> from cp.`tpch/region.parquet` where r_regionkey = ;
> Error: SYSTEM ERROR: IllegalStateException: next() returned NONE without 
> first returning OK_NEW_SCHEMA [#16, ProjectRecordBatch]
> Fragment 0:0
> {noformat}
> If the conversion is applied as UTF8 format,  the same query succeeds: 
> {noformat}
> 0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'utf8') as x 
> from cp.`tpch/region.parquet` where r_regionkey = ;
> ++
> | x  |
> ++
> ++
> No rows selected (0.241 seconds)
> {noformat}
> The reason for this is the special handling in the ProjectRecordBatch for 
> JSON.  The output schema is not known for this until the run time and the 
> ComplexWriter in the Project relies on seeing the input data to determine the 
> output schema - this could be a MapVector or ListVector etc.  
> If the input data has 0 rows due to a filter condition, we should at least 
> produce a default output schema, e.g an empty MapVector ?  Need to decide a 
> good default.  Note that the CONVERT_FROM(x, 'json') could occur on 2 
> branches of a UNION-ALL and if one input is empty while the other side is 
> not, it may still cause incompatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-4523) Disallow using loopback address in distributed mode

2016-05-20 Thread Krystal (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krystal closed DRILL-4523.
--

git.commit.id.abbrev=09b2627

Verified that bug is fixed.

> Disallow using loopback address in distributed mode
> ---
>
> Key: DRILL-4523
> URL: https://issues.apache.org/jira/browse/DRILL-4523
> Project: Apache Drill
>  Issue Type: Improvement
>  Components:  Server
>Affects Versions: 1.6.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
> Fix For: 1.7.0
>
>
> If we enable debug for org.apache.drill.exec.coord.zk in logback.xml, we only 
> get the hostname and ports information. For example:
> {code}
> 2015-11-04 19:47:02,927 [ServiceCache-0] DEBUG 
> o.a.d.e.c.zk.ZKClusterCoordinator - Cache changed, updating.
> 2015-11-04 19:47:02,932 [ServiceCache-0] DEBUG 
> o.a.d.e.c.zk.ZKClusterCoordinator - Active drillbit set changed.  Now 
> includes 2 total bits.  New active drillbits:
>  h3.poc.com:31010:31011:31012
>  h2.poc.com:31010:31011:31012
> {code}
> We need to know the IP address of each hostname to do further troubleshooting.
> Imagine if any drillbit registers itself as "localhost.localdomain" in 
> zookeeper, we will never know where it comes from. Enabling IP address 
> tracking can help this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4679) CONVERT_FROM() json format fails if 0 rows are received from upstream operator

2016-05-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293598#comment-15293598
 ] 

ASF GitHub Bot commented on DRILL-4679:
---

Github user jinfengni commented on the pull request:

https://github.com/apache/drill/pull/504#issuecomment-220645472
  
LGTM.

+1




> CONVERT_FROM()  json format fails if 0 rows are received from upstream 
> operator
> ---
>
> Key: DRILL-4679
> URL: https://issues.apache.org/jira/browse/DRILL-4679
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.6.0
>Reporter: Aman Sinha
>Assignee: Jinfeng Ni
>
> CONVERT_FROM() json format fails as below if the underlying Filter produces 0 
> rows: 
> {noformat}
> 0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'json') as x 
> from cp.`tpch/region.parquet` where r_regionkey = ;
> Error: SYSTEM ERROR: IllegalStateException: next() returned NONE without 
> first returning OK_NEW_SCHEMA [#16, ProjectRecordBatch]
> Fragment 0:0
> {noformat}
> If the conversion is applied as UTF8 format,  the same query succeeds: 
> {noformat}
> 0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'utf8') as x 
> from cp.`tpch/region.parquet` where r_regionkey = ;
> ++
> | x  |
> ++
> ++
> No rows selected (0.241 seconds)
> {noformat}
> The reason for this is the special handling in the ProjectRecordBatch for 
> JSON.  The output schema is not known for this until the run time and the 
> ComplexWriter in the Project relies on seeing the input data to determine the 
> output schema - this could be a MapVector or ListVector etc.  
> If the input data has 0 rows due to a filter condition, we should at least 
> produce a default output schema, e.g an empty MapVector ?  Need to decide a 
> good default.  Note that the CONVERT_FROM(x, 'json') could occur on 2 
> branches of a UNION-ALL and if one input is empty while the other side is 
> not, it may still cause incompatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4679) CONVERT_FROM() json format fails if 0 rows are received from upstream operator

2016-05-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293596#comment-15293596
 ] 

ASF GitHub Bot commented on DRILL-4679:
---

Github user jinfengni commented on a diff in the pull request:

https://github.com/apache/drill/pull/504#discussion_r64064412
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/project/ProjectRecordBatch.java
 ---
@@ -136,6 +145,10 @@ public VectorContainer getOutgoingContainer() {
 
   @Override
   protected IterOutcome doWork() {
+if (wasNone) {
+  return IterOutcome.NONE;
+}
+
 int incomingRecordCount = incoming.getRecordCount();
 
 if (first && incomingRecordCount == 0) {
--- End diff --

The new logic will handle the case for Project's first outgoing batch. Not 
sure whether Drill works properly after the first batch getting data and 
building the schema, but the next incoming batch contains empty result. We may 
treat as a separate issue for further investigation. 



> CONVERT_FROM()  json format fails if 0 rows are received from upstream 
> operator
> ---
>
> Key: DRILL-4679
> URL: https://issues.apache.org/jira/browse/DRILL-4679
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.6.0
>Reporter: Aman Sinha
>Assignee: Jinfeng Ni
>
> CONVERT_FROM() json format fails as below if the underlying Filter produces 0 
> rows: 
> {noformat}
> 0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'json') as x 
> from cp.`tpch/region.parquet` where r_regionkey = ;
> Error: SYSTEM ERROR: IllegalStateException: next() returned NONE without 
> first returning OK_NEW_SCHEMA [#16, ProjectRecordBatch]
> Fragment 0:0
> {noformat}
> If the conversion is applied as UTF8 format,  the same query succeeds: 
> {noformat}
> 0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'utf8') as x 
> from cp.`tpch/region.parquet` where r_regionkey = ;
> ++
> | x  |
> ++
> ++
> No rows selected (0.241 seconds)
> {noformat}
> The reason for this is the special handling in the ProjectRecordBatch for 
> JSON.  The output schema is not known for this until the run time and the 
> ComplexWriter in the Project relies on seeing the input data to determine the 
> output schema - this could be a MapVector or ListVector etc.  
> If the input data has 0 rows due to a filter condition, we should at least 
> produce a default output schema, e.g an empty MapVector ?  Need to decide a 
> good default.  Note that the CONVERT_FROM(x, 'json') could occur on 2 
> branches of a UNION-ALL and if one input is empty while the other side is 
> not, it may still cause incompatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4679) CONVERT_FROM() json format fails if 0 rows are received from upstream operator

2016-05-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293579#comment-15293579
 ] 

ASF GitHub Bot commented on DRILL-4679:
---

Github user jinfengni commented on a diff in the pull request:

https://github.com/apache/drill/pull/504#discussion_r64063077
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/project/ProjectRecordBatch.java
 ---
@@ -146,6 +159,27 @@ protected IterOutcome doWork() {
   if (next == IterOutcome.OUT_OF_MEMORY) {
 outOfMemory = true;
 return next;
+  } else if (next == IterOutcome.NONE) {
+// since this is first batch and we already got a NONE, need 
to set up the schema
+
+//allocate vv in the allocationVectors.
+for (final ValueVector v : this.allocationVectors) {
--- End diff --

the allocation logic may use existing method doAlloc().



> CONVERT_FROM()  json format fails if 0 rows are received from upstream 
> operator
> ---
>
> Key: DRILL-4679
> URL: https://issues.apache.org/jira/browse/DRILL-4679
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.6.0
>Reporter: Aman Sinha
>Assignee: Jinfeng Ni
>
> CONVERT_FROM() json format fails as below if the underlying Filter produces 0 
> rows: 
> {noformat}
> 0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'json') as x 
> from cp.`tpch/region.parquet` where r_regionkey = ;
> Error: SYSTEM ERROR: IllegalStateException: next() returned NONE without 
> first returning OK_NEW_SCHEMA [#16, ProjectRecordBatch]
> Fragment 0:0
> {noformat}
> If the conversion is applied as UTF8 format,  the same query succeeds: 
> {noformat}
> 0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'utf8') as x 
> from cp.`tpch/region.parquet` where r_regionkey = ;
> ++
> | x  |
> ++
> ++
> No rows selected (0.241 seconds)
> {noformat}
> The reason for this is the special handling in the ProjectRecordBatch for 
> JSON.  The output schema is not known for this until the run time and the 
> ComplexWriter in the Project relies on seeing the input data to determine the 
> output schema - this could be a MapVector or ListVector etc.  
> If the input data has 0 rows due to a filter condition, we should at least 
> produce a default output schema, e.g an empty MapVector ?  Need to decide a 
> good default.  Note that the CONVERT_FROM(x, 'json') could occur on 2 
> branches of a UNION-ALL and if one input is empty while the other side is 
> not, it may still cause incompatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4689) Need to support conversion from TIMESTAMP type to TIME type

2016-05-20 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293474#comment-15293474
 ] 

Julian Hyde commented on DRILL-4689:


I support converting TIMESTAMP to TIME, but a TIME literal needs to be in the 
correct format, regardless of what PostgreSQL does.

> Need to support conversion from TIMESTAMP type to TIME type
> ---
>
> Key: DRILL-4689
> URL: https://issues.apache.org/jira/browse/DRILL-4689
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 1.7.0
> Environment: CentOS cluster
>Reporter: Khurram Faraaz
>
> According to ISO/IEC-2 9075 standard, TIMESTAMP type to TIME type conversion 
> is allowed and supported.
> This does not seem to work on Drill 1.7.0
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> values(TIME '2050-2-3 10:11:12.1000');
> Error: PARSE ERROR: Illegal TIME literal '2050-2-3 10:11:12.1000': not in 
> format 'HH:mm:ss'
> SQL Query values(TIME '2050-2-3 10:11:12.1000')
>^
> [Error Id: 77168fe0-760f-4384-a7c6-682241675348 on centos-03.qa.lab:31010] 
> (state=,code=0)
> 0: jdbc:drill:schema=dfs.tmp> values(cast('2050-2-3 10:11:12.1000' as time));
> Error: SYSTEM ERROR: IllegalArgumentException: Invalid format: "2050-2-3 
> 10:11:12.1000" is malformed at "50-2-3 10:11:12.1000"
> Fragment 0:0
> [Error Id: 5168dfe6-b5e5-4ce0-8570-02ea74da6367 on centos-03.qa.lab:31010] 
> (state=,code=0)
> 0: jdbc:drill:schema=dfs.tmp>
> {noformat}
> The above two expressions are supported on Postgres 9.3
> {noformat}
> postgres=# values(TIME '2050-2-3 10:11:12.1000');
>   column1   
> 
>  10:11:12.1
> (1 row)
> postgres=# values(cast('2050-2-3 10:11:12.1000' as time));
>   column1   
> 
>  10:11:12.1
> (1 row)
> postgres=# 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4689) Need to support conversion from TIMESTAMP type to TIME type

2016-05-20 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-4689:
-

 Summary: Need to support conversion from TIMESTAMP type to TIME 
type
 Key: DRILL-4689
 URL: https://issues.apache.org/jira/browse/DRILL-4689
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Data Types
Affects Versions: 1.7.0
 Environment: CentOS cluster
Reporter: Khurram Faraaz


According to ISO/IEC-2 9075 standard, TIMESTAMP type to TIME type conversion is 
allowed and supported.
This does not seem to work on Drill 1.7.0

{noformat}
0: jdbc:drill:schema=dfs.tmp> values(TIME '2050-2-3 10:11:12.1000');
Error: PARSE ERROR: Illegal TIME literal '2050-2-3 10:11:12.1000': not in 
format 'HH:mm:ss'

SQL Query values(TIME '2050-2-3 10:11:12.1000')
   ^


[Error Id: 77168fe0-760f-4384-a7c6-682241675348 on centos-03.qa.lab:31010] 
(state=,code=0)
0: jdbc:drill:schema=dfs.tmp> values(cast('2050-2-3 10:11:12.1000' as time));
Error: SYSTEM ERROR: IllegalArgumentException: Invalid format: "2050-2-3 
10:11:12.1000" is malformed at "50-2-3 10:11:12.1000"

Fragment 0:0

[Error Id: 5168dfe6-b5e5-4ce0-8570-02ea74da6367 on centos-03.qa.lab:31010] 
(state=,code=0)
0: jdbc:drill:schema=dfs.tmp>
{noformat}

The above two expressions are supported on Postgres 9.3
{noformat}
postgres=# values(TIME '2050-2-3 10:11:12.1000');
  column1   

 10:11:12.1
(1 row)

postgres=# values(cast('2050-2-3 10:11:12.1000' as time));
  column1   

 10:11:12.1
(1 row)

postgres=# 
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4679) CONVERT_FROM() json format fails if 0 rows are received from upstream operator

2016-05-20 Thread Aman Sinha (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Sinha updated DRILL-4679:
--
Assignee: Jinfeng Ni  (was: Aman Sinha)

> CONVERT_FROM()  json format fails if 0 rows are received from upstream 
> operator
> ---
>
> Key: DRILL-4679
> URL: https://issues.apache.org/jira/browse/DRILL-4679
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.6.0
>Reporter: Aman Sinha
>Assignee: Jinfeng Ni
>
> CONVERT_FROM() json format fails as below if the underlying Filter produces 0 
> rows: 
> {noformat}
> 0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'json') as x 
> from cp.`tpch/region.parquet` where r_regionkey = ;
> Error: SYSTEM ERROR: IllegalStateException: next() returned NONE without 
> first returning OK_NEW_SCHEMA [#16, ProjectRecordBatch]
> Fragment 0:0
> {noformat}
> If the conversion is applied as UTF8 format,  the same query succeeds: 
> {noformat}
> 0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'utf8') as x 
> from cp.`tpch/region.parquet` where r_regionkey = ;
> ++
> | x  |
> ++
> ++
> No rows selected (0.241 seconds)
> {noformat}
> The reason for this is the special handling in the ProjectRecordBatch for 
> JSON.  The output schema is not known for this until the run time and the 
> ComplexWriter in the Project relies on seeing the input data to determine the 
> output schema - this could be a MapVector or ListVector etc.  
> If the input data has 0 rows due to a filter condition, we should at least 
> produce a default output schema, e.g an empty MapVector ?  Need to decide a 
> good default.  Note that the CONVERT_FROM(x, 'json') could occur on 2 
> branches of a UNION-ALL and if one input is empty while the other side is 
> not, it may still cause incompatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4679) CONVERT_FROM() json format fails if 0 rows are received from upstream operator

2016-05-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292784#comment-15292784
 ] 

ASF GitHub Bot commented on DRILL-4679:
---

Github user amansinha100 commented on the pull request:

https://github.com/apache/drill/pull/504#issuecomment-220525522
  
@jinfengni  could you pls review ?  


> CONVERT_FROM()  json format fails if 0 rows are received from upstream 
> operator
> ---
>
> Key: DRILL-4679
> URL: https://issues.apache.org/jira/browse/DRILL-4679
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.6.0
>Reporter: Aman Sinha
>Assignee: Aman Sinha
>
> CONVERT_FROM() json format fails as below if the underlying Filter produces 0 
> rows: 
> {noformat}
> 0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'json') as x 
> from cp.`tpch/region.parquet` where r_regionkey = ;
> Error: SYSTEM ERROR: IllegalStateException: next() returned NONE without 
> first returning OK_NEW_SCHEMA [#16, ProjectRecordBatch]
> Fragment 0:0
> {noformat}
> If the conversion is applied as UTF8 format,  the same query succeeds: 
> {noformat}
> 0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'utf8') as x 
> from cp.`tpch/region.parquet` where r_regionkey = ;
> ++
> | x  |
> ++
> ++
> No rows selected (0.241 seconds)
> {noformat}
> The reason for this is the special handling in the ProjectRecordBatch for 
> JSON.  The output schema is not known for this until the run time and the 
> ComplexWriter in the Project relies on seeing the input data to determine the 
> output schema - this could be a MapVector or ListVector etc.  
> If the input data has 0 rows due to a filter condition, we should at least 
> produce a default output schema, e.g an empty MapVector ?  Need to decide a 
> good default.  Note that the CONVERT_FROM(x, 'json') could occur on 2 
> branches of a UNION-ALL and if one input is empty while the other side is 
> not, it may still cause incompatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)