[jira] [Commented] (DRILL-7345) Strange Behavior for UDFs with ComplexWriter Output

2019-08-12 Thread Volodymyr Vysotskyi (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905520#comment-16905520
 ] 

Volodymyr Vysotskyi commented on DRILL-7345:


Yes, it should work, the result will be in the same column.

> Strange Behavior for UDFs with ComplexWriter Output
> ---
>
> Key: DRILL-7345
> URL: https://issues.apache.org/jira/browse/DRILL-7345
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Charles Givre
>Priority: Minor
>
> I wrote some UDFs recently and noticed some strange behavior when debugging 
> them. 
> This behavior only occurs when there is ComplexWriter as output.  
> Basically, if the input to the UDF is nullable, Drill doesn't recognize the 
> UDF at all.  I've found that the only way to get Drill to recognize UDFs that 
> have ComplexWriters as output is:
> * Use a non-nullable holder as input
> * Remove the null setting completely from the function parameters.
> This approach has a drawback in that if the function receives a null value, 
> it will throw an error and halt execution.  My preference would be to allow 
> null handling, but I've not figured out how to make that happen.
> Note:  This behavior ONLY occurs when using a ComplexWriter as output.  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (DRILL-7345) Strange Behavior for UDFs with ComplexWriter Output

2019-08-12 Thread Charles Givre (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905514#comment-16905514
 ] 

Charles Givre commented on DRILL-7345:
--

So, let's say we have a UDF called foo(x) which returns a list.  Let's say our 
data looks like this:  [2,4,5,null,8]

Are you saying that for that to work, I'd have to create an additional UDF with 
nullable input that returns an empty list or something like that?  Would that 
return an additional column or would the result be in the same column?
Thanks,
-- C




> Strange Behavior for UDFs with ComplexWriter Output
> ---
>
> Key: DRILL-7345
> URL: https://issues.apache.org/jira/browse/DRILL-7345
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Charles Givre
>Priority: Minor
>
> I wrote some UDFs recently and noticed some strange behavior when debugging 
> them. 
> This behavior only occurs when there is ComplexWriter as output.  
> Basically, if the input to the UDF is nullable, Drill doesn't recognize the 
> UDF at all.  I've found that the only way to get Drill to recognize UDFs that 
> have ComplexWriters as output is:
> * Use a non-nullable holder as input
> * Remove the null setting completely from the function parameters.
> This approach has a drawback in that if the function receives a null value, 
> it will throw an error and halt execution.  My preference would be to allow 
> null handling, but I've not figured out how to make that happen.
> Note:  This behavior ONLY occurs when using a ComplexWriter as output.  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (DRILL-7345) Strange Behavior for UDFs with ComplexWriter Output

2019-08-12 Thread Volodymyr Vysotskyi (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905511#comment-16905511
 ] 

Volodymyr Vysotskyi commented on DRILL-7345:


{quote}
If I were to create an additional UDF which perhaps accepts a NullableVarChar 
as an input parameter, and returns null, wouldn't that cause Drill to either 
add extra columns or otherwise cause problems?
{quote}
You can create additional UDF which accepts NullableVarChar as an input 
parameter and Drill will choose between them both, which one should be used. A 
lot of inbuilt UDFs which uses internal nulls handling use this approach.

> Strange Behavior for UDFs with ComplexWriter Output
> ---
>
> Key: DRILL-7345
> URL: https://issues.apache.org/jira/browse/DRILL-7345
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Charles Givre
>Priority: Minor
>
> I wrote some UDFs recently and noticed some strange behavior when debugging 
> them. 
> This behavior only occurs when there is ComplexWriter as output.  
> Basically, if the input to the UDF is nullable, Drill doesn't recognize the 
> UDF at all.  I've found that the only way to get Drill to recognize UDFs that 
> have ComplexWriters as output is:
> * Use a non-nullable holder as input
> * Remove the null setting completely from the function parameters.
> This approach has a drawback in that if the function receives a null value, 
> it will throw an error and halt execution.  My preference would be to allow 
> null handling, but I've not figured out how to make that happen.
> Note:  This behavior ONLY occurs when using a ComplexWriter as output.  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (DRILL-7345) Strange Behavior for UDFs with ComplexWriter Output

2019-08-12 Thread Charles Givre (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905462#comment-16905462
 ] 

Charles Givre commented on DRILL-7345:
--

Hi Volodymyr, 
I've been working on a bunch of UDFs, but let's take a simple one for example. 
`parse_user_agent`.  This function takes as an argument, a user agent and 
returns a map of the various fields such as browser name, version , os, etc. 
The issue arises when there are blank or null rows in the data.  If that 
happens, the function errors out. I would prefer to include null handling so 
that if the function encounters an empty row, it simply returns an empty list 
or map, but right now that doesn't seem feasible.  Here is the UDF: 
https://github.com/apache/drill/pull/1840/files 
 I have a few others as well, 
but this one is basically done.

If you add any null handling instruction to the function header (either NULL IF 
NULL or INTERNAL) the function will not be recognized.  If you set the input 
parameter to NullableVarChar, you get an error about Drill not finding the 
function.
If I were to create an additional UDF which perhaps accepts a NullableVarChar 
as an input parameter, and returns null, wouldn't that cause Drill to either 
add extra columns or otherwise cause problems?
-- C






> Strange Behavior for UDFs with ComplexWriter Output
> ---
>
> Key: DRILL-7345
> URL: https://issues.apache.org/jira/browse/DRILL-7345
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Charles Givre
>Priority: Minor
>
> I wrote some UDFs recently and noticed some strange behavior when debugging 
> them. 
> This behavior only occurs when there is ComplexWriter as output.  
> Basically, if the input to the UDF is nullable, Drill doesn't recognize the 
> UDF at all.  I've found that the only way to get Drill to recognize UDFs that 
> have ComplexWriters as output is:
> * Use a non-nullable holder as input
> * Remove the null setting completely from the function parameters.
> This approach has a drawback in that if the function receives a null value, 
> it will throw an error and halt execution.  My preference would be to allow 
> null handling, but I've not figured out how to make that happen.
> Note:  This behavior ONLY occurs when using a ComplexWriter as output.  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (DRILL-7345) Strange Behavior for UDFs with ComplexWriter Output

2019-08-12 Thread Volodymyr Vysotskyi (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905443#comment-16905443
 ] 

Volodymyr Vysotskyi commented on DRILL-7345:


Thanks, [~IhorHuzenko] for pointing to the Javadocs and specifying Jira ticket 
where this Limitation was added.

[~cgivre], as pointed in DRILL-6810, currently Drill does not support NULL 
values for list/map so it is incorrect to allow usage of NULL_IF_NULL 
NullHandling for functions with ComplexWriter.
But you can create two UDF implementations: one of them accepts nullable values 
and another - required and then handle nulls inside UDF in the way you choose - 
set default values, return empty lists, etc.

In Jira description, you have written:
{quote}
if the input to the UDF is nullable, Drill doesn't recognize the UDF at all
{quote}

Is it meant that this UDF wasn't used for values with required data mode? In 
the opposite case, it may be another issue, but we need to see UDF 
implementation to find a root cause.

> Strange Behavior for UDFs with ComplexWriter Output
> ---
>
> Key: DRILL-7345
> URL: https://issues.apache.org/jira/browse/DRILL-7345
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Charles Givre
>Priority: Minor
>
> I wrote some UDFs recently and noticed some strange behavior when debugging 
> them. 
> This behavior only occurs when there is ComplexWriter as output.  
> Basically, if the input to the UDF is nullable, Drill doesn't recognize the 
> UDF at all.  I've found that the only way to get Drill to recognize UDFs that 
> have ComplexWriters as output is:
> * Use a non-nullable holder as input
> * Remove the null setting completely from the function parameters.
> This approach has a drawback in that if the function receives a null value, 
> it will throw an error and halt execution.  My preference would be to allow 
> null handling, but I've not figured out how to make that happen.
> Note:  This behavior ONLY occurs when using a ComplexWriter as output.  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (DRILL-6961) Error Occurred: Cannot connect to the db. query INFORMATION_SCHEMA.VIEWS : Maybe you have incorrect connection params or db unavailable now (timeout)

2019-08-12 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905409#comment-16905409
 ] 

ASF GitHub Bot commented on DRILL-6961:
---

asfgit commented on pull request #1833: DRILL-6961: Handle exceptions during 
queries to information_schema
URL: https://github.com/apache/drill/pull/1833
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Error Occurred: Cannot connect to the db. query INFORMATION_SCHEMA.VIEWS : 
> Maybe you have incorrect connection params or db unavailable now (timeout)
> -
>
> Key: DRILL-6961
> URL: https://issues.apache.org/jira/browse/DRILL-6961
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Information Schema
>Affects Versions: 1.13.0
>Reporter: Khurram Faraaz
>Assignee: Anton Gozhiy
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.17.0
>
>
> Trying to query drill information_schema.views table returns error. Disabling 
> openTSDB plugin resolves the problem.
> Drill 1.13.0
> Failing query :
> {noformat}
> SELECT TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, VIEW_DEFINITION FROM 
> INFORMATION_SCHEMA.`VIEWS` where VIEW_DEFINITION not like 'kraken';
> {noformat}
> Stack Trace from drillbit.log
> {noformat}
> 2019-01-07 15:36:21,975 [23cc39aa-2618-e9f0-e77e-4fafa6edc314:foreman] INFO 
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 23cc39aa-2618-e9f0-e77e-4fafa6edc314: SELECT TABLE_CATALOG, TABLE_SCHEMA, 
> TABLE_NAME, VIEW_DEFINITION FROM INFORMATION_SCHEMA.`VIEWS` where 
> VIEW_DEFINITION not like 'kraken'
> 2019-01-07 15:36:35,221 [23cc39aa-2618-e9f0-e77e-4fafa6edc314:frag:0:0] INFO 
> o.a.d.e.s.o.c.services.ServiceImpl - User Error Occurred: Cannot connect to 
> the db. Maybe you have incorrect connection params or db unavailable now 
> (timeout)
> org.apache.drill.common.exceptions.UserException: CONNECTION ERROR: Cannot 
> connect to the db. Maybe you have incorrect connection params or db 
> unavailable now
> [Error Id: f8b4c074-ba62-4691-b142-a8ea6e4f6b2a ]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[drill-common-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.openTSDB.client.services.ServiceImpl.getTableNames(ServiceImpl.java:107)
>  [drill-opentsdb-storage-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.openTSDB.client.services.ServiceImpl.getAllMetricNames(ServiceImpl.java:70)
>  [drill-opentsdb-storage-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.openTSDB.schema.OpenTSDBSchemaFactory$OpenTSDBSchema.getTableNames(OpenTSDBSchemaFactory.java:78)
>  [drill-opentsdb-storage-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.calcite.jdbc.SimpleCalciteSchema.addImplicitTableToBuilder(SimpleCalciteSchema.java:106)
>  [calcite-core-1.15.0-drill-r0.jar:1.15.0-drill-r0]
> at 
> org.apache.calcite.jdbc.CalciteSchema.getTableNames(CalciteSchema.java:318) 
> [calcite-core-1.15.0-drill-r0.jar:1.15.0-drill-r0]
> at 
> org.apache.calcite.jdbc.CalciteSchema$SchemaPlusImpl.getTableNames(CalciteSchema.java:587)
>  [calcite-core-1.15.0-drill-r0.jar:1.15.0-drill-r0]
> at 
> org.apache.calcite.jdbc.CalciteSchema$SchemaPlusImpl.getTableNames(CalciteSchema.java:548)
>  [calcite-core-1.15.0-drill-r0.jar:1.15.0-drill-r0]
> at 
> org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator.visitTables(InfoSchemaRecordGenerator.java:227)
>  [drill-java-exec-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator.scanSchema(InfoSchemaRecordGenerator.java:216)
>  [drill-java-exec-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator.scanSchema(InfoSchemaRecordGenerator.java:209)
>  [drill-java-exec-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator.scanSchema(InfoSchemaRecordGenerator.java:196)
>  [drill-java-exec-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.ischema.InfoSchemaTableType.getRecordReader(InfoSchemaTableType.java:58)
>  [drill-java-exec-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.ischema.InfoSchemaBatchCreator.getBatch(InfoSchemaBatchCreator.java:34)
>  [drill-java-exec-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.ischema.InfoSchemaBatchCreator.getBatch(InfoSchemaBatchCreator.java:30)
>  

[jira] [Commented] (DRILL-4517) Reading emtpy Parquet file failes with java.lang.IllegalArgumentException

2019-08-12 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905410#comment-16905410
 ] 

ASF GitHub Bot commented on DRILL-4517:
---

asfgit commented on pull request #1839: DRILL-4517: Support reading empty 
Parquet files
URL: https://github.com/apache/drill/pull/1839
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Reading emtpy Parquet file failes with java.lang.IllegalArgumentException
> -
>
> Key: DRILL-4517
> URL: https://issues.apache.org/jira/browse/DRILL-4517
> Project: Apache Drill
>  Issue Type: Improvement
>  Components:  Server
>Reporter: Tobias
>Assignee: Arina Ielchiieva
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.17.0
>
> Attachments: empty.parquet, no_rows.parquet
>
>
> When querying a Parquet file that has a schema but no rows the Drill Server 
> will fail with the below
> This looks similar to DRILL-3557
> {noformat}
> {{ParquetMetaData{FileMetaData{schema: message TRANSACTION_REPORT {
>   required int64 MEMBER_ACCOUNT_ID;
>   required int64 TIMESTAMP_IN_HOUR;
>   optional int64 APPLICATION_ID;
> }
> , metadata: {}}}, blocks: []}
> {noformat}
> {noformat}
> Caused by: java.lang.IllegalArgumentException: MinorFragmentId 0 has no read 
> entries assigned
> at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:92) 
> ~[guava-14.0.1.jar:na]
> at 
> org.apache.drill.exec.store.parquet.ParquetGroupScan.getSpecificScan(ParquetGroupScan.java:707)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.store.parquet.ParquetGroupScan.getSpecificScan(ParquetGroupScan.java:105)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.Materializer.visitGroupScan(Materializer.java:68)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.Materializer.visitGroupScan(Materializer.java:35)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.physical.base.AbstractGroupScan.accept(AbstractGroupScan.java:60)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.Materializer.visitOp(Materializer.java:102)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.Materializer.visitOp(Materializer.java:35)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.physical.base.AbstractPhysicalVisitor.visitProject(AbstractPhysicalVisitor.java:77)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.physical.config.Project.accept(Project.java:51) 
> ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.Materializer.visitStore(Materializer.java:82)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.Materializer.visitStore(Materializer.java:35)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.physical.base.AbstractPhysicalVisitor.visitScreen(AbstractPhysicalVisitor.java:195)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.physical.config.Screen.accept(Screen.java:97) 
> ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.SimpleParallelizer.generateWorkUnit(SimpleParallelizer.java:355)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.SimpleParallelizer.getFragments(SimpleParallelizer.java:134)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.work.foreman.Foreman.getQueryWorkUnit(Foreman.java:518) 
> [drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.work.foreman.Foreman.runPhysicalPlan(Foreman.java:405) 
> [drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:926) 
> [drill-java-exec-1.5.0.jar:1.5.0]
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (DRILL-7341) Vector reAlloc may fails after exchange.

2019-08-12 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905407#comment-16905407
 ] 

ASF GitHub Bot commented on DRILL-7341:
---

asfgit commented on pull request #1838: DRILL-7341: Vector reAlloc may fails 
after exchange
URL: https://github.com/apache/drill/pull/1838
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Vector reAlloc may fails after exchange.
> 
>
> Key: DRILL-7341
> URL: https://issues.apache.org/jira/browse/DRILL-7341
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Oleg Zinoviev
>Assignee: Oleg Zinoviev
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.17.0
>
> Attachments: stacktrace.log
>
>
> There are several methods that modify the BaseDataValueVector#data field. 
> Some of them, such as BaseDataValueVector#exchange, do not change 
> allocationSizeInBytes. 
> Therefore, if BaseDataValueVector#exchange was executed for vectors with 
> different size, *ValueVector#reAlloc may create a buffer of insufficient size.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (DRILL-7338) REST API calls to Drill fail due to insufficient heap memory

2019-08-12 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905408#comment-16905408
 ] 

ASF GitHub Bot commented on DRILL-7338:
---

asfgit commented on pull request #1837: DRILL-7338: REST API calls to Drill 
fail due to insufficient heap memory
URL: https://github.com/apache/drill/pull/1837
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> REST API calls to Drill fail due to insufficient heap memory
> 
>
> Key: DRILL-7338
> URL: https://issues.apache.org/jira/browse/DRILL-7338
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Web Server
>Affects Versions: 1.15.0
>Reporter: Aditya Allamraju
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.17.0
>
>
> Drill queries that use REST API calls have started failing(given below) after 
> recent changes.
> {code:java}
> RESOURCE ERROR: There is not enough heap memory to run this query using the 
> web interface.
> Please try a query with fewer columns or with a filter or limit condition to 
> limit the data returned.
> You can also try an ODBC/JDBC client.{code}
> They were running fine earlier as the ResultSet returned was just few rows. 
> These queries now fail for even very small resultSets( < 10rows).
> Investigating the issue revealed that we introduced a check to limit the Heap 
> usage.
> The Wrapper code from 
> *_exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/QueryWrapper.java_*
>   that throws this error, i see certain issues. It does seem we use a 
> threshold of *85%* of heap usage before throwing that warning and exiting the 
> query.
>  
> {code:java}
> public class QueryWrapper {
>   private static final org.slf4j.Logger logger = 
> org.slf4j.LoggerFactory.getLogger(QueryWrapper.class);
>   // Heap usage threshold/trigger to provide resiliency on web server for 
> queries submitted via HTTP
>   private static final double HEAP_MEMORY_FAILURE_THRESHOLD = 0.85;
> ...
>   private static MemoryMXBean memMXBean = ManagementFactory.getMemoryMXBean();
> ...
>   // Wait until the query execution is complete or there is error submitting 
> the query
> logger.debug("Wait until the query execution is complete or there is 
> error submitting the query");
> do {
>   try {
> isComplete = webUserConnection.await(TimeUnit.SECONDS.toMillis(1)); 
> //periodically timeout 1 sec to check heap
>   } catch (InterruptedException e) {}
>   usagePercent = getHeapUsage();
>   if (usagePercent >  HEAP_MEMORY_FAILURE_THRESHOLD) {
> nearlyOutOfHeapSpace = true;
>   }
> } while (!isComplete && !nearlyOutOfHeapSpace);
> {code}
> By using above check, we unintentionally invited all those issues that happen 
> with Java’s Heap usage. JVM does try to make maximum usage of HEAP until 
> Minor or Major GC kicks in i.e GC kicks after there is no more space left in 
> heap(eden or young gen).
> The workarounds i can think of in order to resolve this issue are:
>  # Remove this check altogether so we know why it is filling up Heap.
>  # Advise the users to stop using REST for querying data.(We did this 
> already). *But not all users may not be happy with this suggestion.* There 
> could be few dynamic applications(dashboard, monitoring etc).
>  # Make the threshold high enough so that GC kicks in much better.
> If not above options, we have to tune the Heap sizes of drillbit. A quick fix 
> would be to increase the threshold from 85% to 100%(option-3 above).
> *For documentation*
> New Drill configuration property - 
> drill.exec.http.memory.heap.failure.threshold
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (DRILL-7345) Strange Behavior for UDFs with ComplexWriter Output

2019-08-12 Thread Charles Givre (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905389#comment-16905389
 ] 

Charles Givre commented on DRILL-7345:
--

Hi [~IhorHuzenko], 
I took a look at the javadoc, I think that is a new behavior, and an annoying 
one.  

In practice what this means is that if you have a UDF that returns a complex 
field, you really have no way to deal with empty or null rows.  Is there a way 
you could suggest so that we can deal with this situation?

> Strange Behavior for UDFs with ComplexWriter Output
> ---
>
> Key: DRILL-7345
> URL: https://issues.apache.org/jira/browse/DRILL-7345
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Charles Givre
>Priority: Minor
>
> I wrote some UDFs recently and noticed some strange behavior when debugging 
> them. 
> This behavior only occurs when there is ComplexWriter as output.  
> Basically, if the input to the UDF is nullable, Drill doesn't recognize the 
> UDF at all.  I've found that the only way to get Drill to recognize UDFs that 
> have ComplexWriters as output is:
> * Use a non-nullable holder as input
> * Remove the null setting completely from the function parameters.
> This approach has a drawback in that if the function receives a null value, 
> it will throw an error and halt execution.  My preference would be to allow 
> null handling, but I've not figured out how to make that happen.
> Note:  This behavior ONLY occurs when using a ComplexWriter as output.  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (DRILL-7345) Strange Behavior for UDFs with ComplexWriter Output

2019-08-12 Thread Igor Guzenko (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905362#comment-16905362
 ] 

Igor Guzenko commented on DRILL-7345:
-

Hi [~cgivre], could you please check that the issue is not caused by changes 
which was added as part of DRILL-6810 ? 
[Here|https://github.com/apache/drill/blob/85c77134d5d1bb9f96a5417036cccfb263ae8ae7/exec/java-exec/src/main/java/org/apache/drill/exec/expr/annotations/FunctionTemplate.java#L150]
 in javadoc described some limitations related to ComplexWriter output. 

> Strange Behavior for UDFs with ComplexWriter Output
> ---
>
> Key: DRILL-7345
> URL: https://issues.apache.org/jira/browse/DRILL-7345
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Charles Givre
>Priority: Minor
>
> I wrote some UDFs recently and noticed some strange behavior when debugging 
> them. 
> This behavior only occurs when there is ComplexWriter as output.  
> Basically, if the input to the UDF is nullable, Drill doesn't recognize the 
> UDF at all.  I've found that the only way to get Drill to recognize UDFs that 
> have ComplexWriters as output is:
> * Use a non-nullable holder as input
> * Remove the null setting completely from the function parameters.
> This approach has a drawback in that if the function receives a null value, 
> it will throw an error and halt execution.  My preference would be to allow 
> null handling, but I've not figured out how to make that happen.
> Note:  This behavior ONLY occurs when using a ComplexWriter as output.  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (DRILL-7345) Strange Behavior for UDFs with ComplexWriter Output

2019-08-12 Thread Charles Givre (JIRA)
Charles Givre created DRILL-7345:


 Summary: Strange Behavior for UDFs with ComplexWriter Output
 Key: DRILL-7345
 URL: https://issues.apache.org/jira/browse/DRILL-7345
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.17.0
Reporter: Charles Givre


I wrote some UDFs recently and noticed some strange behavior when debugging 
them. 
This behavior only occurs when there is ComplexWriter as output.  

Basically, if the input to the UDF is nullable, Drill doesn't recognize the UDF 
at all.  I've found that the only way to get Drill to recognize UDFs that have 
ComplexWriters as output is:
* Use a non-nullable holder as input
* Remove the null setting completely from the function parameters.

This approach has a drawback in that if the function receives a null value, it 
will throw an error and halt execution.  My preference would be to allow null 
handling, but I've not figured out how to make that happen.

Note:  This behavior ONLY occurs when using a ComplexWriter as output.  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (DRILL-7050) RexNode convert exception in subquery

2019-08-12 Thread benj (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905219#comment-16905219
 ] 

benj commented on DRILL-7050:
-

Just to illustrate in a real case (simplified here) the problem pointed by 
[~le.louch]:

It's possible to do
{code:sql}
SELECT u FROM
(SELECT split(r,' ') AS r FROM
 (SELECT 'unnest is useful' AS r)) AS x
,LATERAL(SELECT $unnest AS u FROM unnest(x.r))
=>
+---+
| u |
+---+
| unnest    |
| is    |
| useful|
+---+
{code}
but not possible to do
{code:sql}
SELECT t,
(SELECT count(*) FROM
 (SELECT split(r,' ') AS r FROM
  (SELECT sub.t AS r)) AS x
 ,LATERAL(SELECT $unnest AS u FROM unnest(x.r))
 /* WHERE ... */) t2
FROM
(SELECT 'unnest is useful' AS t) sub
=>
Error: SYSTEM ERROR: UnsupportedOperationException: Adding Implicit RowID 
column is not supported for ValuesPrel operator
{code}

> RexNode convert exception in subquery
> -
>
> Key: DRILL-7050
> URL: https://issues.apache.org/jira/browse/DRILL-7050
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.14.0, 1.15.0
>Reporter: Oleg Zinoviev
>Assignee: Volodymyr Vysotskyi
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.17.0
>
>
> If the query contains a subquery whose filters are associated with the main 
> query, an error occurs: *PLAN ERROR: Cannot convert RexNode to equivalent 
> Drill expression. RexNode Class: org.apache.calcite.rex.RexCorrelVariable*
> Steps to reproduce:
> 1) Create source table (or view, doesn't matter)
> {code:sql}
> create table dfs.root.source as  (
> select 1 as id union all select 2 as id
> )
> {code}
> 2) Execute query
> {code:sql}
> select t1.id,
>   (select count(t2.id) 
>   from dfs.root.source t2 where t2.id = t1.id)
> from  dfs.root.source t1
> {code}
> Reason: 
> Method 
> {code:java}org.apache.calcite.sql2rel.SqlToRelConverter.Blackboard.lookupExp{code}
>   call {code:java}RexBuilder.makeCorrel{code} in some cases



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (DRILL-7336) `cast_empty_string_to_null` option doesn't work when text file has > 1 column

2019-08-12 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905057#comment-16905057
 ] 

Bohdan Kazydub commented on DRILL-7336:
---

This option works when casting empty string as some other type, i.e. 
{{CAST(columns[0] as INT)}}. The option's description is wrong.

> `cast_empty_string_to_null` option doesn't work when text file has > 1 column
> -
>
> Key: DRILL-7336
> URL: https://issues.apache.org/jira/browse/DRILL-7336
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Denys Ordynskiy
>Priority: Major
>
> *Description:*
> 1 - create 2 nullable csv files with 1 and 2 columns:
> _one_col.csv_
> {noformat}
> 1
> 2
> 4
> {noformat}
> _two_col.csv_
> {noformat}
> 1,1
> 2,
> ,3
> 4,4
> {noformat}
> 2 - enable option:
> {noformat}
> alter system set `drill.exec.functions.cast_empty_string_to_null`=true;
> {noformat}
> 3 - query file with 1 column:
> {noformat}
> select columns[0] from dfs.tmp.`one_col.csv`;
> {noformat}
> | EXPR$0  |
> | 1   |
> | 2   |
> | null|
> | 4   |
> 4 - query file with 2 columns:
> {noformat}
> select columns[0] from dfs.tmp.`two_col.csv`;
> {noformat}
> *Expected result:*
> Table with NULL in the 3-rd row:
> | EXPR$0  |
> | 1   |
> | 2   |
> | null|
> | 4   |
> *Actual result:*
> {color:#d04437}Drill returns an empty string in the 3-rd row:{color}
> | EXPR$0  |
> | 1   |
> | 2   |
> | |
> | 4   |



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (DRILL-7342) Drill replacing spaces with underlines in the column names of text files with headers

2019-08-12 Thread benj (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905017#comment-16905017
 ] 

benj commented on DRILL-7342:
-

To avoid other surprise of this type, please look list of different rules 
applying in columns header of CSV with Drill at DRILL-7001

> Drill replacing spaces with underlines in the column names of text files with 
> headers
> -
>
> Key: DRILL-7342
> URL: https://issues.apache.org/jira/browse/DRILL-7342
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Denys Ordynskiy
>Priority: Major
>
> Drill doesn't allow to query csvh columns with spaces.
>  *Description:*
>  Update Drill ctas format option to generate text file with header:
> {noformat}
> set `store.format` = 'csvh';
> {noformat}
> Create table with column names having spaces:
> {noformat}
> create table dfs.tmp.`csv table with spaces` (`Full Name`) as select 'James 
> Bond' from (values(1));
> {noformat}
> Drill wrote column name with space:
> {noformat}
> hadoop fs -cat '/tmp/csv table with spaces/0_0_0.csvh'
> {noformat}
> |Full Name|
> |James Bond|
> Try to query this table without column name:
> {noformat}
> select * from dfs.tmp.`csv table with spaces`;
> {noformat}
> |{color:#ff}*Full_Name*{color}|
> |James Bond|
> {color:#ff}*Drill replaced space with underline.*{color}
>  Try to select `Full Name` column with space:
> {noformat}
> select `Full Name` from dfs.tmp.`csv table with spaces`;
> {noformat}
> Drill
> |Full Name|
> | |
> When I changed space to underline, query returned the data:
> {noformat}
> select `Full_Name` from dfs.tmp.`csv table with spaces`;
> {noformat}
> |Full_Name|
> |James Bond|
> Drill can create csvh text files with spaces in the column names. But it's 
> impossible to request data using the original column name.
> *Expected result*
>  Space should be available character in the column names.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (DRILL-4517) Reading emtpy Parquet file failes with java.lang.IllegalArgumentException

2019-08-12 Thread Volodymyr Vysotskyi (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Volodymyr Vysotskyi updated DRILL-4517:
---
Labels: doc-impacting ready-to-commit  (was: doc-impacting)

> Reading emtpy Parquet file failes with java.lang.IllegalArgumentException
> -
>
> Key: DRILL-4517
> URL: https://issues.apache.org/jira/browse/DRILL-4517
> Project: Apache Drill
>  Issue Type: Improvement
>  Components:  Server
>Reporter: Tobias
>Assignee: Arina Ielchiieva
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.17.0
>
> Attachments: empty.parquet, no_rows.parquet
>
>
> When querying a Parquet file that has a schema but no rows the Drill Server 
> will fail with the below
> This looks similar to DRILL-3557
> {noformat}
> {{ParquetMetaData{FileMetaData{schema: message TRANSACTION_REPORT {
>   required int64 MEMBER_ACCOUNT_ID;
>   required int64 TIMESTAMP_IN_HOUR;
>   optional int64 APPLICATION_ID;
> }
> , metadata: {}}}, blocks: []}
> {noformat}
> {noformat}
> Caused by: java.lang.IllegalArgumentException: MinorFragmentId 0 has no read 
> entries assigned
> at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:92) 
> ~[guava-14.0.1.jar:na]
> at 
> org.apache.drill.exec.store.parquet.ParquetGroupScan.getSpecificScan(ParquetGroupScan.java:707)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.store.parquet.ParquetGroupScan.getSpecificScan(ParquetGroupScan.java:105)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.Materializer.visitGroupScan(Materializer.java:68)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.Materializer.visitGroupScan(Materializer.java:35)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.physical.base.AbstractGroupScan.accept(AbstractGroupScan.java:60)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.Materializer.visitOp(Materializer.java:102)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.Materializer.visitOp(Materializer.java:35)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.physical.base.AbstractPhysicalVisitor.visitProject(AbstractPhysicalVisitor.java:77)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.physical.config.Project.accept(Project.java:51) 
> ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.Materializer.visitStore(Materializer.java:82)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.Materializer.visitStore(Materializer.java:35)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.physical.base.AbstractPhysicalVisitor.visitScreen(AbstractPhysicalVisitor.java:195)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.physical.config.Screen.accept(Screen.java:97) 
> ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.SimpleParallelizer.generateWorkUnit(SimpleParallelizer.java:355)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.SimpleParallelizer.getFragments(SimpleParallelizer.java:134)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.work.foreman.Foreman.getQueryWorkUnit(Foreman.java:518) 
> [drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.work.foreman.Foreman.runPhysicalPlan(Foreman.java:405) 
> [drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:926) 
> [drill-java-exec-1.5.0.jar:1.5.0]
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (DRILL-4517) Reading emtpy Parquet file failes with java.lang.IllegalArgumentException

2019-08-12 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16904929#comment-16904929
 ] 

ASF GitHub Bot commented on DRILL-4517:
---

arina-ielchiieva commented on issue #1839: DRILL-4517: Support reading empty 
Parquet files
URL: https://github.com/apache/drill/pull/1839#issuecomment-520323682
 
 
   @vvysotskyi thanks for the code review, addressed code review comments.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Reading emtpy Parquet file failes with java.lang.IllegalArgumentException
> -
>
> Key: DRILL-4517
> URL: https://issues.apache.org/jira/browse/DRILL-4517
> Project: Apache Drill
>  Issue Type: Improvement
>  Components:  Server
>Reporter: Tobias
>Assignee: Arina Ielchiieva
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.17.0
>
> Attachments: empty.parquet, no_rows.parquet
>
>
> When querying a Parquet file that has a schema but no rows the Drill Server 
> will fail with the below
> This looks similar to DRILL-3557
> {noformat}
> {{ParquetMetaData{FileMetaData{schema: message TRANSACTION_REPORT {
>   required int64 MEMBER_ACCOUNT_ID;
>   required int64 TIMESTAMP_IN_HOUR;
>   optional int64 APPLICATION_ID;
> }
> , metadata: {}}}, blocks: []}
> {noformat}
> {noformat}
> Caused by: java.lang.IllegalArgumentException: MinorFragmentId 0 has no read 
> entries assigned
> at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:92) 
> ~[guava-14.0.1.jar:na]
> at 
> org.apache.drill.exec.store.parquet.ParquetGroupScan.getSpecificScan(ParquetGroupScan.java:707)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.store.parquet.ParquetGroupScan.getSpecificScan(ParquetGroupScan.java:105)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.Materializer.visitGroupScan(Materializer.java:68)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.Materializer.visitGroupScan(Materializer.java:35)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.physical.base.AbstractGroupScan.accept(AbstractGroupScan.java:60)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.Materializer.visitOp(Materializer.java:102)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.Materializer.visitOp(Materializer.java:35)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.physical.base.AbstractPhysicalVisitor.visitProject(AbstractPhysicalVisitor.java:77)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.physical.config.Project.accept(Project.java:51) 
> ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.Materializer.visitStore(Materializer.java:82)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.Materializer.visitStore(Materializer.java:35)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.physical.base.AbstractPhysicalVisitor.visitScreen(AbstractPhysicalVisitor.java:195)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.physical.config.Screen.accept(Screen.java:97) 
> ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.SimpleParallelizer.generateWorkUnit(SimpleParallelizer.java:355)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.SimpleParallelizer.getFragments(SimpleParallelizer.java:134)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.work.foreman.Foreman.getQueryWorkUnit(Foreman.java:518) 
> [drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.work.foreman.Foreman.runPhysicalPlan(Foreman.java:405) 
> [drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:926) 
> [drill-java-exec-1.5.0.jar:1.5.0]
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)