[jira] [Commented] (DRILL-4287) Do lazy reading of parquet metadata cache file

2016-02-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128754#comment-15128754
 ] 

ASF GitHub Bot commented on DRILL-4287:
---

Github user jinfengni commented on a diff in the pull request:

https://github.com/apache/drill/pull/345#discussion_r51612851
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetGroupScan.java
 ---
@@ -338,8 +354,14 @@ private boolean hasSingleValue(ColumnMetadata 
columnChunkMetaData) {
 return (columnChunkMetaData != null) && 
(columnChunkMetaData.hasSingleValue());
   }
 
+  @Override
--- End diff --

Right. 


> Do lazy reading of parquet metadata cache file
> --
>
> Key: DRILL-4287
> URL: https://issues.apache.org/jira/browse/DRILL-4287
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.4.0
>Reporter: Aman Sinha
>Assignee: Jinfeng Ni
>
> Currently, the parquet metadata cache file is read eagerly during creation of 
> the DrillTable (as part of ParquetFormatMatcher.isReadable()).  This is not 
> desirable from performance standpoint since there are scenarios where we want 
> to do some up-front optimizations - e.g. directory-based partition pruning 
> (see DRILL-2517) or potential limit 0 optimization etc. - and in such 
> situations it is better to do lazy reading of the metadata cache file.   
> This is a placeholder to perform such delayed reading since it is needed for 
> the aforementioned optimizations.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4339) Avro Reader can not read records - Regression

2016-02-02 Thread JIRA

[ 
https://issues.apache.org/jira/browse/DRILL-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128818#comment-15128818
 ] 

Stefán Baxter commented on DRILL-4339:
--

Hi,

As a part of answering this request I cloned the project again and built it 
from scratch.

I did this because I had local, Lucene related, work that I thought could not 
possibly affect Avro (strictly contained to the Lucene reader).

Long story short: this ticket can be closed as invalid and I apologize for the 
inconvenience.

> Avro Reader can not read records - Regression
> -
>
> Key: DRILL-4339
> URL: https://issues.apache.org/jira/browse/DRILL-4339
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Affects Versions: 1.5.0
>Reporter: Stefán Baxter
>Priority: Blocker
> Fix For: 1.5.0
>
>
> Simple reading of Avro records no longer works
> 0: jdbc:drill:zk=local> select * from dfs.asa.`/`;
> Exception in thread "drill-executor-2" java.lang.NoSuchMethodError: 
> org.apache.drill.exec.store.avro.AvroRecordReader.setColumns(Ljava/util/Collection;)V
>   at 
> org.apache.drill.exec.store.avro.AvroRecordReader.(AvroRecordReader.java:99)
>   at 
> org.apache.drill.exec.store.avro.AvroFormatPlugin.getRecordReader(AvroFormatPlugin.java:73)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin.getReaderBatch(EasyFormatPlugin.java:172)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:35)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:28)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:147)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:101)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:79)
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:230)
>   at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> We have been using the Avro reader for a while and this looks like a 
> regression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (DRILL-4313) C++ client - Improve method of drillbit selection from cluster

2016-02-02 Thread Parth Chandra (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra reopened DRILL-4313:
--
  Assignee: Parth Chandra

With Tableau, even with pooling enabled, this fix does not appear to improve 
the distribution of the queries to different drillbits. The fix does improve 
the distribution of queries in the test program, but not when used by Tableau.
The only solution might be for the client library to implement a pool of 
connections and manage the distribution of queries under the covers.


> C++ client - Improve method of drillbit selection from cluster
> --
>
> Key: DRILL-4313
> URL: https://issues.apache.org/jira/browse/DRILL-4313
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Parth Chandra
>Assignee: Parth Chandra
> Fix For: 1.5.0
>
>
> The current C++ client handles multiple parallel queries over the same 
> connection, but that creates a bottleneck as the queries get sent to the same 
> drillbit.
> The client can manage this more effectively by choosing from a configurable 
> pool of connections and round robin queries to them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-4343) UNSUPPORTED_OPERATION ERROR: This type of window frame is currently not supported See Apache Drill JIRA: DRILL-3188

2016-02-02 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang closed DRILL-4343.
-
Resolution: Invalid

Captured wrong stack.

> UNSUPPORTED_OPERATION ERROR: This type of window frame is currently not 
> supported  See Apache Drill JIRA: DRILL-3188
> 
>
> Key: DRILL-4343
> URL: https://issues.apache.org/jira/browse/DRILL-4343
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.5.0
>Reporter: Chun Chang
>
> 1.5.0-SNAPSHOT1b96174b1e5bafb13a873dd79f03467802d7c929
> Running negative test cases automation:
> ./run.sh -s Functional -g smoke,regression,negative -n 10 -d 
> Got this error. It's random failure since it did not appear in repeated runs. 
> It's interesting that the error message refers to JIRA-3188 which is already 
> fixed.
> {noformat}
> [#184] Query failed: 
> oadd.org.apache.drill.common.exceptions.UserRemoteException: 
> UNSUPPORTED_OPERATION ERROR: This type of window frame is currently not 
> supported 
> See Apache Drill JIRA: DRILL-3188
> [Error Id: 53ff1611-736f-4b85-9c11-421125b69711 on atsqa4-195.qa.lab:31010]
>   at 
> oadd.org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:119)
>   at 
> oadd.org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:113)
>   at 
> oadd.org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:46)
>   at 
> oadd.org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:31)
>   at oadd.org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:67)
>   at 
> oadd.org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:374)
>   at 
> oadd.org.apache.drill.common.SerializedExecutor$RunnableProcessor.run(SerializedExecutor.java:89)
>   at 
> oadd.org.apache.drill.exec.rpc.RpcBus$SameExecutor.execute(RpcBus.java:252)
>   at 
> oadd.org.apache.drill.common.SerializedExecutor.execute(SerializedExecutor.java:123)
>   at 
> oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:285)
>   at 
> oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:257)
>   at 
> oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> oadd.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> oadd.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> oadd.io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> oadd.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
>   at 
> oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>   at 
> oadd.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>   at 
> oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>   at 
> oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>   at oadd.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>   at 
> oadd.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>   at 

[jira] [Commented] (DRILL-4343) UNSUPPORTED_OPERATION ERROR: This type of window frame is currently not supported See Apache Drill JIRA: DRILL-3188

2016-02-02 Thread Khurram Faraaz (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128885#comment-15128885
 ] 

Khurram Faraaz commented on DRILL-4343:
---

since your run includes negative suites, you may want to check if it was a 
negative test that failed and led to this Exception. 

> UNSUPPORTED_OPERATION ERROR: This type of window frame is currently not 
> supported  See Apache Drill JIRA: DRILL-3188
> 
>
> Key: DRILL-4343
> URL: https://issues.apache.org/jira/browse/DRILL-4343
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.5.0
>Reporter: Chun Chang
>
> 1.5.0-SNAPSHOT1b96174b1e5bafb13a873dd79f03467802d7c929
> Running negative test cases automation:
> ./run.sh -s Functional -g smoke,regression,negative -n 10 -d 
> Got this error. It's random failure since it did not appear in repeated runs. 
> It's interesting that the error message refers to JIRA-3188 which is already 
> fixed.
> {noformat}
> [#184] Query failed: 
> oadd.org.apache.drill.common.exceptions.UserRemoteException: 
> UNSUPPORTED_OPERATION ERROR: This type of window frame is currently not 
> supported 
> See Apache Drill JIRA: DRILL-3188
> [Error Id: 53ff1611-736f-4b85-9c11-421125b69711 on atsqa4-195.qa.lab:31010]
>   at 
> oadd.org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:119)
>   at 
> oadd.org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:113)
>   at 
> oadd.org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:46)
>   at 
> oadd.org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:31)
>   at oadd.org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:67)
>   at 
> oadd.org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:374)
>   at 
> oadd.org.apache.drill.common.SerializedExecutor$RunnableProcessor.run(SerializedExecutor.java:89)
>   at 
> oadd.org.apache.drill.exec.rpc.RpcBus$SameExecutor.execute(RpcBus.java:252)
>   at 
> oadd.org.apache.drill.common.SerializedExecutor.execute(SerializedExecutor.java:123)
>   at 
> oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:285)
>   at 
> oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:257)
>   at 
> oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> oadd.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> oadd.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> oadd.io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> oadd.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
>   at 
> oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>   at 
> oadd.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>   at 
> oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>   at 
> oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>   at oadd.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>   

[jira] [Commented] (DRILL-4339) Avro Reader can not read records - Regression

2016-02-02 Thread Jinfeng Ni (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128941#comment-15128941
 ] 

Jinfeng Ni commented on DRILL-4339:
---

If the storage plugin has subclass of AbstractRecordReader,  I think the plugin 
has to recompile.

Are we supposedly going to keep this not-required-for-recompile for storage 
plugin with each new release? 

Actually, I just realized this method's signature does not have to change at 
all ( Maybe I had to change it because of other codes which are not included in 
the patch, but I forgot to change it back). I can change it back to the 
original signature of setColumns().

 

> Avro Reader can not read records - Regression
> -
>
> Key: DRILL-4339
> URL: https://issues.apache.org/jira/browse/DRILL-4339
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Affects Versions: 1.5.0
>Reporter: Stefán Baxter
>Priority: Blocker
> Fix For: 1.5.0
>
>
> Simple reading of Avro records no longer works
> 0: jdbc:drill:zk=local> select * from dfs.asa.`/`;
> Exception in thread "drill-executor-2" java.lang.NoSuchMethodError: 
> org.apache.drill.exec.store.avro.AvroRecordReader.setColumns(Ljava/util/Collection;)V
>   at 
> org.apache.drill.exec.store.avro.AvroRecordReader.(AvroRecordReader.java:99)
>   at 
> org.apache.drill.exec.store.avro.AvroFormatPlugin.getRecordReader(AvroFormatPlugin.java:73)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin.getReaderBatch(EasyFormatPlugin.java:172)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:35)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:28)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:147)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:101)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:79)
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:230)
>   at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> We have been using the Avro reader for a while and this looks like a 
> regression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4339) Avro Reader can not read records - Regression

2016-02-02 Thread Jinfeng Ni (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128954#comment-15128954
 ] 

Jinfeng Ni commented on DRILL-4339:
---

On a second thought, seems that it's easy to change the function setColumn's 
signature. That will avoid the re-compile. However, without re-compile, I'm not 
sure if your customer storage plugin will work or not, since the implementation 
of this method actually is changed in 1.5.0.



> Avro Reader can not read records - Regression
> -
>
> Key: DRILL-4339
> URL: https://issues.apache.org/jira/browse/DRILL-4339
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Affects Versions: 1.5.0
>Reporter: Stefán Baxter
>Priority: Blocker
> Fix For: 1.5.0
>
>
> Simple reading of Avro records no longer works
> 0: jdbc:drill:zk=local> select * from dfs.asa.`/`;
> Exception in thread "drill-executor-2" java.lang.NoSuchMethodError: 
> org.apache.drill.exec.store.avro.AvroRecordReader.setColumns(Ljava/util/Collection;)V
>   at 
> org.apache.drill.exec.store.avro.AvroRecordReader.(AvroRecordReader.java:99)
>   at 
> org.apache.drill.exec.store.avro.AvroFormatPlugin.getRecordReader(AvroFormatPlugin.java:73)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin.getReaderBatch(EasyFormatPlugin.java:172)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:35)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:28)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:147)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:101)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:79)
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:230)
>   at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> We have been using the Avro reader for a while and this looks like a 
> regression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4343) UNSUPPORTED_OPERATION ERROR: This type of window frame is currently not supported See Apache Drill JIRA: DRILL-3188

2016-02-02 Thread Chun Chang (JIRA)
Chun Chang created DRILL-4343:
-

 Summary: UNSUPPORTED_OPERATION ERROR: This type of window frame is 
currently not supported  See Apache Drill JIRA: DRILL-3188
 Key: DRILL-4343
 URL: https://issues.apache.org/jira/browse/DRILL-4343
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.5.0
Reporter: Chun Chang


1.5.0-SNAPSHOT  1b96174b1e5bafb13a873dd79f03467802d7c929

Running negative test cases automation:
./run.sh -s Functional -g smoke,regression,negative -n 10 -d 

Got this error. It's random failure since it did not appear in repeated runs. 
It's interesting that the error message refers to JIRA-3188 which is already 
fixed.

{noformat}
[#184] Query failed: 
oadd.org.apache.drill.common.exceptions.UserRemoteException: 
UNSUPPORTED_OPERATION ERROR: This type of window frame is currently not 
supported 
See Apache Drill JIRA: DRILL-3188


[Error Id: 53ff1611-736f-4b85-9c11-421125b69711 on atsqa4-195.qa.lab:31010]
at 
oadd.org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:119)
at 
oadd.org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:113)
at 
oadd.org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:46)
at 
oadd.org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:31)
at oadd.org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:67)
at 
oadd.org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:374)
at 
oadd.org.apache.drill.common.SerializedExecutor$RunnableProcessor.run(SerializedExecutor.java:89)
at 
oadd.org.apache.drill.exec.rpc.RpcBus$SameExecutor.execute(RpcBus.java:252)
at 
oadd.org.apache.drill.common.SerializedExecutor.execute(SerializedExecutor.java:123)
at 
oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:285)
at 
oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:257)
at 
oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at 
oadd.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at 
oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at 
oadd.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at 
oadd.io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at 
oadd.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
at 
oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at 
oadd.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at 
oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at 
oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at oadd.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at 
oadd.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:744)
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4342) Drill fails to read a date column from hive generated parquet

2016-02-02 Thread Rahul Challapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Challapalli updated DRILL-4342:
-
Attachment: fewtypes_null.parquet

> Drill fails to read a date column from hive generated parquet
> -
>
> Key: DRILL-4342
> URL: https://issues.apache.org/jira/browse/DRILL-4342
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Hive, Storage - Parquet
>Reporter: Rahul Challapalli
> Attachments: fewtypes_null.parquet
>
>
> git.commit.id.abbrev=576271d
> Below is the hive ddl (using hive 1.2 which supports date in parquet)
> {code}
> create external table hive1dot2_fewtypes_null (
>   int_col int,
>   bigint_col bigint,
>   date_col date,
>   time_col string,
>   timestamp_col timestamp,
>   interval_col string,
>   varchar_col string,
>   float_col float,
>   double_col double,
>   bool_col boolean
> )
> stored as parquet
> location '/drill/testdata/hive_storage/hive1dot2_fewtypes_null';
> {code}
> Query using the hive storage plugin
> {code}
> date_col from hive.hive1dot2_fewtypes_null;
> +-+
> |  date_col   |
> +-+
> | null|
> | null|
> | null|
> | 1996-01-29  |
> | 1996-03-01  |
> | 1996-03-02  |
> | 1997-02-28  |
> | null|
> | 1997-03-01  |
> | 1997-03-02  |
> | 2000-04-01  |
> | 2000-04-03  |
> | 2038-04-08  |
> | 2039-04-09  |
> | 2040-04-10  |
> | null|
> | 1999-02-08  |
> | 1999-03-08  |
> | 1999-01-18  |
> | 2003-01-02  |
> | null|
> +-+
> {code}
> Below is the output reading through dfs parquet reader. 
> {code}
> 0: jdbc:drill:zk=10.10.10.41:5181> select date_col from 
> dfs.`/drill/testdata/hive_storage/hive1dot2_fewtypes_null`;
> +-+
> |  date_col   |
> +-+
> | null|
> | null|
> | null|
> | 369-02-09  |
> | 369-03-12  |
> | 369-03-13  |
> | 368-03-11  |
> | null|
> | 368-03-12  |
> | 368-03-13  |
> | 365-04-12  |
> | 365-04-14  |
> | 327-04-19  |
> | 326-04-20  |
> | 325-04-21  |
> | null|
> | 366-02-19  |
> | 366-03-19  |
> | 366-01-29  |
> | 362-01-13  |
> | null|
> +-+
> {code}
> I attached the parquet file generated from hive. Let me know if anything else 
> is needed for reproducing this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4344) oadd.org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: NullPointerException

2016-02-02 Thread Chun Chang (JIRA)
Chun Chang created DRILL-4344:
-

 Summary: 
oadd.org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
NullPointerException
 Key: DRILL-4344
 URL: https://issues.apache.org/jira/browse/DRILL-4344
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - RPC
Reporter: Chun Chang


1.5.0-SNAPSHOT  1b96174b1e5bafb13a873dd79f03467802d7c929
Running negative test cases automation:
./run.sh -s Functional -g smoke,regression,negative -n 10 -d
Got this error. It's random failure since it did not appear in repeated runs.

{noformat}
Execution Failures:
/root/drillAutomation/framework-master/framework/resources/Functional/tpcds/variants/json/q4_1.sql
Query: 
--/* q4 tpcds */

SELECT A.SS_CUSTOMER_SK,
   B.D_DATE_SK,
   B.D_YEAR,
   B.D_MOY,
   max(A.price) as price,
   max(A.cost) as cost
FROM
( SELECT
  S.SS_CUSTOMER_SK,
  S.SS_SOLD_DATE_SK,
  max(S.SS_LIST_PRICE) as price,
  max(S.SS_WHOLESALE_COST) as cost
FROM store_sales S
WHERE S.SS_QUANTITY  > 2
GROUP BY S.SS_CUSTOMER_SK,
 S.SS_SOLD_DATE_SK

) a
JOIN
  date_dim b
ON a.SS_SOLD_DATE_SK = b.D_DATE_SK
WHERE b.d_qoy = 2
  AND b.d_dow = 1
  and b.d_year IN (1990, 1901, 1902, 1903, 1904, 1905, 1906, 1907, 
1908, 1909,
 1910, 1911, 1912, 1913, 1914, 1915, 1916, 1917, 1918, 
1919,
 1920, 1921, 1922, 1923, 1924, 1925, 1926, 1927, 1928, 
1929,
 1930, 1931, 1932, 1933, 1934, 1935, 1936, 1937, 1938, 
1939,
 1940, 1941, 1950, 1951, 1952, 1953, 1954, 1955, 1956, 
1960,
 1970, 1980, 2001, 2002, 2011, 2012, 2013, 2014)
GROUP BY A.SS_CUSTOMER_SK,
B.D_DATE_SK,
B.D_YEAR,
B.D_MOY
ORDER BY B.D_DATE_SK, A.SS_CUSTOMER_SK, B.D_YEAR, B.D_MOY
Failed with exception
java.sql.SQLException: SYSTEM ERROR: NullPointerException


[Error Id: b96a07a1-4307-41fd-be84-544b0c4176ae on atsqa4-193.qa.lab:31010]
at 
org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:247)
at 
org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:290)
at 
org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1923)
at 
org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:73)
at 
oadd.net.hydromatic.avatica.AvaticaConnection.executeQueryInternal(AvaticaConnection.java:404)
at 
oadd.net.hydromatic.avatica.AvaticaStatement.executeQueryInternal(AvaticaStatement.java:351)
at 
oadd.net.hydromatic.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:78)
at 
org.apache.drill.jdbc.impl.DrillStatementImpl.executeQuery(DrillStatementImpl.java:112)
at 
org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:165)
at 
org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:93)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: oadd.org.apache.drill.common.exceptions.UserRemoteException: SYSTEM 
ERROR: NullPointerException


[Error Id: b96a07a1-4307-41fd-be84-544b0c4176ae on atsqa4-193.qa.lab:31010]
at 
oadd.org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:119)
at 
oadd.org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:113)
at 
oadd.org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:46)
at 
oadd.org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:31)
at oadd.org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:67)
at 
oadd.org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:374)
at 
oadd.org.apache.drill.common.SerializedExecutor$RunnableProcessor.run(SerializedExecutor.java:89)
at 
oadd.org.apache.drill.exec.rpc.RpcBus$SameExecutor.execute(RpcBus.java:252)
at 
oadd.org.apache.drill.common.SerializedExecutor.execute(SerializedExecutor.java:123)
at 
oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:285)
at 
oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:257)
at 
oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
at 

[jira] [Updated] (DRILL-4313) C++ client - Improve method of drillbit selection from cluster

2016-02-02 Thread Parth Chandra (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra updated DRILL-4313:
-
Fix Version/s: (was: 1.5.0)
   1.6.0

> C++ client - Improve method of drillbit selection from cluster
> --
>
> Key: DRILL-4313
> URL: https://issues.apache.org/jira/browse/DRILL-4313
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Parth Chandra
>Assignee: Parth Chandra
> Fix For: 1.6.0
>
>
> The current C++ client handles multiple parallel queries over the same 
> connection, but that creates a bottleneck as the queries get sent to the same 
> drillbit.
> The client can manage this more effectively by choosing from a configurable 
> pool of connections and round robin queries to them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4343) UNSUPPORTED_OPERATION ERROR: This type of window frame is currently not supported See Apache Drill JIRA: DRILL-3188

2016-02-02 Thread Chun Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128898#comment-15128898
 ] 

Chun Chang commented on DRILL-4343:
---

That's right. The real error is actually a NPE. Will close this one and file 
another bug.

> UNSUPPORTED_OPERATION ERROR: This type of window frame is currently not 
> supported  See Apache Drill JIRA: DRILL-3188
> 
>
> Key: DRILL-4343
> URL: https://issues.apache.org/jira/browse/DRILL-4343
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.5.0
>Reporter: Chun Chang
>
> 1.5.0-SNAPSHOT1b96174b1e5bafb13a873dd79f03467802d7c929
> Running negative test cases automation:
> ./run.sh -s Functional -g smoke,regression,negative -n 10 -d 
> Got this error. It's random failure since it did not appear in repeated runs. 
> It's interesting that the error message refers to JIRA-3188 which is already 
> fixed.
> {noformat}
> [#184] Query failed: 
> oadd.org.apache.drill.common.exceptions.UserRemoteException: 
> UNSUPPORTED_OPERATION ERROR: This type of window frame is currently not 
> supported 
> See Apache Drill JIRA: DRILL-3188
> [Error Id: 53ff1611-736f-4b85-9c11-421125b69711 on atsqa4-195.qa.lab:31010]
>   at 
> oadd.org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:119)
>   at 
> oadd.org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:113)
>   at 
> oadd.org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:46)
>   at 
> oadd.org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:31)
>   at oadd.org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:67)
>   at 
> oadd.org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:374)
>   at 
> oadd.org.apache.drill.common.SerializedExecutor$RunnableProcessor.run(SerializedExecutor.java:89)
>   at 
> oadd.org.apache.drill.exec.rpc.RpcBus$SameExecutor.execute(RpcBus.java:252)
>   at 
> oadd.org.apache.drill.common.SerializedExecutor.execute(SerializedExecutor.java:123)
>   at 
> oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:285)
>   at 
> oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:257)
>   at 
> oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> oadd.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> oadd.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> oadd.io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> oadd.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
>   at 
> oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>   at 
> oadd.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>   at 
> oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>   at 
> oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>   at oadd.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>   at 
> 

[jira] [Commented] (DRILL-4287) Do lazy reading of parquet metadata cache file

2016-02-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128750#comment-15128750
 ] 

ASF GitHub Bot commented on DRILL-4287:
---

Github user jinfengni commented on a diff in the pull request:

https://github.com/apache/drill/pull/345#discussion_r51612792
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSelection.java 
---
@@ -118,13 +133,34 @@ public boolean apply(@Nullable FileStatus status) {
   }
 }));
 
-return create(nonDirectories, null, selectionRoot);
+final FileSelection fileSel = create(nonDirectories, null, 
selectionRoot);
--- End diff --

Agreed that it's existing create() call. It seems fine to keep it as is.  

This new FileSelection is special, since it has excluded directories by 
going through the list of file status recursively. In that sense, it's 
hasDirectories = false, checkedForDirectories=true, and the list of files are 
already available while going through the file status list. If a FileSelection 
always has to call containsDirectory() or getFile() eventually, then it might 
be better to store them when they are available. But I guess it's minor thing 
to consider. 



> Do lazy reading of parquet metadata cache file
> --
>
> Key: DRILL-4287
> URL: https://issues.apache.org/jira/browse/DRILL-4287
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.4.0
>Reporter: Aman Sinha
>Assignee: Jinfeng Ni
>
> Currently, the parquet metadata cache file is read eagerly during creation of 
> the DrillTable (as part of ParquetFormatMatcher.isReadable()).  This is not 
> desirable from performance standpoint since there are scenarios where we want 
> to do some up-front optimizations - e.g. directory-based partition pruning 
> (see DRILL-2517) or potential limit 0 optimization etc. - and in such 
> situations it is better to do lazy reading of the metadata cache file.   
> This is a placeholder to perform such delayed reading since it is needed for 
> the aforementioned optimizations.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4287) Do lazy reading of parquet metadata cache file

2016-02-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128751#comment-15128751
 ] 

ASF GitHub Bot commented on DRILL-4287:
---

Github user jinfengni commented on the pull request:

https://github.com/apache/drill/pull/345#issuecomment-178746517
  
Overall, looks good to me. 

+1



> Do lazy reading of parquet metadata cache file
> --
>
> Key: DRILL-4287
> URL: https://issues.apache.org/jira/browse/DRILL-4287
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.4.0
>Reporter: Aman Sinha
>Assignee: Jinfeng Ni
>
> Currently, the parquet metadata cache file is read eagerly during creation of 
> the DrillTable (as part of ParquetFormatMatcher.isReadable()).  This is not 
> desirable from performance standpoint since there are scenarios where we want 
> to do some up-front optimizations - e.g. directory-based partition pruning 
> (see DRILL-2517) or potential limit 0 optimization etc. - and in such 
> situations it is better to do lazy reading of the metadata cache file.   
> This is a placeholder to perform such delayed reading since it is needed for 
> the aforementioned optimizations.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4128) null pointer at org.apache.drill.exec.vector.accessor.AbstractSqlAccessor.getString(AbstractSqlAccessor.java:101)

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4128.

Resolution: Fixed
  Assignee: Jason Altekruse

Fixed in 1b96174b1e5bafb13a873dd79f03467802d7c929

> null pointer at 
> org.apache.drill.exec.vector.accessor.AbstractSqlAccessor.getString(AbstractSqlAccessor.java:101)
> -
>
> Key: DRILL-4128
> URL: https://issues.apache.org/jira/browse/DRILL-4128
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.0.0, 1.1.0, 1.2.0, 1.3.0, 1.4.0
>Reporter: Devender Yadav 
>Assignee: Jason Altekruse
>Priority: Blocker
> Fix For: 1.5.0
>
>
> While fetching data from ResultSet in JDBC. I got the null pointer. Details - 
> java.lang.NullPointerException
> at 
> org.apache.drill.exec.vector.accessor.AbstractSqlAccessor.getString(AbstractSqlAccessor.java:101)
> at 
> org.apache.drill.exec.vector.accessor.BoundCheckingAccessor.getString(BoundCheckingAccessor.java:124)
> at 
> org.apache.drill.jdbc.impl.TypeConvertingSqlAccessor.getString(TypeConvertingSqlAccessor.java:649)
> at 
> org.apache.drill.jdbc.impl.AvaticaDrillSqlAccessor.getString(AvaticaDrillSqlAccessor.java:95)
> at 
> net.hydromatic.avatica.AvaticaResultSet.getString(AvaticaResultSet.java:205)
> at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.getString(DrillResultSetImpl.java:182)
> Below mentioned method is throwing null pointer becaue getObject(rowOffset) 
> returns null for null values & null.toString() is throwing null pointer.
>  @Override
>   public String getString(int rowOffset) throws InvalidAccessException{
> return getObject(rowOffset).toString();
>   }
> It should be like:
>  @Override
>   public String getString(int rowOffset) throws InvalidAccessException{
> return getObject(rowOffset)==null? null:getObject(rowOffset).toString();
>   }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4339) Avro Reader can not read records - Regression

2016-02-02 Thread JIRA

[ 
https://issues.apache.org/jira/browse/DRILL-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128993#comment-15128993
 ] 

Stefán Baxter commented on DRILL-4339:
--

Thank you, I will look for the mismatch in my code base.




> Avro Reader can not read records - Regression
> -
>
> Key: DRILL-4339
> URL: https://issues.apache.org/jira/browse/DRILL-4339
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Affects Versions: 1.5.0
>Reporter: Stefán Baxter
>Priority: Blocker
> Fix For: 1.5.0
>
>
> Simple reading of Avro records no longer works
> 0: jdbc:drill:zk=local> select * from dfs.asa.`/`;
> Exception in thread "drill-executor-2" java.lang.NoSuchMethodError: 
> org.apache.drill.exec.store.avro.AvroRecordReader.setColumns(Ljava/util/Collection;)V
>   at 
> org.apache.drill.exec.store.avro.AvroRecordReader.(AvroRecordReader.java:99)
>   at 
> org.apache.drill.exec.store.avro.AvroFormatPlugin.getRecordReader(AvroFormatPlugin.java:73)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin.getReaderBatch(EasyFormatPlugin.java:172)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:35)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:28)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:147)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:101)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:79)
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:230)
>   at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> We have been using the Avro reader for a while and this looks like a 
> regression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4340) Tableau complains about the ODBC driver capabilities

2016-02-02 Thread Tomer Shiran (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129106#comment-15129106
 ] 

Tomer Shiran commented on DRILL-4340:
-

Where are you seeing these warnings? I've used Tableau with Drill and didn't 
see these.

> Tableau complains about the ODBC driver capabilities
> 
>
> Key: DRILL-4340
> URL: https://issues.apache.org/jira/browse/DRILL-4340
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.5.0
>Reporter: Oscar Morante
>
> I'm testing Drill with Tableau via ODBC and when it connects it complains 
> about some missing features:
> {code}
> This ODBC driver does not support important capabilities used by Tableau.
> This unsupported function is required for relative date filters: The date 
> part named 'week' for the date function: DATETRUNC(date_part, date, 
> [start_of_week])
> -
> Tableau identified the following warnings for the ODBC data source named 
> 'test (s3.views.test)':
> This aggregation is unsupported: Attribute
> This aggregation is unsupported: Std. Dev
> This aggregation is unsupported: Std. Dev (Pop.)
> This aggregation is unsupported: Trunc Week Number
> This aggregation is unsupported: Variance
> This aggregation is unsupported: Variance (Pop.)
> This function is unsupported: % with parameter types 'integer, integer'
> This function is unsupported: ABS(number) with parameter types 'float'
> This function is unsupported: ABS(number) with parameter types 'integer'
> This function is unsupported: ACOS(number) with parameter types 'float'
> This function is unsupported: ASIN(number) with parameter types 'float'
> This function is unsupported: ATAN(number) with parameter types 'float'
> This function is unsupported: ATAN2(y number, x number) with parameter types 
> 'float, float'
> This function is unsupported: COS(angle) with parameter types 'float'
> This function is unsupported: COT(angle) with parameter types 'float'
> This function is unsupported: DATEPART_DAYOFWEEK_INTERNAL with parameter 
> types 'date'
> This function is unsupported: DATEPART_WEEK_INTERNAL with parameter types 
> 'date'
> This function is unsupported: DATETIME with parameter types 'integer'
> This function is unsupported: DEGREES(number) with parameter types 'float'
> This function is unsupported: EXP(number) with parameter types 'float'
> This function is unsupported: LN with parameter types 'float'
> This function is unsupported: LN(number) with parameter types 'float'
> This function is unsupported: LOG with parameter types 'float'
> This function is unsupported: LOG(number, [base]) with parameter types 'float'
> This function is unsupported: PI()
> This function is unsupported: POWER with parameter types 'float, integer'
> This function is unsupported: POWER with parameter types 'integer, integer'
> This function is unsupported: POWER(number,power) with parameter types 
> 'float, integer'
> This function is unsupported: POWER(number,power) with parameter types 
> 'integer, integer'
> This function is unsupported: RADIANS(number) with parameter types 'float'
> This function is unsupported: ROUND(number, [decimals]) with parameter types 
> 'float'
> This function is unsupported: ROUND(number, [decimals]) with parameter types 
> 'float, integer'
> This function is unsupported: SIGN(number) with parameter types 'float'
> This function is unsupported: SIN(angle) with parameter types 'float'
> This function is unsupported: SQRT(number) with parameter types 'float'
> This function is unsupported: SQUARE with parameter types 'float'
> This function is unsupported: TAN(angle) with parameter types 'float'
> This function is unsupported: The date part named 'week' for the date 
> function: DATEDIFF(date_part, start_date, end_date, [start_of_week])
> This function is unsupported: The date part named 'week' for the date 
> function: DATEDIFF(date_part, start_date, end_date, [start_of_week])
> This function is unsupported: The date part named 'week' for the date 
> function: DATENAME(date_part, date, [start_of_week])
> This function is unsupported: The date part named 'week' for the date 
> function: DATENAME(date_part, date, [start_of_week])
> This function is unsupported: The date part named 'week' for the date 
> function: DATEPART(date_part, date, [start_of_week])
> This function is unsupported: The date part named 'week' for the date 
> function: DATEPART(date_part, date, [start_of_week])
> This function is unsupported: The date part named 'weekday' for the date 
> function: DATENAME(date_part, date, [start_of_week])
> This function is unsupported: The date part named 'weekday' for the date 
> function: DATENAME(date_part, date, [start_of_week])
> This function is unsupported: The date part named 'weekday' for the date 
> function: DATEPART(date_part, date, 

[jira] [Commented] (DRILL-4161) Make Hive Metastore client caching user configurable.

2016-02-02 Thread Bridget Bevens (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129181#comment-15129181
 ] 

Bridget Bevens commented on DRILL-4161:
---

Added the following doc that describes Drill-Hive metadata caching and 
configuration steps:
https://drill.apache.org/docs/hive-metadata-caching/ 

> Make Hive Metastore client caching user configurable.
> -
>
> Key: DRILL-4161
> URL: https://issues.apache.org/jira/browse/DRILL-4161
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Jinfeng Ni
>  Labels: documentation
> Fix For: 1.5.0
>
>
> Drill leverages LoadingCache in hive metastore client, in order to avoid the 
> long access time to hive metastore server. However, there is a tradeoff 
> between caching stale data and the possibility of cache hit. 
> For instance, DRILL-3893 changes cache invalidation policy to "1 minute after 
> last write", to avoid the chances of hitting stale data. However, it also 
> implies that the cached data would be only valid for 1 minute after 
> loading/write.
> It's desirable to allow user to configure the caching policy, per their 
> individual use case requirement. In particular, we probably should allow user 
> to specify:
> 1) caching invalidation policy : expire after last access, or expire after 
> last write.
> 2) cache TTL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4274) ExternalSort doesn't always handle low memory condition well, failing execution instead of spilling in some cases

2016-02-02 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129066#comment-15129066
 ] 

Victoria Markman commented on DRILL-4274:
-

These 4 tests have been disabled temporarily (commitId: 
01a4a303770d7ffa564e0c1f0497049a4721bd9c), need to move them back when bug is 
fixed.
{code}
/root/drillAutomation/framework/framework/resources/Advanced/data-shapes/wide-columns/5000/10rows/parquet/q174.q
/root/drillAutomation/framework/framework/resources/Advanced/data-shapes/wide-columns/5000/10rows/parquet/q209.q
/root/drillAutomation/framework/framework/resources/Advanced/data-shapes/wide-columns/5000/10rows/parquet/q207.q
/root/drillAutomation/framework/framework/resources/Advanced/data-shapes/wide-columns/5000/10rows/parquet/q213.q
{code}

> ExternalSort doesn't always handle low memory condition well, failing 
> execution instead of spilling in some cases
> -
>
> Key: DRILL-4274
> URL: https://issues.apache.org/jira/browse/DRILL-4274
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Jacques Nadeau
>Assignee: Jacques Nadeau
>Priority: Blocker
> Fix For: 1.5.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4128) null pointer at org.apache.drill.exec.vector.accessor.AbstractSqlAccessor.getString(AbstractSqlAccessor.java:101)

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-4128:
---
Fix Version/s: (was: Future)
   1.5.0

> null pointer at 
> org.apache.drill.exec.vector.accessor.AbstractSqlAccessor.getString(AbstractSqlAccessor.java:101)
> -
>
> Key: DRILL-4128
> URL: https://issues.apache.org/jira/browse/DRILL-4128
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.0.0, 1.1.0, 1.2.0, 1.3.0, 1.4.0
>Reporter: Devender Yadav 
>Priority: Blocker
> Fix For: 1.5.0
>
>
> While fetching data from ResultSet in JDBC. I got the null pointer. Details - 
> java.lang.NullPointerException
> at 
> org.apache.drill.exec.vector.accessor.AbstractSqlAccessor.getString(AbstractSqlAccessor.java:101)
> at 
> org.apache.drill.exec.vector.accessor.BoundCheckingAccessor.getString(BoundCheckingAccessor.java:124)
> at 
> org.apache.drill.jdbc.impl.TypeConvertingSqlAccessor.getString(TypeConvertingSqlAccessor.java:649)
> at 
> org.apache.drill.jdbc.impl.AvaticaDrillSqlAccessor.getString(AvaticaDrillSqlAccessor.java:95)
> at 
> net.hydromatic.avatica.AvaticaResultSet.getString(AvaticaResultSet.java:205)
> at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.getString(DrillResultSetImpl.java:182)
> Below mentioned method is throwing null pointer becaue getObject(rowOffset) 
> returns null for null values & null.toString() is throwing null pointer.
>  @Override
>   public String getString(int rowOffset) throws InvalidAccessException{
> return getObject(rowOffset).toString();
>   }
> It should be like:
>  @Override
>   public String getString(int rowOffset) throws InvalidAccessException{
> return getObject(rowOffset)==null? null:getObject(rowOffset).toString();
>   }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4196) some TPCDS queries return wrong result when hash join is disabled

2016-02-02 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-4196:

Fix Version/s: 1.5.0

> some TPCDS queries return wrong result when hash join is disabled
> -
>
> Key: DRILL-4196
> URL: https://issues.apache.org/jira/browse/DRILL-4196
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Victoria Markman
>Assignee: amit hadke
> Fix For: 1.5.0
>
> Attachments: 1.5.0-amit-branch_tpcds_sf1.txt, query40.tar, query52.tar
>
>
> With hash join disabled query52.sql and query40.sql returned incorrect result 
> with 1.4.0 :
> {noformat}
> +-+---+-++--++
> | version | commit_id |   
> commit_message|commit_time
>  | build_email  | build_time |
> +-+---+-++--++
> | 1.4.0-SNAPSHOT  | b9068117177c3b47025f52c00f67938e0c3e4732  | DRILL-4165 
> Add a precondition for size of merge join record batch.  | 08.12.2015 @ 
> 01:25:34 UTC  | Unknown  | 08.12.2015 @ 03:36:25 UTC  |
> +-+---+-++--++
> 1 row selected (2.13 seconds)
> {noformat}
> Setup and options are the same as in DRILL-4190
> See attached queries (.sql), expected result (.e_tsv) and actual output (.out)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4340) Tableau complains about the ODBC driver capabilities

2016-02-02 Thread Jacques Nadeau (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129330#comment-15129330
 ] 

Jacques Nadeau commented on DRILL-4340:
---

Also, did you install the TDC file that comes with the Drill ODBC driver? 
Sometimes you need to manually install it.

> Tableau complains about the ODBC driver capabilities
> 
>
> Key: DRILL-4340
> URL: https://issues.apache.org/jira/browse/DRILL-4340
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.5.0
>Reporter: Oscar Morante
>
> I'm testing Drill with Tableau via ODBC and when it connects it complains 
> about some missing features:
> {code}
> This ODBC driver does not support important capabilities used by Tableau.
> This unsupported function is required for relative date filters: The date 
> part named 'week' for the date function: DATETRUNC(date_part, date, 
> [start_of_week])
> -
> Tableau identified the following warnings for the ODBC data source named 
> 'test (s3.views.test)':
> This aggregation is unsupported: Attribute
> This aggregation is unsupported: Std. Dev
> This aggregation is unsupported: Std. Dev (Pop.)
> This aggregation is unsupported: Trunc Week Number
> This aggregation is unsupported: Variance
> This aggregation is unsupported: Variance (Pop.)
> This function is unsupported: % with parameter types 'integer, integer'
> This function is unsupported: ABS(number) with parameter types 'float'
> This function is unsupported: ABS(number) with parameter types 'integer'
> This function is unsupported: ACOS(number) with parameter types 'float'
> This function is unsupported: ASIN(number) with parameter types 'float'
> This function is unsupported: ATAN(number) with parameter types 'float'
> This function is unsupported: ATAN2(y number, x number) with parameter types 
> 'float, float'
> This function is unsupported: COS(angle) with parameter types 'float'
> This function is unsupported: COT(angle) with parameter types 'float'
> This function is unsupported: DATEPART_DAYOFWEEK_INTERNAL with parameter 
> types 'date'
> This function is unsupported: DATEPART_WEEK_INTERNAL with parameter types 
> 'date'
> This function is unsupported: DATETIME with parameter types 'integer'
> This function is unsupported: DEGREES(number) with parameter types 'float'
> This function is unsupported: EXP(number) with parameter types 'float'
> This function is unsupported: LN with parameter types 'float'
> This function is unsupported: LN(number) with parameter types 'float'
> This function is unsupported: LOG with parameter types 'float'
> This function is unsupported: LOG(number, [base]) with parameter types 'float'
> This function is unsupported: PI()
> This function is unsupported: POWER with parameter types 'float, integer'
> This function is unsupported: POWER with parameter types 'integer, integer'
> This function is unsupported: POWER(number,power) with parameter types 
> 'float, integer'
> This function is unsupported: POWER(number,power) with parameter types 
> 'integer, integer'
> This function is unsupported: RADIANS(number) with parameter types 'float'
> This function is unsupported: ROUND(number, [decimals]) with parameter types 
> 'float'
> This function is unsupported: ROUND(number, [decimals]) with parameter types 
> 'float, integer'
> This function is unsupported: SIGN(number) with parameter types 'float'
> This function is unsupported: SIN(angle) with parameter types 'float'
> This function is unsupported: SQRT(number) with parameter types 'float'
> This function is unsupported: SQUARE with parameter types 'float'
> This function is unsupported: TAN(angle) with parameter types 'float'
> This function is unsupported: The date part named 'week' for the date 
> function: DATEDIFF(date_part, start_date, end_date, [start_of_week])
> This function is unsupported: The date part named 'week' for the date 
> function: DATEDIFF(date_part, start_date, end_date, [start_of_week])
> This function is unsupported: The date part named 'week' for the date 
> function: DATENAME(date_part, date, [start_of_week])
> This function is unsupported: The date part named 'week' for the date 
> function: DATENAME(date_part, date, [start_of_week])
> This function is unsupported: The date part named 'week' for the date 
> function: DATEPART(date_part, date, [start_of_week])
> This function is unsupported: The date part named 'week' for the date 
> function: DATEPART(date_part, date, [start_of_week])
> This function is unsupported: The date part named 'weekday' for the date 
> function: DATENAME(date_part, date, [start_of_week])
> This function is unsupported: The date part named 'weekday' for the date 
> function: DATENAME(date_part, date, [start_of_week])
> This function is unsupported: The date part named 'weekday' for the date 
> function: 

[jira] [Closed] (DRILL-4180) IllegalArgumentException while reading JSON files

2016-02-02 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang closed DRILL-4180.
-

1.5.0-SNAPSHOT  | 9ff947288f3214fe8e525e001d89a4f91b8b0728

{noformat}
0: jdbc:drill:schema=dfs.drillTestDir> select jsonFieldMapLevel1_aaa from 
`drill-4180`;
++
| jsonFieldMapLevel1_aaa |
++
| 

[jira] [Commented] (DRILL-4132) Ability to submit simple type of physical plan directly to EndPoint DrillBit for execution

2016-02-02 Thread Yuliya Feldman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129403#comment-15129403
 ] 

Yuliya Feldman commented on DRILL-4132:
---

I have wrote small design doc based on the discussions we had on this JIRA: 
[Alternative way to handle Drill Simple Queries | 
https://docs.google.com/document/d/1IUYLvxB1DCuT6595cUh_jySzMsLpsfyU25jETMg3tGE/edit#
 ]

> Ability to submit simple type of physical plan directly to EndPoint DrillBit 
> for execution
> --
>
> Key: DRILL-4132
> URL: https://issues.apache.org/jira/browse/DRILL-4132
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Execution - Flow, Execution - RPC, Query Planning & 
> Optimization
>Reporter: Yuliya Feldman
>Assignee: Yuliya Feldman
>
> Today Drill Query execution is optimistic and stateful (at least due to data 
> exchanges) - if any of the stages of query execution fails whole query fails. 
> If query is just simple scan, filter push down and project where no data 
> exchange happens between DrillBits there is no need to fail whole query when 
> one DrillBit fails, as minor fragments running on that DrillBit can be rerun 
> on the other DrillBit. There are probably multiple ways to achieve this. This 
> JIRA is to open discussion on: 
> 1. agreement that we need to support above use case 
> 2. means of achieving it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (DRILL-4032) Drill unable to parse json files with schema changes

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse reopened DRILL-4032:


> Drill unable to parse json files with schema changes
> 
>
> Key: DRILL-4032
> URL: https://issues.apache.org/jira/browse/DRILL-4032
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types, Storage - JSON
>Affects Versions: 1.3.0
>Reporter: Rahul Challapalli
>Assignee: Steven Phillips
>Priority: Blocker
> Fix For: 1.4.0
>
>
> git.commit.id.abbrev=bb69f22
> {code}
> select d.col2.col3  from reg1 d;
> Error: DATA_READ ERROR: Error parsing JSON - index: 0, length: 4 (expected: 
> range(0, 0))
> File  /drill/testdata/reg1/a.json
> Record  2
> Fragment 0:0
> {code}
> The folder reg1 contains 2 files
> File 1 : a.json
> {code}
> {"col1": "val1","col2": null}
> {"col1": "val1","col2": {"col3":"abc", "col4":"xyz"}}
> {code}
> File 2 : b.json
> {code}
> {"col1": "val1","col2": null}
> {"col1": "val1","col2": null}
> {code}
> Exception from the log file :
> {code}
> [Error Id: a7e3c716-838d-4f8f-9361-3727b98f04cd ]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534)
>  ~[drill-common-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.store.easy.json.JSONRecordReader.handleAndRaise(JSONRecordReader.java:165)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.store.easy.json.JSONRecordReader.next(JSONRecordReader.java:205)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:183) 
> [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:119)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:113)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:103)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:130)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:156)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:119)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104) 
> [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:80)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94) 
> [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:256)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:250)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at java.security.AccessController.doPrivileged(Native Method) 
> [na:1.7.0_71]
> at javax.security.auth.Subject.doAs(Subject.java:415) [na:1.7.0_71]
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)
>  [hadoop-common-2.7.0-mapr-1506.jar:na]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:250)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_71]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_71]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> Caused by: java.lang.IndexOutOfBoundsException: index: 0, length: 4 
> (expected: range(0, 0))
> at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:189) 
> 

[jira] [Reopened] (DRILL-4048) Parquet reader corrupts dictionary encoded binary columns

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse reopened DRILL-4048:


> Parquet reader corrupts dictionary encoded binary columns
> -
>
> Key: DRILL-4048
> URL: https://issues.apache.org/jira/browse/DRILL-4048
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.3.0
>Reporter: Rahul Challapalli
>Assignee: Jason Altekruse
>Priority: Blocker
> Fix For: 1.4.0
>
> Attachments: lineitem_dic_enc.parquet
>
>
> git.commit.id.abbrev=04c01bd
> The below query returns corrupted data (not even showing up here) for binary 
> columns
> {code}
> select * from `lineitem_dic_enc.parquet` limit 1;
> +-+++---+-+--+-++---+---+-+---+++-+--+
> | l_orderkey  | l_partkey  | l_suppkey  | l_linenumber  | l_quantity  | 
> l_extendedprice  | l_discount  | l_tax  | l_returnflag  | l_linestatus  | 
> l_shipdate  | l_commitdate  | l_receiptdate  |   l_shipinstruct   | 
> l_shipmode  |l_comment |
> +-+++---+-+--+-++---+---+-+---+++-+--+
> | 1   | 1552   | 93 | 1 | 17.0| 
> 24710.35 | 0.04| 0.02   |  |  | 
> 1996-03-13  | 1996-02-12| 1996-03-22 | DELIVER IN PE  | T   | 
> egular courts above the  |
> +-+++---+-+--+-++---+---+-+---+++-+--+
> {code}
> The same query from an older build (git.commit.id.abbrev=839f8da)
> {code}
> select * from `lineitem_dic_enc.parquet` limit 1;
> +-+++---+-+--+-++---+---+-+---+++-+--+
> | l_orderkey  | l_partkey  | l_suppkey  | l_linenumber  | l_quantity  | 
> l_extendedprice  | l_discount  | l_tax  | l_returnflag  | l_linestatus  | 
> l_shipdate  | l_commitdate  | l_receiptdate  |   l_shipinstruct   | 
> l_shipmode  |l_comment |
> +-+++---+-+--+-++---+---+-+---+++-+--+
> | 1   | 1552   | 93 | 1 | 17.0| 
> 24710.35 | 0.04| 0.02   | N | O | 
> 1996-03-13  | 1996-02-12| 1996-03-22 | DELIVER IN PERSON  | TRUCK 
>   | egular courts above the  |
> +-+++---+-+--+-++---+---+-+---+++-+--+
> {code}
> Below is the output of the parquet-meta command for this dataset
> {code}
> creator: parquet-mr 
> file schema: root 
> ---
> l_orderkey:  REQUIRED INT32 R:0 D:0
> l_partkey:   REQUIRED INT32 R:0 D:0
> l_suppkey:   REQUIRED INT32 R:0 D:0
> l_linenumber:REQUIRED INT32 R:0 D:0
> l_quantity:  REQUIRED DOUBLE R:0 D:0
> l_extendedprice: REQUIRED DOUBLE R:0 D:0
> l_discount:  REQUIRED DOUBLE R:0 D:0
> l_tax:   REQUIRED DOUBLE R:0 D:0
> l_returnflag:REQUIRED BINARY O:UTF8 R:0 D:0
> l_linestatus:REQUIRED BINARY O:UTF8 R:0 D:0
> l_shipdate:  REQUIRED INT32 O:DATE R:0 D:0
> l_commitdate:REQUIRED INT32 O:DATE R:0 D:0
> l_receiptdate:   REQUIRED INT32 O:DATE R:0 D:0
> l_shipinstruct:  REQUIRED BINARY O:UTF8 R:0 D:0
> l_shipmode:  REQUIRED BINARY O:UTF8 R:0 D:0
> l_comment:   REQUIRED BINARY O:UTF8 R:0 D:0
> row group 1: RC:60175 TS:3049610 
> 

[jira] [Resolved] (DRILL-4032) Drill unable to parse json files with schema changes

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4032.

   Resolution: Fixed
Fix Version/s: 1.4.0

> Drill unable to parse json files with schema changes
> 
>
> Key: DRILL-4032
> URL: https://issues.apache.org/jira/browse/DRILL-4032
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types, Storage - JSON
>Affects Versions: 1.3.0
>Reporter: Rahul Challapalli
>Assignee: Steven Phillips
>Priority: Blocker
> Fix For: 1.4.0
>
>
> git.commit.id.abbrev=bb69f22
> {code}
> select d.col2.col3  from reg1 d;
> Error: DATA_READ ERROR: Error parsing JSON - index: 0, length: 4 (expected: 
> range(0, 0))
> File  /drill/testdata/reg1/a.json
> Record  2
> Fragment 0:0
> {code}
> The folder reg1 contains 2 files
> File 1 : a.json
> {code}
> {"col1": "val1","col2": null}
> {"col1": "val1","col2": {"col3":"abc", "col4":"xyz"}}
> {code}
> File 2 : b.json
> {code}
> {"col1": "val1","col2": null}
> {"col1": "val1","col2": null}
> {code}
> Exception from the log file :
> {code}
> [Error Id: a7e3c716-838d-4f8f-9361-3727b98f04cd ]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534)
>  ~[drill-common-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.store.easy.json.JSONRecordReader.handleAndRaise(JSONRecordReader.java:165)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.store.easy.json.JSONRecordReader.next(JSONRecordReader.java:205)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:183) 
> [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:119)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:113)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:103)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:130)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:156)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:119)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104) 
> [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:80)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94) 
> [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:256)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:250)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at java.security.AccessController.doPrivileged(Native Method) 
> [na:1.7.0_71]
> at javax.security.auth.Subject.doAs(Subject.java:415) [na:1.7.0_71]
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)
>  [hadoop-common-2.7.0-mapr-1506.jar:na]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:250)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_71]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_71]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> Caused by: java.lang.IndexOutOfBoundsException: index: 0, length: 4 
> (expected: range(0, 0))
> at 

[jira] [Closed] (DRILL-4032) Drill unable to parse json files with schema changes

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse closed DRILL-4032.
--

> Drill unable to parse json files with schema changes
> 
>
> Key: DRILL-4032
> URL: https://issues.apache.org/jira/browse/DRILL-4032
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types, Storage - JSON
>Affects Versions: 1.3.0
>Reporter: Rahul Challapalli
>Assignee: Steven Phillips
>Priority: Blocker
> Fix For: 1.4.0
>
>
> git.commit.id.abbrev=bb69f22
> {code}
> select d.col2.col3  from reg1 d;
> Error: DATA_READ ERROR: Error parsing JSON - index: 0, length: 4 (expected: 
> range(0, 0))
> File  /drill/testdata/reg1/a.json
> Record  2
> Fragment 0:0
> {code}
> The folder reg1 contains 2 files
> File 1 : a.json
> {code}
> {"col1": "val1","col2": null}
> {"col1": "val1","col2": {"col3":"abc", "col4":"xyz"}}
> {code}
> File 2 : b.json
> {code}
> {"col1": "val1","col2": null}
> {"col1": "val1","col2": null}
> {code}
> Exception from the log file :
> {code}
> [Error Id: a7e3c716-838d-4f8f-9361-3727b98f04cd ]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534)
>  ~[drill-common-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.store.easy.json.JSONRecordReader.handleAndRaise(JSONRecordReader.java:165)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.store.easy.json.JSONRecordReader.next(JSONRecordReader.java:205)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:183) 
> [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:119)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:113)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:103)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:130)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:156)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:119)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104) 
> [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:80)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94) 
> [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:256)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:250)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at java.security.AccessController.doPrivileged(Native Method) 
> [na:1.7.0_71]
> at javax.security.auth.Subject.doAs(Subject.java:415) [na:1.7.0_71]
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)
>  [hadoop-common-2.7.0-mapr-1506.jar:na]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:250)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_71]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_71]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> Caused by: java.lang.IndexOutOfBoundsException: index: 0, length: 4 
> (expected: range(0, 0))
> at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:189) 
> 

[jira] [Resolved] (DRILL-4243) CTAS with partition by, results in Out Of Memory

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4243.

   Resolution: Fixed
Fix Version/s: 1.5.0

> CTAS with partition by, results in Out Of Memory
> 
>
> Key: DRILL-4243
> URL: https://issues.apache.org/jira/browse/DRILL-4243
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.5.0
> Environment: 4 node cluster
>Reporter: Khurram Faraaz
> Fix For: 1.5.0
>
>
> CTAS with partition by, results in Out Of Memory. It seems to be coming from 
> ExternalSortBatch
> Details of Drill are
> {noformat}
> version   commit_id   commit_message  commit_time build_email 
> build_time
> 1.5.0-SNAPSHOTe4372f224a4b474494388356355a53808092a67a
> DRILL-4242: Updates to storage-mongo03.01.2016 @ 15:31:13 PST   
> Unknown 04.01.2016 @ 01:02:29 PST
>  create table `tpch_single_partition/lineitem` partition by (l_moddate) as 
> select l.*, l_shipdate - extract(day from l_shipdate) + 1 l_moddate from 
> cp.`tpch/lineitem.parquet` l;
> Error: RESOURCE ERROR: One or more nodes ran out of memory while 
> executing the query.
> Fragment 0:0
> [Error Id: 3323fd1c-4b78-42a7-b311-23ee73c7d550 on atsqa4-193.qa.lab:31010] 
> (state=,code=0)
> java.sql.SQLException: RESOURCE ERROR: One or more nodes ran out of memory 
> while executing the query.
> Fragment 0:0
> [Error Id: 3323fd1c-4b78-42a7-b311-23ee73c7d550 on atsqa4-193.qa.lab:31010]
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:247)
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:290)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1923)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:73)
>   at 
> net.hydromatic.avatica.AvaticaConnection.executeQueryInternal(AvaticaConnection.java:404)
>   at 
> net.hydromatic.avatica.AvaticaStatement.executeQueryInternal(AvaticaStatement.java:351)
>   at 
> net.hydromatic.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:338)
>   at 
> net.hydromatic.avatica.AvaticaStatement.execute(AvaticaStatement.java:69)
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.execute(DrillStatementImpl.java:101)
>   at sqlline.Commands.execute(Commands.java:841)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:746)
>   at sqlline.SqlLine.runCommands(SqlLine.java:1651)
>   at sqlline.Commands.run(Commands.java:1304)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:36)
>   at sqlline.SqlLine.dispatch(SqlLine.java:742)
>   at sqlline.SqlLine.initArgs(SqlLine.java:553)
>   at sqlline.SqlLine.begin(SqlLine.java:596)
>   at sqlline.SqlLine.start(SqlLine.java:375)
>   at sqlline.SqlLine.main(SqlLine.java:268)
> Caused by: org.apache.drill.common.exceptions.UserRemoteException: RESOURCE 
> ERROR: One or more nodes ran out of memory while executing the query.
> Fragment 0:0
> [Error Id: 3323fd1c-4b78-42a7-b311-23ee73c7d550 on atsqa4-193.qa.lab:31010]
>   at 
> org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:119)
>   at 
> org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:113)
>   at 
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:46)
>   at 
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:31)
>   at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:69)
>   at org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:400)
>   at 
> org.apache.drill.common.SerializedExecutor$RunnableProcessor.run(SerializedExecutor.java:105)
>   at 
> org.apache.drill.exec.rpc.RpcBus$SameExecutor.execute(RpcBus.java:264)
>   at 
> org.apache.drill.common.SerializedExecutor.execute(SerializedExecutor.java:142)
>   at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:298)
>   at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:269)
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
>   at 
> 

[jira] [Closed] (DRILL-4243) CTAS with partition by, results in Out Of Memory

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse closed DRILL-4243.
--

> CTAS with partition by, results in Out Of Memory
> 
>
> Key: DRILL-4243
> URL: https://issues.apache.org/jira/browse/DRILL-4243
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.5.0
> Environment: 4 node cluster
>Reporter: Khurram Faraaz
> Fix For: 1.5.0
>
>
> CTAS with partition by, results in Out Of Memory. It seems to be coming from 
> ExternalSortBatch
> Details of Drill are
> {noformat}
> version   commit_id   commit_message  commit_time build_email 
> build_time
> 1.5.0-SNAPSHOTe4372f224a4b474494388356355a53808092a67a
> DRILL-4242: Updates to storage-mongo03.01.2016 @ 15:31:13 PST   
> Unknown 04.01.2016 @ 01:02:29 PST
>  create table `tpch_single_partition/lineitem` partition by (l_moddate) as 
> select l.*, l_shipdate - extract(day from l_shipdate) + 1 l_moddate from 
> cp.`tpch/lineitem.parquet` l;
> Error: RESOURCE ERROR: One or more nodes ran out of memory while 
> executing the query.
> Fragment 0:0
> [Error Id: 3323fd1c-4b78-42a7-b311-23ee73c7d550 on atsqa4-193.qa.lab:31010] 
> (state=,code=0)
> java.sql.SQLException: RESOURCE ERROR: One or more nodes ran out of memory 
> while executing the query.
> Fragment 0:0
> [Error Id: 3323fd1c-4b78-42a7-b311-23ee73c7d550 on atsqa4-193.qa.lab:31010]
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:247)
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:290)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1923)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:73)
>   at 
> net.hydromatic.avatica.AvaticaConnection.executeQueryInternal(AvaticaConnection.java:404)
>   at 
> net.hydromatic.avatica.AvaticaStatement.executeQueryInternal(AvaticaStatement.java:351)
>   at 
> net.hydromatic.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:338)
>   at 
> net.hydromatic.avatica.AvaticaStatement.execute(AvaticaStatement.java:69)
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.execute(DrillStatementImpl.java:101)
>   at sqlline.Commands.execute(Commands.java:841)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:746)
>   at sqlline.SqlLine.runCommands(SqlLine.java:1651)
>   at sqlline.Commands.run(Commands.java:1304)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:36)
>   at sqlline.SqlLine.dispatch(SqlLine.java:742)
>   at sqlline.SqlLine.initArgs(SqlLine.java:553)
>   at sqlline.SqlLine.begin(SqlLine.java:596)
>   at sqlline.SqlLine.start(SqlLine.java:375)
>   at sqlline.SqlLine.main(SqlLine.java:268)
> Caused by: org.apache.drill.common.exceptions.UserRemoteException: RESOURCE 
> ERROR: One or more nodes ran out of memory while executing the query.
> Fragment 0:0
> [Error Id: 3323fd1c-4b78-42a7-b311-23ee73c7d550 on atsqa4-193.qa.lab:31010]
>   at 
> org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:119)
>   at 
> org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:113)
>   at 
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:46)
>   at 
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:31)
>   at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:69)
>   at org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:400)
>   at 
> org.apache.drill.common.SerializedExecutor$RunnableProcessor.run(SerializedExecutor.java:105)
>   at 
> org.apache.drill.exec.rpc.RpcBus$SameExecutor.execute(RpcBus.java:264)
>   at 
> org.apache.drill.common.SerializedExecutor.execute(SerializedExecutor.java:142)
>   at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:298)
>   at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:269)
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> 

[jira] [Reopened] (DRILL-4243) CTAS with partition by, results in Out Of Memory

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse reopened DRILL-4243:


> CTAS with partition by, results in Out Of Memory
> 
>
> Key: DRILL-4243
> URL: https://issues.apache.org/jira/browse/DRILL-4243
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.5.0
> Environment: 4 node cluster
>Reporter: Khurram Faraaz
> Fix For: 1.5.0
>
>
> CTAS with partition by, results in Out Of Memory. It seems to be coming from 
> ExternalSortBatch
> Details of Drill are
> {noformat}
> version   commit_id   commit_message  commit_time build_email 
> build_time
> 1.5.0-SNAPSHOTe4372f224a4b474494388356355a53808092a67a
> DRILL-4242: Updates to storage-mongo03.01.2016 @ 15:31:13 PST   
> Unknown 04.01.2016 @ 01:02:29 PST
>  create table `tpch_single_partition/lineitem` partition by (l_moddate) as 
> select l.*, l_shipdate - extract(day from l_shipdate) + 1 l_moddate from 
> cp.`tpch/lineitem.parquet` l;
> Error: RESOURCE ERROR: One or more nodes ran out of memory while 
> executing the query.
> Fragment 0:0
> [Error Id: 3323fd1c-4b78-42a7-b311-23ee73c7d550 on atsqa4-193.qa.lab:31010] 
> (state=,code=0)
> java.sql.SQLException: RESOURCE ERROR: One or more nodes ran out of memory 
> while executing the query.
> Fragment 0:0
> [Error Id: 3323fd1c-4b78-42a7-b311-23ee73c7d550 on atsqa4-193.qa.lab:31010]
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:247)
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:290)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1923)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:73)
>   at 
> net.hydromatic.avatica.AvaticaConnection.executeQueryInternal(AvaticaConnection.java:404)
>   at 
> net.hydromatic.avatica.AvaticaStatement.executeQueryInternal(AvaticaStatement.java:351)
>   at 
> net.hydromatic.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:338)
>   at 
> net.hydromatic.avatica.AvaticaStatement.execute(AvaticaStatement.java:69)
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.execute(DrillStatementImpl.java:101)
>   at sqlline.Commands.execute(Commands.java:841)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:746)
>   at sqlline.SqlLine.runCommands(SqlLine.java:1651)
>   at sqlline.Commands.run(Commands.java:1304)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:36)
>   at sqlline.SqlLine.dispatch(SqlLine.java:742)
>   at sqlline.SqlLine.initArgs(SqlLine.java:553)
>   at sqlline.SqlLine.begin(SqlLine.java:596)
>   at sqlline.SqlLine.start(SqlLine.java:375)
>   at sqlline.SqlLine.main(SqlLine.java:268)
> Caused by: org.apache.drill.common.exceptions.UserRemoteException: RESOURCE 
> ERROR: One or more nodes ran out of memory while executing the query.
> Fragment 0:0
> [Error Id: 3323fd1c-4b78-42a7-b311-23ee73c7d550 on atsqa4-193.qa.lab:31010]
>   at 
> org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:119)
>   at 
> org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:113)
>   at 
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:46)
>   at 
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:31)
>   at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:69)
>   at org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:400)
>   at 
> org.apache.drill.common.SerializedExecutor$RunnableProcessor.run(SerializedExecutor.java:105)
>   at 
> org.apache.drill.exec.rpc.RpcBus$SameExecutor.execute(RpcBus.java:264)
>   at 
> org.apache.drill.common.SerializedExecutor.execute(SerializedExecutor.java:142)
>   at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:298)
>   at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:269)
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> 

[jira] [Reopened] (DRILL-4205) Simple query hit IndexOutOfBoundException

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse reopened DRILL-4205:


>  Simple query hit IndexOutOfBoundException
> --
>
> Key: DRILL-4205
> URL: https://issues.apache.org/jira/browse/DRILL-4205
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.4.0
>Reporter: Dechang Gu
>Assignee: Dechang Gu
> Fix For: 1.5.0
>
>
> The following query failed due to IOB:
> 0: jdbc:drill:schema=wf_pigprq100> select * from 
> `store_sales/part-m-00073.parquet`;
> Error: SYSTEM ERROR: IndexOutOfBoundsException: srcIndex: 1048587
> Fragment 0:0
> [Error Id: ad8d2bc0-259f-483c-9024-93865963541e on ucs-node4.perf.lab:31010]
>   (org.apache.drill.common.exceptions.DrillRuntimeException) Error in parquet 
> record reader.
> Message: 
> Hadoop path: /tpcdsPigParq/SF100/store_sales/part-m-00073.parquet
> Total records read: 135280
> Mock records read: 0
> Records to read: 1424
> Row group index: 0
> Records in row group: 3775712
> Parquet Metadata: ParquetMetaData{FileMetaData{schema: message pig_schema {
>   optional int64 ss_sold_date_sk;
>   optional int64 ss_sold_time_sk;
>   optional int64 ss_item_sk;
>   optional int64 ss_customer_sk;
>   optional int64 ss_cdemo_sk;
>   optional int64 ss_hdemo_sk;
>   optional int64 ss_addr_sk;
>   optional int64 ss_store_sk;
>   optional int64 ss_promo_sk;
>   optional int64 ss_ticket_number;
>   optional int64 ss_quantity;
>   optional double ss_wholesale_cost;
>   optional double ss_list_price;
>   optional double ss_sales_price;
>   optional double ss_ext_discount_amt;
>   optional double ss_ext_sales_price;
>   optional double ss_ext_wholesale_cost;
>   optional double ss_ext_list_price;
>   optional double ss_ext_tax;
>   optional double ss_coupon_amt;
>   optional double ss_net_paid;
>   optional double ss_net_paid_inc_tax;
>   optional double ss_net_profit;
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-4205) Simple query hit IndexOutOfBoundException

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse closed DRILL-4205.
--

>  Simple query hit IndexOutOfBoundException
> --
>
> Key: DRILL-4205
> URL: https://issues.apache.org/jira/browse/DRILL-4205
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.4.0
>Reporter: Dechang Gu
>Assignee: Dechang Gu
> Fix For: 1.5.0
>
>
> The following query failed due to IOB:
> 0: jdbc:drill:schema=wf_pigprq100> select * from 
> `store_sales/part-m-00073.parquet`;
> Error: SYSTEM ERROR: IndexOutOfBoundsException: srcIndex: 1048587
> Fragment 0:0
> [Error Id: ad8d2bc0-259f-483c-9024-93865963541e on ucs-node4.perf.lab:31010]
>   (org.apache.drill.common.exceptions.DrillRuntimeException) Error in parquet 
> record reader.
> Message: 
> Hadoop path: /tpcdsPigParq/SF100/store_sales/part-m-00073.parquet
> Total records read: 135280
> Mock records read: 0
> Records to read: 1424
> Row group index: 0
> Records in row group: 3775712
> Parquet Metadata: ParquetMetaData{FileMetaData{schema: message pig_schema {
>   optional int64 ss_sold_date_sk;
>   optional int64 ss_sold_time_sk;
>   optional int64 ss_item_sk;
>   optional int64 ss_customer_sk;
>   optional int64 ss_cdemo_sk;
>   optional int64 ss_hdemo_sk;
>   optional int64 ss_addr_sk;
>   optional int64 ss_store_sk;
>   optional int64 ss_promo_sk;
>   optional int64 ss_ticket_number;
>   optional int64 ss_quantity;
>   optional double ss_wholesale_cost;
>   optional double ss_list_price;
>   optional double ss_sales_price;
>   optional double ss_ext_discount_amt;
>   optional double ss_ext_sales_price;
>   optional double ss_ext_wholesale_cost;
>   optional double ss_ext_list_price;
>   optional double ss_ext_tax;
>   optional double ss_coupon_amt;
>   optional double ss_net_paid;
>   optional double ss_net_paid_inc_tax;
>   optional double ss_net_profit;
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4121) External Sort may not spill if above a receiver

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-4121:
---
Fix Version/s: 1.5.0

> External Sort may not spill if above a receiver
> ---
>
> Key: DRILL-4121
> URL: https://issues.apache.org/jira/browse/DRILL-4121
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Deneche A. Hakim
>Assignee: Deneche A. Hakim
> Fix For: 1.5.0
>
>
> If external sort is above a receiver, all received batches will contain non 
> root buffers. Sort operator doesn't account for non root buffers when 
> estimating how much memory and if it needs to spill to disk. This may delay 
> the spill and cause the corresponding Drillbit to use large amounts of memory



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (DRILL-4163) Support schema changes for MergeJoin operator.

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse reopened DRILL-4163:

  Assignee: Jason Altekruse  (was: amit hadke)

> Support schema changes for MergeJoin operator.
> --
>
> Key: DRILL-4163
> URL: https://issues.apache.org/jira/browse/DRILL-4163
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: amit hadke
>Assignee: Jason Altekruse
>
> Since external sort operator supports schema changes, allow use of union 
> types in merge join to support for schema changes.
> For now, we assume that merge join always works on record batches from sort 
> operator. Thus merging schemas and promoting to union vectors is already 
> taken care by sort operator.
> Test Cases:
> 1) Only one side changes schema (join on union type and primitive type)
> 2) Both sids change schema on all columns.
> 3) Join between numeric types and string types.
> 4) Missing columns - each batch has different columns. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4339) Avro Reader can not read records - Regression

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-4339:
---
Fix Version/s: (was: 1.6.0)
   1.5.0

> Avro Reader can not read records - Regression
> -
>
> Key: DRILL-4339
> URL: https://issues.apache.org/jira/browse/DRILL-4339
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Affects Versions: 1.5.0
>Reporter: Stefán Baxter
>Priority: Blocker
> Fix For: 1.5.0
>
>
> Simple reading of Avro records no longer works
> 0: jdbc:drill:zk=local> select * from dfs.asa.`/`;
> Exception in thread "drill-executor-2" java.lang.NoSuchMethodError: 
> org.apache.drill.exec.store.avro.AvroRecordReader.setColumns(Ljava/util/Collection;)V
>   at 
> org.apache.drill.exec.store.avro.AvroRecordReader.(AvroRecordReader.java:99)
>   at 
> org.apache.drill.exec.store.avro.AvroFormatPlugin.getRecordReader(AvroFormatPlugin.java:73)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin.getReaderBatch(EasyFormatPlugin.java:172)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:35)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:28)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:147)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:101)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:79)
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:230)
>   at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> We have been using the Avro reader for a while and this looks like a 
> regression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4339) Avro Reader can not read records - Regression

2016-02-02 Thread Jinfeng Ni (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129373#comment-15129373
 ] 

Jinfeng Ni commented on DRILL-4339:
---

[~sphillips], I have a patch which reverts the setColumn's signature change 
[1].  I run the pre-commit and unit test, and did not see any issue.

If you prefer to use the old setColumn signature, I can merge this patch. As I 
said earlier, the change to setColumn signature is not necessary. 

As a side note, we probably need cleaner definition of public api, so that 
people would be more cautious to touch this kind of code in the future release. 

[1] 
https://github.com/jinfengni/incubator-drill/commit/6a36a704bc139aa05deb3919e792f21b3fcd7794


> Avro Reader can not read records - Regression
> -
>
> Key: DRILL-4339
> URL: https://issues.apache.org/jira/browse/DRILL-4339
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Affects Versions: 1.5.0
>Reporter: Stefán Baxter
>Priority: Blocker
> Fix For: 1.5.0
>
>
> Simple reading of Avro records no longer works
> 0: jdbc:drill:zk=local> select * from dfs.asa.`/`;
> Exception in thread "drill-executor-2" java.lang.NoSuchMethodError: 
> org.apache.drill.exec.store.avro.AvroRecordReader.setColumns(Ljava/util/Collection;)V
>   at 
> org.apache.drill.exec.store.avro.AvroRecordReader.(AvroRecordReader.java:99)
>   at 
> org.apache.drill.exec.store.avro.AvroFormatPlugin.getRecordReader(AvroFormatPlugin.java:73)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin.getReaderBatch(EasyFormatPlugin.java:172)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:35)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:28)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:147)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:101)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:79)
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:230)
>   at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> We have been using the Avro reader for a while and this looks like a 
> regression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4337) Drill fails to read INT96 fields from hive generated parquet files

2016-02-02 Thread Rahul Challapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129409#comment-15129409
 ] 

Rahul Challapalli commented on DRILL-4337:
--

I am also seeing this error when using a hive through drill's native parquet 
reader

> Drill fails to read INT96 fields from hive generated parquet files
> --
>
> Key: DRILL-4337
> URL: https://issues.apache.org/jira/browse/DRILL-4337
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Rahul Challapalli
>Priority: Critical
> Attachments: hive1_fewtypes_null.parquet
>
>
> git.commit.id.abbrev=576271d
> Cluster : 2 nodes running MaprFS 4.1
> The data file used in the below table is generated from hive. Below is output 
> from running the same query multiple times. 
> {code}
> 0: jdbc:drill:zk=10.10.100.190:5181> select timestamp_col from 
> hive1_fewtypes_null;
> Error: SYSTEM ERROR: NegativeArraySizeException
> Fragment 0:0
> [Error Id: 5517e983-ccae-4c96-b09c-30f331919e56 on qa-node191.qa.lab:31010] 
> (state=,code=0)
> 0: jdbc:drill:zk=10.10.100.190:5181> select timestamp_col from 
> hive1_fewtypes_null;
> Error: SYSTEM ERROR: IllegalArgumentException: Reading past RLE/BitPacking 
> stream.
> Fragment 0:0
> [Error Id: 94ed5996-d2ac-438d-b460-c2d2e41bdcc3 on qa-node191.qa.lab:31010] 
> (state=,code=0)
> 0: jdbc:drill:zk=10.10.100.190:5181> select timestamp_col from 
> hive1_fewtypes_null;
> Error: SYSTEM ERROR: ArrayIndexOutOfBoundsException: 0
> Fragment 0:0
> [Error Id: 41dca093-571e-49e5-a2ab-fd69210b143d on qa-node191.qa.lab:31010] 
> (state=,code=0)
> 0: jdbc:drill:zk=10.10.100.190:5181> select timestamp_col from 
> hive1_fewtypes_null;
> ++
> | timestamp_col  |
> ++
> | null   |
> | [B@7c766115|
> | [B@3fdfe989|
> | null   |
> | [B@55d4222 |
> | [B@2da0c8ee|
> | [B@16e798a9|
> | [B@3ed78afe|
> | [B@38e649ed|
> | [B@16ff83ca|
> | [B@61254e91|
> | [B@5849436a|
> | [B@31e9116e|
> | [B@3c77665b|
> | [B@42e0ff60|
> | [B@419e19ed|
> | [B@72b83842|
> | [B@1c75afe5|
> | [B@726ef1fb|
> | [B@51d0d06e|
> | [B@64240fb8|
> +
> {code}
> Attached the log, hive ddl used to generate the parquet file and the parquet 
> file itself



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4323) Hive Native Reader : A simple count(*) throws Incoming batch has an empty schema error

2016-02-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129411#comment-15129411
 ] 

ASF GitHub Bot commented on DRILL-4323:
---

Github user hsuanyi commented on the pull request:

https://github.com/apache/drill/pull/349#issuecomment-178900844
  
Rahul already verified it and this patch passed all the requested tests.


> Hive Native Reader : A simple count(*) throws Incoming batch has an empty 
> schema error
> --
>
> Key: DRILL-4323
> URL: https://issues.apache.org/jira/browse/DRILL-4323
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Hive
>Affects Versions: 1.5.0
>Reporter: Rahul Challapalli
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
> Attachments: error.log
>
>
> git.commit.id.abbrev=3d0b4b0
> A simple count(*) query does not work when hive native reader is enabled
> {code}
> 0: jdbc:drill:zk=10.10.100.190:5181> select count(*) from customer;
> +-+
> | EXPR$0  |
> +-+
> | 10  |
> +-+
> 1 row selected (3.074 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> alter session set 
> `store.hive.optimize_scan_with_native_readers` = true;
> +---++
> |  ok   |summary |
> +---++
> | true  | store.hive.optimize_scan_with_native_readers updated.  |
> +---++
> 1 row selected (0.2 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> select count(*) from customer;
> Error: SYSTEM ERROR: IllegalStateException: Incoming batch [#1341, 
> ProjectRecordBatch] has an empty schema. This is not allowed.
> Fragment 0:0
> [Error Id: 4c867440-0fd3-4eda-922f-0f5eadcb1463 on qa-node191.qa.lab:31010] 
> (state=,code=0)
> {code}
> Hive DDL for the table :
> {code}
> create table customer
> (
> c_customer_sk int,
> c_customer_id string,
> c_current_cdemo_sk int,
> c_current_hdemo_sk int,
> c_current_addr_sk int,
> c_first_shipto_date_sk int,
> c_first_sales_date_sk int,
> c_salutation string,
> c_first_name string,
> c_last_name string,
> c_preferred_cust_flag string,
> c_birth_day int,
> c_birth_month int,
> c_birth_year int,
> c_birth_country string,
> c_login string,
> c_email_address string,
> c_last_review_date string
> )
> STORED AS PARQUET
> LOCATION '/drill/testdata/customer'
> {code}
> Attached the log file with the stacktrace



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4345) Hive Native Reader reporting wrong results for timestamp column in hive generated parquet file

2016-02-02 Thread Rahul Challapalli (JIRA)
Rahul Challapalli created DRILL-4345:


 Summary: Hive Native Reader reporting wrong results for timestamp 
column in hive generated parquet file
 Key: DRILL-4345
 URL: https://issues.apache.org/jira/browse/DRILL-4345
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Hive, Storage - Parquet
Reporter: Rahul Challapalli
Priority: Critical


git.commit.id.abbrev=1b96174

Below you can see different results returned from hive plugin and native reader 
for the same table.

{code}
0: jdbc:drill:zk=10.10.100.190:5181> use hive;
+---+---+
|  ok   |  summary  |
+---+---+
| true  | Default schema changed to [hive]  |
+---+---+
1 row selected (0.415 seconds)
0: jdbc:drill:zk=10.10.100.190:5181> select int_col, timestamp_col from 
hive1_fewtypes_null_parquet;
+--++
| int_col  | timestamp_col  |
+--++
| 1| null   |
| null | 1997-01-02 00:00:00.0  |
| 3| null   |
| 4| null   |
| 5| 1997-02-10 17:32:00.0  |
| 6| 1997-02-11 17:32:01.0  |
| 7| 1997-02-12 17:32:01.0  |
| 8| 1997-02-13 17:32:01.0  |
| 9| null   |
| 10   | 1997-02-15 17:32:01.0  |
| null | 1997-02-16 17:32:01.0  |
| 12   | 1897-02-18 17:32:01.0  |
| 13   | 2002-02-14 17:32:01.0  |
| 14   | 1991-02-10 17:32:01.0  |
| 15   | 1900-02-16 17:32:01.0  |
| 16   | null   |
| null | 1897-02-16 17:32:01.0  |
| 18   | 1997-02-16 17:32:01.0  |
| null | null   |
| 20   | 1996-02-28 17:32:01.0  |
| null | null   |
+--++
21 rows selected (0.368 seconds)
0: jdbc:drill:zk=10.10.100.190:5181> alter session set 
`store.hive.optimize_scan_with_native_readers` = true;
+---++
|  ok   |summary |
+---++
| true  | store.hive.optimize_scan_with_native_readers updated.  |
+---++
1 row selected (0.213 seconds)
0: jdbc:drill:zk=10.10.100.190:5181> select int_col, timestamp_col from 
hive1_fewtypes_null_parquet;
+--++
| int_col  | timestamp_col  |
+--++
| 1| null   |
| null | 1997-01-02 00:00:00.0  |
| 3| 1997-02-10 17:32:00.0  |
| 4| null   |
| 5| 1997-02-11 17:32:01.0  |
| 6| 1997-02-12 17:32:01.0  |
| 7| 1997-02-13 17:32:01.0  |
| 8| 1997-02-15 17:32:01.0  |
| 9| 1997-02-16 17:32:01.0  |
| 10   | 1900-02-16 17:32:01.0  |
| null | 1897-02-16 17:32:01.0  |
| 12   | 1997-02-16 17:32:01.0  |
| 13   | 1996-02-28 17:32:01.0  |
| 14   | 1997-01-02 00:00:00.0  |
| 15   | 1997-01-02 00:00:00.0  |
| 16   | 1997-01-02 00:00:00.0  |
| null | 1997-01-02 00:00:00.0  |
| 18   | 1997-01-02 00:00:00.0  |
| null | 1997-01-02 00:00:00.0  |
| 20   | 1997-01-02 00:00:00.0  |
| null | 1997-01-02 00:00:00.0  |
+--++
21 rows selected (0.352 seconds)
{code}

DDL for hive table :
{code}
create external table hive1_fewtypes_null_parquet (
  int_col int,
  bigint_col bigint,
  date_col string,
  time_col string,
  timestamp_col timestamp,
  interval_col string,
  varchar_col string,
  float_col float,
  double_col double,
  bool_col boolean
)
stored as parquet
location '/drill/testdata/hive_storage/hive1_fewtypes_null';
{code}

Attached the underlying parquet file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4048) Parquet reader corrupts dictionary encoded binary columns

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4048.

   Resolution: Fixed
Fix Version/s: 1.4.0

> Parquet reader corrupts dictionary encoded binary columns
> -
>
> Key: DRILL-4048
> URL: https://issues.apache.org/jira/browse/DRILL-4048
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.3.0
>Reporter: Rahul Challapalli
>Assignee: Jason Altekruse
>Priority: Blocker
> Fix For: 1.4.0
>
> Attachments: lineitem_dic_enc.parquet
>
>
> git.commit.id.abbrev=04c01bd
> The below query returns corrupted data (not even showing up here) for binary 
> columns
> {code}
> select * from `lineitem_dic_enc.parquet` limit 1;
> +-+++---+-+--+-++---+---+-+---+++-+--+
> | l_orderkey  | l_partkey  | l_suppkey  | l_linenumber  | l_quantity  | 
> l_extendedprice  | l_discount  | l_tax  | l_returnflag  | l_linestatus  | 
> l_shipdate  | l_commitdate  | l_receiptdate  |   l_shipinstruct   | 
> l_shipmode  |l_comment |
> +-+++---+-+--+-++---+---+-+---+++-+--+
> | 1   | 1552   | 93 | 1 | 17.0| 
> 24710.35 | 0.04| 0.02   |  |  | 
> 1996-03-13  | 1996-02-12| 1996-03-22 | DELIVER IN PE  | T   | 
> egular courts above the  |
> +-+++---+-+--+-++---+---+-+---+++-+--+
> {code}
> The same query from an older build (git.commit.id.abbrev=839f8da)
> {code}
> select * from `lineitem_dic_enc.parquet` limit 1;
> +-+++---+-+--+-++---+---+-+---+++-+--+
> | l_orderkey  | l_partkey  | l_suppkey  | l_linenumber  | l_quantity  | 
> l_extendedprice  | l_discount  | l_tax  | l_returnflag  | l_linestatus  | 
> l_shipdate  | l_commitdate  | l_receiptdate  |   l_shipinstruct   | 
> l_shipmode  |l_comment |
> +-+++---+-+--+-++---+---+-+---+++-+--+
> | 1   | 1552   | 93 | 1 | 17.0| 
> 24710.35 | 0.04| 0.02   | N | O | 
> 1996-03-13  | 1996-02-12| 1996-03-22 | DELIVER IN PERSON  | TRUCK 
>   | egular courts above the  |
> +-+++---+-+--+-++---+---+-+---+++-+--+
> {code}
> Below is the output of the parquet-meta command for this dataset
> {code}
> creator: parquet-mr 
> file schema: root 
> ---
> l_orderkey:  REQUIRED INT32 R:0 D:0
> l_partkey:   REQUIRED INT32 R:0 D:0
> l_suppkey:   REQUIRED INT32 R:0 D:0
> l_linenumber:REQUIRED INT32 R:0 D:0
> l_quantity:  REQUIRED DOUBLE R:0 D:0
> l_extendedprice: REQUIRED DOUBLE R:0 D:0
> l_discount:  REQUIRED DOUBLE R:0 D:0
> l_tax:   REQUIRED DOUBLE R:0 D:0
> l_returnflag:REQUIRED BINARY O:UTF8 R:0 D:0
> l_linestatus:REQUIRED BINARY O:UTF8 R:0 D:0
> l_shipdate:  REQUIRED INT32 O:DATE R:0 D:0
> l_commitdate:REQUIRED INT32 O:DATE R:0 D:0
> l_receiptdate:   REQUIRED INT32 O:DATE R:0 D:0
> l_shipinstruct:  REQUIRED BINARY O:UTF8 R:0 D:0
> l_shipmode:  REQUIRED BINARY O:UTF8 R:0 D:0
> l_comment:   REQUIRED BINARY O:UTF8 R:0 D:0
> row group 1: RC:60175 TS:3049610 
> 

[jira] [Closed] (DRILL-4048) Parquet reader corrupts dictionary encoded binary columns

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse closed DRILL-4048.
--

> Parquet reader corrupts dictionary encoded binary columns
> -
>
> Key: DRILL-4048
> URL: https://issues.apache.org/jira/browse/DRILL-4048
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.3.0
>Reporter: Rahul Challapalli
>Assignee: Jason Altekruse
>Priority: Blocker
> Fix For: 1.4.0
>
> Attachments: lineitem_dic_enc.parquet
>
>
> git.commit.id.abbrev=04c01bd
> The below query returns corrupted data (not even showing up here) for binary 
> columns
> {code}
> select * from `lineitem_dic_enc.parquet` limit 1;
> +-+++---+-+--+-++---+---+-+---+++-+--+
> | l_orderkey  | l_partkey  | l_suppkey  | l_linenumber  | l_quantity  | 
> l_extendedprice  | l_discount  | l_tax  | l_returnflag  | l_linestatus  | 
> l_shipdate  | l_commitdate  | l_receiptdate  |   l_shipinstruct   | 
> l_shipmode  |l_comment |
> +-+++---+-+--+-++---+---+-+---+++-+--+
> | 1   | 1552   | 93 | 1 | 17.0| 
> 24710.35 | 0.04| 0.02   |  |  | 
> 1996-03-13  | 1996-02-12| 1996-03-22 | DELIVER IN PE  | T   | 
> egular courts above the  |
> +-+++---+-+--+-++---+---+-+---+++-+--+
> {code}
> The same query from an older build (git.commit.id.abbrev=839f8da)
> {code}
> select * from `lineitem_dic_enc.parquet` limit 1;
> +-+++---+-+--+-++---+---+-+---+++-+--+
> | l_orderkey  | l_partkey  | l_suppkey  | l_linenumber  | l_quantity  | 
> l_extendedprice  | l_discount  | l_tax  | l_returnflag  | l_linestatus  | 
> l_shipdate  | l_commitdate  | l_receiptdate  |   l_shipinstruct   | 
> l_shipmode  |l_comment |
> +-+++---+-+--+-++---+---+-+---+++-+--+
> | 1   | 1552   | 93 | 1 | 17.0| 
> 24710.35 | 0.04| 0.02   | N | O | 
> 1996-03-13  | 1996-02-12| 1996-03-22 | DELIVER IN PERSON  | TRUCK 
>   | egular courts above the  |
> +-+++---+-+--+-++---+---+-+---+++-+--+
> {code}
> Below is the output of the parquet-meta command for this dataset
> {code}
> creator: parquet-mr 
> file schema: root 
> ---
> l_orderkey:  REQUIRED INT32 R:0 D:0
> l_partkey:   REQUIRED INT32 R:0 D:0
> l_suppkey:   REQUIRED INT32 R:0 D:0
> l_linenumber:REQUIRED INT32 R:0 D:0
> l_quantity:  REQUIRED DOUBLE R:0 D:0
> l_extendedprice: REQUIRED DOUBLE R:0 D:0
> l_discount:  REQUIRED DOUBLE R:0 D:0
> l_tax:   REQUIRED DOUBLE R:0 D:0
> l_returnflag:REQUIRED BINARY O:UTF8 R:0 D:0
> l_linestatus:REQUIRED BINARY O:UTF8 R:0 D:0
> l_shipdate:  REQUIRED INT32 O:DATE R:0 D:0
> l_commitdate:REQUIRED INT32 O:DATE R:0 D:0
> l_receiptdate:   REQUIRED INT32 O:DATE R:0 D:0
> l_shipinstruct:  REQUIRED BINARY O:UTF8 R:0 D:0
> l_shipmode:  REQUIRED BINARY O:UTF8 R:0 D:0
> l_comment:   REQUIRED BINARY O:UTF8 R:0 D:0
> row group 1: RC:60175 TS:3049610 
> 

[jira] [Closed] (DRILL-4192) Dir0 and Dir1 from drill-1.4 are messed up

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse closed DRILL-4192.
--

> Dir0 and Dir1 from drill-1.4 are messed up
> --
>
> Key: DRILL-4192
> URL: https://issues.apache.org/jira/browse/DRILL-4192
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.4.0
>Reporter: Krystal
>Assignee: Aman Sinha
>Priority: Blocker
> Fix For: 1.5.0
>
>
> I have the following directories:
> /drill/testdata/temp1/abc/dt=2014-12-30/lineitem.parquet
> /drill/testdata/temp1/abc/dt=2014-12-31/lineitem.parquet
> The following queries returned incorrect data.
> select dir0,dir1 from dfs.`/drill/testdata/temp1` limit 2;
> ++---+
> |  dir0  | dir1  |
> ++---+
> | dt=2014-12-30  | null  |
> | dt=2014-12-30  | null  |
> ++---+
> select dir0 from dfs.`/drill/testdata/temp1` limit 2;
> ++
> |  dir0  |
> ++
> | dt=2014-12-31  |
> | dt=2014-12-31  |
> ++



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4205) Simple query hit IndexOutOfBoundException

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4205.

   Resolution: Fixed
Fix Version/s: 1.5.0

>  Simple query hit IndexOutOfBoundException
> --
>
> Key: DRILL-4205
> URL: https://issues.apache.org/jira/browse/DRILL-4205
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.4.0
>Reporter: Dechang Gu
>Assignee: Dechang Gu
> Fix For: 1.5.0
>
>
> The following query failed due to IOB:
> 0: jdbc:drill:schema=wf_pigprq100> select * from 
> `store_sales/part-m-00073.parquet`;
> Error: SYSTEM ERROR: IndexOutOfBoundsException: srcIndex: 1048587
> Fragment 0:0
> [Error Id: ad8d2bc0-259f-483c-9024-93865963541e on ucs-node4.perf.lab:31010]
>   (org.apache.drill.common.exceptions.DrillRuntimeException) Error in parquet 
> record reader.
> Message: 
> Hadoop path: /tpcdsPigParq/SF100/store_sales/part-m-00073.parquet
> Total records read: 135280
> Mock records read: 0
> Records to read: 1424
> Row group index: 0
> Records in row group: 3775712
> Parquet Metadata: ParquetMetaData{FileMetaData{schema: message pig_schema {
>   optional int64 ss_sold_date_sk;
>   optional int64 ss_sold_time_sk;
>   optional int64 ss_item_sk;
>   optional int64 ss_customer_sk;
>   optional int64 ss_cdemo_sk;
>   optional int64 ss_hdemo_sk;
>   optional int64 ss_addr_sk;
>   optional int64 ss_store_sk;
>   optional int64 ss_promo_sk;
>   optional int64 ss_ticket_number;
>   optional int64 ss_quantity;
>   optional double ss_wholesale_cost;
>   optional double ss_list_price;
>   optional double ss_sales_price;
>   optional double ss_ext_discount_amt;
>   optional double ss_ext_sales_price;
>   optional double ss_ext_wholesale_cost;
>   optional double ss_ext_list_price;
>   optional double ss_ext_tax;
>   optional double ss_coupon_amt;
>   optional double ss_net_paid;
>   optional double ss_net_paid_inc_tax;
>   optional double ss_net_profit;
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (DRILL-4192) Dir0 and Dir1 from drill-1.4 are messed up

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse reopened DRILL-4192:


> Dir0 and Dir1 from drill-1.4 are messed up
> --
>
> Key: DRILL-4192
> URL: https://issues.apache.org/jira/browse/DRILL-4192
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.4.0
>Reporter: Krystal
>Assignee: Aman Sinha
>Priority: Blocker
> Fix For: 1.5.0
>
>
> I have the following directories:
> /drill/testdata/temp1/abc/dt=2014-12-30/lineitem.parquet
> /drill/testdata/temp1/abc/dt=2014-12-31/lineitem.parquet
> The following queries returned incorrect data.
> select dir0,dir1 from dfs.`/drill/testdata/temp1` limit 2;
> ++---+
> |  dir0  | dir1  |
> ++---+
> | dt=2014-12-30  | null  |
> | dt=2014-12-30  | null  |
> ++---+
> select dir0 from dfs.`/drill/testdata/temp1` limit 2;
> ++
> |  dir0  |
> ++
> | dt=2014-12-31  |
> | dt=2014-12-31  |
> ++



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4192) Dir0 and Dir1 from drill-1.4 are messed up

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4192.

   Resolution: Fixed
Fix Version/s: 1.5.0

> Dir0 and Dir1 from drill-1.4 are messed up
> --
>
> Key: DRILL-4192
> URL: https://issues.apache.org/jira/browse/DRILL-4192
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.4.0
>Reporter: Krystal
>Assignee: Aman Sinha
>Priority: Blocker
> Fix For: 1.5.0
>
>
> I have the following directories:
> /drill/testdata/temp1/abc/dt=2014-12-30/lineitem.parquet
> /drill/testdata/temp1/abc/dt=2014-12-31/lineitem.parquet
> The following queries returned incorrect data.
> select dir0,dir1 from dfs.`/drill/testdata/temp1` limit 2;
> ++---+
> |  dir0  | dir1  |
> ++---+
> | dt=2014-12-30  | null  |
> | dt=2014-12-30  | null  |
> ++---+
> select dir0 from dfs.`/drill/testdata/temp1` limit 2;
> ++
> |  dir0  |
> ++
> | dt=2014-12-31  |
> | dt=2014-12-31  |
> ++



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4328) Fix for backward compatibility regression caused by DRILL-4198

2016-02-02 Thread Jinfeng Ni (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni resolved DRILL-4328.
---
Resolution: Fixed

Fixed in commit: 03197d0f2c665b7671b366332e1b4ebc2f271bd9

> Fix for backward compatibility regression caused by DRILL-4198
> --
>
> Key: DRILL-4328
> URL: https://issues.apache.org/jira/browse/DRILL-4328
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
> Fix For: 1.5.0
>
>
> Revert updates made to StoragePlugin interface in DRILL-4198. Instead add the 
> new methods to AbstractStoragePlugin. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4328) Fix for backward compatibility regression caused by DRILL-4198

2016-02-02 Thread Jinfeng Ni (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni updated DRILL-4328:
--
Fix Version/s: 1.5.0

> Fix for backward compatibility regression caused by DRILL-4198
> --
>
> Key: DRILL-4328
> URL: https://issues.apache.org/jira/browse/DRILL-4328
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
> Fix For: 1.5.0
>
>
> Revert updates made to StoragePlugin interface in DRILL-4198. Instead add the 
> new methods to AbstractStoragePlugin. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4339) Avro Reader can not read records - Regression

2016-02-02 Thread Jinfeng Ni (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni updated DRILL-4339:
--
Fix Version/s: (was: 1.5.0)
   1.6.0

> Avro Reader can not read records - Regression
> -
>
> Key: DRILL-4339
> URL: https://issues.apache.org/jira/browse/DRILL-4339
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Affects Versions: 1.5.0
>Reporter: Stefán Baxter
>Priority: Blocker
> Fix For: 1.6.0
>
>
> Simple reading of Avro records no longer works
> 0: jdbc:drill:zk=local> select * from dfs.asa.`/`;
> Exception in thread "drill-executor-2" java.lang.NoSuchMethodError: 
> org.apache.drill.exec.store.avro.AvroRecordReader.setColumns(Ljava/util/Collection;)V
>   at 
> org.apache.drill.exec.store.avro.AvroRecordReader.(AvroRecordReader.java:99)
>   at 
> org.apache.drill.exec.store.avro.AvroFormatPlugin.getRecordReader(AvroFormatPlugin.java:73)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin.getReaderBatch(EasyFormatPlugin.java:172)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:35)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:28)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:147)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:101)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:79)
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:230)
>   at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> We have been using the Avro reader for a while and this looks like a 
> regression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4279) Improve performance for skipAll query against Text/JSON/Parquet table

2016-02-02 Thread Jinfeng Ni (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni resolved DRILL-4279.
---
   Resolution: Fixed
Fix Version/s: 1.6.0

Fixed in commit: 9ff947288f3214fe8e525e001d89a4f91b8b0728

> Improve performance for skipAll query against Text/JSON/Parquet table
> -
>
> Key: DRILL-4279
> URL: https://issues.apache.org/jira/browse/DRILL-4279
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
> Fix For: 1.6.0
>
>
> When query does not specify any specific column to be returned SCAN,  for 
> instance,
> {code}
> Q1:  select count(*) from T1;
> Q2:  select 1 + 100 from T1;
> Q3:  select  1.0 + random() from T1; 
> {code}
> Drill's planner would use a ColumnList with * column, plus a SKIP_ALL mode. 
> However, the MODE is not serialized / deserialized. This leads to two 
> problems.
> 1).  The EXPLAIN plan is confusing, since there is no way to different from a 
> "SELECT * " query from this SKIP_ALL mode. 
> For instance, 
> {code}
> explain plan for select count(*) from dfs.`/Users/jni/work/data/yelp/t1`;
> 00-03  Project($f0=[0])
> 00-04Scan(groupscan=[EasyGroupScan 
> [selectionRoot=file:/Users/jni/work/data/yelp/t1, numFiles=2, columns=[`*`], 
> files= ... 
> {code} 
> 2) If the query is to be executed distributed / parallel,  the missing 
> serialization of mode would means some Fragment is fetching all the columns, 
> while some Fragment is skipping all the columns. That will cause execution 
> error.
> For instance, by changing slice_target to enforce the query to be executed in 
> multiple fragments, it will hit execution error. 
> {code}
> select count(*) from dfs.`/Users/jni/work/data/yelp/t1`;
> org.apache.drill.common.exceptions.UserRemoteException: DATA_READ ERROR: 
> Error parsing JSON - You tried to start when you are using a ValueWriter of 
> type NullableBitWriterImpl.
> {code}
> Directory "t1" just contains two yelp JSON files. 
> Ideally, I think when no columns is required from SCAN, the explain plan 
> should show an empty of column list. The MODE of SKIP_ALL together with star 
> * column seems to be confusing and error prone. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4342) Drill fails to read a date column from hive generated parquet

2016-02-02 Thread Rahul Challapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128933#comment-15128933
 ] 

Rahul Challapalli commented on DRILL-4342:
--

Looks like a possible duplicate of DRILL-4203

> Drill fails to read a date column from hive generated parquet
> -
>
> Key: DRILL-4342
> URL: https://issues.apache.org/jira/browse/DRILL-4342
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Hive, Storage - Parquet
>Reporter: Rahul Challapalli
> Attachments: fewtypes_null.parquet
>
>
> git.commit.id.abbrev=576271d
> Below is the hive ddl (using hive 1.2 which supports date in parquet)
> {code}
> create external table hive1dot2_fewtypes_null (
>   int_col int,
>   bigint_col bigint,
>   date_col date,
>   time_col string,
>   timestamp_col timestamp,
>   interval_col string,
>   varchar_col string,
>   float_col float,
>   double_col double,
>   bool_col boolean
> )
> stored as parquet
> location '/drill/testdata/hive_storage/hive1dot2_fewtypes_null';
> {code}
> Query using the hive storage plugin
> {code}
> date_col from hive.hive1dot2_fewtypes_null;
> +-+
> |  date_col   |
> +-+
> | null|
> | null|
> | null|
> | 1996-01-29  |
> | 1996-03-01  |
> | 1996-03-02  |
> | 1997-02-28  |
> | null|
> | 1997-03-01  |
> | 1997-03-02  |
> | 2000-04-01  |
> | 2000-04-03  |
> | 2038-04-08  |
> | 2039-04-09  |
> | 2040-04-10  |
> | null|
> | 1999-02-08  |
> | 1999-03-08  |
> | 1999-01-18  |
> | 2003-01-02  |
> | null|
> +-+
> {code}
> Below is the output reading through dfs parquet reader. 
> {code}
> 0: jdbc:drill:zk=10.10.10.41:5181> select date_col from 
> dfs.`/drill/testdata/hive_storage/hive1dot2_fewtypes_null`;
> +-+
> |  date_col   |
> +-+
> | null|
> | null|
> | null|
> | 369-02-09  |
> | 369-03-12  |
> | 369-03-13  |
> | 368-03-11  |
> | null|
> | 368-03-12  |
> | 368-03-13  |
> | 365-04-12  |
> | 365-04-14  |
> | 327-04-19  |
> | 326-04-20  |
> | 325-04-21  |
> | null|
> | 366-02-19  |
> | 366-03-19  |
> | 366-01-29  |
> | 362-01-13  |
> | null|
> +-+
> {code}
> I attached the parquet file generated from hive. Let me know if anything else 
> is needed for reproducing this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4341) Fails to parse string literals containing escaped quotes

2016-02-02 Thread Oscar Morante (JIRA)
Oscar Morante created DRILL-4341:


 Summary: Fails to parse string literals containing escaped quotes
 Key: DRILL-4341
 URL: https://issues.apache.org/jira/browse/DRILL-4341
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.5.0
Reporter: Oscar Morante


In some situations, Drill fails to parse strings with escaped single quotes.  
Just changing the last line of this query makes the problem appear and 
disappear (without touching the string it's complaining about):

{code:sql}
select
  '-MM-dd''T''HH:mm:ss.SSS''Z'''
from
  s3; --> FAILS
  --s3.`2015`; --> FAILS
  --s3.`2015/11`; --> FAILS
  --s3.`2015/11/3`; --> WORKS
  --s3.`2015/11/3/10-1072045-1612661.json.gz`; --> WORKS
  --(select * from s3 limit 1); --> WORKS
  --(select * from s3.`2015` limit 1); --> WORKS
  --(select * from s3.`2015/11` limit 1); --> WORKS
{code}

This is the error when it fails:

{code}
ExampleExceptionFormatter: exception message was: SYSTEM ERROR: 
ExpressionParsingException: Expression has syntax error! line 1:12:missing EOF 
at 'T'

Fragment 1:1
{code}

This is very important when dealing with json files containing dates encoded as 
iso-8601 strings.  My current workaround is something like this:

{code:sql}
select to_timestamp(regexp_replace(`timestamp`, '[TZ]', ''),
'-MM-ddHH:mm:ss.SSS')
from s3;
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4339) Avro Reader can not read records - Regression

2016-02-02 Thread Jinfeng Ni (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128835#comment-15128835
 ] 

Jinfeng Ni commented on DRILL-4339:
---

Jason's explanation makes sense. the setColumns method changes definition in 
1.5.0.  The error message seems to indicate a wrong version is loaded. 

> Avro Reader can not read records - Regression
> -
>
> Key: DRILL-4339
> URL: https://issues.apache.org/jira/browse/DRILL-4339
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Affects Versions: 1.5.0
>Reporter: Stefán Baxter
>Priority: Blocker
> Fix For: 1.5.0
>
>
> Simple reading of Avro records no longer works
> 0: jdbc:drill:zk=local> select * from dfs.asa.`/`;
> Exception in thread "drill-executor-2" java.lang.NoSuchMethodError: 
> org.apache.drill.exec.store.avro.AvroRecordReader.setColumns(Ljava/util/Collection;)V
>   at 
> org.apache.drill.exec.store.avro.AvroRecordReader.(AvroRecordReader.java:99)
>   at 
> org.apache.drill.exec.store.avro.AvroFormatPlugin.getRecordReader(AvroFormatPlugin.java:73)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin.getReaderBatch(EasyFormatPlugin.java:172)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:35)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:28)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:147)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:101)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:79)
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:230)
>   at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> We have been using the Avro reader for a while and this looks like a 
> regression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4342) Drill fails to read a date column from hive generated parquet

2016-02-02 Thread Rahul Challapalli (JIRA)
Rahul Challapalli created DRILL-4342:


 Summary: Drill fails to read a date column from hive generated 
parquet
 Key: DRILL-4342
 URL: https://issues.apache.org/jira/browse/DRILL-4342
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Hive, Storage - Parquet
Reporter: Rahul Challapalli


git.commit.id.abbrev=576271d

Below is the hive ddl (using hive 1.2 which supports date in parquet)
{code}
create external table hive1dot2_fewtypes_null (
  int_col int,
  bigint_col bigint,
  date_col date,
  time_col string,
  timestamp_col timestamp,
  interval_col string,
  varchar_col string,
  float_col float,
  double_col double,
  bool_col boolean
)
stored as parquet
location '/drill/testdata/hive_storage/hive1dot2_fewtypes_null';
{code}

Query using the hive storage plugin
{code}
date_col from hive.hive1dot2_fewtypes_null;
+-+
|  date_col   |
+-+
| null|
| null|
| null|
| 1996-01-29  |
| 1996-03-01  |
| 1996-03-02  |
| 1997-02-28  |
| null|
| 1997-03-01  |
| 1997-03-02  |
| 2000-04-01  |
| 2000-04-03  |
| 2038-04-08  |
| 2039-04-09  |
| 2040-04-10  |
| null|
| 1999-02-08  |
| 1999-03-08  |
| 1999-01-18  |
| 2003-01-02  |
| null|
+-+
{code}

Below is the output reading through dfs parquet reader. 
{code}
0: jdbc:drill:zk=10.10.10.41:5181> select date_col from 
dfs.`/drill/testdata/hive_storage/hive1dot2_fewtypes_null`;
+-+
|  date_col   |
+-+
| null|
| null|
| null|
| 369-02-09  |
| 369-03-12  |
| 369-03-13  |
| 368-03-11  |
| null|
| 368-03-12  |
| 368-03-13  |
| 365-04-12  |
| 365-04-14  |
| 327-04-19  |
| 326-04-20  |
| 325-04-21  |
| null|
| 366-02-19  |
| 366-03-19  |
| 366-01-29  |
| 362-01-13  |
| null|
+-+
{code}

I attached the parquet file generated from hive. Let me know if anything else 
is needed for reproducing this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4339) Avro Reader can not read records - Regression

2016-02-02 Thread JIRA
Stefán Baxter created DRILL-4339:


 Summary: Avro Reader can not read records - Regression
 Key: DRILL-4339
 URL: https://issues.apache.org/jira/browse/DRILL-4339
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Other
Affects Versions: 1.5.0
Reporter: Stefán Baxter
Priority: Blocker
 Fix For: 1.5.0


Simple reading of Avro records no longer works

0: jdbc:drill:zk=local> select * from dfs.asa.`/`;
Exception in thread "drill-executor-2" java.lang.NoSuchMethodError: 
org.apache.drill.exec.store.avro.AvroRecordReader.setColumns(Ljava/util/Collection;)V
at 
org.apache.drill.exec.store.avro.AvroRecordReader.(AvroRecordReader.java:99)
at 
org.apache.drill.exec.store.avro.AvroFormatPlugin.getRecordReader(AvroFormatPlugin.java:73)
at 
org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin.getReaderBatch(EasyFormatPlugin.java:172)
at 
org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:35)
at 
org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:28)
at 
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:147)
at 
org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
at 
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
at 
org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
at 
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
at 
org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
at 
org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:101)
at 
org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:79)
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:230)
at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

We have been using the Avro reader for a while and this looks like a regression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2669) Error happening without limit clause and works with limit clause

2016-02-02 Thread Oscar Morante (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128097#comment-15128097
 ] 

Oscar Morante commented on DRILL-2669:
--

I've been doing some more testing and I think the problem is that in some 
situations it fails to parse strings with escaped single quotes.  Just changing 
the last line of this query makes the problem appear and disappear (without 
touching the string it's complaining about):

{code:sql}
select
  '-MM-dd''T''HH:mm:ss.SSS''Z'''
from
  raw; --> FAILS
  --raw.`*`; --> FAILS
  --raw.`2015`; --> FAILS
  --raw.`2015/11`; --> FAILS
  --raw.`2015/11/3`; --> WORKS
  --raw.`2015/11/3/10-1072045-1612661.json.gz`; --> WORKS
  --(select * from raw limit 1); --> WORKS
  --(select * from raw.`*` limit 1); --> WORKS
  --(select * from raw.`2015` limit 1); --> WORKS
  --(select * from raw.`2015/11` limit 1); --> WORKS
{code}

This query gives a simpler error:

{code}
ExampleExceptionFormatter: exception message was: SYSTEM ERROR: 
ExpressionParsingException: Expression has syntax error! line 1:12:missing EOF 
at 'T'

Fragment 1:1
{code}

> Error happening without limit clause and works with limit clause
> 
>
> Key: DRILL-2669
> URL: https://issues.apache.org/jira/browse/DRILL-2669
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 0.8.0
> Environment: mapr sandbox 4.0.2
>Reporter: Sudhakar Thota
> Fix For: Future
>
>
> Perhaps this could be a bug. I get the same results.
> But the plan is very different, the UnionExchange is set up immediately after 
> the scan operation in successful case( Case 1 ), where as UnionExchange is 
> happening after scan>project (Case -2).
> Case -1.Successful case.
> {code}
> 0: jdbc:drill:> explain plan for select to_timestamp(t.t, 
> '-MM-dd''T''HH:mm:ss.SSS''Z''') FROM (select * from 
> dfs.sthota_prq.`/tstamp_test/*.parquet` limit 13015351) t;
> --+
> text  json
> --+
> 00-00 Screen
> 00-01 Project(EXPR$0=[TO_TIMESTAMP(ITEM($0, 't'), 
> '-MM-dd''T''HH:mm:ss.SSS''Z''')])
> 00-02 SelectionVectorRemover
> 00-03 Limit(fetch=[13015351])
> 00-04 UnionExchange
> 01-01 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=maprfs:/mapr/demo.mapr.com/user/sthota/parquet/tstamp_test/1_2_0.parquet],
>  ReadEntryWithPath 
> [path=maprfs:/mapr/demo.mapr.com/user/sthota/parquet/tstamp_test/1_1_0.parquet],
>  ReadEntryWithPath 
> [path=maprfs:/mapr/demo.mapr.com/user/sthota/parquet/tstamp_test/1_0_0.parquet]],
>  selectionRoot=/mapr/demo.mapr.com/user/sthota/parquet/tstamp_test, 
> numFiles=3, columns=[`*`]]])
> {
> "head" :
> Unknown macro: { "version" }
> ,
> {code}
> Case -2. Unsuccessful case:
> {code}
> 0: jdbc:drill:> explain plan for select to_timestamp(t.t, 
> '-MM-dd''T''HH:mm:ss.SSS''Z''') FROM (select * from 
> dfs.sthota_prq.`/tstamp_test/*.parquet` ) t;
> --+
> text  json
> --+
> 00-00 Screen
> 00-01 UnionExchange
> 01-01 Project(EXPR$0=[TO_TIMESTAMP(ITEM($0, 't'), 
> '-MM-dd''T''HH:mm:ss.SSS''Z''')])
> 01-02 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=maprfs:/mapr/demo.mapr.com/user/sthota/parquet/tstamp_test/1_2_0.parquet],
>  ReadEntryWithPath 
> [path=maprfs:/mapr/demo.mapr.com/user/sthota/parquet/tstamp_test/1_1_0.parquet],
>  ReadEntryWithPath 
> [path=maprfs:/mapr/demo.mapr.com/user/sthota/parquet/tstamp_test/1_0_0.parquet]],
>  selectionRoot=/mapr/demo.mapr.com/user/sthota/parquet/tstamp_test, 
> numFiles=3, columns=[`*`]]])
> {
> "head" :
> Unknown macro: { "version" }
> ,
> {code}
> {code}
> 0: jdbc:drill:> select to_timestamp(t.t, '-MM-dd''T''HH:mm:ss.SSS''Z''') 
> FROM (select * from dfs.sthota_prq.`/tstamp_test/*.parquet` limit 10) t;
> 
> EXPR$0
> 
> 2015-01-27 13:43:53.0
> 2015-01-27 13:43:49.0
> 2015-01-27 13:43:47.0
> 2015-01-27 13:43:47.0
> 2015-01-27 13:43:47.0
> 2015-01-27 13:43:45.0
> 2015-01-27 13:43:43.0
> 2015-01-27 13:43:43.0
> 2015-01-27 13:43:43.0
> 2015-01-27 13:43:39.0
> 
> 10 rows selected (1.127 seconds)
> {code}
> {code}
> 0: jdbc:drill:> select to_timestamp(t.t, '-MM-dd''T''HH:mm:ss.SSS''Z''') 
> FROM (select * from dfs.sthota_prq.`/tstamp_test/*.parquet`) t;
> {code}
> {code}
> 0: jdbc:drill:> select to_timestamp(t.t, '-MM-dd''T''HH:mm:ss.SSS''Z''') 
> FROM (select * from dfs.sthota_prq.`/tstamp_test/*.parquet`) t;
> Query failed: RemoteRpcException: Failure while trying to start remote 
> fragment, Expression has syntax error! line 1:30:mismatched input 'T' 
> expecting CParen [ ab817e5a-9b74-47dd-b3c6-3bbf025c7de9 on maprdemo:31010 ]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}




[jira] [Commented] (DRILL-2556) support delimiters of multiple characters

2016-02-02 Thread Celso Mosquera (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128257#comment-15128257
 ] 

Celso Mosquera commented on DRILL-2556:
---

This would be nice to have.

> support delimiters of multiple characters
> -
>
> Key: DRILL-2556
> URL: https://issues.apache.org/jira/browse/DRILL-2556
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Text & CSV
>Affects Versions: 0.7.0
>Reporter: Vince Gonzalez
> Fix For: Future
>
>
> Support for multi-character delimiters would be nice to have. Here's a sample 
> data set: https://gist.github.com/vicenteg/111372a42b0989d43dca



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4339) Avro Reader can not read records - Regression

2016-02-02 Thread Deneche A. Hakim (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128576#comment-15128576
 ] 

Deneche A. Hakim commented on DRILL-4339:
-

[~acmeguy] you confirmed the same query was running fine on 1.4.0, right ?

> Avro Reader can not read records - Regression
> -
>
> Key: DRILL-4339
> URL: https://issues.apache.org/jira/browse/DRILL-4339
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Affects Versions: 1.5.0
>Reporter: Stefán Baxter
>Priority: Blocker
> Fix For: 1.5.0
>
>
> Simple reading of Avro records no longer works
> 0: jdbc:drill:zk=local> select * from dfs.asa.`/`;
> Exception in thread "drill-executor-2" java.lang.NoSuchMethodError: 
> org.apache.drill.exec.store.avro.AvroRecordReader.setColumns(Ljava/util/Collection;)V
>   at 
> org.apache.drill.exec.store.avro.AvroRecordReader.(AvroRecordReader.java:99)
>   at 
> org.apache.drill.exec.store.avro.AvroFormatPlugin.getRecordReader(AvroFormatPlugin.java:73)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin.getReaderBatch(EasyFormatPlugin.java:172)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:35)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:28)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:147)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:101)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:79)
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:230)
>   at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> We have been using the Avro reader for a while and this looks like a 
> regression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4339) Avro Reader can not read records - Regression

2016-02-02 Thread Jason Altekruse (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128584#comment-15128584
 ] 

Jason Altekruse commented on DRILL-4339:


The signature on this method changed in a recent commit (the setColumns method 
used to take a Collection of column paths, it now takes a List). Is there a 
chance you have 1.4 JARs on your classpath that may be causing the wrong 
version to be loaded?

[~adeneche] He mentioned on the list that it was running in 1.4.

> Avro Reader can not read records - Regression
> -
>
> Key: DRILL-4339
> URL: https://issues.apache.org/jira/browse/DRILL-4339
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Affects Versions: 1.5.0
>Reporter: Stefán Baxter
>Priority: Blocker
> Fix For: 1.5.0
>
>
> Simple reading of Avro records no longer works
> 0: jdbc:drill:zk=local> select * from dfs.asa.`/`;
> Exception in thread "drill-executor-2" java.lang.NoSuchMethodError: 
> org.apache.drill.exec.store.avro.AvroRecordReader.setColumns(Ljava/util/Collection;)V
>   at 
> org.apache.drill.exec.store.avro.AvroRecordReader.(AvroRecordReader.java:99)
>   at 
> org.apache.drill.exec.store.avro.AvroFormatPlugin.getRecordReader(AvroFormatPlugin.java:73)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin.getReaderBatch(EasyFormatPlugin.java:172)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:35)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:28)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:147)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:101)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:79)
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:230)
>   at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> We have been using the Avro reader for a while and this looks like a 
> regression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4338) Concurrent query remains in CANCELLATION_REQUESTED state

2016-02-02 Thread Khurram Faraaz (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129934#comment-15129934
 ] 

Khurram Faraaz commented on DRILL-4338:
---

Likewise, while the concurrent set of queries are under execution, if the 
Drillbit on Foreman node is killed (kill -9 PID), the java program that was 
under execution hangs without terminating. Here is the stack for the hung java 
process.

{noformat}
 [root@centos-01 logs]# ps -eaf | grep Concurren
root 17772  1827  0 06:04 pts/900:00:00 grep Concurren
root 31081  2411  2 05:50 pts/700:00:18 java ConcurrencyTest
[root@centos-01 logs]# jstack 31081
2016-02-03 06:05:26
Full thread dump OpenJDK 64-Bit Server VM (25.65-b01 mixed mode):

"Attach Listener" #35 daemon prio=9 os_prio=0 tid=0x7ff3f0001000 nid=0x488b 
waiting on condition [0x]
   java.lang.Thread.State: RUNNABLE

"DestroyJavaVM" #34 prio=5 os_prio=0 tid=0x7ff4a8011000 nid=0x796a waiting 
on condition [0x]
   java.lang.Thread.State: RUNNABLE

"pool-1-thread-12" #33 prio=5 os_prio=0 tid=0x7ff4a8e54000 nid=0x79c3 
waiting on condition [0x7ff4765f3000]
   java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x0006d0718178> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at 
java.util.concurrent.LinkedBlockingDeque.pollFirst(LinkedBlockingDeque.java:522)
at 
java.util.concurrent.LinkedBlockingDeque.poll(LinkedBlockingDeque.java:684)
at 
org.apache.drill.jdbc.impl.DrillResultSetImpl$ResultsListener.getNext(DrillResultSetImpl.java:2097)
at 
org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:175)
at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:321)
at 
net.hydromatic.avatica.AvaticaResultSet.next(AvaticaResultSet.java:187)
at 
org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:172)
at ConcurrencyTest.executeQuery(ConcurrencyTest.java:55)
at ConcurrencyTest.SELECTData(ConcurrencyTest.java:37)
at ConcurrencyTest.run(ConcurrencyTest.java:23)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

"pool-1-thread-11" #32 prio=5 os_prio=0 tid=0x7ff4a8e52800 nid=0x79c2 
waiting on condition [0x7ff4766f4000]
   java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x0006d0720178> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at 
java.util.concurrent.LinkedBlockingDeque.pollFirst(LinkedBlockingDeque.java:522)
at 
java.util.concurrent.LinkedBlockingDeque.poll(LinkedBlockingDeque.java:684)
at 
org.apache.drill.jdbc.impl.DrillResultSetImpl$ResultsListener.getNext(DrillResultSetImpl.java:2097)
at 
org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:175)
at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:321)
at 
net.hydromatic.avatica.AvaticaResultSet.next(AvaticaResultSet.java:187)
at 
org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:172)
at ConcurrencyTest.executeQuery(ConcurrencyTest.java:55)
at ConcurrencyTest.SELECTData(ConcurrencyTest.java:37)
at ConcurrencyTest.run(ConcurrencyTest.java:23)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

"pool-1-thread-10" #31 prio=5 os_prio=0 tid=0x7ff4a8e51000 nid=0x79c1 
waiting on condition [0x7ff4767f5000]
   java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x0006d0728178> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at 

[jira] [Commented] (DRILL-4339) Avro Reader can not read records - Regression

2016-02-02 Thread Steven Phillips (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129381#comment-15129381
 ] 

Steven Phillips commented on DRILL-4339:


I personally don't have much problem with having to recompile my code, I was 
just wondering if this would create problems for others.

If reverting the signature change could avoid a few headaches, and there is 
very little cost in making the change, I say we go ahead and merge it.

+1

> Avro Reader can not read records - Regression
> -
>
> Key: DRILL-4339
> URL: https://issues.apache.org/jira/browse/DRILL-4339
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Affects Versions: 1.5.0
>Reporter: Stefán Baxter
>Priority: Blocker
> Fix For: 1.5.0
>
>
> Simple reading of Avro records no longer works
> 0: jdbc:drill:zk=local> select * from dfs.asa.`/`;
> Exception in thread "drill-executor-2" java.lang.NoSuchMethodError: 
> org.apache.drill.exec.store.avro.AvroRecordReader.setColumns(Ljava/util/Collection;)V
>   at 
> org.apache.drill.exec.store.avro.AvroRecordReader.(AvroRecordReader.java:99)
>   at 
> org.apache.drill.exec.store.avro.AvroFormatPlugin.getRecordReader(AvroFormatPlugin.java:73)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin.getReaderBatch(EasyFormatPlugin.java:172)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:35)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:28)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:147)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:101)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:79)
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:230)
>   at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> We have been using the Avro reader for a while and this looks like a 
> regression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4345) Hive Native Reader reporting wrong results for timestamp column in hive generated parquet file

2016-02-02 Thread Rahul Challapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Challapalli updated DRILL-4345:
-
Attachment: hive1_fewtypes_null.parquet

> Hive Native Reader reporting wrong results for timestamp column in hive 
> generated parquet file
> --
>
> Key: DRILL-4345
> URL: https://issues.apache.org/jira/browse/DRILL-4345
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Hive, Storage - Parquet
>Reporter: Rahul Challapalli
>Priority: Critical
> Attachments: hive1_fewtypes_null.parquet
>
>
> git.commit.id.abbrev=1b96174
> Below you can see different results returned from hive plugin and native 
> reader for the same table.
> {code}
> 0: jdbc:drill:zk=10.10.100.190:5181> use hive;
> +---+---+
> |  ok   |  summary  |
> +---+---+
> | true  | Default schema changed to [hive]  |
> +---+---+
> 1 row selected (0.415 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> select int_col, timestamp_col from 
> hive1_fewtypes_null_parquet;
> +--++
> | int_col  | timestamp_col  |
> +--++
> | 1| null   |
> | null | 1997-01-02 00:00:00.0  |
> | 3| null   |
> | 4| null   |
> | 5| 1997-02-10 17:32:00.0  |
> | 6| 1997-02-11 17:32:01.0  |
> | 7| 1997-02-12 17:32:01.0  |
> | 8| 1997-02-13 17:32:01.0  |
> | 9| null   |
> | 10   | 1997-02-15 17:32:01.0  |
> | null | 1997-02-16 17:32:01.0  |
> | 12   | 1897-02-18 17:32:01.0  |
> | 13   | 2002-02-14 17:32:01.0  |
> | 14   | 1991-02-10 17:32:01.0  |
> | 15   | 1900-02-16 17:32:01.0  |
> | 16   | null   |
> | null | 1897-02-16 17:32:01.0  |
> | 18   | 1997-02-16 17:32:01.0  |
> | null | null   |
> | 20   | 1996-02-28 17:32:01.0  |
> | null | null   |
> +--++
> 21 rows selected (0.368 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> alter session set 
> `store.hive.optimize_scan_with_native_readers` = true;
> +---++
> |  ok   |summary |
> +---++
> | true  | store.hive.optimize_scan_with_native_readers updated.  |
> +---++
> 1 row selected (0.213 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> select int_col, timestamp_col from 
> hive1_fewtypes_null_parquet;
> +--++
> | int_col  | timestamp_col  |
> +--++
> | 1| null   |
> | null | 1997-01-02 00:00:00.0  |
> | 3| 1997-02-10 17:32:00.0  |
> | 4| null   |
> | 5| 1997-02-11 17:32:01.0  |
> | 6| 1997-02-12 17:32:01.0  |
> | 7| 1997-02-13 17:32:01.0  |
> | 8| 1997-02-15 17:32:01.0  |
> | 9| 1997-02-16 17:32:01.0  |
> | 10   | 1900-02-16 17:32:01.0  |
> | null | 1897-02-16 17:32:01.0  |
> | 12   | 1997-02-16 17:32:01.0  |
> | 13   | 1996-02-28 17:32:01.0  |
> | 14   | 1997-01-02 00:00:00.0  |
> | 15   | 1997-01-02 00:00:00.0  |
> | 16   | 1997-01-02 00:00:00.0  |
> | null | 1997-01-02 00:00:00.0  |
> | 18   | 1997-01-02 00:00:00.0  |
> | null | 1997-01-02 00:00:00.0  |
> | 20   | 1997-01-02 00:00:00.0  |
> | null | 1997-01-02 00:00:00.0  |
> +--++
> 21 rows selected (0.352 seconds)
> {code}
> DDL for hive table :
> {code}
> create external table hive1_fewtypes_null_parquet (
>   int_col int,
>   bigint_col bigint,
>   date_col string,
>   time_col string,
>   timestamp_col timestamp,
>   interval_col string,
>   varchar_col string,
>   float_col float,
>   double_col double,
>   bool_col boolean
> )
> stored as parquet
> location '/drill/testdata/hive_storage/hive1_fewtypes_null';
> {code}
> Attached the underlying parquet file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4339) Avro Reader can not read records - Regression

2016-02-02 Thread Jinfeng Ni (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jinfeng Ni resolved DRILL-4339.
---
Resolution: Fixed

Fixed in commit: 6a36a704bc139aa05deb3919e792f21b3fcd7794

> Avro Reader can not read records - Regression
> -
>
> Key: DRILL-4339
> URL: https://issues.apache.org/jira/browse/DRILL-4339
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Affects Versions: 1.5.0
>Reporter: Stefán Baxter
>Priority: Blocker
> Fix For: 1.5.0
>
>
> Simple reading of Avro records no longer works
> 0: jdbc:drill:zk=local> select * from dfs.asa.`/`;
> Exception in thread "drill-executor-2" java.lang.NoSuchMethodError: 
> org.apache.drill.exec.store.avro.AvroRecordReader.setColumns(Ljava/util/Collection;)V
>   at 
> org.apache.drill.exec.store.avro.AvroRecordReader.(AvroRecordReader.java:99)
>   at 
> org.apache.drill.exec.store.avro.AvroFormatPlugin.getRecordReader(AvroFormatPlugin.java:73)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin.getReaderBatch(EasyFormatPlugin.java:172)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:35)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:28)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:147)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:101)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:79)
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:230)
>   at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> We have been using the Avro reader for a while and this looks like a 
> regression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4163) Support schema changes for MergeJoin operator.

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4163.

   Resolution: Fixed
Fix Version/s: 1.5.0

Fixed in cc9175c13270660ffd9ec2ddcbc70780dd72dada

> Support schema changes for MergeJoin operator.
> --
>
> Key: DRILL-4163
> URL: https://issues.apache.org/jira/browse/DRILL-4163
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: amit hadke
>Assignee: Jason Altekruse
> Fix For: 1.5.0
>
>
> Since external sort operator supports schema changes, allow use of union 
> types in merge join to support for schema changes.
> For now, we assume that merge join always works on record batches from sort 
> operator. Thus merging schemas and promoting to union vectors is already 
> taken care by sort operator.
> Test Cases:
> 1) Only one side changes schema (join on union type and primitive type)
> 2) Both sids change schema on all columns.
> 3) Join between numeric types and string types.
> 4) Missing columns - each batch has different columns. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-4122) Create unit test suite for checking quality of hashing for hash based operators

2016-02-02 Thread Sudheesh Katkam (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sudheesh Katkam reassigned DRILL-4122:
--

Assignee: Sudheesh Katkam

> Create unit test suite for checking quality of hashing for hash based 
> operators
> ---
>
> Key: DRILL-4122
> URL: https://issues.apache.org/jira/browse/DRILL-4122
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.3.0
>Reporter: Aman Sinha
>Assignee: Sudheesh Katkam
>
> We have encountered substantial skew in the hash based operators (hash 
> distribution, hash aggregation, hash join) for certain data sets.  Two such 
> issues are DRILL-2803, DRILL-4119.   
> It would be very useful to have a unit test suite to test the quality of 
> hashing.  
> The number of combinations is large: num_data_types x nullability x 
> num_hash_function_types (32bit, 64bit, AsDouble variations). Plus, the nature 
> of the data itself.   We would have to be judicious about picking a 
> reasonable subset of this space.   We should also look at open source test 
> suites in this area. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4339) Avro Reader can not read records - Regression

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-4339:
---
Fix Version/s: (was: 1.5.0)
   1.6.0

> Avro Reader can not read records - Regression
> -
>
> Key: DRILL-4339
> URL: https://issues.apache.org/jira/browse/DRILL-4339
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Affects Versions: 1.5.0
>Reporter: Stefán Baxter
>Priority: Blocker
> Fix For: 1.6.0
>
>
> Simple reading of Avro records no longer works
> 0: jdbc:drill:zk=local> select * from dfs.asa.`/`;
> Exception in thread "drill-executor-2" java.lang.NoSuchMethodError: 
> org.apache.drill.exec.store.avro.AvroRecordReader.setColumns(Ljava/util/Collection;)V
>   at 
> org.apache.drill.exec.store.avro.AvroRecordReader.(AvroRecordReader.java:99)
>   at 
> org.apache.drill.exec.store.avro.AvroFormatPlugin.getRecordReader(AvroFormatPlugin.java:73)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin.getReaderBatch(EasyFormatPlugin.java:172)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:35)
>   at 
> org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:28)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:147)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:127)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:170)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:101)
>   at 
> org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:79)
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:230)
>   at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> We have been using the Avro reader for a while and this looks like a 
> regression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4333) tests in Drill2489CallsAfterCloseThrowExceptionsTest fail in Java 8

2016-02-02 Thread Jacques Nadeau (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128285#comment-15128285
 ] 

Jacques Nadeau commented on DRILL-4333:
---

Whomever picks this up need to fix the test. Overriding the new methods would 
likely break java 7 compatibility 

> tests in Drill2489CallsAfterCloseThrowExceptionsTest fail in Java 8
> ---
>
> Key: DRILL-4333
> URL: https://issues.apache.org/jira/browse/DRILL-4333
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.5.0
>Reporter: Deneche A. Hakim
> Fix For: Future
>
>
> The following tests fail in Java 8:
> {noformat}
> Drill2489CallsAfterCloseThrowExceptionsTest.testClosedPlainStatementMethodsThrowRight
> Drill2489CallsAfterCloseThrowExceptionsTest.testclosedPreparedStmtOfOpenConnMethodsThrowRight
> Drill2489CallsAfterCloseThrowExceptionsTest.testClosedResultSetMethodsThrowRight1
> Drill2489CallsAfterCloseThrowExceptionsTest.testClosedResultSetMethodsThrowRight2
> Drill2489CallsAfterCloseThrowExceptionsTest.testClosedDatabaseMetaDataMethodsThrowRight
> Drill2769UnsupportedReportsUseSqlExceptionTest.testPreparedStatementMethodsThrowRight
> Drill2769UnsupportedReportsUseSqlExceptionTest.testPlainStatementMethodsThrowRight
> {noformat}
> Drill has special implementations of Statement, PreparedStatement, ResultSet 
> and DatabaseMetadata that overrides all parent methods to make sure they 
> throw a proper exception if the statement has already been closed. 
> These tests use reflection to make sure all methods behave correctly, but 
> Java 8 added more methods that need to be properly overridden.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)