Re: Review Request 30965: Follow up on DRILL-133 (LocalExchange) to save CPU cycles on hash generation when using in HashToLocalExchange

2015-02-26 Thread Yuliya Feldman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30965/
---

(Updated Feb. 26, 2015, 12:34 a.m.)


Review request for drill, Jacques Nadeau, Jinfeng Ni, Steven Phillips, and 
Venki Korukanti.


Changes
---

Addressing review comments after code review with Jacques and Venki


Bugs: DRILL-2209
https://issues.apache.org/jira/browse/DRILL-2209


Repository: drill-git


Description
---

Insert Project operator to add new column "EXPRHASH" with hash expression for 
fields that are used for HashToRandomExchange
Remove Project operator after HashRandomExchange (or Demux) since it will 
create problems to fields ordering in HashJoin.

Tight this to MuxExchange - so if MuxExchange is enabled, Project is inserted.


Diffs (updated)
-

  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/HashToRandomExchangePrel.java
 372c75d 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PrelUtil.java
 1adc54f 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/visitor/InsertLocalExchangeVisitor.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/30965/diff/


Testing
---

Need to add Unit Tests. tested live, run Functional and TPCH tests


Thanks,

Yuliya Feldman



OpenJDK’s java.utils.Collection.sort() Bug

2015-02-26 Thread Yash Sharma
As pointed out on the Hadoop mailing list -

The OpenJDK’s java.utils.Collection.sort() is broken - such that the
default TimSort implementation would cause ArrayIndexOutOfBoundsException
for number of elements larger than 67108864.

I wonder if we can have such a huge collection in Drill and might hit this
bug ?
We do have Collections.sort used in multiple places
including DrillTextRecordReader but do we need to consider workaround for
this ?

Thoughts ?

Links:
http://envisage-project.eu/timsort-specification-and-verification/

https://bugs.openjdk.java.net/browse/JDK-8072909


Re: OpenJDK’s java.utils.Collection.sort() Bug

2015-02-26 Thread Steven Phillips
It looks like we are using the method in 5 different places in drill. We
are using to sort lists of: files, drillbit endpoints, workunits, operator
profiles, and columnIds.

I can't imagine we are ever going to need to sort millions of those. So
probably no need to worry about this bug.

But we should keep it in mind for any future code that might want to use it.

On Thu, Feb 26, 2015 at 1:00 AM, Yash Sharma  wrote:

> As pointed out on the Hadoop mailing list -
>
> The OpenJDK’s java.utils.Collection.sort() is broken - such that the
> default TimSort implementation would cause ArrayIndexOutOfBoundsException
> for number of elements larger than 67108864.
>
> I wonder if we can have such a huge collection in Drill and might hit this
> bug ?
> We do have Collections.sort used in multiple places
> including DrillTextRecordReader but do we need to consider workaround for
> this ?
>
> Thoughts ?
>
> Links:
> http://envisage-project.eu/timsort-specification-and-verification/
>
> https://bugs.openjdk.java.net/browse/JDK-8072909
>



-- 
 Steven Phillips
 Software Engineer

 mapr.com


Re: OpenJDK’s java.utils.Collection.sort() Bug

2015-02-26 Thread Yash Sharma
Makes sense.
We just need to keep in mind that we don't use collection.sort for sorting
actual data. Otherwise we should never hit this bug.

On Thu, Feb 26, 2015 at 4:28 PM, Steven Phillips 
wrote:

> It looks like we are using the method in 5 different places in drill. We
> are using to sort lists of: files, drillbit endpoints, workunits, operator
> profiles, and columnIds.
>
> I can't imagine we are ever going to need to sort millions of those. So
> probably no need to worry about this bug.
>
> But we should keep it in mind for any future code that might want to use
> it.
>
> On Thu, Feb 26, 2015 at 1:00 AM, Yash Sharma  wrote:
>
> > As pointed out on the Hadoop mailing list -
> >
> > The OpenJDK’s java.utils.Collection.sort() is broken - such that the
> > default TimSort implementation would cause ArrayIndexOutOfBoundsException
> > for number of elements larger than 67108864.
> >
> > I wonder if we can have such a huge collection in Drill and might hit
> this
> > bug ?
> > We do have Collections.sort used in multiple places
> > including DrillTextRecordReader but do we need to consider workaround for
> > this ?
> >
> > Thoughts ?
> >
> > Links:
> > http://envisage-project.eu/timsort-specification-and-verification/
> >
> > https://bugs.openjdk.java.net/browse/JDK-8072909
> >
>
>
>
> --
>  Steven Phillips
>  Software Engineer
>
>  mapr.com
>


[jira] [Resolved] (DRILL-2238) Unsupported join on OR yields rough CannotPlanException (vs. cleaner unsupported-feature message)

2015-02-26 Thread Sean Hsuan-Yi Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Hsuan-Yi Chu resolved DRILL-2238.
--
Resolution: Fixed

> Unsupported join on OR  yields rough CannotPlanException (vs. cleaner 
> unsupported-feature message)
> --
>
> Key: DRILL-2238
> URL: https://issues.apache.org/jira/browse/DRILL-2238
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization, SQL Parser
>Reporter: Daniel Barclay (Drill)
>Assignee: Sean Hsuan-Yi Chu
>Priority: Minor
> Fix For: 0.9.0
>
>
> The unsupported combination of joining using a condition with an OR yields an 
> internal CannotPlanException (with a dump of the plan):
> org.apache.drill.exec.rpc.RpcException: RelOptPlanner.CannotPlanException: 
> Node [rel#4217:Subset#3.LOGICAL.ANY([]).[]] could not be implemented; planner 
> state:
> Root: rel#4217:Subset#3.LOGICAL.ANY([]).[]
> Original rel:
> ...
> Eventually this should instead use some kind of unsupported-combination error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2155) Subquery in projection list that returns scalar result throws an exception

2015-02-26 Thread Sean Hsuan-Yi Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Hsuan-Yi Chu resolved DRILL-2155.
--
Resolution: Fixed

> Subquery in projection list that returns scalar result throws an exception
> --
>
> Key: DRILL-2155
> URL: https://issues.apache.org/jira/browse/DRILL-2155
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 0.8.0
>Reporter: Victoria Markman
>Assignee: Sean Hsuan-Yi Chu
> Fix For: 0.9.0
>
>
> Since there is no correlation, it shouldn't even plan ...
> I think we should throw the same error as in DRILL-1921 (cross join)
> {code}
> 0: jdbc:drill:schema=dfs> select c_integer, (select current_timestamp from t1 
> limit 1) from t1;
> Query failed: AssertionError: must call validate first
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> 0: jdbc:drill:schema=dfs> select c_integer, (select current_timestamp from t1 
> limit 1) b from t1;
> Query failed: AssertionError: must call validate first
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}
> drillbit.log
> {code}
> 2015-02-03 22:33:16,008 [2b2eb354-2084-7da7-4e33-0debc7a90921:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - State change requested.  PENDING --> 
> FAILED
> org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception 
> during fragment initialization: must call validate first
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:197) 
> [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:254)
>  [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_71]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_71]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> Caused by: java.lang.AssertionError: must call validate first
> at 
> org.eigenbase.sql.validate.IdentifierNamespace.resolve(IdentifierNamespace.java:139)
>  ~[optiq-core-0.9-drill-r18.jar:na]
> at 
> org.eigenbase.sql2rel.SqlToRelConverter.convertFrom(SqlToRelConverter.java:1686)
>  ~[optiq-core-0.9-drill-r18.jar:na]
> at 
> org.eigenbase.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:494)
>  ~[optiq-core-0.9-drill-r18.jar:na]
> at 
> org.eigenbase.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:474)
>  ~[optiq-core-0.9-drill-r18.jar:na]
> at 
> org.eigenbase.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:2657)
>  ~[optiq-core-0.9-drill-r18.jar:na]
> at 
> org.eigenbase.sql2rel.SqlToRelConverter.convertQueryOrInList(SqlToRelConverter.java:1267)
>  ~[optiq-core-0.9-drill-r18.jar:na]
> at 
> org.eigenbase.sql2rel.SqlToRelConverter.convertExists(SqlToRelConverter.java:1240)
>  ~[optiq-core-0.9-drill-r18.jar:na]
> at 
> org.eigenbase.sql2rel.SqlToRelConverter.substituteSubquery(SqlToRelConverter.java:1003)
>  ~[optiq-core-0.9-drill-r18.jar:na]
> at 
> org.eigenbase.sql2rel.SqlToRelConverter.replaceSubqueries(SqlToRelConverter.java:859)
>  ~[optiq-core-0.9-drill-r18.jar:na]
> at 
> org.eigenbase.sql2rel.SqlToRelConverter.convertSelectList(SqlToRelConverter.java:3279)
>  ~[optiq-core-0.9-drill-r18.jar:na]
> at 
> org.eigenbase.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:519)
>  ~[optiq-core-0.9-drill-r18.jar:na]
> at 
> org.eigenbase.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:474)
>  ~[optiq-core-0.9-drill-r18.jar:na]
> at 
> org.eigenbase.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:2657)
>  ~[optiq-core-0.9-drill-r18.jar:na]
> at 
> org.eigenbase.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:432)
>  ~[optiq-core-0.9-drill-r18.jar:na]
> at 
> net.hydromatic.optiq.prepare.PlannerImpl.convert(PlannerImpl.java:186) 
> ~[optiq-core-0.9-drill-r18.jar:na]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRel(DefaultSqlHandler.java:149)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:126)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:145)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>

[jira] [Resolved] (DRILL-1746) Query profile should contain node/drilbit information where query ran

2015-02-26 Thread Jacques Nadeau (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacques Nadeau resolved DRILL-1746.
---
Resolution: Fixed

> Query profile should contain node/drilbit information where query ran
> -
>
> Key: DRILL-1746
> URL: https://issues.apache.org/jira/browse/DRILL-1746
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Flow
>Affects Versions: 0.6.0
>Reporter: Aman Sinha
>Assignee: Jacques Nadeau
>Priority: Minor
> Fix For: 0.8.0
>
>
> The query profile currently has information on the major and minor fragments. 
>  We should also include node/drillbit information on which the minor fragment 
> was executed.  This will help with diagnosability.  Field engineers have 
> asked for this feature. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 30910: DRILL-2178: Update outgoing record batch size and allocation in PartitionSender

2015-02-26 Thread Jacques Nadeau

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30910/#review74281
---

Ship it!


Ship It!

- Jacques Nadeau


On Feb. 12, 2015, 9:52 p.m., Venki Korukanti wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30910/
> ---
> 
> (Updated Feb. 12, 2015, 9:52 p.m.)
> 
> 
> Review request for drill and Steven Phillips.
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> Please see the JIRA DRILL-2178 for details
> 
> 
> Diffs
> -
> 
>   exec/java-exec/src/main/codegen/templates/FixedValueVectors.java 52a3868 
>   exec/java-exec/src/main/codegen/templates/NullableValueVectors.java ba7c629 
>   exec/java-exec/src/main/codegen/templates/RepeatedValueVectors.java d39040e 
>   exec/java-exec/src/main/codegen/templates/VariableLengthVectors.java 
> aa5b702 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/materialize/QueryWritableBatch.java
>  cef4101 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
>  f09acaa 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerTemplate.java
>  4292c09 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/record/FragmentWritableBatch.java
>  d122311 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/record/MaterializedField.java
>  bcc226f 
>   exec/java-exec/src/main/java/org/apache/drill/exec/vector/BitVector.java 
> f6644bd 
>   exec/java-exec/src/main/java/org/apache/drill/exec/vector/ObjectVector.java 
> 3c15db3 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/vector/RepeatedVector.java 
> b23ee02 
>   exec/java-exec/src/main/java/org/apache/drill/exec/vector/ValueVector.java 
> df6a486 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/MapVector.java
>  c5dc5ba 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/RepeatedListVector.java
>  3078f4e 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/RepeatedMapVector.java
>  6dce363 
>   
> exec/java-exec/src/test/java/org/apache/drill/exec/record/vector/TestValueVector.java
>  2bed433 
> 
> Diff: https://reviews.apache.org/r/30910/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Venki Korukanti
> 
>



[jira] [Resolved] (DRILL-2315) Documentation update

2015-02-26 Thread Aditya Kishore (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Kishore resolved DRILL-2315.
---
   Resolution: Fixed
Fix Version/s: 0.7.0

Resolved by git commit 
[d959a2|https://fisheye6.atlassian.com/changelog/incubator-drill?cs=d959a210053f02b5069f0a0cb9f0d34131640ffb]
 and SVN revision 
[1662344|https://svn.apache.org/viewvc?view=revision&revision=1662344].

> Documentation update
> 
>
> Key: DRILL-2315
> URL: https://issues.apache.org/jira/browse/DRILL-2315
> Project: Apache Drill
>  Issue Type: Task
>  Components: Documentation
>Reporter: Aditya Kishore
>Assignee: Kristine Hahn
> Fix For: 0.7.0
>
>
> corrections to confluence conversions, review request #31400 made thru review 
> board.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 30466: DRILL-133: LocalExchange (planning and parallelization)

2015-02-26 Thread Venki Korukanti

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30466/
---

(Updated Feb. 26, 2015, 6:15 p.m.)


Review request for drill, Chris Westin, Jacques Nadeau, and Steven Phillips.


Changes
---

Addressed review comments by Jacques and Yuliya.


Repository: drill-git


Description
---

In this patch LocalExchange contains only multiplexing exchange. Currently 
working on demultiplexing exchange (there are few failures in executions and 
currently debugging those). Sending this patch to get initial feedback.

Brief overview of changes:
1. Traverse the PRel tree after all optimizations. Whenever a 
HashToRandomExchangePrel is encountered insert a MuxExchange before 
HashToRandomExchangePrel and DemuxExchange after HashToRandomExchangePrel.
2. Parallelization changes: 
   i) Traverse the physical operator tree and divide it into Fragments.
   ii) Based on the affinity of the sending Exchange, set parallelization 
dependencies between fragments. 
   iii) Start parallelizing from the leaf fragments (fragment that have no 
other fragments depending on them for parallelization info). Stats collection 
include collecting parallelization info (minWidth, maxWidth, affinityMap) and 
cost.
3. Change the Receiver to accept set of (minorFragmentId, DrillbitEndpoint) as 
sender list. This also involved few changes in DataCollector.
4. Change SingleSender to accept custom minorFragmentId instead of default 
minorFragmentId of 0


Diffs (updated)
-

  contrib/storage-hbase/src/test/java/org/apache/drill/hbase/BaseHBaseTest.java 
1152b7b 
  
contrib/storage-hive/core/src/test/java/org/apache/drill/exec/hive/HiveTestBase.java
 7e3b6c8 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/EndpointAffinity.java
 df31f74 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/MinorFragmentEndpoint.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/AbstractExchange.java
 73280ea 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/AbstractGroupScan.java
 5d0d9bf 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/AbstractReceiver.java
 f621a26 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/AbstractSender.java
 53a0721 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/Exchange.java 
7be7f20 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/PhysicalOperatorUtil.java
 dfcb113 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/Receiver.java 
0c67770 
  exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/Sender.java 
bbd1b2c 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/AbstractDeMuxExchange.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/AbstractMuxExchange.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/BroadcastExchange.java
 73a1d20 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/BroadcastSender.java
 1827367 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/HashPartitionSender.java
 bdb1362 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/HashToMergeExchange.java
 f62d922 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/HashToRandomExchange.java
 fac374b 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/MergingReceiverPOP.java
 f5dca1a 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/OrderedPartitionExchange.java
 8e1526a 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/OrderedPartitionSender.java
 0a2b9be 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/RangeSender.java
 c8c8f43 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/Screen.java 
58c8e29 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/SingleMergeExchange.java
 2914112 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/SingleSender.java
 4a11a51 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/UnionExchange.java
 cfc21ac 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/UnorderedDeMuxExchange.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/UnorderedMuxExchange.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/UnorderedReceiver.java
 3a4dd0e 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/SingleSenderCreator.java
 6db9f4a 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/broadcastsender/BroadcastSenderRootExec.java
 22fa047 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/mergereceiver/MergingRecordBatch.java

[jira] [Created] (DRILL-2321) FlattenRecordBatch should transfer vectors honoring output field reference.

2015-02-26 Thread Hanifi Gunes (JIRA)
Hanifi Gunes created DRILL-2321:
---

 Summary: FlattenRecordBatch should transfer vectors honoring 
output field reference.
 Key: DRILL-2321
 URL: https://issues.apache.org/jira/browse/DRILL-2321
 Project: Apache Drill
  Issue Type: Bug
Reporter: Hanifi Gunes
Assignee: Hanifi Gunes


Current implementation of FlattenRecordBatch does not create transfer pairs 
with the target vector named with the desired field reference. This seems 
working fine with repeated types as the upstream project or ArrayPathSegment 
makes the renaming implicitly. Overall relying on upstream makes the code 
fragile. Also, DRILL-2150 requires this patch as it alters the way repeated 
vectors are modeled. This patch proposes to implement a less fragile transfer 
pair handling via explicit naming.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2322) CSV record reader should log which file and which record caused an error in the reader

2015-02-26 Thread Ramana Inukonda Nagaraj (JIRA)
Ramana Inukonda Nagaraj created DRILL-2322:
--

 Summary: CSV record reader should log which file and which record 
caused an error in the reader
 Key: DRILL-2322
 URL: https://issues.apache.org/jira/browse/DRILL-2322
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - Text & CSV
Affects Versions: 0.8.0
Reporter: Ramana Inukonda Nagaraj
Assignee: Hanifi Gunes


I believe the title is self exploratory.
If the text reader fails for any reason due to an offending record drill should 
log which file (if there are multiple files) and which line/record the error 
occurs at. This will improve debugging when dealing with large files/ large 
number of files.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2323) CLONE - Parquet record reader should log which file and which record caused an error in the reader

2015-02-26 Thread Ramana Inukonda Nagaraj (JIRA)
Ramana Inukonda Nagaraj created DRILL-2323:
--

 Summary: CLONE - Parquet record reader should log which file and 
which record caused an error in the reader
 Key: DRILL-2323
 URL: https://issues.apache.org/jira/browse/DRILL-2323
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - Text & CSV
Affects Versions: 0.8.0
Reporter: Ramana Inukonda Nagaraj
Assignee: Hanifi Gunes


I believe the title is self exploratory.
If the text reader fails for any reason due to an offending record drill should 
log which file (if there are multiple files) and which line/record the error 
occurs at. This will improve debugging when dealing with large files/ large 
number of files.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 30327: Drill-1545 v2 allowing custom extensions on json files

2015-02-26 Thread Jason Altekruse

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30327/#review74349
---



exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONFormatPlugin.java


Looking at the docs I think that Include.NON_DEFAULT only works on 
primitives. It doesn't specifically mention lists, but it does metnion that it 
will always include a map. I think the safest way is just to create two 
separate methods, one specifically to check if we have only the default member 
of the list and in this case return null, this will only be use for Jackson (We 
will also have to turn off populating nulls, I think it is currently turned 
on). There will be a separate method that returns the list even if it is the 
default, for use in the other parts of the code.


- Jason Altekruse


On Jan. 27, 2015, 9:56 p.m., Jason Altekruse wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30327/
> ---
> 
> (Updated Jan. 27, 2015, 9:56 p.m.)
> 
> 
> Review request for drill.
> 
> 
> Bugs: DRILL-1545
> https://issues.apache.org/jira/browse/DRILL-1545
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> Patch add support for custom extensions on json files. There was a previous 
> review request for this issue, but there were some problems while testing the 
> patch in a shared cluster environment and with compressed json files. I was 
> not the owner of the previous review request so I could not upload this there.
> 
> 
> Diffs
> -
> 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/BasicFormatMatcher.java
>  2ba2910 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONFormatPlugin.java
>  d41243d 
>   exec/java-exec/src/main/resources/bootstrap-storage-plugins.json 6bf1872 
> 
> Diff: https://reviews.apache.org/r/30327/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jason Altekruse
> 
>



[jira] [Created] (DRILL-2324) fix broken links in 0.7 docs

2015-02-26 Thread Kristine Hahn (JIRA)
Kristine Hahn created DRILL-2324:


 Summary: fix broken links in 0.7 docs
 Key: DRILL-2324
 URL: https://issues.apache.org/jira/browse/DRILL-2324
 Project: Apache Drill
  Issue Type: Task
  Components: Documentation
Affects Versions: 0.7.0
Reporter: Kristine Hahn
Assignee: Kristine Hahn
 Fix For: 0.7.0


When we moved files to the Drill site, links that worked on the personal 
staging site no longer work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2325) conf/drill-override-example.conf is outdated

2015-02-26 Thread Zhiyong Liu (JIRA)
Zhiyong Liu created DRILL-2325:
--

 Summary: conf/drill-override-example.conf is outdated
 Key: DRILL-2325
 URL: https://issues.apache.org/jira/browse/DRILL-2325
 Project: Apache Drill
  Issue Type: Bug
  Components: Tools, Build & Test
Reporter: Zhiyong Liu
Assignee: Steven Phillips


The conf/drill-override-example.conf file is outdated.  Properties have been 
added (e.g., compile), removed (e.g., cache.hazel.subnets) or otherwise 
modified.

The file is statically tracked in 
distribution/src/resources/drill-override-example.conf.  Ideally there should 
be a way to update the file programmatically when things change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 31507: DRILL-2289: User email is still pointing to the old ( incubator.apache.org) should be u...@drill.apache.org

2015-02-26 Thread Bridget Bridget

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31507/
---

Review request for drill and Aditya Kishore.


Repository: drill-git


Description
---

URL : http://drill.apache.org/faq/
FAQ:"How can I ask questions and provide feedback?" has the the following
Email drill-u...@incubator.apache.org should be u...@drill.apache.org address


Diffs
-

  faq.md 0c805cc 

Diff: https://reviews.apache.org/r/31507/diff/


Testing
---


Thanks,

Bridget Bridget



[jira] [Created] (DRILL-2326) scalar replacement fails in TestConvertFunctions.testBigIntVarCharReturnTripConvertLogical()

2015-02-26 Thread Chris Westin (JIRA)
Chris Westin created DRILL-2326:
---

 Summary: scalar replacement fails in 
TestConvertFunctions.testBigIntVarCharReturnTripConvertLogical()
 Key: DRILL-2326
 URL: https://issues.apache.org/jira/browse/DRILL-2326
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Codegen
Affects Versions: 0.8.0
Reporter: Chris Westin
Assignee: Chris Westin


For now, we've worked around this by using a retry strategy in ClassTransformer 
which will fall back to using the code without scalar replacement when the 
scalar replacement fails. This needs to be revisited to find out what is 
failing about this particular case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 30622: DRILL-1953: alter session set `store.json.all_text_mode` does not work as documented

2015-02-26 Thread Jason Altekruse

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30622/#review74386
---

Ship it!


Ship It!

- Jason Altekruse


On Feb. 5, 2015, 8:40 p.m., abdelhakim deneche wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30622/
> ---
> 
> (Updated Feb. 5, 2015, 8:40 p.m.)
> 
> 
> Review request for drill, Jacques Nadeau and Jason Altekruse.
> 
> 
> Bugs: DRILL-1953
> https://issues.apache.org/jira/browse/DRILL-1953
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> Setting *store.json.all_text_mode* on the *SESSION* level doesn't have any 
> effect on *JSONRecordReader*
> 
> The same problem was found and fixed in *MongoRecordReader*
> 
> 
> Diffs
> -
> 
>   
> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoRecordReader.java
>  4b73600 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitContext.java
>  83a89df 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONRecordReader.java
>  0070d18 
>   
> exec/java-exec/src/test/java/org/apache/drill/exec/store/json/TestJsonRecordReader.java
>  449f091 
>   exec/java-exec/src/test/resources/jsoninput/big_numeric.json PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/30622/diff/
> 
> 
> Testing
> ---
> 
> added new unit test to test the fix
> all unit tests pass
> both *Functional* and *TPCH SF100* tests pass
> 
> 
> Thanks,
> 
> abdelhakim deneche
> 
>



Re: Review Request 30313: Fixing Mongo join issue when * is selected

2015-02-26 Thread Hanifi Gunes

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30313/#review74406
---



contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoRecordReader.java


This does not seem quite right to me. What if multiple writes fail? Also 
note that this will change the order of records being written into vectors.



contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoRecordReader.java


DRILL-1960 handles auto-reallocation of buffers. I don't think we will ever 
hit to this check due to out of memory issues. We should remove this.


- Hanifi Gunes


On Jan. 27, 2015, 10:27 a.m., Kamesh B wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30313/
> ---
> 
> (Updated Jan. 27, 2015, 10:27 a.m.)
> 
> 
> Review request for drill, Jacques Nadeau and Jinfeng Ni.
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> Fixing Mongo join issue when * is selected.
> 
> 
> Diffs
> -
> 
>   
> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoRecordReader.java
>  4b7360057e9d95fb0176b035469bca02128a0b34 
> 
> Diff: https://reviews.apache.org/r/30313/diff/
> 
> 
> Testing
> ---
> 
> Tested the patch by selecting * in join operation. It is working fine. This 
> patch also fixes some of issues caused by earlier commits.
> 
> 
> Thanks,
> 
> Kamesh B
> 
>



[jira] [Created] (DRILL-2327) Upper bound on join's row_count_estimate_factor should be increased to handle expanding joins

2015-02-26 Thread Aman Sinha (JIRA)
Aman Sinha created DRILL-2327:
-

 Summary: Upper bound on join's row_count_estimate_factor should be 
increased to handle expanding joins
 Key: DRILL-2327
 URL: https://issues.apache.org/jira/browse/DRILL-2327
 Project: Apache Drill
  Issue Type: Improvement
  Components: Query Planning & Optimization
Reporter: Aman Sinha
Assignee: Aman Sinha


The current bounds for planner.join.row_count_estimate_factor is between 0 to 
100.  The default value is 1.0.  This parameter determines the estimated output 
cardinality of a join.  For hugely expanding joins, this is inadequate and we 
need to allow substantially larger upper bound. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 31507: DRILL-2289: User email is still pointing to the old ( incubator.apache.org) should be u...@drill.apache.org

2015-02-26 Thread Aditya Kishore

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31507/#review74412
---

Ship it!


Ship It!

- Aditya Kishore


On Feb. 26, 2015, 3 p.m., Bridget Bridget wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31507/
> ---
> 
> (Updated Feb. 26, 2015, 3 p.m.)
> 
> 
> Review request for drill and Aditya Kishore.
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> URL : http://drill.apache.org/faq/
> FAQ:"How can I ask questions and provide feedback?" has the the following
> Email drill-u...@incubator.apache.org should be u...@drill.apache.org address
> 
> 
> Diffs
> -
> 
>   faq.md 0c805cc 
> 
> Diff: https://reviews.apache.org/r/31507/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Bridget Bridget
> 
>



[jira] [Created] (DRILL-2328) Concat operator returns wrong result when one of the operands is NULL

2015-02-26 Thread Victoria Markman (JIRA)
Victoria Markman created DRILL-2328:
---

 Summary: Concat operator returns wrong result when one of the 
operands is NULL
 Key: DRILL-2328
 URL: https://issues.apache.org/jira/browse/DRILL-2328
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 0.8.0
Reporter: Victoria Markman
Assignee: Sean Hsuan-Yi Chu
Priority: Critical


Queries below should return NULL:

{code}
0: jdbc:drill:schema=dfs> select cast(null as varchar(10)) || '--' from t1;
++
|   EXPR$0   |
++
| -- |
| -- |
| -- |
| -- |
| -- |
| -- |
| -- |
| -- |
| -- |
| -- |
++
10 rows selected (0.09 seconds)
0: jdbc:drill:schema=dfs> select a1 || '--' from t1 where a1 is null;
++
|   EXPR$0   |
++
| -- |
++
1 row selected (0.105 seconds)
{code}

Looks harmless at first, but a very common pattern in many customer queries 
will be broken: grouping by using '||' as following:

{code}
select
cast(extract(day from c_timestamp) as varchar(10)) || '-' || 
cast(extract(month from c_timestamp) as varchar(10)) || '-' || 
cast(extract(year from c_timestamp) as varchar(10)),
sum(c_integer)  as sum1
from
alltypes_with_nulls
group by
cast(extract(day from c_timestamp) as varchar(10)) || '-' || 
cast(extract(month from c_timestamp) as varchar(10)) || '-' || 
cast(extract(year from c_timestamp) as varchar(10))
order by
cast(extract(day from c_timestamp) as varchar(10)) || '-' || 
cast(extract(month from c_timestamp) as varchar(10)) || '-' || 
cast(extract(year from c_timestamp) as varchar(10))
;
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 31459: Drill-2316

2015-02-26 Thread Kristine Hahn

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31459/
---

(Updated Feb. 27, 2015, 12:50 a.m.)


Review request for drill and Bridget Bridget.


Bugs: DRILL-2316
https://issues.apache.org/jira/browse/DRILL-2316


Repository: drill-git


Description
---

hive, parquet, json references, basics tutorial, misc updates


Diffs (updated)
-

  _docs/009-datasources.md PRE-CREATION 
  _docs/010-dev-custom-func.md PRE-CREATION 
  _docs/011-manage.md PRE-CREATION 
  _docs/012-develop.md PRE-CREATION 
  _docs/013-rn.md PRE-CREATION 
  _docs/014-contribute.md PRE-CREATION 
  _docs/015-sample-ds.md PRE-CREATION 
  _docs/016-design.md PRE-CREATION 
  _docs/018-progress.md PRE-CREATION 
  _docs/019-bylaws.md PRE-CREATION 
  _docs/connect/005-reg-hive.md 564bebc 
  _docs/connect/007-mongo-plugin.md fd5dba8 
  _docs/data-sources/001-hive-types.md PRE-CREATION 
  _docs/data-sources/002-hive-udf.md PRE-CREATION 
  _docs/data-sources/003-parquet-ref.md PRE-CREATION 
  _docs/data-sources/004-json-ref.md PRE-CREATION 
  _docs/dev-custom-fcn/002-dev-aggregate.md d1a3cfb 
  _docs/img/Untitled.png 7fea1e87cd7721dd501b05e76f73ab9fb1116916 
  _docs/img/json-workaround.png PRE-CREATION 
  _docs/install/001-drill-in-10.md 13d2410 
  _docs/interfaces/001-odbc-win.md 2f08af2 
  _docs/interfaces/odbc-win/003-connect-odbc-win.md 0d4cb8a 
  _docs/interfaces/odbc-win/004-tableau-examples.md d45f3f3 
  _docs/manage/002-start-stop.md 76a76f4 
  _docs/manage/003-ports.md df1d362 
  _docs/manage/conf/002-startup-opt.md 3434401 
  _docs/manage/conf/003-plan-exec.md ea67e2d 
  _docs/manage/conf/004-persist-conf.md b1deefa 
  _docs/query/001-get-started.md PRE-CREATION 
  _docs/query/001-query-fs.md ca488fb 
  _docs/query/002-query-fs.md PRE-CREATION 
  _docs/query/002-query-hbase.md d2a33d5 
  _docs/query/003-query-complex.md 537d7b4 
  _docs/query/003-query-hbase.md PRE-CREATION 
  _docs/query/004-query-complex.md PRE-CREATION 
  _docs/query/004-query-hive.md 903c7c6 
  _docs/query/005-query-hive.md PRE-CREATION 
  _docs/query/005-query-info-skema.md 1ad0008 
  _docs/query/006-query-info-skema.md PRE-CREATION 
  _docs/query/006-query-sys-tbl.md 9b853ec 
  _docs/query/007-query-sys-tbl.md PRE-CREATION 
  _docs/query/get-started/001-lesson1-connect.md PRE-CREATION 
  _docs/query/get-started/002-lesson2-download.md PRE-CREATION 
  _docs/query/get-started/003-lesson3-plugin.md PRE-CREATION 
  _docs/sql-ref/003-functions.md 4769257 
  _docs/sql-ref/005-cmd-summary.md 13d9515 
  _docs/sql-ref/006-reserved-wds.md a19f73c 
  _docs/sql-ref/data-types/001-date.md 6340e35 
  _docs/tutorial/005-lesson3.md f6c7ae4 

Diff: https://reviews.apache.org/r/31459/diff/


Testing
---


Thanks,

Kristine Hahn



[jira] [Created] (DRILL-2329) TPCDS query 95 and simplified variant fail to execute

2015-02-26 Thread Abhishek Girish (JIRA)
Abhishek Girish created DRILL-2329:
--

 Summary: TPCDS query 95 and simplified variant fail to execute
 Key: DRILL-2329
 URL: https://issues.apache.org/jira/browse/DRILL-2329
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 0.8.0
Reporter: Abhishek Girish
Assignee: Jinfeng Ni


TPCDS query 95 (attached) fails to validate. 

*A simplified variant of the query (may not have much semantics) that fails:*
{code:sql}
WITH abc AS
(
   SELECT *
   FROM   web_sales)
SELECT
 Count(DISTINCT ws_order_number) AS a1 ,
 Sum(ws_ext_ship_cost)   AS a2 
FROM web_sales ws 
WHERE ws.ws_ship_addr_sk = 1
AND  ws.ws_web_site_sk = web_site_sk
ORDER BY count(DISTINCT ws_order_number)
LIMIT 100;
{code}

*Error:*
Query failed: SqlValidatorException: Aggregate expression is illegal in ORDER 
BY clause of non-aggregating SELECT





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2289) User email is still pointing to the old ( incubator.apache.org) should be u...@drill.apache.org.

2015-02-26 Thread Bridget Bevens (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bridget Bevens resolved DRILL-2289.
---
Resolution: Fixed

Changed email in FAQ on the Drill website and updated links in the Drill 
Confluence wiki.

> User email is still pointing to the old ( incubator.apache.org) should be 
> u...@drill.apache.org.
> 
>
> Key: DRILL-2289
> URL: https://issues.apache.org/jira/browse/DRILL-2289
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Brahma Reddy Battula
>Assignee: Bridget Bevens
>
> Not sure who to direct this to, but:
> 1) URL : http://drill.apache.org/faq/
> FAQ:"How can I ask questions and provide feedback?" has the the following
> How can I ask questions and provide feedback?
> Please post your questions and feedback on  
> *{color:red}drill-u...@incubator.apache.org.{color}*  We are happy to have 
> you try out Drill and help with any questions!
> I feel, It should be  *{color:green}u...@drill.apache.org address{color}* 
> 2) URL : 
> https://cwiki.apache.org/confluence/display/DRILL/Apache+Drill+Contribution+Ideas
> Section : Fixing JIRAs
> This is a good place to begin if you are new to Drill. Feel free to pick 
> issues from the Drill JIRA list. When you pick an issue, assign it to 
> yourself, inform the team, and start fixing it.
> For any questions, seek help from the team by sending email to  
> *{color:red}drill-...@incubator.apache.org.{color}* 
> I feel, It should be  *{color:green}dev@drill.apache.org address{color}* 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2330) Add support for nested aggregate expressions

2015-02-26 Thread Abhishek Girish (JIRA)
Abhishek Girish created DRILL-2330:
--

 Summary: Add support for nested aggregate expressions
 Key: DRILL-2330
 URL: https://issues.apache.org/jira/browse/DRILL-2330
 Project: Apache Drill
  Issue Type: Improvement
  Components: Query Planning & Optimization
Affects Versions: 0.8.0
Reporter: Abhishek Girish
Assignee: Jinfeng Ni
 Fix For: Future
 Attachments: drillbit.log

Aggregate expressions currently cannot be nested. 

*The following query fails to validate:*
{code:sql}
select avg(sum(i_item_sk)) from item;
{code}

Error:
Query failed: SqlValidatorException: Aggregate expressions cannot be nested

Log attached. 

Reference: TPCDS queries (20, 63, 98, ...) fail to execute.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 31519: fix broken links in 0.7 docs

2015-02-26 Thread Kristine Hahn

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31519/
---

Review request for drill and Bridget Bridget.


Bugs: DRILL-2324
https://issues.apache.org/jira/browse/DRILL-2324


Repository: drill-git


Description
---

fix broken links in 0.7 docs


Diffs
-

  _docs/001-arch.md 0905ad3 
  _docs/002-tutorial.md 14cae80 
  _docs/003-yelp.md b65359e 
  _docs/006-interfaces.md ce068a6 
  _docs/012-rn.md f369335 
  _docs/013-contribute.md 33db231 
  _docs/014-sample-ds.md 7212ea0 
  _docs/015-design.md 00b17e5 
  _docs/016-progress.md bf19a29 
  _docs/arch/arch-hilite/001-flexibility.md 0b5c5e3 
  _docs/connect/005-reg-hive.md 564bebc 
  _docs/connect/006-default-frmt.md 7dc55d5 
  _docs/connect/007-mongo-plugin.md fd5dba8 
  _docs/contribute/001-guidelines.md 686d972 
  _docs/dev-custom-fcn/001-dev-simple.md ebf3831 
  _docs/dev-custom-fcn/002-dev-aggregate.md d1a3cfb 
  _docs/develop/001-compile.md 2cf6ac9 
  _docs/develop/003-patch-tool.md 3ef3fe5 
  _docs/install/001-drill-in-10.md 13d2410 
  _docs/install/002-deploy.md eecd3bc 
  _docs/install/004-install-distributed.md d0f07aa 
  _docs/install/install-embedded/001-install-linux.md b7a0c85 
  _docs/install/install-embedded/002-install-mac.md a288b94 
  _docs/install/install-embedded/003-install-win.md 6680019 
  _docs/interfaces/001-odbc-win.md 2f08af2 
  _docs/interfaces/003-jdbc-squirrel.md 99eba80 
  _docs/interfaces/odbc-linux/001-install-odbc-linux.md 3ae1930 
  _docs/interfaces/odbc-linux/002-install-odbc-mac.md 65a35f3 
  _docs/interfaces/odbc-linux/003-odbc-connections-linux.md 11b660d 
  _docs/interfaces/odbc-linux/005-odbc-connect-str.md 595432b 
  _docs/interfaces/odbc-win/001-install-odbc-win.md 5bb6c8d 
  _docs/interfaces/odbc-win/002-conf-odbc-win.md 636bd9f 
  _docs/interfaces/odbc-win/003-connect-odbc-win.md 0d4cb8a 
  _docs/interfaces/odbc-win/004-tableau-examples.md d45f3f3 
  _docs/manage/conf/001-mem-alloc.md 8f98cfc 
  _docs/manage/conf/002-startup-opt.md 3434401 
  _docs/manage/conf/004-persist-conf.md b1deefa 
  _docs/query/001-query-fs.md ca488fb 
  _docs/query/002-query-hbase.md d2a33d5 
  _docs/query/004-query-hive.md 903c7c6 
  _docs/query/006-query-sys-tbl.md 9b853ec 
  _docs/query/query-fs/002-query-parquet.md cf19fcf 
  _docs/rn/004-0.6.0-rn.md f121ebe 
  _docs/sql-ref/001-data-types.md e425033 
  _docs/sql-ref/002-operators.md 79afc7d 
  _docs/sql-ref/003-functions.md 4769257 
  _docs/sql-ref/004-nest-functions.md 09fe91e 
  _docs/sql-ref/005-cmd-summary.md 13d9515 
  _docs/sql-ref/nested/001-flatten.md 2769000 
  _docs/sql-ref/nested/002-kvgen.md f619864 
  _docs/sql-ref/nested/003-repeated-cnt.md 2b332b3 
  _docs/tutorial/002-get2kno-sb.md 9b11b9d 
  _docs/tutorial/003-lesson1.md 119d67f 
  _docs/tutorial/004-lesson2.md 73c4329 
  _docs/tutorial/005-lesson3.md f6c7ae4 
  _docs/tutorial/006-summary.md 552d72f 
  _docs/tutorial/install-sandbox/001-install-mapr-vm.md 9c0e19e 
  _docs/tutorial/install-sandbox/002-install-mapr-vb.md 36b10c0 

Diff: https://reviews.apache.org/r/31519/diff/


Testing
---


Thanks,

Kristine Hahn



[jira] [Created] (DRILL-2331) fix broken links in 0.8/1.0 docs

2015-02-26 Thread Kristine Hahn (JIRA)
Kristine Hahn created DRILL-2331:


 Summary: fix broken links in 0.8/1.0 docs
 Key: DRILL-2331
 URL: https://issues.apache.org/jira/browse/DRILL-2331
 Project: Apache Drill
  Issue Type: Task
  Components: Documentation
Reporter: Kristine Hahn
Assignee: Kristine Hahn


When we moved files to the Drill site, links that worked on the personal 
staging site no longer work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 31519: fix broken links in 0.7 docs

2015-02-26 Thread Bridget Bridget

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31519/#review74421
---

Ship it!


looks good

- Bridget Bridget


On Feb. 27, 2015, 1:30 a.m., Kristine Hahn wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31519/
> ---
> 
> (Updated Feb. 27, 2015, 1:30 a.m.)
> 
> 
> Review request for drill and Bridget Bridget.
> 
> 
> Bugs: DRILL-2324
> https://issues.apache.org/jira/browse/DRILL-2324
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> fix broken links in 0.7 docs
> 
> 
> Diffs
> -
> 
>   _docs/001-arch.md 0905ad3 
>   _docs/002-tutorial.md 14cae80 
>   _docs/003-yelp.md b65359e 
>   _docs/006-interfaces.md ce068a6 
>   _docs/012-rn.md f369335 
>   _docs/013-contribute.md 33db231 
>   _docs/014-sample-ds.md 7212ea0 
>   _docs/015-design.md 00b17e5 
>   _docs/016-progress.md bf19a29 
>   _docs/arch/arch-hilite/001-flexibility.md 0b5c5e3 
>   _docs/connect/005-reg-hive.md 564bebc 
>   _docs/connect/006-default-frmt.md 7dc55d5 
>   _docs/connect/007-mongo-plugin.md fd5dba8 
>   _docs/contribute/001-guidelines.md 686d972 
>   _docs/dev-custom-fcn/001-dev-simple.md ebf3831 
>   _docs/dev-custom-fcn/002-dev-aggregate.md d1a3cfb 
>   _docs/develop/001-compile.md 2cf6ac9 
>   _docs/develop/003-patch-tool.md 3ef3fe5 
>   _docs/install/001-drill-in-10.md 13d2410 
>   _docs/install/002-deploy.md eecd3bc 
>   _docs/install/004-install-distributed.md d0f07aa 
>   _docs/install/install-embedded/001-install-linux.md b7a0c85 
>   _docs/install/install-embedded/002-install-mac.md a288b94 
>   _docs/install/install-embedded/003-install-win.md 6680019 
>   _docs/interfaces/001-odbc-win.md 2f08af2 
>   _docs/interfaces/003-jdbc-squirrel.md 99eba80 
>   _docs/interfaces/odbc-linux/001-install-odbc-linux.md 3ae1930 
>   _docs/interfaces/odbc-linux/002-install-odbc-mac.md 65a35f3 
>   _docs/interfaces/odbc-linux/003-odbc-connections-linux.md 11b660d 
>   _docs/interfaces/odbc-linux/005-odbc-connect-str.md 595432b 
>   _docs/interfaces/odbc-win/001-install-odbc-win.md 5bb6c8d 
>   _docs/interfaces/odbc-win/002-conf-odbc-win.md 636bd9f 
>   _docs/interfaces/odbc-win/003-connect-odbc-win.md 0d4cb8a 
>   _docs/interfaces/odbc-win/004-tableau-examples.md d45f3f3 
>   _docs/manage/conf/001-mem-alloc.md 8f98cfc 
>   _docs/manage/conf/002-startup-opt.md 3434401 
>   _docs/manage/conf/004-persist-conf.md b1deefa 
>   _docs/query/001-query-fs.md ca488fb 
>   _docs/query/002-query-hbase.md d2a33d5 
>   _docs/query/004-query-hive.md 903c7c6 
>   _docs/query/006-query-sys-tbl.md 9b853ec 
>   _docs/query/query-fs/002-query-parquet.md cf19fcf 
>   _docs/rn/004-0.6.0-rn.md f121ebe 
>   _docs/sql-ref/001-data-types.md e425033 
>   _docs/sql-ref/002-operators.md 79afc7d 
>   _docs/sql-ref/003-functions.md 4769257 
>   _docs/sql-ref/004-nest-functions.md 09fe91e 
>   _docs/sql-ref/005-cmd-summary.md 13d9515 
>   _docs/sql-ref/nested/001-flatten.md 2769000 
>   _docs/sql-ref/nested/002-kvgen.md f619864 
>   _docs/sql-ref/nested/003-repeated-cnt.md 2b332b3 
>   _docs/tutorial/002-get2kno-sb.md 9b11b9d 
>   _docs/tutorial/003-lesson1.md 119d67f 
>   _docs/tutorial/004-lesson2.md 73c4329 
>   _docs/tutorial/005-lesson3.md f6c7ae4 
>   _docs/tutorial/006-summary.md 552d72f 
>   _docs/tutorial/install-sandbox/001-install-mapr-vm.md 9c0e19e 
>   _docs/tutorial/install-sandbox/002-install-mapr-vb.md 36b10c0 
> 
> Diff: https://reviews.apache.org/r/31519/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Kristine Hahn
> 
>



Re: Review Request 31519: fix broken links in 0.7 docs

2015-02-26 Thread Bridget Bridget

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31519/#review74422
---

Ship it!


Ship It!

- Bridget Bridget


On Feb. 27, 2015, 1:30 a.m., Kristine Hahn wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31519/
> ---
> 
> (Updated Feb. 27, 2015, 1:30 a.m.)
> 
> 
> Review request for drill and Bridget Bridget.
> 
> 
> Bugs: DRILL-2324
> https://issues.apache.org/jira/browse/DRILL-2324
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> fix broken links in 0.7 docs
> 
> 
> Diffs
> -
> 
>   _docs/001-arch.md 0905ad3 
>   _docs/002-tutorial.md 14cae80 
>   _docs/003-yelp.md b65359e 
>   _docs/006-interfaces.md ce068a6 
>   _docs/012-rn.md f369335 
>   _docs/013-contribute.md 33db231 
>   _docs/014-sample-ds.md 7212ea0 
>   _docs/015-design.md 00b17e5 
>   _docs/016-progress.md bf19a29 
>   _docs/arch/arch-hilite/001-flexibility.md 0b5c5e3 
>   _docs/connect/005-reg-hive.md 564bebc 
>   _docs/connect/006-default-frmt.md 7dc55d5 
>   _docs/connect/007-mongo-plugin.md fd5dba8 
>   _docs/contribute/001-guidelines.md 686d972 
>   _docs/dev-custom-fcn/001-dev-simple.md ebf3831 
>   _docs/dev-custom-fcn/002-dev-aggregate.md d1a3cfb 
>   _docs/develop/001-compile.md 2cf6ac9 
>   _docs/develop/003-patch-tool.md 3ef3fe5 
>   _docs/install/001-drill-in-10.md 13d2410 
>   _docs/install/002-deploy.md eecd3bc 
>   _docs/install/004-install-distributed.md d0f07aa 
>   _docs/install/install-embedded/001-install-linux.md b7a0c85 
>   _docs/install/install-embedded/002-install-mac.md a288b94 
>   _docs/install/install-embedded/003-install-win.md 6680019 
>   _docs/interfaces/001-odbc-win.md 2f08af2 
>   _docs/interfaces/003-jdbc-squirrel.md 99eba80 
>   _docs/interfaces/odbc-linux/001-install-odbc-linux.md 3ae1930 
>   _docs/interfaces/odbc-linux/002-install-odbc-mac.md 65a35f3 
>   _docs/interfaces/odbc-linux/003-odbc-connections-linux.md 11b660d 
>   _docs/interfaces/odbc-linux/005-odbc-connect-str.md 595432b 
>   _docs/interfaces/odbc-win/001-install-odbc-win.md 5bb6c8d 
>   _docs/interfaces/odbc-win/002-conf-odbc-win.md 636bd9f 
>   _docs/interfaces/odbc-win/003-connect-odbc-win.md 0d4cb8a 
>   _docs/interfaces/odbc-win/004-tableau-examples.md d45f3f3 
>   _docs/manage/conf/001-mem-alloc.md 8f98cfc 
>   _docs/manage/conf/002-startup-opt.md 3434401 
>   _docs/manage/conf/004-persist-conf.md b1deefa 
>   _docs/query/001-query-fs.md ca488fb 
>   _docs/query/002-query-hbase.md d2a33d5 
>   _docs/query/004-query-hive.md 903c7c6 
>   _docs/query/006-query-sys-tbl.md 9b853ec 
>   _docs/query/query-fs/002-query-parquet.md cf19fcf 
>   _docs/rn/004-0.6.0-rn.md f121ebe 
>   _docs/sql-ref/001-data-types.md e425033 
>   _docs/sql-ref/002-operators.md 79afc7d 
>   _docs/sql-ref/003-functions.md 4769257 
>   _docs/sql-ref/004-nest-functions.md 09fe91e 
>   _docs/sql-ref/005-cmd-summary.md 13d9515 
>   _docs/sql-ref/nested/001-flatten.md 2769000 
>   _docs/sql-ref/nested/002-kvgen.md f619864 
>   _docs/sql-ref/nested/003-repeated-cnt.md 2b332b3 
>   _docs/tutorial/002-get2kno-sb.md 9b11b9d 
>   _docs/tutorial/003-lesson1.md 119d67f 
>   _docs/tutorial/004-lesson2.md 73c4329 
>   _docs/tutorial/005-lesson3.md f6c7ae4 
>   _docs/tutorial/006-summary.md 552d72f 
>   _docs/tutorial/install-sandbox/001-install-mapr-vm.md 9c0e19e 
>   _docs/tutorial/install-sandbox/002-install-mapr-vb.md 36b10c0 
> 
> Diff: https://reviews.apache.org/r/31519/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Kristine Hahn
> 
>



understanding groupCount & valueCount in repeated vectors

2015-02-26 Thread Hanifi Gunes
Hey everyone,

Scalar ValueVector(VV) types implement getValueCount method, which returns
the number of "value"s stored in the vector. I would expect the same be
true for RepeatedVVs as well. However, getValueCount on repeated types
report number of inner/sub-values stored and introduces another method
called groupCount to report actual number of "value"s stored.

This becomes really confusing and somewhat inconsistent (especially for
RepeatedList) as one would expect #getValueCount should report the number
of values regardless if the stored value type is nested or flat.

As part of DRILL-2150, I am refactoring VVs so that getValueCount
universally returns the number of values stored. Alongside, I plan to
introduce a new method getCellCount that reports total number of
sub-values/cells stored in a repeated vector.

I'd like to probe if anyone has any concerns relating to this. Please let
me know.


Thanks.
-Hanifi


Review Request 31521: fix broken links in 0.8/1.0 docs

2015-02-26 Thread Kristine Hahn

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31521/
---

Review request for drill and Bridget Bridget.


Bugs: DRILL-2331
https://issues.apache.org/jira/browse/DRILL-2331


Repository: drill-git


Description
---

When we moved files to the Drill site, links that worked on the personal 
staging site no longer work.


Diffs
-

  _docs/001-arch.md 0905ad3 
  _docs/002-tutorial.md 14cae80 
  _docs/003-yelp.md b65359e 
  _docs/006-interfaces.md ce068a6 
  _docs/012-rn.md f369335 
  _docs/013-contribute.md 33db231 
  _docs/014-sample-ds.md 7212ea0 
  _docs/015-design.md 00b17e5 
  _docs/016-progress.md bf19a29 
  _docs/arch/arch-hilite/001-flexibility.md 0b5c5e3 
  _docs/connect/005-reg-hive.md 564bebc 
  _docs/connect/006-default-frmt.md 7dc55d5 
  _docs/connect/007-mongo-plugin.md fd5dba8 
  _docs/contribute/001-guidelines.md 686d972 
  _docs/dev-custom-fcn/001-dev-simple.md ebf3831 
  _docs/dev-custom-fcn/002-dev-aggregate.md d1a3cfb 
  _docs/develop/001-compile.md 2cf6ac9 
  _docs/develop/003-patch-tool.md 3ef3fe5 
  _docs/install/001-drill-in-10.md 13d2410 
  _docs/install/002-deploy.md eecd3bc 
  _docs/install/004-install-distributed.md d0f07aa 
  _docs/install/install-embedded/001-install-linux.md b7a0c85 
  _docs/install/install-embedded/002-install-mac.md a288b94 
  _docs/install/install-embedded/003-install-win.md 6680019 
  _docs/interfaces/001-odbc-win.md 2f08af2 
  _docs/interfaces/003-jdbc-squirrel.md 99eba80 
  _docs/interfaces/odbc-linux/001-install-odbc-linux.md 3ae1930 
  _docs/interfaces/odbc-linux/002-install-odbc-mac.md 65a35f3 
  _docs/interfaces/odbc-linux/003-odbc-connections-linux.md 11b660d 
  _docs/interfaces/odbc-linux/005-odbc-connect-str.md 595432b 
  _docs/interfaces/odbc-win/001-install-odbc-win.md 5bb6c8d 
  _docs/interfaces/odbc-win/002-conf-odbc-win.md 636bd9f 
  _docs/interfaces/odbc-win/003-connect-odbc-win.md 0d4cb8a 
  _docs/interfaces/odbc-win/004-tableau-examples.md d45f3f3 
  _docs/manage/conf/001-mem-alloc.md 8f98cfc 
  _docs/manage/conf/002-startup-opt.md 3434401 
  _docs/manage/conf/004-persist-conf.md b1deefa 
  _docs/query/001-query-fs.md ca488fb 
  _docs/query/002-query-hbase.md d2a33d5 
  _docs/query/004-query-hive.md 903c7c6 
  _docs/query/006-query-sys-tbl.md 9b853ec 
  _docs/query/query-fs/002-query-parquet.md cf19fcf 
  _docs/rn/004-0.6.0-rn.md f121ebe 
  _docs/sql-ref/001-data-types.md e425033 
  _docs/sql-ref/002-operators.md 79afc7d 
  _docs/sql-ref/003-functions.md 4769257 
  _docs/sql-ref/004-nest-functions.md 09fe91e 
  _docs/sql-ref/005-cmd-summary.md 13d9515 
  _docs/sql-ref/nested/001-flatten.md 2769000 
  _docs/sql-ref/nested/002-kvgen.md f619864 
  _docs/sql-ref/nested/003-repeated-cnt.md 2b332b3 
  _docs/tutorial/002-get2kno-sb.md 9b11b9d 
  _docs/tutorial/003-lesson1.md 119d67f 
  _docs/tutorial/004-lesson2.md 73c4329 
  _docs/tutorial/005-lesson3.md f6c7ae4 
  _docs/tutorial/006-summary.md 552d72f 
  _docs/tutorial/install-sandbox/001-install-mapr-vm.md 9c0e19e 
  _docs/tutorial/install-sandbox/002-install-mapr-vb.md 36b10c0 

Diff: https://reviews.apache.org/r/31521/diff/


Testing
---


Thanks,

Kristine Hahn



[jira] [Created] (DRILL-2332) Drill should be consistent with Implicit casting rules across data formats

2015-02-26 Thread Abhishek Girish (JIRA)
Abhishek Girish created DRILL-2332:
--

 Summary: Drill should be consistent with Implicit casting rules 
across data formats
 Key: DRILL-2332
 URL: https://issues.apache.org/jira/browse/DRILL-2332
 Project: Apache Drill
  Issue Type: Improvement
  Components: Query Planning & Optimization
Reporter: Abhishek Girish
Assignee: Jinfeng Ni


Currently, the outcome of a query with a filter on a column comparing it with a 
literal, depends on the underlying data format. 

*Parquet*
{code:sql}
select * from date_dim where d_month_seq ='1193' limit 1;
[Succeeds]

select * from date_dim where d_date in ('1999-06-30') limit 1;
[Succeeds]
{code}

*View on top of text:*
{code:sql}
select * from date_dim where d_date in ('1999-06-30') limit 1;
Query failed: SqlValidatorException: Values passed to IN operator must have 
compatible types

Error: exception while executing query: Failure while executing query. 
(state=,code=0)

select * from date_dim where d_month_seq ='1193' limit 1;
Query failed: SqlValidatorException: Cannot apply '=' to arguments of type 
' = '. Supported form(s): ' = 
'

Error: exception while executing query: Failure while executing query. 
(state=,code=0)
{code}

I understand that in the case of View on Text, SQL validation fails at the 
Optiq layer. 

But from the perspective of an end-user, Drill's behavior must be consistent 
across data formats. Also having a view by definition should abstract out this 
information.

Here, both the view and parquet were created with type information. 

*Parquet-meta*
{code}
parquet-schema /mapr/abhi311/data/parquet/tpcds/scale1/date_dim/0_0_0.parquet 
message root {
  optional int32 d_date_sk;
  optional binary d_date_id (UTF8);
  optional binary d_date (UTF8);
  optional int32 d_month_seq;
  optional int32 d_week_seq;
  optional int32 d_quarter_seq;
  optional int32 d_year;
  optional int32 d_dow;
  optional int32 d_moy;
  optional int32 d_dom;
  optional int32 d_qoy;
  optional int32 d_fy_year;
  optional int32 d_fy_quarter_seq;
  optional int32 s_fy_week_seq;
  optional binary d_day_name (UTF8);
  optional binary d_quarter_name (UTF8);
  optional binary d_holiday (UTF8);
  optional binary d_weekend (UTF8);
  optional binary d_following_holiday (UTF8);
  optional int32 d_first_dom;
  optional int32 d_last_dom;
  optional int32 d_same_day_ly;
  optional int32 d_same_day_lq;
  optional binary d_current_day (UTF8);
  optional binary d_current_week (UTF8);
  optional binary d_current_month (UTF8);
  optional binary d_current_quarter (UTF8);
  optional binary d_current_year (UTF8);
}
{code}

*Describe View*
{code:sql}
> describe date_dim;
+-++-+
| COLUMN_NAME | DATA_TYPE  | IS_NULLABLE |
+-++-+
| d_date_sk   | INTEGER| NO  |
| d_date_id   | VARCHAR| NO  |
| d_date  | DATE   | NO  |
| d_month_seq | INTEGER| NO  |
| d_week_seq  | INTEGER| NO  |
| d_quarter_seq | INTEGER| NO  |
| d_year  | INTEGER| NO  |
| d_dow   | INTEGER| NO  |
| d_moy   | INTEGER| NO  |
| d_dom   | INTEGER| NO  |
| d_qoy   | INTEGER| NO  |
| d_fy_year   | INTEGER| NO  |
| d_fy_quarter_seq | INTEGER| NO  |
| s_fy_week_seq | INTEGER| NO  |
| d_day_name  | VARCHAR| NO  |
| d_quarter_name | VARCHAR| NO  |
| d_holiday   | VARCHAR| NO  |
| d_weekend   | VARCHAR| NO  |
| d_following_holiday | VARCHAR| NO  |
| d_first_dom | INTEGER| NO  |
| d_last_dom  | INTEGER| NO  |
| d_same_day_ly | INTEGER| NO  |
| d_same_day_lq | INTEGER| NO  |
| d_current_day | VARCHAR| NO  |
| d_current_week | VARCHAR| NO  |
| d_current_month | VARCHAR| NO  |
| d_current_quarter | VARCHAR| NO  |
| d_current_year | VARCHAR| NO  |
+-++-+
28 rows selected (0.137 seconds)
{code}



For an end



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2333) Parquet writer reports negative count for very large number of records

2015-02-26 Thread Aman Sinha (JIRA)
Aman Sinha created DRILL-2333:
-

 Summary: Parquet writer reports negative count for very large 
number of records
 Key: DRILL-2333
 URL: https://issues.apache.org/jira/browse/DRILL-2333
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - Parquet
Reporter: Aman Sinha
Assignee: Steven Phillips


When doing large scale CTAS, the counter for keeping track of number of records 
written overflows.  See below.  The actual number of records written by each 
minor fragment was about 45B rows. 

 ++---+
|  Fragment  | Number of records written |
++---+
| 1_1| -1271142674   |
| 1_0| -1276497044   |
++---+



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2334) Text record reader should fail gracefully when encountering bad records

2015-02-26 Thread Aman Sinha (JIRA)
Aman Sinha created DRILL-2334:
-

 Summary: Text record reader should fail gracefully when 
encountering bad records
 Key: DRILL-2334
 URL: https://issues.apache.org/jira/browse/DRILL-2334
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - Text & CSV
Reporter: Aman Sinha
Assignee: Hanifi Gunes


The attached file has 1 bad record.   Running a simple count(*) query on this 
file errors out with IOBE and/or possible schema change exception.

The hex dump of the file shows a bunch of 0's (the '*' below indicates more 
lines of 0's):
{code}
1c0 3a 35 35 2e 35 30 35 35 30 00 00 00 00 00 00 00
1d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
*
02a01c0 00 00 00 00 00 00 00 00 00 35 35 35 0a 35 35 35
{code}

{code}
0: jdbc:drill:zk=local> select count(*) from `badRecords2.dat`;
++
|   EXPR$0   |
++
Query failed: RemoteRpcException: Failure while running fragment., You tried to 
do a batch data read operation when you were in a state of STOP.  You can only 
do this type of operation when you are in a state of OK or OK_NEW_SCHEMA.
{code}

log file also shows an IOBE related to this: 

{code}
18:49:00.003 [2b1024e4-5639-b4ec-392e-8d5879c3d4db:frag:0:0] DEBUG 
o.a.d.exec.physical.impl.ScanBatch - Failed to read the batch. Stopping...
java.lang.IndexOutOfBoundsException: index: 374, length: 2752540 (expected: 
range(0, 65536))
at 
io.netty.buffer.AbstractByteBuf.checkIndex(AbstractByteBuf.java:1143) 
~[netty-buffer-4.0.24.Final.jar:4.0.24.Final]
at 
io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:272)
 ~[netty-buffer-4.0.24.Final.jar:4.0.24.Final]
at io.netty.buffer.WrappedByteBuf.setBytes(WrappedByteBuf.java:390) 
~[netty-buffer-4.0.24.Final.jar:4.0.24.Final]
at 
io.netty.buffer.UnsafeDirectLittleEndian.setBytes(UnsafeDirectLittleEndian.java:25)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
at io.netty.buffer.DrillBuf.setBytes(DrillBuf.java:651) 
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
at 
org.apache.drill.exec.vector.VarCharVector$Mutator.setSafe(VarCharVector.java:481)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.RepeatedVarCharVector$Mutator.addSafe(RepeatedVarCharVector.java:451)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.store.text.DrillTextRecordReader.next(DrillTextRecordReader.java:172)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:165) 
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
{code}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 31107: Ability to make PartitionSender multithreaded - useful in case of LocalExchange being enabled, as it allows to deal with high volume of incoming data

2015-02-26 Thread Yuliya Feldman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31107/
---

(Updated Feb. 26, 2015, 7:19 p.m.)


Review request for drill, Chris Westin, Jacques Nadeau, Steven Phillips, and 
Venki Korukanti.


Changes
---

Added calculation of number of threads based on the cost (number of rows).

Formula is:  cost/slicetarget/#senders/threadfactor

threadfactor is set to 4 by default

Additional config param is max number of threads - by default set to 32

Will need to play around with those params to figure out good combination


Bugs: DRILL-2210
https://issues.apache.org/jira/browse/DRILL-2210


Repository: drill-git


Description
---

In addition to description

Fixed few classes that did not handle multithreading well
Added/Changed some Stats behavior to allow stats merge from multiple threads, 
since again this class is not suitable to be used in multithreaded environment
Introduced new decorator class to handle multi thrteading (or not)  to minimize 
changes to ParitionSenderRootExec class


Diffs (updated)
-

  exec/java-exec/src/main/java/org/apache/drill/exec/compile/CodeCompiler.java 
7cc350e 
  exec/java-exec/src/main/java/org/apache/drill/exec/ops/FragmentContext.java 
e413921 
  exec/java-exec/src/main/java/org/apache/drill/exec/ops/OperatorStats.java 
0e9da0e 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/config/IteratorValidator.java
 64cf7c5 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/SendingAccountor.java
 7af7b65 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java
 f09acaa 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/Partitioner.java
 5ed9c39 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerDecorator.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionerTemplate.java
 4292c09 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/fragment/Materializer.java
 961b603 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java
 faa8546 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitContext.java 
83a89df 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
 aa0a5ad 
  exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java 
99c6ab8 
  
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestOptiqPlans.java
 478 

Diff: https://reviews.apache.org/r/31107/diff/


Testing
---

Still need to provide Unit Tests.

Functional tests are passing

Performance tests were run and look promising for some queries


Thanks,

Yuliya Feldman



[jira] [Created] (DRILL-2335) Error message must be updated to exclude unsupported operators when queries fail to parse

2015-02-26 Thread Abhishek Girish (JIRA)
Abhishek Girish created DRILL-2335:
--

 Summary: Error message must be updated to exclude unsupported 
operators when queries fail to parse
 Key: DRILL-2335
 URL: https://issues.apache.org/jira/browse/DRILL-2335
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Reporter: Abhishek Girish
Assignee: Jinfeng Ni
Priority: Minor


When queries fail to parse due to errors in query syntax, an error is thrown 
with a list of expected operators, which include some which we do not support 
at present. 

I understand that the SQL validation errors come from the Calcite layer. But 
since we do not support all operators for now (for example: INTERSECT), the 
message must be updated if possible, to correctly reflect what is supported. 

This would make sure contradicting messages aren't thrown (for example: first 
complaining with parse error indicating INTERSECT is a valid operator and then 
upon correction, failing saying INTERSECT isn't supported). 

{code:sql}
Query failed: ParseException: Encountered ";" at line 1, column 89.
Was expecting one of:
 
"ORDER" ...
"LIMIT" ...
"OFFSET" ...
"FETCH" ...
"UNION" ...
"INTERSECT" ...
"EXCEPT" ...
"NOT" ...
"IN" ...
"BETWEEN" ...
"LIKE" ...
"SIMILAR" ...
"=" ...
">" ...
"<" ...
"<=" ...
">=" ...
"<>" ...
"+" ...
"-" ...
"*" ...
"/" ...
"||" ...
"AND" ...
"OR" ...
"IS" ...
"MEMBER" ...
"SUBMULTISET" ...
"MULTISET" ...
"[" ...
"OVERLAPS" ...
"YEAR" ...
"MONTH" ...
"DAY" ...
"HOUR" ...
"MINUTE" ...
"SECOND" ...
Error: exception while executing query: Failure while executing query. 
(state=,code=0)
{code}







--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 30313: Fixing Mongo join issue when * is selected

2015-02-26 Thread Kamesh B


> On Feb. 27, 2015, 12:19 a.m., Hanifi Gunes wrote:
> > contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoRecordReader.java,
> >  line 200
> > 
> >
> > This does not seem quite right to me. What if multiple writes fail? 
> > Also note that this will change the order of records being written into 
> > vectors.

The reason for doing this is, if there is any issue while storing documents 
into value vectors, and if the vectors don't have any buffers, if fails to 
write it. But by that time, Mongo iterator advances its iterator. When the flow 
again reaches to next method, first we need to process the previously read 
document, otherwise, we will miss that document.
But, I think, with DRILL-1960, we may not encounter this scenario. 

Will update the patch.


- Kamesh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30313/#review74406
---


On Jan. 27, 2015, 10:27 a.m., Kamesh B wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30313/
> ---
> 
> (Updated Jan. 27, 2015, 10:27 a.m.)
> 
> 
> Review request for drill, Jacques Nadeau and Jinfeng Ni.
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> Fixing Mongo join issue when * is selected.
> 
> 
> Diffs
> -
> 
>   
> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoRecordReader.java
>  4b7360057e9d95fb0176b035469bca02128a0b34 
> 
> Diff: https://reviews.apache.org/r/30313/diff/
> 
> 
> Testing
> ---
> 
> Tested the patch by selecting * in join operation. It is working fine. This 
> patch also fixes some of issues caused by earlier commits.
> 
> 
> Thanks,
> 
> Kamesh B
> 
>



Re: Review Request 31507: DRILL-2289: User email is still pointing to the old ( incubator.apache.org) should be u...@drill.apache.org

2015-02-26 Thread Ted Dunning

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31507/#review74441
---

Ship it!


Ship It!

- Ted Dunning


On Feb. 26, 2015, 11 p.m., Bridget Bridget wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31507/
> ---
> 
> (Updated Feb. 26, 2015, 11 p.m.)
> 
> 
> Review request for drill and Aditya Kishore.
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> URL : http://drill.apache.org/faq/
> FAQ:"How can I ask questions and provide feedback?" has the the following
> Email drill-u...@incubator.apache.org should be u...@drill.apache.org address
> 
> 
> Diffs
> -
> 
>   faq.md 0c805cc 
> 
> Diff: https://reviews.apache.org/r/31507/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Bridget Bridget
> 
>



New ideas for contribution

2015-02-26 Thread Yash Sharma
Hi Team,
I am looking for new tasks and areas for contribution.

Could someone suggest few areas that are priorities for Drill and/or are
left out for a while. I would start exploring them and reach out in case of
any queries.

Thanks.


Re: Review Request 31435: DRILL-2245-core: core changes for improved drillbit stability

2015-02-26 Thread Jacques Nadeau

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31435/#review74282
---



exec/java-exec/src/main/java/org/apache/drill/exec/server/Drillbit.java


Did you reorder for a sepcific reason?  Ideally we should unregister from 
the coordination service before shutting things down.  In fact, we should 
probably also pause before shutting things down after unregistering



exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java


Do you make sure that Foremans no longer leak exceptions?  This was here to 
ensure that we get an error rather than the thread terminating and the error 
showing up in standard err.



exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java


is this guaranteed to be called or can something above prematurely end this 
method via exception?



exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java


can we just use if this logger.isInfoEnabled()?



exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java


Canceled should be a terminal state from the user's perspective.  Not sure 
that happens here or if you manage this some other way.



exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java


Shouldn't this be protected?



exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java


Can we make this be the first exception with rest of the exceptions tacked 
on as suppressed exceptions?



exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java


can you pick message for the assertion so if we hit it, we get something 
other than "null"



exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java


No, but that is on purpose.  We keep them forever for log analysis



exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java


We need to make sure we have everything closed before returning success.  
You've removed that functionality by eating (only logging) the exceptions.  If 
a query leaks anything, I want the query to fail so it shows up in all the 
tests.  I'm okay disabling this functionality in production mode but definitely 
want it to be the case in normal development and testing.  If you disagree, we 
should discuss further because I see this a show stopper for this patch.



exec/java-exec/src/test/java/org/apache/drill/BaseTestQuery.java


This is a stylistic change.  I think you need to propose this on the list 
and we come to a group consensus.  As it is, this is inconsistent with the 
whole rest of the code base.  Please discuss on the dev list before changing 
this everywhere.



exec/java-exec/src/test/java/org/apache/drill/exec/TestWithZookeeper.java


why commented?  I see no downside to having an not-yet used logger in a 
class.


- Jacques Nadeau


On Feb. 25, 2015, 9:42 p.m., Chris Westin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31435/
> ---
> 
> (Updated Feb. 25, 2015, 9:42 p.m.)
> 
> 
> Review request for drill and Jacques Nadeau.
> 
> 
> Bugs: DRILL-2245
> https://issues.apache.org/jira/browse/DRILL-2245
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> DRILL-2245-core: Clean up query setup and execution kickoff in 
> Foreman/WorkManager in order to ensure consistent handling, and avoid hangs 
> and races, with the goal of improving Drillbit robustness.
> 
> I did my best to keep these clean when I split them up, but this core 
> commit
> may depend on some minor changes in the hygiene commit that is also
> associated with this bug, so either both should be applied, or neither.
> The core commit should be applied first.
> 
> AutoCloseables
> - created org.apache.drill.common.AutoCloseables to handle closing these
>   quietly
> 
> BaseTestQuery, and derivatives
> - factored out pieces into QueryTestUtil so they can be reused
> 
> Drillbit
> - uses AutoCloseables for the WorkManager and for the storeProvider
> - allow start