[GitHub] drill pull request: DRILL-3497: Throw UserException#validationErro...

2015-09-03 Thread jaltekruse
Github user jaltekruse commented on a diff in the pull request:

https://github.com/apache/drill/pull/98#discussion_r38660966
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/testing/ControlsInjectionUtil.java
 ---
@@ -74,7 +75,7 @@ public static void setControls(final UserSession session, 
final String controls)
 
 final OptionManager options = session.getOptions();
 try {
-  options.getAdmin().validate(opValue);
+  
SystemOptionManager.getValidator(DRILLBIT_CONTROL_INJECTIONS).validate(opValue);
--- End diff --

This could just be a reference to the validator itself, as it is also 
public and static, rather than looking it up by name.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: fix missing indirect dependency

2015-09-03 Thread julienledem
Github user julienledem closed the pull request at:

https://github.com/apache/drill/pull/121


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: fix missing indirect dependency

2015-09-03 Thread julienledem
Github user julienledem commented on the pull request:

https://github.com/apache/drill/pull/121#issuecomment-137532822
  
Fixed in 4b8e85ad6fb40554e6752144f09bdfb474d62d9b


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (DRILL-3669) fix missing direct dependency

2015-09-03 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-3669.

Resolution: Fixed

Fixed in 4b8e85ad6fb40554e6752144f09bdfb474d62d9b

> fix missing direct dependency
> -
>
> Key: DRILL-3669
> URL: https://issues.apache.org/jira/browse/DRILL-3669
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Julien Le Dem
>Assignee: Jason Altekruse
> Attachments: DRILL-3669.1.patch.txt, DRILL-3669.2.patch.txt
>
>
> This prevents generating a compiling project with mvn eclipse:eclipse
> pull request here:
> https://github.com/apache/drill/pull/121/files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Identifying the source of problematic records

2015-09-03 Thread Jason Altekruse
@Jacques,

On your point a) about expressing failures and the compilation model, I had
thought about previously using the interpreter to figure out which
expression against the current row failed, once we have caught an exception
out of some part of the complete code-generated expression evaluation. Do
you think this would possibly address your concern? Do you think anything
more than the problematic input data and the expression that failed would
be produced by the functions in this new standardized error format?

- Jason

On Wed, Sep 2, 2015 at 8:43 PM, Jacques Nadeau  wrote:

> I'd like to propose a few things to solve this:
>
> a) Functions should be able to express failures in a standardized way. I'm
> thinking a new type of injectable and/or a certain type of exception
> (although more dangerous/possibly requires rewrite given compilation
> model).
> b) Users (session/system level) should be able to set a setting where
> function errors are handled a certain way. Options could include query
> failure, ignore + inform as warning/notice, and save records for later
> analysis (maybe in v2).
> c) Readers that have a notorious problem (e.g. Text) should support
> projection/expression pushdown so that they can create these kinds of
> errors and provide additional context as part of that.
> d) We should also implement dot drill files so that users can prescribe
> this projection/data validation process by default for files/diretories
> (which would provide the behavior as c above.
> e) We should get more serious about providing useful virtual fields.  This
> should include filename (similar to directory name).
>
> Once a record leaves an operator, I don't think we should carry any
> additional provenance with it. It would be too heavy weight as a default
> behavior.
>
>
>
>
>
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
> On Tue, Sep 1, 2015 at 9:08 AM, Aman Sinha  wrote:
>
> > Drill can point out the filename and location of corrupted records in a
> > file but we don't have a good mechanism to deal with the following
> > scenario:
> >
> > Consider a text file with 2 records:
> > $ cat t4.csv
> > 10,2001
> > 11,http://www.cnn.com
> >
> > 0: jdbc:drill:zk=local> alter session set `exec.errors.verbose` = true;
> >
> > 0: jdbc:drill:zk=local> select cast(columns[0] as init), cast(columns[1]
> as
> > bigint) from dfs.`/Users/asinha/data/t4.csv`;
> >
> > Error: SYSTEM ERROR: NumberFormatException: http://www.cnn.com
> >
> > Fragment 0:0
> >
> > [Error Id: 72aad22c-a345-4100-9a57-dcd8436105f7 on 10.250.56.140:31010]
> >
> >   (java.lang.NumberFormatException) http://www.cnn.com
> > org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.nfeL():91
> >
> >
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.varCharToLong():62
> > org.apache.drill.exec.test.generated.ProjectorGen1.doEval():62
> >
>  org.apache.drill.exec.test.generated.ProjectorGen1.projectRecords():62
> >
> >
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork():172
> >
> > The problem is user does not have a clue about the original source of
> this
> > error.  This is a pain point especially when dealing with thousands of
> > files.
> >
> > 1.  We can start by providing the column index where the problem
> occurred.
> > 2.  Can a scan batch keep track of the file it originated from ? Since
> the
> > Project in the
> >  above query is pushed right above the scan, it could get the
> filename
> > from the record
> >  batch (assuming we can store this piece of information).  This won't
> > be possible
> >  for other Projects elsewhere in the plan.
> > 3.  What about the location within the file ?   Unless the projection is
> > pushed into the scan
> >  itself, I don't see a good way to provide this information.
> >
> > A related topic is how to tell Drill to ignore such records when doing a
> > query or a CTAS ?
> > That could be a separate discussion.
> >
> > Thoughts ?
> > Aman
> >
>


[GitHub] drill pull request: DRILL-2304: Case sensitivity - system and sess...

2015-09-03 Thread jaltekruse
Github user jaltekruse commented on the pull request:

https://github.com/apache/drill/pull/90#issuecomment-137505054
  
@sudheeshkatkam I reviewed this change while I was looking at 3497 in #98. 
Other than the few small comments over there this looks good.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Apache drill jdbc driver - can i connect to a drillbit?

2015-09-03 Thread Tomer Shiran
If you want to connect to a random drillbit in the cluster you would use
ZooKeeper in the connection URL:

jdbc:drill:zk=/drill/

If you want to connect to a specific drillbit you could specify that
directly by replacing "zk=" with "drillbit="

On Thu, Sep 3, 2015 at 12:28 AM, Rajkumar Singh  wrote:

> This is a sample code snippet to connect to drill using Drill-Jdbc-all
> Driver.
>
> Class.forName("org.apache.drill.jdbc.Driver");
> Connection connection =DriverManager.getConnection("jdbc:drill:zk=
> node3.mynode.com:5181/drill/my_cluster_com-drillbits");
> Statement st = connection.createStatement();
> ResultSet rs = st.executeQuery("SELECT * from cp.`employee`");
> while(rs.next()){
> System.out.println(rs.getString(1));
> }
>
>
> Rajkumar Singh
> MapR Technologies
>
>
> > On Sep 3, 2015, at 12:50 PM, Sudip Mukherjee 
> wrote:
> >
> > Hi Devs,
> >
> > Is there way to connect a drillbit using the jdbc driver. Could you
> please point me to an example if there is one?
> >
> > Thanks,
> > Sudip
> >
> >
> >
> > ***Legal Disclaimer***
> > "This communication may contain confidential and privileged material for
> the
> > sole use of the intended recipient. Any unauthorized review, use or
> distribution
> > by others is strictly prohibited. If you have received the message by
> mistake,
> > please advise the sender by reply email and delete the message. Thank
> you."
> > **
>
>


-- 
Tomer Shiran
CEO and Co-Founder, Dremio


[GitHub] drill pull request: DRILL-3566: PreparedStatement fix and DRILL-33...

2015-09-03 Thread dsbos
Github user dsbos closed the pull request at:

https://github.com/apache/drill/pull/111


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3566: Fix: PreparedStatement.executeQuer...

2015-09-03 Thread dsbos
GitHub user dsbos opened a pull request:

https://github.com/apache/drill/pull/143

DRILL-3566: Fix:  PreparedStatement.executeQuery() got ClassCastException.

Main:
Restored DrillResultSetImpl(...)'s statement parameter from overly
restrictive DrillStatementImpl to AvaticaStatement and removed caller
cast that was throwing.  (Relatedly, adjusted getStatement() and moved
internal casting from statement to connection.)

Added basic test of querying via PreparedStatement.  [PreparedStatementTest]
Added some case test of statement-creation methods.  [ConnectionTest]

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dsbos/incubator-drill bugs/drill-3566

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/143.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #143


commit 1a870538c66fa59070facce7e2a4342d9869b51e
Author: dbarclay 
Date:   2015-07-28T02:27:50Z

DRILL-3566: Fix:  PreparedStatement.executeQuery() got ClassCastException.

Main:
Restored DrillResultSetImpl(...)'s statement parameter from overly
restrictive DrillStatementImpl to AvaticaStatement and removed caller
cast that was throwing.  (Relatedly, adjusted getStatement() and moved
internal casting from statement to connection.)

Added basic test of querying via PreparedStatement.  [PreparedStatementTest]
Added some case test of statement-creation methods.  [ConnectionTest]




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (DRILL-3348) NPE when two different window functions are used in projection list and order by clauses

2015-09-03 Thread Sean Hsuan-Yi Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Hsuan-Yi Chu resolved DRILL-3348.
--
Resolution: Fixed

> NPE when two different window functions are used in projection list and order 
> by clauses
> 
>
> Key: DRILL-3348
> URL: https://issues.apache.org/jira/browse/DRILL-3348
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.0.0
>Reporter: Victoria Markman
>Assignee: Sean Hsuan-Yi Chu
>  Labels: window_function
> Fix For: 1.2.0
>
>
> {code:sql}
> select 
> a1, 
> rank() over(partition by b1 order by a1) 
> from 
> t1 
> order by 
> row_number() over(partition by b1 order by a1);
> {code}
> {code}
> 0: jdbc:drill:schema=dfs> select a1, rank() over(partition by b1 order by a1) 
> from t1 order by row_number() over(partition by b1 order by a1);
> Error: SYSTEM ERROR: org.apache.drill.exec.work.foreman.ForemanException: 
> Unexpected exception during fragment initialization: null
> [Error Id: ba3e0fda-cc78-4650-a49b-51e4fd7d625d on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> {code}
> drillbit.log
> {code}
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception 
> during fragment initialization: null
> [Error Id: ba3e0fda-cc78-4650-a49b-51e4fd7d625d on atsqa4-133.qa.lab:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:523)
>  ~[drill-common-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:738)
>  [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:840)
>  [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:782)
>  [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73) 
> [drill-common-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:784)
>  [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:893) 
> [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:253) 
> [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_71]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_71]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
> exception during fragment initialization: null
> ... 4 common frames omitted
> Caused by: java.lang.NullPointerException: null
> at org.apache.calcite.rex.RexBuilder.makeCast(RexBuilder.java:465) 
> ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
> at org.apache.calcite.rex.RexBuilder.ensureType(RexBuilder.java:955) 
> ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertOver(SqlToRelConverter.java:1763)
>  ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter.access$1000(SqlToRelConverter.java:180)
>  ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:3938)
>  ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectList(SqlToRelConverter.java:3327)
>  ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:609)
>  ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:564)
>  ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:2741)
>  ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:522)
>  

[GitHub] drill pull request: DRILL-3347: VARCHAR ResultSet.getObject return...

2015-09-03 Thread dsbos
GitHub user dsbos opened a pull request:

https://github.com/apache/drill/pull/144

DRILL-3347: VARCHAR ResultSet.getObject returned ...hadoop.io.Text, not 
String.


Core fix:
- Fixed {,Nullable}VarCharAccessor's getObject() to return String instead of
  value vector's internal org.apache.hadoop.io.Text.
- Updated unit tests (to expect only String now).
  [DatabaseMetaDataGetColumnsTest, ResultSetMetaDataTest]

Also Added getObject check in tracing proxy test.  [TracingProxyDriverTest]
Changed hard references to Hadoop's Text and JodaTime's Period to strings in
warning check in tracing proxy.  [InvocationReporterImpl]

Cleanup:
- Added @Override annotations.  [SqlAccessors]
- (Unintentionally) fixed (undetected) missing comma.  
[ValueVectorTypes.tdd]

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dsbos/incubator-drill bugs/drill-3347

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/144.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #144


commit ccd6ded4387b2b27849762f607a13fa4236351b6
Author: dbarclay 
Date:   2015-08-04T23:51:07Z

DRILL-3347: VARCHAR ResultSet.getObject returned ...hadoop.io.Text, not 
String.

Core fix:
- Fixed {,Nullable}VarCharAccessor's getObject() to return String instead of
  value vector's internal org.apache.hadoop.io.Text.
- Updated unit tests (to expect only String now).
  [DatabaseMetaDataGetColumnsTest, ResultSetMetaDataTest]

Also Added getObject check in tracing proxy test.  [TracingProxyDriverTest]
Changed hard references to Hadoop's Text and JodaTime's Period to strings in
warning check in tracing proxy.  [InvocationReporterImpl]

Cleanup:
- Added @Override annotations.  [SqlAccessors]
- (Unintentionally) fixed (undetected) missing comma.  
[ValueVectorTypes.tdd]




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3497: Throw UserException#validationErro...

2015-09-03 Thread sudheeshkatkam
Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/98#discussion_r38692395
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/FallbackOptionManager.java
 ---
@@ -35,61 +42,65 @@ public FallbackOptionManager(OptionManager fallback) {
 
   @Override
   public Iterator iterator() {
-return Iterables.concat(fallback, optionIterable()).iterator();
+return Iterables.concat(fallback, getLocalOptions()).iterator();
   }
 
   @Override
-  public OptionValue getOption(String name) {
-final OptionValue opt = getLocalOption(name);
-if(opt == null && fallback != null){
+  public OptionValue getOption(final String name) {
+final OptionValue value = getLocalOption(name);
+if (value == null && fallback != null) {
--- End diff --

Will add a precondition.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3497: Throw UserException#validationErro...

2015-09-03 Thread sudheeshkatkam
Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/98#discussion_r38692377
  
--- Diff: 
common/src/main/java/org/apache/drill/common/map/CaseInsensitiveMap.java ---
@@ -0,0 +1,141 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.common.map;
+
+import com.google.common.collect.ImmutableMap;
+import com.google.common.collect.Maps;
+
+import java.util.Collection;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+
+/**
+ * A special type of {@link Map} with {@link String}s as keys, and the 
case of a key is ignored for operations involving
+ * keys like {@link #put}, {@link #get}, etc. The keys are stored and 
retrieved in lower case. Use the static methods to
+ * create instances of this class (e.g. {@link #newConcurrentMap}).
+ *
+ * @param  the type of values to be stored in the map
+ */
+public class CaseInsensitiveMap implements Map {
+
+  /**
+   * Returns a new instance of {@link java.util.concurrent.ConcurrentMap} 
with key case-insensitivity. See
+   * {@link java.util.concurrent.ConcurrentMap}.
+   *
+   * @param  type of values to be stored in the map
+   * @return key case-insensitive concurrent map
+   */
+  public static  CaseInsensitiveMap newConcurrentMap() {
+return new CaseInsensitiveMap<>(Maps.newConcurrentMap());
+  }
+
+  /**
+   * Returns a new instance of {@link java.util.HashMap} with key 
case-insensitivity. See {@link java.util.HashMap}.
+   *
+   * @param  type of values to be stored in the map
+   * @return key case-insensitive hash map
+   */
+  public static  CaseInsensitiveMap newHashMap() {
+return new CaseInsensitiveMap<>(Maps.newHashMap());
+  }
+
+  /**
+   * Returns a new instance of {@link ImmutableMap} with key 
case-insensitivity. This map is built from the given
+   * map. See {@link ImmutableMap}.
+   *
+   * @param map map to copy from
+   * @param  type of values to be stored in the map
+   * @return key case-insensitive immutable map
+   */
+  public static  CaseInsensitiveMap newImmutableMap(final 
Map map) {
+final ImmutableMap.Builder builder = 
ImmutableMap.builder();
+for (final Entry entry : 
map.entrySet()) {
+  builder.put(entry.getKey().toLowerCase(), entry.getValue());
+}
+return new CaseInsensitiveMap<>(builder.build());
+  }
+
+  private final Map underlyingMap;
+
+  protected CaseInsensitiveMap(final Map underlyingMap) {
+this.underlyingMap = underlyingMap;
--- End diff --

That's not possible because the ctor is protected. I'll make the ctor 
private, and add appropriate comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Apache drill jdbc driver - can i connect to a drillbit?

2015-09-03 Thread Rajkumar Singh
This is a sample code snippet to connect to drill using Drill-Jdbc-all Driver.

Class.forName("org.apache.drill.jdbc.Driver");
Connection connection 
=DriverManager.getConnection("jdbc:drill:zk=node3.mynode.com:5181/drill/my_cluster_com-drillbits");
Statement st = connection.createStatement();
ResultSet rs = st.executeQuery("SELECT * from cp.`employee`");
while(rs.next()){
System.out.println(rs.getString(1));
}


Rajkumar Singh
MapR Technologies


> On Sep 3, 2015, at 12:50 PM, Sudip Mukherjee  wrote:
> 
> Hi Devs,
> 
> Is there way to connect a drillbit using the jdbc driver. Could you please 
> point me to an example if there is one?
> 
> Thanks,
> Sudip
> 
> 
> 
> ***Legal Disclaimer***
> "This communication may contain confidential and privileged material for the
> sole use of the intended recipient. Any unauthorized review, use or 
> distribution
> by others is strictly prohibited. If you have received the message by mistake,
> please advise the sender by reply email and delete the message. Thank you."
> **



Apache drill jdbc driver - can i connect to a drillbit?

2015-09-03 Thread Sudip Mukherjee
Hi Devs,

Is there way to connect a drillbit using the jdbc driver. Could you please 
point me to an example if there is one?

Thanks,
Sudip



***Legal Disclaimer***
"This communication may contain confidential and privileged material for the
sole use of the intended recipient. Any unauthorized review, use or distribution
by others is strictly prohibited. If you have received the message by mistake,
please advise the sender by reply email and delete the message. Thank you."
**

[GitHub] drill pull request: DRILL-3566: Fix: PreparedStatement.executeQuer...

2015-09-03 Thread adeneche
Github user adeneche commented on a diff in the pull request:

https://github.com/apache/drill/pull/143#discussion_r38705073
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillJdbc41Factory.java ---
@@ -103,9 +103,9 @@ public DrillResultSetImpl newResultSet(AvaticaStatement 
statement,
  TimeZone timeZone) {
 final ResultSetMetaData metaData =
 newResultSetMetaData(statement, prepareResult.getColumnList());
-return new DrillResultSetImpl( (DrillStatementImpl) statement,
-   (DrillPrepareResult) prepareResult,
-   metaData, timeZone);
+return new DrillResultSetImpl(statement,
+  (DrillPrepareResult) prepareResult,
--- End diff --

Cast here is redundant here, DrillResultSetImpl() is expecting an 
AvaticaPrepareResult object 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3566: Fix: PreparedStatement.executeQuer...

2015-09-03 Thread adeneche
Github user adeneche commented on a diff in the pull request:

https://github.com/apache/drill/pull/143#discussion_r38706070
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java ---
@@ -1334,8 +1335,6 @@ public String getQueryId() throws SQLException {
 
   @Override
   protected DrillResultSetImpl execute() throws SQLException{
-DrillConnectionImpl connection = (DrillConnectionImpl) 
statement.getConnection();
-
 connection.getClient().runQuery(QueryType.SQL, 
this.prepareResult.getSql(),
--- End diff --

why not use `this.client` instead of calling `connection.getClient()` ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3566: Fix: PreparedStatement.executeQuer...

2015-09-03 Thread adeneche
Github user adeneche commented on a diff in the pull request:

https://github.com/apache/drill/pull/143#discussion_r38706036
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java ---
@@ -87,12 +88,12 @@
   boolean hasPendingCancelationNotification;
 
 
-  DrillResultSetImpl(DrillStatementImpl statement, AvaticaPrepareResult 
prepareResult,
+  DrillResultSetImpl(AvaticaStatement statement, AvaticaPrepareResult 
prepareResult,
  ResultSetMetaData resultSetMetaData, TimeZone 
timeZone) {
 super(statement, prepareResult, resultSetMetaData, timeZone);
-this.statement = statement;
+connection = (DrillConnectionImpl) statement.getConnection();
 final int batchQueueThrottlingThreshold =
-this.getStatement().getConnection().getClient().getConfig().getInt(
+connection.getClient().getConfig().getInt(
--- End diff --

if you move the line `this.client = client;` before this line you should be 
able to reuse `this.client` instead of calling `connection.getClient()`
You can also reuse `this.connection` instead of creating another instance 
`c` below


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3455: If fragments on unregistered Drill...

2015-09-03 Thread sudheeshkatkam
GitHub user sudheeshkatkam opened a pull request:

https://github.com/apache/drill/pull/145

DRILL-3455: If fragments on unregistered Drillbits finished successfu…

…lly, do not fail the query

+ DRILL-3448: Flipped the atLeastOneFailure condition in QueryManager
+ fixes in DrillbitStatusListener interface
+ logs from implementations of DrillbitStatusListener

Already got a +1 on [RB](https://reviews.apache.org/r/36208/), rebased on 
master. @adeneche can you review/ merge this?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sudheeshkatkam/drill DRILL-3455

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/145.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #145


commit 8016ee3b4f9850c1ede8bdc06b2365710a59b195
Author: Sudheesh Katkam 
Date:   2015-07-23T00:16:29Z

DRILL-3455: If fragments on unregistered Drillbits finished successfully, 
do not fail the query

+ DRILL-3448: Flipped the atLeastOneFailure condition in QueryManager
+ fixes in DrillbitStatusListener interface
+ logs from implementations of DrillbitStatusListener




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3566: Fix: PreparedStatement.executeQuer...

2015-09-03 Thread dsbos
Github user dsbos commented on a diff in the pull request:

https://github.com/apache/drill/pull/143#discussion_r38708736
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillJdbc41Factory.java ---
@@ -103,9 +103,9 @@ public DrillResultSetImpl newResultSet(AvaticaStatement 
statement,
  TimeZone timeZone) {
 final ResultSetMetaData metaData =
 newResultSetMetaData(statement, prepareResult.getColumnList());
-return new DrillResultSetImpl( (DrillStatementImpl) statement,
-   (DrillPrepareResult) prepareResult,
-   metaData, timeZone);
+return new DrillResultSetImpl(statement,
+  (DrillPrepareResult) prepareResult,
--- End diff --

Simplified.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3566: Fix: PreparedStatement.executeQuer...

2015-09-03 Thread dsbos
Github user dsbos commented on a diff in the pull request:

https://github.com/apache/drill/pull/143#discussion_r38710996
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java ---
@@ -1334,8 +1335,6 @@ public String getQueryId() throws SQLException {
 
   @Override
   protected DrillResultSetImpl execute() throws SQLException{
-DrillConnectionImpl connection = (DrillConnectionImpl) 
statement.getConnection();
-
 connection.getClient().runQuery(QueryType.SQL, 
this.prepareResult.getSql(),
--- End diff --

I don't know why the code was like that.  (Maybe to be symmetric with 
connection.getDriver(), or maybe from before client was kept?)

Eliminated redundant call to getClient().


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3455: If fragments on unregistered Drill...

2015-09-03 Thread adeneche
Github user adeneche commented on the pull request:

https://github.com/apache/drill/pull/145#issuecomment-137599425
  
+1 LGTM (pending test results)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (DRILL-3287) Changing session level parameter back to the default value does not change it's status back to DEFAULT

2015-09-03 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman resolved DRILL-3287.
-
Resolution: Fixed

Resolved in 1.2.0, see comments in DRILL-3122

> Changing session level parameter back to the default value does not change 
> it's status back to DEFAULT
> --
>
> Key: DRILL-3287
> URL: https://issues.apache.org/jira/browse/DRILL-3287
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Reporter: Victoria Markman
>Assignee: Sudheesh Katkam
> Fix For: 1.2.0
>
>
> Initial state:
> {code}
> 0: jdbc:drill:schema=dfs> select * from sys.options where status like 
> '%CHANGED%';
> +---+--+-+--+--+-+---++
> |   name|   kind   |  type   |  status  | num_val 
>  | string_val  | bool_val  | float_val  |
> +---+--+-+--+--+-+---++
> | planner.enable_decimal_data_type  | BOOLEAN  | SYSTEM  | CHANGED  | null
>  | null| true  | null   |
> +---+--+-+--+--+-+---++
> 1 row selected (0.247 seconds)
> {code}
> I changed session parameter:
> {code}
> 0: jdbc:drill:schema=dfs> alter session set `planner.enable_hashjoin` = false;
> +---+---+
> |  ok   |  summary  |
> +---+---+
> | true  | planner.enable_hashjoin updated.  |
> +---+---+
> 1 row selected (0.1 seconds)
> {code}
> So far, so good: it appears on changed options list: 
> {code}
> 0: jdbc:drill:schema=dfs> select * from sys.options where status like 
> '%CHANGED%';
> +---+--+--+--+--+-+---++
> |   name|   kind   |   type   |  status  | 
> num_val  | string_val  | bool_val  | float_val  |
> +---+--+--+--+--+-+---++
> | planner.enable_decimal_data_type  | BOOLEAN  | SYSTEM   | CHANGED  | null   
>   | null| true  | null   |
> | planner.enable_hashjoin   | BOOLEAN  | SESSION  | CHANGED  | null   
>   | null| false | null   |
> +---+--+--+--+--+-+---++
> 2 rows selected (0.133 seconds)
> {code}
> I changed session parameter back to it's default value:
> {code}
> 0: jdbc:drill:schema=dfs> alter session set `planner.enable_hashjoin` = true;
> +---+---+
> |  ok   |  summary  |
> +---+---+
> | true  | planner.enable_hashjoin updated.  |
> +---+---+
> 1 row selected (0.096 seconds)
> {code}
> {color:red} It still appears on changed list, even though it has default 
> value:{color}
> {code}
> 0: jdbc:drill:schema=dfs> select * from sys.options where status like 
> '%CHANGED%';
> +---+--+--+--+--+-+---++
> |   name|   kind   |   type   |  status  | 
> num_val  | string_val  | bool_val  | float_val  |
> +---+--+--+--+--+-+---++
> | planner.enable_decimal_data_type  | BOOLEAN  | SYSTEM   | CHANGED  | null   
>   | null| true  | null   |
> | planner.enable_hashjoin   | BOOLEAN  | SESSION  | CHANGED  | null   
>   | null| true  | null   |
> +---+--+--+--+--+-+---++
> 2 rows selected (0.124 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3737) CTAS from empty text file fails with NPE

2015-09-03 Thread Sean Hsuan-Yi Chu (JIRA)
Sean Hsuan-Yi Chu created DRILL-3737:


 Summary: CTAS from empty text file fails with NPE
 Key: DRILL-3737
 URL: https://issues.apache.org/jira/browse/DRILL-3737
 Project: Apache Drill
  Issue Type: Bug
Reporter: Sean Hsuan-Yi Chu
Assignee: Sean Hsuan-Yi Chu
Priority: Critical


{code}
create table a(aa) as select columns[0] from `empty.csv`;
{code}

shows:

Error: SYSTEM ERROR: NullPointerException
Fragment 0:0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] drill pull request: DRILL-3566: Fix: PreparedStatement.executeQuer...

2015-09-03 Thread adeneche
Github user adeneche commented on a diff in the pull request:

https://github.com/apache/drill/pull/143#discussion_r38705830
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java ---
@@ -850,9 +851,9 @@ public void moveToCurrentRow() throws SQLException {
   }
 
   @Override
-  public DrillStatementImpl getStatement() {
+  public AvaticaStatement getStatement() {
--- End diff --

Any specific reason to have this method override it's parent implementation 
? it's basically doing the same thing 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3555: Changing defaults for planner.memo...

2015-09-03 Thread adeneche
Github user adeneche commented on the pull request:

https://github.com/apache/drill/pull/137#issuecomment-137599080
  
@vkorukanti can you please review ? thx


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3566: Fix: PreparedStatement.executeQuer...

2015-09-03 Thread dsbos
Github user dsbos commented on a diff in the pull request:

https://github.com/apache/drill/pull/143#discussion_r38709955
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java ---
@@ -850,9 +851,9 @@ public void moveToCurrentRow() throws SQLException {
   }
 
   @Override
-  public DrillStatementImpl getStatement() {
+  public AvaticaStatement getStatement() {
--- End diff --

One reason for leaving it in (after changing the return type back) is to 
hold that comment pointing out that this method doesn't call checkNotClosed() 
as most other methods do.

Fixed copy/paste/forgot-to-edit error "close()"  to "getStatement()" in 
comment.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Windows build

2015-09-03 Thread Parth Chandra
As of commit 9baec8a, the build completes and all Unit tests pass on
Windows. Let's hope it stays that way.


[GitHub] drill pull request: DRILL-3566: Fix: PreparedStatement.executeQuer...

2015-09-03 Thread dsbos
Github user dsbos commented on a diff in the pull request:

https://github.com/apache/drill/pull/143#discussion_r38710714
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java ---
@@ -87,12 +88,12 @@
   boolean hasPendingCancelationNotification;
 
 
-  DrillResultSetImpl(DrillStatementImpl statement, AvaticaPrepareResult 
prepareResult,
+  DrillResultSetImpl(AvaticaStatement statement, AvaticaPrepareResult 
prepareResult,
  ResultSetMetaData resultSetMetaData, TimeZone 
timeZone) {
 super(statement, prepareResult, resultSetMetaData, timeZone);
-this.statement = statement;
+connection = (DrillConnectionImpl) statement.getConnection();
 final int batchQueueThrottlingThreshold =
-this.getStatement().getConnection().getClient().getConfig().getInt(
+connection.getClient().getConfig().getInt(
--- End diff --

Yeah, I don't know why the code was like that.

Reworked a bit to eliminate redundant calls and code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3566: Fix: PreparedStatement.executeQuer...

2015-09-03 Thread dsbos
Github user dsbos commented on the pull request:

https://github.com/apache/drill/pull/143#issuecomment-137607549
  
> can you confirm that this patch passes all our "jdbc" tests ?

It passed before the post-review update commit.  I'm re-running regular 
tests now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3566: Fix: PreparedStatement.executeQuer...

2015-09-03 Thread adeneche
Github user adeneche commented on the pull request:

https://github.com/apache/drill/pull/143#issuecomment-137597475
  
@dsbos can you confirm that this patch passes all our "jdbc" tests ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-2304: Case sensitivity - system and sess...

2015-09-03 Thread sudheeshkatkam
Github user sudheeshkatkam commented on the pull request:

https://github.com/apache/drill/pull/90#issuecomment-137604942
  
Passes all unit and regression tests.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Identifying the source of problematic records

2015-09-03 Thread Jacques Nadeau
Interesting idea.  The question I have is how would this work when you have
a combination of generated code related to expressions and code not related
to expressions.

--
Jacques Nadeau
CTO and Co-Founder, Dremio

On Thu, Sep 3, 2015 at 11:31 AM, Jason Altekruse 
wrote:

> @Jacques,
>
> On your point a) about expressing failures and the compilation model, I had
> thought about previously using the interpreter to figure out which
> expression against the current row failed, once we have caught an exception
> out of some part of the complete code-generated expression evaluation. Do
> you think this would possibly address your concern? Do you think anything
> more than the problematic input data and the expression that failed would
> be produced by the functions in this new standardized error format?
>
> - Jason
>
> On Wed, Sep 2, 2015 at 8:43 PM, Jacques Nadeau  wrote:
>
> > I'd like to propose a few things to solve this:
> >
> > a) Functions should be able to express failures in a standardized way.
> I'm
> > thinking a new type of injectable and/or a certain type of exception
> > (although more dangerous/possibly requires rewrite given compilation
> > model).
> > b) Users (session/system level) should be able to set a setting where
> > function errors are handled a certain way. Options could include query
> > failure, ignore + inform as warning/notice, and save records for later
> > analysis (maybe in v2).
> > c) Readers that have a notorious problem (e.g. Text) should support
> > projection/expression pushdown so that they can create these kinds of
> > errors and provide additional context as part of that.
> > d) We should also implement dot drill files so that users can prescribe
> > this projection/data validation process by default for files/diretories
> > (which would provide the behavior as c above.
> > e) We should get more serious about providing useful virtual fields.
> This
> > should include filename (similar to directory name).
> >
> > Once a record leaves an operator, I don't think we should carry any
> > additional provenance with it. It would be too heavy weight as a default
> > behavior.
> >
> >
> >
> >
> >
> >
> > --
> > Jacques Nadeau
> > CTO and Co-Founder, Dremio
> >
> > On Tue, Sep 1, 2015 at 9:08 AM, Aman Sinha  wrote:
> >
> > > Drill can point out the filename and location of corrupted records in a
> > > file but we don't have a good mechanism to deal with the following
> > > scenario:
> > >
> > > Consider a text file with 2 records:
> > > $ cat t4.csv
> > > 10,2001
> > > 11,http://www.cnn.com
> > >
> > > 0: jdbc:drill:zk=local> alter session set `exec.errors.verbose` = true;
> > >
> > > 0: jdbc:drill:zk=local> select cast(columns[0] as init),
> cast(columns[1]
> > as
> > > bigint) from dfs.`/Users/asinha/data/t4.csv`;
> > >
> > > Error: SYSTEM ERROR: NumberFormatException: http://www.cnn.com
> > >
> > > Fragment 0:0
> > >
> > > [Error Id: 72aad22c-a345-4100-9a57-dcd8436105f7 on 10.250.56.140:31010
> ]
> > >
> > >   (java.lang.NumberFormatException) http://www.cnn.com
> > > org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.nfeL():91
> > >
> > >
> >
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.varCharToLong():62
> > > org.apache.drill.exec.test.generated.ProjectorGen1.doEval():62
> > >
> >  org.apache.drill.exec.test.generated.ProjectorGen1.projectRecords():62
> > >
> > >
> >
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork():172
> > >
> > > The problem is user does not have a clue about the original source of
> > this
> > > error.  This is a pain point especially when dealing with thousands of
> > > files.
> > >
> > > 1.  We can start by providing the column index where the problem
> > occurred.
> > > 2.  Can a scan batch keep track of the file it originated from ? Since
> > the
> > > Project in the
> > >  above query is pushed right above the scan, it could get the
> > filename
> > > from the record
> > >  batch (assuming we can store this piece of information).  This
> won't
> > > be possible
> > >  for other Projects elsewhere in the plan.
> > > 3.  What about the location within the file ?   Unless the projection
> is
> > > pushed into the scan
> > >  itself, I don't see a good way to provide this information.
> > >
> > > A related topic is how to tell Drill to ignore such records when doing
> a
> > > query or a CTAS ?
> > > That could be a separate discussion.
> > >
> > > Thoughts ?
> > > Aman
> > >
> >
>


Re: Identifying the source of problematic records

2015-09-03 Thread Jason Altekruse
I was thinking we would just put a catch around the calls to evaluate the
generated code and re-evaluate each individual expression with the
interpreter to find out which one caused the exception.

Thinking about it a little more, the call to the generated code actually
happens inside of the loop in the ProjectTemplate/FilterTemplate classes
today. This is where the information about the index in the current batch
is known, but the list of expressions is not known at this level. We might
have to add an interface to extract the last index we tried to evaluate
from the Template, so that we could use this to evaluate against the
correct row back in the RecordBatch where we have access to expressions
which can be used to materialize the interpreter.

On Thu, Sep 3, 2015 at 6:31 PM, Jacques Nadeau  wrote:

> Interesting idea.  The question I have is how would this work when you have
> a combination of generated code related to expressions and code not related
> to expressions.
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
> On Thu, Sep 3, 2015 at 11:31 AM, Jason Altekruse  >
> wrote:
>
> > @Jacques,
> >
> > On your point a) about expressing failures and the compilation model, I
> had
> > thought about previously using the interpreter to figure out which
> > expression against the current row failed, once we have caught an
> exception
> > out of some part of the complete code-generated expression evaluation. Do
> > you think this would possibly address your concern? Do you think anything
> > more than the problematic input data and the expression that failed would
> > be produced by the functions in this new standardized error format?
> >
> > - Jason
> >
> > On Wed, Sep 2, 2015 at 8:43 PM, Jacques Nadeau 
> wrote:
> >
> > > I'd like to propose a few things to solve this:
> > >
> > > a) Functions should be able to express failures in a standardized way.
> > I'm
> > > thinking a new type of injectable and/or a certain type of exception
> > > (although more dangerous/possibly requires rewrite given compilation
> > > model).
> > > b) Users (session/system level) should be able to set a setting where
> > > function errors are handled a certain way. Options could include query
> > > failure, ignore + inform as warning/notice, and save records for later
> > > analysis (maybe in v2).
> > > c) Readers that have a notorious problem (e.g. Text) should support
> > > projection/expression pushdown so that they can create these kinds of
> > > errors and provide additional context as part of that.
> > > d) We should also implement dot drill files so that users can prescribe
> > > this projection/data validation process by default for files/diretories
> > > (which would provide the behavior as c above.
> > > e) We should get more serious about providing useful virtual fields.
> > This
> > > should include filename (similar to directory name).
> > >
> > > Once a record leaves an operator, I don't think we should carry any
> > > additional provenance with it. It would be too heavy weight as a
> default
> > > behavior.
> > >
> > >
> > >
> > >
> > >
> > >
> > > --
> > > Jacques Nadeau
> > > CTO and Co-Founder, Dremio
> > >
> > > On Tue, Sep 1, 2015 at 9:08 AM, Aman Sinha 
> wrote:
> > >
> > > > Drill can point out the filename and location of corrupted records
> in a
> > > > file but we don't have a good mechanism to deal with the following
> > > > scenario:
> > > >
> > > > Consider a text file with 2 records:
> > > > $ cat t4.csv
> > > > 10,2001
> > > > 11,http://www.cnn.com
> > > >
> > > > 0: jdbc:drill:zk=local> alter session set `exec.errors.verbose` =
> true;
> > > >
> > > > 0: jdbc:drill:zk=local> select cast(columns[0] as init),
> > cast(columns[1]
> > > as
> > > > bigint) from dfs.`/Users/asinha/data/t4.csv`;
> > > >
> > > > Error: SYSTEM ERROR: NumberFormatException: http://www.cnn.com
> > > >
> > > > Fragment 0:0
> > > >
> > > > [Error Id: 72aad22c-a345-4100-9a57-dcd8436105f7 on
> 10.250.56.140:31010
> > ]
> > > >
> > > >   (java.lang.NumberFormatException) http://www.cnn.com
> > > >
>  org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.nfeL():91
> > > >
> > > >
> > >
> >
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.varCharToLong():62
> > > > org.apache.drill.exec.test.generated.ProjectorGen1.doEval():62
> > > >
> > >  org.apache.drill.exec.test.generated.ProjectorGen1.projectRecords():62
> > > >
> > > >
> > >
> >
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork():172
> > > >
> > > > The problem is user does not have a clue about the original source of
> > > this
> > > > error.  This is a pain point especially when dealing with thousands
> of
> > > > files.
> > > >
> > > > 1.  We can start by providing the column index where the problem
> > > occurred.
> > > > 2.  Can a scan batch keep track of the file it originated from ?
> Since
> > > the
> > > > Project in the
> > > >  

[GitHub] drill pull request: DRILL-3589: Update JDBC driver to shade and mi...

2015-09-03 Thread jacques-n
Github user jacques-n commented on the pull request:

https://github.com/apache/drill/pull/116#issuecomment-137621150
  
Can you start by reviewing my branch? Let's get it right before we try 
rebasing.

With regards to slf4j and logging: I have corrected the behavior versus the 
old packaging. We support any slf4j logging tool but we don't package any. This 
means that a user can leverage their existing logging framework. If we include 
a logging framework, then a user can't use their own for centralized logging. 
We'll need to doc this but it is on purpose.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Patch Reviews (Mongo & Avro related issues)

2015-09-03 Thread Jacques Nadeau
Andrew and I are working on them. Hope to get back to you soon.

thanks for your patience!

--
Jacques Nadeau
CTO and Co-Founder, Dremio

On Wed, Sep 2, 2015 at 7:06 AM, Kamesh  wrote:

> Hi All,
>  For the following issues I submitted patches. Can somebody review them.
> .
> DRILL-1666 
> DRILL-2879 
> DRILL-3720 
> DRILL-3458 
> DRILL-3699 
>
> --
> Kamesh.
>


Re: The meaning of the methods in StoragePlugin and EasyFormatPlugin

2015-09-03 Thread Daniel Barclay

I wrote:

... Below are some notes on the detailed requirements I had extracted from
the code.  ...

I found a later copy of my (still rough) notes.

See the Google Docs document at
[Notes for] Instructions on Creating Storage Plug-ins 
.

Daniel

--
Daniel Barclay
MapR Technologies