[GitHub] drill issue #514: DRILL-4694: CTAS in JSON format produces extraneous NULL f...

2016-06-06 Thread amansinha100
Github user amansinha100 commented on the issue:

https://github.com/apache/drill/pull/514
  
LGTM.  +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #480: DRILL-4606: Add DrillClient.Builder class

2016-06-06 Thread parthchandra
Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/480#discussion_r65997722
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/client/DrillClient.java ---
@@ -74,81 +78,173 @@
 import com.fasterxml.jackson.core.JsonProcessingException;
 import com.fasterxml.jackson.databind.ObjectMapper;
 import com.fasterxml.jackson.databind.node.ArrayNode;
-import com.google.common.base.Strings;
-import com.google.common.util.concurrent.AbstractCheckedFuture;
-import com.google.common.util.concurrent.SettableFuture;
 
 /**
  * Thin wrapper around a UserClient that handles connect/close and 
transforms
  * String into ByteBuf.
+ *
+ * To create non-default objects, use {@link DrillClient.Builder the 
builder class}.
+ * E.g.
+ * 
+ *   DrillClient client = DrillClient.newBuilder()
+ *   .setConfig(...)
+ *   .setIsDirectConnection(true)
+ *   .build();
+ * 
+ *
+ * Except for {@link #runQuery} and {@link #cancelQuery}, this class is 
generally not thread safe.
  */
 public class DrillClient implements Closeable, ConnectionThrottle {
   private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(DrillClient.class);
 
   private static final ObjectMapper objectMapper = new ObjectMapper();
   private final DrillConfig config;
-  private UserClient client;
-  private UserProperties props = null;
-  private volatile ClusterCoordinator clusterCoordinator;
-  private volatile boolean connected = false;
   private final BufferAllocator allocator;
-  private int reconnectTimes;
-  private int reconnectDelay;
-  private boolean supportComplexTypes;
-  private final boolean ownsZkConnection;
+  private final EventLoopGroup eventLoopGroup;
+  private final ExecutorService executor;
+  private final boolean isDirectConnection;
+  private final int reconnectTimes;
+  private final int reconnectDelay;
+
+  // TODO: clusterCoordinator should be initialized in the constructor.
+  // Currently, initialization is tightly coupled with #connect.
+  private ClusterCoordinator clusterCoordinator;
+
+  // checks if this client owns these resources (used when closing)
   private final boolean ownsAllocator;
-  private final boolean isDirectConnection; // true if the connection 
bypasses zookeeper and connects directly to a drillbit
-  private EventLoopGroup eventLoopGroup;
-  private ExecutorService executor;
+  private final boolean ownsZkConnection;
+  private final boolean ownsEventLoopGroup;
+  private final boolean ownsExecutor;
 
-  public DrillClient() throws OutOfMemoryException {
-this(DrillConfig.create(), false);
+  // once #setSupportComplexTypes() is removed, make this final
+  private boolean supportComplexTypes;
+
+  private UserClient client;
+  private UserProperties props;
+  private boolean connected;
+
+  public DrillClient() {
+this(newBuilder());
   }
 
-  public DrillClient(boolean isDirect) throws OutOfMemoryException {
-this(DrillConfig.create(), isDirect);
+  /**
+   * @deprecated Create a DrillClient using {@link DrillClient.Builder}.
+   */
+  @Deprecated
+  public DrillClient(boolean isDirect) {
+this(newBuilder()
+.setDirectConnection(isDirect));
   }
 
-  public DrillClient(String fileName) throws OutOfMemoryException {
-this(DrillConfig.create(fileName), false);
+  /**
+   * @deprecated Create a DrillClient using {@link DrillClient.Builder}.
+   */
+  @Deprecated
+  public DrillClient(String fileName) {
+this(newBuilder()
+.setConfigFromFile(fileName));
   }
 
-  public DrillClient(DrillConfig config) throws OutOfMemoryException {
-this(config, null, false);
+  /**
+   * @deprecated Create a DrillClient using {@link DrillClient.Builder}.
+   */
+  @Deprecated
+  public DrillClient(DrillConfig config) {
+this(newBuilder()
+.setConfig(config));
   }
 
-  public DrillClient(DrillConfig config, boolean isDirect)
-  throws OutOfMemoryException {
-this(config, null, isDirect);
+  /**
+   * @deprecated Create a DrillClient using {@link DrillClient.Builder}.
+   */
+  @Deprecated
+  public DrillClient(DrillConfig config, boolean isDirect) {
+this(newBuilder()
+.setConfig(config)
+.setDirectConnection(isDirect));
   }
 
-  public DrillClient(DrillConfig config, ClusterCoordinator coordinator)
-throws OutOfMemoryException {
-this(config, coordinator, null, false);
+  /**
+   * @deprecated Create a DrillClient using {@link DrillClient.Builder}.
+   */
  

[GitHub] drill pull request #480: DRILL-4606: Add DrillClient.Builder class

2016-06-06 Thread parthchandra
Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/480#discussion_r65997533
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/client/DrillClient.java ---
@@ -74,81 +78,173 @@
 import com.fasterxml.jackson.core.JsonProcessingException;
 import com.fasterxml.jackson.databind.ObjectMapper;
 import com.fasterxml.jackson.databind.node.ArrayNode;
-import com.google.common.base.Strings;
-import com.google.common.util.concurrent.AbstractCheckedFuture;
-import com.google.common.util.concurrent.SettableFuture;
 
 /**
  * Thin wrapper around a UserClient that handles connect/close and 
transforms
  * String into ByteBuf.
+ *
+ * To create non-default objects, use {@link DrillClient.Builder the 
builder class}.
+ * E.g.
+ * 
+ *   DrillClient client = DrillClient.newBuilder()
+ *   .setConfig(...)
+ *   .setIsDirectConnection(true)
+ *   .build();
+ * 
+ *
+ * Except for {@link #runQuery} and {@link #cancelQuery}, this class is 
generally not thread safe.
  */
 public class DrillClient implements Closeable, ConnectionThrottle {
   private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(DrillClient.class);
 
   private static final ObjectMapper objectMapper = new ObjectMapper();
   private final DrillConfig config;
-  private UserClient client;
-  private UserProperties props = null;
-  private volatile ClusterCoordinator clusterCoordinator;
-  private volatile boolean connected = false;
   private final BufferAllocator allocator;
-  private int reconnectTimes;
-  private int reconnectDelay;
-  private boolean supportComplexTypes;
-  private final boolean ownsZkConnection;
+  private final EventLoopGroup eventLoopGroup;
+  private final ExecutorService executor;
+  private final boolean isDirectConnection;
+  private final int reconnectTimes;
+  private final int reconnectDelay;
+
+  // TODO: clusterCoordinator should be initialized in the constructor.
+  // Currently, initialization is tightly coupled with #connect.
+  private ClusterCoordinator clusterCoordinator;
+
+  // checks if this client owns these resources (used when closing)
   private final boolean ownsAllocator;
-  private final boolean isDirectConnection; // true if the connection 
bypasses zookeeper and connects directly to a drillbit
-  private EventLoopGroup eventLoopGroup;
-  private ExecutorService executor;
+  private final boolean ownsZkConnection;
+  private final boolean ownsEventLoopGroup;
+  private final boolean ownsExecutor;
 
-  public DrillClient() throws OutOfMemoryException {
-this(DrillConfig.create(), false);
+  // once #setSupportComplexTypes() is removed, make this final
+  private boolean supportComplexTypes;
+
+  private UserClient client;
+  private UserProperties props;
+  private boolean connected;
+
+  public DrillClient() {
+this(newBuilder());
   }
 
-  public DrillClient(boolean isDirect) throws OutOfMemoryException {
-this(DrillConfig.create(), isDirect);
+  /**
+   * @deprecated Create a DrillClient using {@link DrillClient.Builder}.
+   */
+  @Deprecated
+  public DrillClient(boolean isDirect) {
+this(newBuilder()
+.setDirectConnection(isDirect));
   }
 
-  public DrillClient(String fileName) throws OutOfMemoryException {
-this(DrillConfig.create(fileName), false);
+  /**
+   * @deprecated Create a DrillClient using {@link DrillClient.Builder}.
+   */
+  @Deprecated
+  public DrillClient(String fileName) {
+this(newBuilder()
+.setConfigFromFile(fileName));
   }
 
-  public DrillClient(DrillConfig config) throws OutOfMemoryException {
-this(config, null, false);
+  /**
+   * @deprecated Create a DrillClient using {@link DrillClient.Builder}.
+   */
+  @Deprecated
+  public DrillClient(DrillConfig config) {
+this(newBuilder()
+.setConfig(config));
   }
 
-  public DrillClient(DrillConfig config, boolean isDirect)
-  throws OutOfMemoryException {
-this(config, null, isDirect);
+  /**
+   * @deprecated Create a DrillClient using {@link DrillClient.Builder}.
+   */
+  @Deprecated
+  public DrillClient(DrillConfig config, boolean isDirect) {
+this(newBuilder()
+.setConfig(config)
+.setDirectConnection(isDirect));
   }
 
-  public DrillClient(DrillConfig config, ClusterCoordinator coordinator)
-throws OutOfMemoryException {
-this(config, coordinator, null, false);
+  /**
+   * @deprecated Create a DrillClient using {@link DrillClient.Builder}.
+   */
  

[GitHub] drill pull request #480: DRILL-4606: Add DrillClient.Builder class

2016-06-06 Thread parthchandra
Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/480#discussion_r65997389
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/client/DrillClient.java ---
@@ -73,81 +78,173 @@
 import com.fasterxml.jackson.core.JsonProcessingException;
 import com.fasterxml.jackson.databind.ObjectMapper;
 import com.fasterxml.jackson.databind.node.ArrayNode;
-import com.google.common.base.Strings;
-import com.google.common.util.concurrent.AbstractCheckedFuture;
-import com.google.common.util.concurrent.SettableFuture;
 
 /**
  * Thin wrapper around a UserClient that handles connect/close and 
transforms
  * String into ByteBuf.
+ *
+ * To create non-default objects, use {@link DrillClient.Builder the 
builder class}.
+ * E.g.
+ * 
+ *   DrillClient client = DrillClient.newBuilder()
+ *   .setConfig(...)
+ *   .setIsDirectConnection(true)
+ *   .build();
+ * 
+ *
+ * Except for {@link #runQuery} and {@link #cancelQuery}, this class is 
generally not thread safe.
  */
 public class DrillClient implements Closeable, ConnectionThrottle {
   private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(DrillClient.class);
 
   private static final ObjectMapper objectMapper = new ObjectMapper();
   private final DrillConfig config;
-  private UserClient client;
-  private UserProperties props = null;
-  private volatile ClusterCoordinator clusterCoordinator;
-  private volatile boolean connected = false;
   private final BufferAllocator allocator;
-  private int reconnectTimes;
-  private int reconnectDelay;
-  private boolean supportComplexTypes;
-  private final boolean ownsZkConnection;
+  private final EventLoopGroup eventLoopGroup;
+  private final ExecutorService executor;
+  private final boolean isDirectConnection;
+  private final int reconnectTimes;
+  private final int reconnectDelay;
+
+  // TODO: clusterCoordinator should be initialized in the constructor.
+  // Currently, initialization is tightly coupled with #connect.
+  private ClusterCoordinator clusterCoordinator;
+
+  // checks if this client owns these resources (used when closing)
   private final boolean ownsAllocator;
-  private final boolean isDirectConnection; // true if the connection 
bypasses zookeeper and connects directly to a drillbit
-  private EventLoopGroup eventLoopGroup;
-  private ExecutorService executor;
+  private final boolean ownsZkConnection;
+  private final boolean ownsEventLoopGroup;
+  private final boolean ownsExecutor;
 
-  public DrillClient() throws OutOfMemoryException {
-this(DrillConfig.create(), false);
+  // once #setSupportComplexTypes() is removed, make this final
+  private boolean supportComplexTypes;
+
+  private UserClient client;
+  private UserProperties props;
+  private boolean connected;
+
+  public DrillClient() {
+this(newBuilder());
   }
 
-  public DrillClient(boolean isDirect) throws OutOfMemoryException {
-this(DrillConfig.create(), isDirect);
+  /**
+   * @deprecated Create a DrillClient using {@link DrillClient.Builder}.
+   */
+  @Deprecated
+  public DrillClient(boolean isDirect) {
+this(newBuilder()
+.setDirectConnection(isDirect));
   }
 
-  public DrillClient(String fileName) throws OutOfMemoryException {
-this(DrillConfig.create(fileName), false);
+  /**
+   * @deprecated Create a DrillClient using {@link DrillClient.Builder}.
+   */
+  @Deprecated
+  public DrillClient(String fileName) {
+this(newBuilder()
+.setConfigFromFile(fileName));
   }
 
-  public DrillClient(DrillConfig config) throws OutOfMemoryException {
-this(config, null, false);
+  /**
+   * @deprecated Create a DrillClient using {@link DrillClient.Builder}.
+   */
+  @Deprecated
+  public DrillClient(DrillConfig config) {
+this(newBuilder()
+.setConfig(config));
   }
 
-  public DrillClient(DrillConfig config, boolean isDirect)
-  throws OutOfMemoryException {
-this(config, null, isDirect);
+  /**
+   * @deprecated Create a DrillClient using {@link DrillClient.Builder}.
+   */
+  @Deprecated
+  public DrillClient(DrillConfig config, boolean isDirect) {
+this(newBuilder()
+.setConfig(config)
+.setDirectConnection(isDirect));
   }
 
-  public DrillClient(DrillConfig config, ClusterCoordinator coordinator)
-throws OutOfMemoryException {
-this(config, coordinator, null, false);
+  /**
+   * @deprecated Create a DrillClient using {@link DrillClient.Builder}.
+   */
  

[GitHub] drill pull request #480: DRILL-4606: Add DrillClient.Builder class

2016-06-06 Thread parthchandra
Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/480#discussion_r65997345
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/client/DrillClient.java ---
@@ -74,81 +78,173 @@
 import com.fasterxml.jackson.core.JsonProcessingException;
 import com.fasterxml.jackson.databind.ObjectMapper;
 import com.fasterxml.jackson.databind.node.ArrayNode;
-import com.google.common.base.Strings;
-import com.google.common.util.concurrent.AbstractCheckedFuture;
-import com.google.common.util.concurrent.SettableFuture;
 
 /**
  * Thin wrapper around a UserClient that handles connect/close and 
transforms
  * String into ByteBuf.
+ *
+ * To create non-default objects, use {@link DrillClient.Builder the 
builder class}.
+ * E.g.
+ * 
+ *   DrillClient client = DrillClient.newBuilder()
+ *   .setConfig(...)
+ *   .setIsDirectConnection(true)
+ *   .build();
+ * 
+ *
+ * Except for {@link #runQuery} and {@link #cancelQuery}, this class is 
generally not thread safe.
  */
 public class DrillClient implements Closeable, ConnectionThrottle {
   private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(DrillClient.class);
 
   private static final ObjectMapper objectMapper = new ObjectMapper();
   private final DrillConfig config;
-  private UserClient client;
-  private UserProperties props = null;
-  private volatile ClusterCoordinator clusterCoordinator;
-  private volatile boolean connected = false;
   private final BufferAllocator allocator;
-  private int reconnectTimes;
-  private int reconnectDelay;
-  private boolean supportComplexTypes;
-  private final boolean ownsZkConnection;
+  private final EventLoopGroup eventLoopGroup;
+  private final ExecutorService executor;
+  private final boolean isDirectConnection;
+  private final int reconnectTimes;
+  private final int reconnectDelay;
+
+  // TODO: clusterCoordinator should be initialized in the constructor.
+  // Currently, initialization is tightly coupled with #connect.
+  private ClusterCoordinator clusterCoordinator;
+
+  // checks if this client owns these resources (used when closing)
   private final boolean ownsAllocator;
-  private final boolean isDirectConnection; // true if the connection 
bypasses zookeeper and connects directly to a drillbit
-  private EventLoopGroup eventLoopGroup;
-  private ExecutorService executor;
+  private final boolean ownsZkConnection;
+  private final boolean ownsEventLoopGroup;
+  private final boolean ownsExecutor;
 
-  public DrillClient() throws OutOfMemoryException {
-this(DrillConfig.create(), false);
+  // once #setSupportComplexTypes() is removed, make this final
+  private boolean supportComplexTypes;
+
+  private UserClient client;
+  private UserProperties props;
+  private boolean connected;
+
+  public DrillClient() {
+this(newBuilder());
   }
 
-  public DrillClient(boolean isDirect) throws OutOfMemoryException {
-this(DrillConfig.create(), isDirect);
+  /**
+   * @deprecated Create a DrillClient using {@link DrillClient.Builder}.
+   */
+  @Deprecated
+  public DrillClient(boolean isDirect) {
+this(newBuilder()
+.setDirectConnection(isDirect));
   }
 
-  public DrillClient(String fileName) throws OutOfMemoryException {
-this(DrillConfig.create(fileName), false);
+  /**
+   * @deprecated Create a DrillClient using {@link DrillClient.Builder}.
+   */
+  @Deprecated
+  public DrillClient(String fileName) {
+this(newBuilder()
+.setConfigFromFile(fileName));
   }
 
-  public DrillClient(DrillConfig config) throws OutOfMemoryException {
-this(config, null, false);
+  /**
+   * @deprecated Create a DrillClient using {@link DrillClient.Builder}.
+   */
+  @Deprecated
+  public DrillClient(DrillConfig config) {
+this(newBuilder()
+.setConfig(config));
   }
 
-  public DrillClient(DrillConfig config, boolean isDirect)
-  throws OutOfMemoryException {
-this(config, null, isDirect);
+  /**
+   * @deprecated Create a DrillClient using {@link DrillClient.Builder}.
+   */
+  @Deprecated
+  public DrillClient(DrillConfig config, boolean isDirect) {
+this(newBuilder()
+.setConfig(config)
+.setDirectConnection(isDirect));
   }
 
-  public DrillClient(DrillConfig config, ClusterCoordinator coordinator)
-throws OutOfMemoryException {
-this(config, coordinator, null, false);
+  /**
+   * @deprecated Create a DrillClient using {@link DrillClient.Builder}.
+   */
  

Re: median, quantile

2016-06-06 Thread Julian Hyde
I’ve thought for some time that SQL aggregate functions should have an 
“APPROXIMATE ( … )” clause. Users don’t WANT to call a TD_MEDIAN function, they 
want the MEDIAN that gives them an answer to their desired accuracy (within X, 
within Y%, or within a given confidence interval), and TD_MEDIAN may be the way 
to achieve that.

In fact the user might just set “SET APPROXIMATE = ’95%'” in their session and 
the APPROXIMATE clause is implicit on every query they write.

Approximate aggregate functions are all the rage right now but I’m not aware of 
any effort standardize them across databases.

Julian


> On Jun 6, 2016, at 5:58 PM, Parth Chandra  wrote:
> 
> Hey Steven,
> Somehow I missed this one when you posted it.
> Since you asked, I would suggest a different name from median, quartile
> since that might mislead. How about td_median, td_quantile ?
> 
> On Wed, Apr 13, 2016 at 11:51 AM, Steven Phillips  wrote:
> 
>> I submitted a pull request a little while ago that introduces (approximate)
>> median and quantile functions using the tdigest library.
>> 
>> https://github.com/apache/drill/pull/456
>> 
>> It would be great if I could get some feedback on this. Specifically, is it
>> ok to call these functions median and quantile, given that they are not
>> exact.
>> 



Re: median, quantile

2016-06-06 Thread Parth Chandra
Hey Steven,
Somehow I missed this one when you posted it.
Since you asked, I would suggest a different name from median, quartile
since that might mislead. How about td_median, td_quantile ?

On Wed, Apr 13, 2016 at 11:51 AM, Steven Phillips  wrote:

> I submitted a pull request a little while ago that introduces (approximate)
> median and quantile functions using the tdigest library.
>
> https://github.com/apache/drill/pull/456
>
> It would be great if I could get some feedback on this. Specifically, is it
> ok to call these functions median and quantile, given that they are not
> exact.
>


[GitHub] drill pull request #512: Drill 4573 fix issue with unicode chars

2016-06-06 Thread jinfengni
Github user jinfengni commented on a diff in the pull request:

https://github.com/apache/drill/pull/512#discussion_r65994802
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/expr/fn/impl/TestStringFunctions.java
 ---
@@ -114,6 +114,19 @@ public void testRegexpMatches() throws Exception {
   }
 
   @Test
+  public void testRegexpReplace() throws Exception {
+testBuilder()
+.sqlQuery("select regexp_replace(a, 'a|c', 'x') res1, 
regexp_replace(b, 'd', 'zzz') res2 " +
+  "from (values('abc', 'bcd'), ('bcd', 'abc')) as t(a,b)")
+.unOrdered()
+.baselineColumns("res1", "res2")
+.baselineValues("xbx", "bczzz")
+.baselineValues("bxd", "abc")
+.build()
+.run();
+  }
+
--- End diff --

Can you add at least one case for the non-ASCII case, since one of the 
issue this PR tries to address is the incorrect result for regex function over 
non-ASCII input?  

A new unit test case will help verify this PR fix the problem, and make 
sure any new change will not regress this functionality. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #512: Drill 4573 fix issue with unicode chars

2016-06-06 Thread jinfengni
Github user jinfengni commented on a diff in the pull request:

https://github.com/apache/drill/pull/512#discussion_r65994671
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/StringFunctions.java
 ---
@@ -268,7 +284,13 @@ public void eval() {
 
 @Override
 public void setup() {
-  matcher = 
java.util.regex.Pattern.compile(org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.toStringFromUTF8(pattern.start,
  pattern.end,  pattern.buffer)).matcher("");
+  CharSequenceWrapper patternWrapper = new 
CharSequenceWrapper(pattern.start, pattern.end, pattern.buffer);
+  char[] chars = new char[patternWrapper.length()];
+  for(int i=0; i

[GitHub] drill pull request #512: Drill 4573 fix issue with unicode chars

2016-06-06 Thread jinfengni
Github user jinfengni commented on a diff in the pull request:

https://github.com/apache/drill/pull/512#discussion_r65994494
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/CharSequenceWrapper.java
 ---
@@ -17,13 +17,52 @@
  */
 package org.apache.drill.exec.expr.fn.impl;
 
+import java.nio.ByteBuffer;
+import java.nio.CharBuffer;
+import java.nio.charset.CharacterCodingException;
+import java.nio.charset.Charset;
+import java.nio.charset.CharsetDecoder;
+import java.nio.charset.CoderResult;
+import java.util.regex.Matcher;
+
 import io.netty.buffer.DrillBuf;
 
+/**
+ * A CharSequence is a readable sequence of char values. This interface 
provides
+ * uniform, read-only access to many different kinds of char sequences. A 
char
+ * value represents a character in the Basic Multilingual Plane (BMP) or a
+ * surrogate. Refer to Unicode Character Representation for details.
+ * Specifically this implementation of the CharSequence adapts a Drill
+ * {@link DrillBuf} to the CharSequence. The implementation is meant to be
+ * re-used that is allocated once and then passed DrillBuf to adapt. This 
can be
+ * handy to exploit API that consume CharSequence avoiding the need to 
create
+ * string objects.
+ *
+ */
 public class CharSequenceWrapper implements CharSequence {
 
+// The adapted drill buffer (in the case of US-ASCII)
+private DrillBuf buffer;
+// The converted bytes in the case of non ASCII
+private CharBuffer charBuffer;
--- End diff --

In the case of non ASCII, did we see any improvement compared to the 
approach before DRILL-4573? If there is no noticeable improvement,  can we 
switch to the original call?

String i = 
org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.toStringFromUTF8(input.start,
 input.end, input.buffer);

Seems to me this CharBuffer is doing exactly the same job as this function 
: encode the byte arrays in UTF8. Having to init / re-allocate makes thing 
complicated; it could introduce bugs here and there. The original method calls 
java library method, which is assumed to be well tested.

Can we 1) use CharSequenceWrapper only for ASCII, 2) construct String for 
non-ASCII in the body of regex function (regex_matches)?
 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #507: DRILL-4690: CORS in REST API

2016-06-06 Thread parthchandra
Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/507#discussion_r65994192
  
--- Diff: exec/java-exec/src/main/resources/drill-module.conf ---
@@ -111,7 +111,14 @@ drill.exec: {
 enabled: true,
 ssl_enabled: false,
 port: 8047
-session_max_idle_secs: 3600 # Default value 1hr
+session_max_idle_secs: 3600, # Default value 1hr
+cors: {
+  enabled: true,
--- End diff --

I would default cors.enabled to false and/or set the 
access-cotrol-allow-origin to null. Ideally, only the end user should be able 
to enable CORS for all sites. 
Otherwise looks good to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #515: DRILL-4707: Fix memory leak or incorrect query resu...

2016-06-06 Thread jinfengni
GitHub user jinfengni opened a pull request:

https://github.com/apache/drill/pull/515

DRILL-4707: Fix memory leak or incorrect query result in case two col…

…umn names are case-insensitive identical

Fix is mainly in CALCITE-528.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jinfengni/incubator-drill DRILL-4707

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/515.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #515


commit 1bab184e862be5baa5f47804f506bdb56b35b374
Author: Jinfeng Ni 
Date:   2016-06-06T00:37:22Z

DRILL-4707: Fix memory leak or incorrect query result in case two column 
names are case-insensitive identical

Fix is mainly in CALCITE-528.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #507: DRILL-4690: CORS in REST API

2016-06-06 Thread sudheeshkatkam
Github user sudheeshkatkam commented on the issue:

https://github.com/apache/drill/pull/507
  
I am not familiar with CORS. One question: why is this enabled by default?

Also, there is a discussion about not increasing the size of the jdbc-all 
jar (subject: _drill-jdbc-all-1.7.0-SNAPSHOT.jar max size_). Any way to avoid 
that change?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #507: DRILL-4690: CORS in REST API

2016-06-06 Thread PythonicNinja
Github user PythonicNinja commented on the issue:

https://github.com/apache/drill/pull/507
  
@hnfgns @adeneche @sudheeshkatkam any udpates?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (DRILL-4710) Document Drill's JSON processing rules

2016-06-06 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-4710:
--

 Summary: Document Drill's JSON processing rules
 Key: DRILL-4710
 URL: https://issues.apache.org/jira/browse/DRILL-4710
 Project: Apache Drill
  Issue Type: Improvement
  Components: Documentation
Reporter: Paul Rogers
Priority: Minor


One of Drill's key benefits is the ability to query JSON-formatted data. Much 
great work has been done. But, unless someone happens to be a Drill developer, 
the details of exactly how Drill handles various JSON formats can be hard to 
find.

We should document how Drill handles various JSON scenarios.

* SELECT * (schema inferred)
* SELECT a, b, c (schema implied by query)

And various JSON structures:

* Top-level structure (list of maps. Can we handle an array of maps? A list of 
scalars?)
* Changes of the top-level map structure across rows.
** New field appears later in the file. (Was {a: 1, b: "s"}, now is {a: 1, b: 
"s", c: 10}
** Fields disappear later in the file
** Fields change type
** Start of file has many nulls for a field, later in file has non-null values.
* How Drill handles array fields
** Array field is null: { a: [10, 20]}, { a: null }
** Array contains nulls: { a: [10, null, 20] }
** Array contains single scalar type (number or string)
** Array contains multiple scalar types (number and string)
** Aray contains structured types (array, map)
* How Drill handles nested maps
** Explicit select: a, b.c, b.d: {a: 1, b: { c: "s", d: 10 }}
** Implicit select: *
** How data is delivered to Drill client
** How data is delivered to JDBC/ODBC clients
* Size issues
** Very large records (what is max size?)
** Very large strings
** Vary large arrays

Along with any other detailed information not covered by the above list.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4709) Document the included Foodmart sample data

2016-06-06 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-4709:
--

 Summary: Document the included Foodmart sample data
 Key: DRILL-4709
 URL: https://issues.apache.org/jira/browse/DRILL-4709
 Project: Apache Drill
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.6.0
Reporter: Paul Rogers
Priority: Minor


Drill includes a JSON version of the Mondrian FoodMart sample data. This data 
appears in the $DRILL_HOME/jars/3rdparty/foodmart-data-json-0.4.jar jar file, 
accessible using the class path storage plugin.

The documentation mentions using the cp plugin to access customers.json. 
However, the FoodMart data set is quite rich, with many example files.

As it is, unless someone is a curious developer, and good with Google, they 
won't be able to find the other data sets or the source of the FoodMart data.

The data appears to be a JSON version of the SQL sample data for the Mondrian 
project. A schema description is here: 
https://github.com/pentaho/mondrian/blob/master/demo/FoodMart.xml

The Mondrian data appears to have originated at Microsoft to highlight their 
circa 2000 OLAP projects, but has since been discontinued. See

* http://sqlmag.com/development/dts-2000-action
* https://technet.microsoft.com/en-us/library/aa217032(v=sql.80).aspx
* http://sqlmag.com/sql-server/desperately-seeking-samples

Or do a Google search for "microsoft foodmart database".

The request is to:

1. Credit MS and Mondrian for the data.
2. Either explain the data (which is quite a bit of work), or
3. Explain how to extract the files from the jar file to explore manually.
4. Provide a pointer to a description of the schema (if such can be found.)

For option 3:

cd $DRILL_HOME/jars/3rdparty
unzip foodmart-data-json-0.4.jar -d ~/foodmart
cd ~/foodmart
ls

Looking at the data, it is clear that SOME description is needed to understand 
the many tables and how they might work with Drill.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] drill pull request #514: DRILL-4694: CTAS in JSON format produces extraneous...

2016-06-06 Thread parthchandra
Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/514#discussion_r65928727
  
--- Diff: 
exec/java-exec/src/main/codegen/templates/JsonOutputRecordWriter.java ---
@@ -61,7 +62,13 @@
 
 @Override
 public void startField() throws IOException {
+  <#if mode.prefix = "Nullable" >
--- End diff --

Dumb mistake. Thanks for catching it!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #514: DRILL-4694: CTAS in JSON format produces extraneous NULL f...

2016-06-06 Thread parthchandra
Github user parthchandra commented on the issue:

https://github.com/apache/drill/pull/514
  
Updated


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: drill-jdbc-all-1.7.0-SNAPSHOT.jar max size

2016-06-06 Thread Jacques Nadeau
I bet some throuough class cleansing can mean keeping this limit as opposed
to increasing it.

I suggest the JIRA instead be someone reducing the current size by 2mb. In
the past I've done this by expanding the archive and determining all the
large chunks of classes that shouldn't be included. Note the current filter
list at [1] and [2] needs to be continuously updated. It looks like neither
have been updated since January.

[1] https://github.com/apache/drill/blob/master/exec/jdbc-all/pom.xml#L280
[2] https://github.com/apache/drill/blob/master/exec/jdbc-all/pom.xml#L386




--
Jacques Nadeau
CTO and Co-Founder, Dremio

On Mon, Jun 6, 2016 at 6:52 AM, Arina Yelchiyeva  wrote:

> Hi all!
>
> Drill has enforcer for drill-jdbc-all-1.7.0-SNAPSHOT.jar max size. Max size
> is 2000.
> Currently on master jar size is 19956787.
> 43213 bytes is left till the limit. I have exceeded this limit just with
> adding a of couple of new  classes.
>
> I am going to create Jira to update this limit.
> Just wanted to know your opinion on new max size. 3000 will be ok?
>
>
> Kind regards
> Arina
>


drill-jdbc-all-1.7.0-SNAPSHOT.jar max size

2016-06-06 Thread Arina Yelchiyeva
Hi all!

Drill has enforcer for drill-jdbc-all-1.7.0-SNAPSHOT.jar max size. Max size
is 2000.
Currently on master jar size is 19956787.
43213 bytes is left till the limit. I have exceeded this limit just with
adding a of couple of new  classes.

I am going to create Jira to update this limit.
Just wanted to know your opinion on new max size. 3000 will be ok?


Kind regards
Arina