Re: Release process for Maven artifacts?

2018-07-24 Thread Joseph Glanville
Hi Gian,

I was able to publish our own artifacts by creating a new oss-parent pom
pointing to our Maven repository and updating the Druid root pom to point
to it.
It was non-obvious where the release plugin was configured until I spotted
the parent reference.

Joseph.

On Tue, Jul 24, 2018 at 9:09 AM, Gian Merlino  wrote:

> Hi Joseph,
>
> For official releases we just do mvn release:prepare release:perform. The
> poms encode everything that has to happen, including sources jars, javadocs
> jars, signing everything, and pushing it up to Maven Central. You might be
> able to modify them to push to a different repo.
>
> On Mon, Jul 23, 2018 at 5:06 PM Joseph Glanville 
> wrote:
>
> > Hi,
> >
> > Is the release process for publishing Maven artifacts documented
> somewhere?
> > We have been building tar archives with `mvn package` successfully but we
> > would like to publish our own Maven artifacts also, including `-sources`
> > JARs so we can add them as dependencies to projects outside of the Druid
> > source tree.
> >
> > Joseph.
> >
>


Problem on druid-s3-extensions

2018-07-24 Thread Dongjin Lee
Hello. I encountered a problem building druid. *In short,
`TestAWSCredentialsProvider` fails like the following*:

```
mvn -pl extensions-core/s3-extensions test

...

testWithFileSessionCredentials(io.druid.storage.s3.TestAWSCredentialsProvider)
Time elapsed: 6.615 sec  <<< ERROR!
com.amazonaws.SdkClientException: Unable to find a region via the region
provider chain. Must provide an explicit region in the builder or setup
environment to supply a region.
at
io.druid.storage.s3.TestAWSCredentialsProvider.testWithFileSessionCredentials(TestAWSCredentialsProvider.java:98)

testWithFixedAWSKeys(io.druid.storage.s3.TestAWSCredentialsProvider)  Time
elapsed: 4.345 sec  <<< ERROR!
com.amazonaws.SdkClientException: Unable to find a region via the region
provider chain. Must provide an explicit region in the builder or setup
environment to supply a region.
at
io.druid.storage.s3.TestAWSCredentialsProvider.testWithFixedAWSKeys(TestAWSCredentialsProvider.java:67)
```

After digging the code, I found following:

1. `S3StorageDruidModule` does not provide a way to explictily set
`Region`, but detects the region automatically with its default
credential/region provider chain. `S3StorageDruidModule#getAmazonS3Client
-> AmazonS3Client#builder -> AmazonS3ClientBuilder#standard;`
2. The default region provider chain[^1][^2] tries to determine region in
following orders:[^1][^2]
a. Environment Variable[^5]
b. JVM property[^6]
c. AWS Configuration[^7]
d. EC2 instance metadata service[^8]

If all of the above fails, it throws an Exception.

*In short, there is a possibility that `S3StorageDruidModule` can't
determine the Region. It would be better to provide a way to set Region
explicitly.*

How do you think?

Best,
Dongjin

[^1]:
https://github.com/aws/aws-sdk-java/blob/master/aws-java-sdk-core/src/main/java/com/amazonaws/regions/DefaultAwsRegionProviderChain.java
[^2]:
https://github.com/aws/aws-sdk-java/blob/master/aws-java-sdk-core/src/main/java/com/amazonaws/regions/AwsRegionProviderChain.java
[^3]:
https://docs.aws.amazon.com/sdk-for-java/v2/developer-guide/java-dg-region-selection.html
[^4]: The example client in documentation is `AmazonEC2ClientBuilder` but
it also applies to the other clients like `AmazonS3ClientBuilder`.
[^5]:
https://github.com/aws/aws-sdk-java/blob/master/aws-java-sdk-core/src/main/java/com/amazonaws/regions/AwsEnvVarOverrideRegionProvider.java
[^6]:
https://github.com/aws/aws-sdk-java/blob/master/aws-java-sdk-core/src/main/java/com/amazonaws/regions/AwsSystemPropertyRegionProvider.java
[^7]:
https://github.com/aws/aws-sdk-java/blob/master/aws-java-sdk-core/src/main/java/com/amazonaws/regions/AwsProfileRegionProvider.java
[^8]:
https://github.com/aws/aws-sdk-java/blob/master/aws-java-sdk-core/src/main/java/com/amazonaws/regions/InstanceMetadataRegionProvider.java

-- 
*Dongjin Lee*

*A hitchhiker in the mathematical world.*

*github:  github.com/dongjinleekr
linkedin: kr.linkedin.com/in/dongjinleekr
slideshare:
www.slideshare.net/dongjinleekr
*


Re: Build failure on 0.13.SNAPSHOT

2018-07-24 Thread Dongjin Lee
After some experiments, I figured out the following:

1. Druid uses above 8gb of memory for testing. (building-druid.png)
2. With 8gb(physical)+4gb(swap) of memory, the test succeeds regardless of
maven version (3.3.9, 3.5.2, 3.5.4) or MAVEN_OPTS. However, with
8gb(physical)+2gb(swap) of memory[^1], some tests failed. The list of
failing tests differs between maven 3.3.9 and 3.5.2.

In short, retaining sufficient memory solved the problem - *It seems like
12gb of memory is a recommended setting for building druid.* (I guess lots
of you are working with the MacBook Pro with 16gb RAM, right? In that case,
you must not have encountered this problem.)

If you are okay, may I update the building documentation for the newbies
like me?

Thanks,
Dongjin

+1. While building Druid, I found another problem. But this issue should be
discussed in another thread.

[^1]: You know, the other processes also occupy the memory.


On Tue, Jul 24, 2018 at 3:07 AM Jihoon Son  wrote:

> I'm also using Maven 3.5.2 and not using any special configurations for
> Maven, but I have never seen that error too.
> Most of our Travis jobs have been working with only 512 MB of direct
> memory. Only the 'strict compilation' Travis job requires 3 GB of memory.
>
> I think it's worthwhile to look into this more. Maybe we somehow use more
> memory when we run all tests by 'mvn install'. Maybe this relates to the
> frequent transient failures of 'processing module test', one of our Travis
> jobs.
>
> Jihoon
>
> On Mon, Jul 23, 2018 at 9:32 AM Gian Merlino  wrote:
>
> > Interesting. Fwiw, I am using Maven 3.5.2 for building Druid and it has
> > been working for for me. I don't think I"m using any special Maven
> > overrides (at least, I don't see anything interesting in my ~/.m2
> directory
> > or in my environment variables). It might have to do with how much memory
> > our machines have? I do most of my builds on a Mac with 16GB RAM. Maybe
> try
> > checking .travis.yml in the druid repo. It sets -Xmx3000m for mvn install
> > commands, which might be needed for more low memory environments.
> >
> > $ mvn --version
> > Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d;
> > 2017-10-18T00:58:13-07:00)
> > Maven home: /usr/local/Cellar/maven/3.5.2/libexec
> > Java version: 1.8.0_161, vendor: Oracle Corporation
> > Java home:
> > /Library/Java/JavaVirtualMachines/jdk1.8.0_161.jdk/Contents/Home/jre
> > Default locale: en_US, platform encoding: UTF-8
> > OS name: "mac os x", version: "10.13.5", arch: "x86_64", family: "mac"
> >
> > On Mon, Jul 23, 2018 at 6:40 AM Dongjin Lee  wrote:
> >
> > > Finally, it seems like I found the reason. It was a composition of
> > several
> > > problems:
> > >
> > > - Druid should not be built with maven 3.5.x. With 3.5.2, Test suites
> > like
> > > `GroupByQueryRunnerFailureTest` fails. After I switched into 3.3.9
> which
> > is
> > > built in the latest version of IntelliJ, those errors disappeared. It
> > seems
> > > like maven 3.5.x is not stable yet - it applied a drastic change, and
> it
> > is
> > > also why they skipped 3.4.x.
> > > - It seems like Druid requires some MaxDirectMemorySize configuration
> for
> > > some test suites. With some JVM parameter like
> > `-XX:MaxDirectMemorySize=4g`
> > > some test suites were passed, but not all. I am now trying the other
> > > options with enlarged swap space.
> > >
> > > Question: How much MaxDirectMemorySize configuration are you using?
> > >
> > > Best,
> > > Dongjin
> > >
> > > On Sat, Jul 21, 2018 at 3:01 AM Jihoon Son  wrote:
> > >
> > > > Hi Dongjin,
> > > >
> > > > that is weird. It looks like the vm crashed because of out of memory
> > > while
> > > > testing.
> > > > It might be a real issue or not.
> > > > Have you set any memory configuration for your maven?
> > > >
> > > > Jihoon
> > > >
> > > > On Thu, Jul 19, 2018 at 7:09 PM Dongjin Lee 
> > wrote:
> > > >
> > > > > Hi Jihoon,
> > > > >
> > > > > I ran `mvn clean package` following development/build
> > > > > <
> > > > >
> > > >
> > >
> >
> https://github.com/apache/incubator-druid/blob/master/docs/content/development/build.md
> > > > > >
> > > > > .
> > > > >
> > > > > Dongjin
> > > > >
> > > > > On Fri, Jul 20, 2018 at 12:30 AM Jihoon Son 
> > > > wrote:
> > > > >
> > > > > > Hi Dongjin,
> > > > > >
> > > > > > what maven command did you run?
> > > > > >
> > > > > > Jihoon
> > > > > >
> > > > > > On Wed, Jul 18, 2018 at 10:38 PM Dongjin Lee  >
> > > > wrote:
> > > > > >
> > > > > > > Hello. I am trying to build druid, but it fails. My environment
> > is
> > > > like
> > > > > > the
> > > > > > > following:
> > > > > > >
> > > > > > > - CPU: Intel(R) Core(TM) i7-7560U CPU @ 2.40GHz
> > > > > > > - RAM: 7704 MB
> > > > > > > - OS: ubuntu 18.04
> > > > > > > - JDK: openjdk version "1.8.0_171" (default configuration, with
> > > > > > MaxHeapSize
> > > > > > > = 1928 MB)
> > > > > > > - Branch: master (commit: cd8ea3d)
> > > > > > >
> > > > > > > The error message I got is:
> > > > > > >
> > > 

Re: GitBox review comments

2018-07-24 Thread Julian Hyde
Ah, I see. Thanks for clarifying.

> On Jul 24, 2018, at 6:43 PM, Gian Merlino  wrote:
> 
> There is a github feature to make multiple comments as a single "review",
> although from what I can see, gitbox splits those up into multiple emails
> anyway, so it doesn't help. I have poked Infra again on our ticket:
> https://issues.apache.org/jira/browse/INFRA-16674
> 
> On Tue, Jul 24, 2018 at 5:55 PM Julian Hyde  wrote:
> 
>> I know there is an open request with INFRA to route git review comments to
>> a list other than dev. But until then, we are still getting a lot of
>> messages on the list each day. A lot of them come in groups, as a reviewer
>> makes multiple comments on a particular PR.
>> 
>> I believe git has a feature called “review” where a reviewer can make
>> multiple comments and they all appear in the same email. Is this feature
>> supported in gitbox? If so, could reviewers consider using it, in order to
>> reduce the email volume?
>> 
>> Julian
>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
>> For additional commands, e-mail: dev-h...@druid.apache.org
>> 
>> 


-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



Re: GitBox review comments

2018-07-24 Thread Gian Merlino
There is a github feature to make multiple comments as a single "review",
although from what I can see, gitbox splits those up into multiple emails
anyway, so it doesn't help. I have poked Infra again on our ticket:
https://issues.apache.org/jira/browse/INFRA-16674

On Tue, Jul 24, 2018 at 5:55 PM Julian Hyde  wrote:

> I know there is an open request with INFRA to route git review comments to
> a list other than dev. But until then, we are still getting a lot of
> messages on the list each day. A lot of them come in groups, as a reviewer
> makes multiple comments on a particular PR.
>
> I believe git has a feature called “review” where a reviewer can make
> multiple comments and they all appear in the same email. Is this feature
> supported in gitbox? If so, could reviewers consider using it, in order to
> reduce the email volume?
>
> Julian
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> For additional commands, e-mail: dev-h...@druid.apache.org
>
>


GitBox review comments

2018-07-24 Thread Julian Hyde
I know there is an open request with INFRA to route git review comments to a 
list other than dev. But until then, we are still getting a lot of messages on 
the list each day. A lot of them come in groups, as a reviewer makes multiple 
comments on a particular PR.

I believe git has a feature called “review” where a reviewer can make multiple 
comments and they all appear in the same email. Is this feature supported in 
gitbox? If so, could reviewers consider using it, in order to reduce the email 
volume?

Julian


-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] asdf2014 commented on issue #5980: Remove redundant type parameters and enforce some other style and inspection rules

2018-07-24 Thread GitBox
asdf2014 commented on issue #5980: Remove redundant type parameters and enforce 
some other style and inspection rules
URL: https://github.com/apache/incubator-druid/pull/5980#issuecomment-407597210
 
 
   Hi, @jihoonson . Thanks for you comments. Added.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #6041: Synchronize scheduled poll() calls in SQLMetadataSegmentManager

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #6041: Synchronize scheduled 
poll() calls in SQLMetadataSegmentManager
URL: https://github.com/apache/incubator-druid/pull/6041#discussion_r204953651
 
 

 ##
 File path: 
server/src/main/java/io/druid/metadata/SQLMetadataSegmentManager.java
 ##
 @@ -96,9 +94,18 @@
   private final AtomicReference> 
dataSourcesRef;
   private final SQLMetadataConnector connector;
 
-  private volatile ListeningScheduledExecutorService exec = null;
-  private volatile ListenableFuture future = null;
-  private volatile boolean started;
+  /** The number of times this SQLMetadataSegmentManager was started. */
+  private long startCount = 0;
+  /**
+   * Equal to the current {@link #startCount} value, if the 
SQLMetadataSegmentManager is currently started; -1 if
+   * currently stopped.
+   *
+   * This field is used to implement a simple stamp mechanism instead of just 
a boolean "started" flag to prevent
+   * the theoretical situation of two tasks scheduled in {@link #start()} 
calling {@link #poll()} concurrently, if
+   * the sequence of {@link #start()} - {@link #stop()} - {@link #start()} 
actions occurs quickly.
+   */
 
 Review comment:
   Sounds a good convention. I also haven't noticed until I check the changes 
in https://github.com/apache/incubator-druid/pull/5554 which introduced this 
bug. 
   
   Added a comment.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #6041: Synchronize scheduled poll() calls in SQLMetadataSegmentManager

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #6041: Synchronize scheduled 
poll() calls in SQLMetadataSegmentManager
URL: https://github.com/apache/incubator-druid/pull/6041#discussion_r204953478
 
 

 ##
 File path: 
server/src/main/java/io/druid/metadata/SQLMetadataSegmentManager.java
 ##
 @@ -340,7 +357,9 @@ public boolean removeSegment(String ds, final String 
segmentID)
   @Override
   public boolean isStarted()
   {
-return started;
+synchronized (lock) {
 
 Review comment:
   Good point. Changed to use `ReentrantReadWriteLock`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #5958: Part 2 of changes for 
SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204937406
 
 

 ##
 File path: 
processing/src/main/java/io/druid/query/aggregation/NullableBufferAggregator.java
 ##
 @@ -0,0 +1,116 @@
+/*
+ * Licensed to Metamarkets Group Inc. (Metamarkets) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. Metamarkets licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package io.druid.query.aggregation;
+
+import io.druid.common.config.NullHandling;
+import io.druid.guice.annotations.PublicApi;
+import io.druid.segment.BaseNullableColumnValueSelector;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+
+/**
+ * The result of a NullableBufferAggregator will be null if all the values to 
be aggregated are null values or no values
+ * are aggregated at all. If any of the value is non-null, the result would be 
the aggregated value of the delegate
+ * aggregator. Note that the delegate aggregator is not required to perform 
check for
+ * {@link BaseNullableColumnValueSelector#isNull()} on the selector as only 
non-null values will be passed to the
+ * delegate aggregator. This class is only used when SQL compatible null 
handling is enabled.
+ */
+@PublicApi
+public final class NullableBufferAggregator implements BufferAggregator
+{
+
+  private final BufferAggregator delegate;
+  private final BaseNullableColumnValueSelector selector;
+
+  public NullableBufferAggregator(BufferAggregator delegate, 
BaseNullableColumnValueSelector selector)
+  {
+this.delegate = delegate;
+this.selector = selector;
+  }
+
+  @Override
+  public void init(ByteBuffer buf, int position)
+  {
+buf.put(position, NullHandling.IS_NULL_BYTE);
+delegate.init(buf, position + Byte.BYTES);
+  }
+
+  @Override
+  public void aggregate(ByteBuffer buf, int position)
+  {
+boolean isNotNull = !selector.isNull();
+if (isNotNull) {
+  if (buf.get(position) == NullHandling.IS_NULL_BYTE) {
+buf.put(position, NullHandling.IS_NOT_NULL_BYTE);
+  }
+  delegate.aggregate(buf, position + Byte.BYTES);
+}
+  }
+
+  @Override
+  @Nullable
+  public Object get(ByteBuffer buf, int position)
+  {
+if (buf.get(position) == NullHandling.IS_NULL_BYTE) {
+  return null;
+}
+return delegate.get(buf, position + Byte.BYTES);
+  }
+
+  @Override
+  public float getFloat(ByteBuffer buf, int position)
+  {
+if (buf.get(position) == NullHandling.IS_NULL_BYTE) {
+  throw new IllegalStateException("Cannot return float for Null Value");
+}
+return delegate.getFloat(buf, position + Byte.BYTES);
+  }
+
+  @Override
+  public long getLong(ByteBuffer buf, int position)
+  {
+if (buf.get(position) == NullHandling.IS_NULL_BYTE) {
+  throw new IllegalStateException("Cannot return long for Null Value");
+}
+return delegate.getLong(buf, position + Byte.BYTES);
+  }
+
+  @Override
+  public double getDouble(ByteBuffer buf, int position)
+  {
+if (buf.get(position) == NullHandling.IS_NULL_BYTE) {
+  throw new IllegalStateException("Cannot return double for Null Value");
+}
+return delegate.getDouble(buf, position + Byte.BYTES);
+  }
+
+  @Override
+  public boolean isNull(ByteBuffer buf, int position)
+  {
+return buf.get(position) == NullHandling.IS_NULL_BYTE || 
delegate.isNull(buf, position + Byte.BYTES);
 
 Review comment:
   Do we need to check `delegate.isNull()` here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #5958: Part 2 of changes for 
SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204944312
 
 

 ##
 File path: 
processing/src/main/java/io/druid/query/groupby/epinephelinae/RowBasedGrouperHelper.java
 ##
 @@ -1700,6 +1712,78 @@ public BufferComparator getBufferComparator()
 return bufferComparator;
   }
 }
+
+// This class is only used when SQL compatible null handling is enabled.
+private class NullableRowBasedKeySerdeHelper implements 
RowBasedKeySerdeHelper
 
 Review comment:
   Please add a comment about the memory layout.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #5958: Part 2 of changes for 
SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204933856
 
 

 ##
 File path: 
processing/src/main/java/io/druid/query/aggregation/NullableAggregatorFactory.java
 ##
 @@ -0,0 +1,114 @@
+/*
+ * Licensed to Metamarkets Group Inc. (Metamarkets) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. Metamarkets licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package io.druid.query.aggregation;
+
+
+import io.druid.common.config.NullHandling;
+import io.druid.guice.annotations.ExtensionPoint;
+import io.druid.segment.BaseNullableColumnValueSelector;
+import io.druid.segment.ColumnSelectorFactory;
+import io.druid.segment.ColumnValueSelector;
+
+/**
+ * Abstract class with functionality to wrap {@link Aggregator}, {@link 
BufferAggregator} and {@link AggregateCombiner}
+ * to support nullable aggregations for SQL compatibility. Implementations of 
{@link AggregatorFactory} which need to
+ * Support Nullable Aggregations are encouraged to extend this class.
+ */
+@ExtensionPoint
+public abstract class NullableAggregatorFactory extends AggregatorFactory
+{
+  @Override
+  public final Aggregator factorize(ColumnSelectorFactory metricFactory)
+  {
+T selector = selector(metricFactory);
+Aggregator aggregator = factorize(metricFactory, selector);
+return NullHandling.replaceWithDefault() ? aggregator : new 
NullableAggregator(aggregator, selector);
+  }
+
+  @Override
+  public final BufferAggregator factorizeBuffered(ColumnSelectorFactory 
metricFactory)
+  {
+T selector = selector(metricFactory);
+BufferAggregator aggregator = factorizeBuffered(metricFactory, selector);
+return NullHandling.replaceWithDefault() ? aggregator : new 
NullableBufferAggregator(aggregator, selector);
+  }
+
+  @Override
+  public final AggregateCombiner makeAggregateCombiner()
+  {
+AggregateCombiner combiner = makeAggregateCombiner2();
+return NullHandling.replaceWithDefault() ? combiner : new 
NullableAggregateCombiner(combiner);
+  }
+
+  @Override
+  public final int getMaxIntermediateSize()
+  {
+return getMaxIntermediateSize2() + (NullHandling.replaceWithDefault() ? 0 
: Byte.BYTES);
+  }
+
+  //  ABSTRACT METHODS BELOW --
+
+  /**
+   * Creates a {@link ColumnValueSelector} for the aggregated column.
+   *
+   * @see ColumnValueSelector
+   */
+  protected abstract T selector(ColumnSelectorFactory metricFactory);
+
+  /**
+   * Creates an {@link Aggregator} to aggregate values from several rows, by 
using the provided selector.
+   * @param metricFactory metricFactory
+   * @param selector {@link ColumnValueSelector} for the column to aggregate.
+   *
+   * @see Aggregator
+   */
+  protected abstract Aggregator factorize(ColumnSelectorFactory metricFactory, 
T selector);
+
+  /**
+   * Creates an {@link BufferAggregator} to aggregate values from several rows 
into a ByteBuffer.
+   * @param metricFactory metricFactory
+   * @param selector {@link ColumnValueSelector} for the column to aggregate.
+   *
+   * @see BufferAggregator
+   */
+  protected abstract BufferAggregator factorizeBuffered(
+  ColumnSelectorFactory metricFactory,
+  T selector
+  );
+
+  /**
+   * Creates an {@link AggregateCombiner} to fold rollup aggregation results 
from serveral "rows" of different indexes
+   * during index merging. AggregateCombiner implements the same logic as 
{@link #combine}, with the difference that it
+   * uses {@link ColumnValueSelector} and it's subinterfaces to get inputs and 
implements {@code ColumnValueSelector}
+   * to provide output.
+   *
+   * @see AggregateCombiner
+   * @see io.druid.segment.IndexMerger
+   */
+  protected abstract AggregateCombiner makeAggregateCombiner2();
 
 Review comment:
   Would you rename this method to be more intuitive? Also, why do we need this 
method in addition to `AggregatorFactory.makeAggregateCombiner()`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git 

[GitHub] jihoonson commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #5958: Part 2 of changes for 
SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204931100
 
 

 ##
 File path: 
processing/src/main/java/io/druid/query/aggregation/NullableAggregator.java
 ##
 @@ -0,0 +1,107 @@
+/*
+ * Licensed to Metamarkets Group Inc. (Metamarkets) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. Metamarkets licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package io.druid.query.aggregation;
+
+import io.druid.guice.annotations.PublicApi;
+import io.druid.segment.BaseNullableColumnValueSelector;
+
+import javax.annotation.Nullable;
+
+/**
+ * The result of a NullableAggregator will be null if all the values to be 
aggregated are null values
+ * or no values are aggregated at all. If any of the value is non-null, the 
result would be the aggregated
+ * value of the delegate aggregator. Note that the delegate aggregator is not 
required to perform check for
+ * {@link BaseNullableColumnValueSelector#isNull()} on the selector as only 
non-null values will be passed
+ * to the delegate aggregator. This class is only used when SQL compatible 
null handling is enabled.
+ */
+@PublicApi
+public final class NullableAggregator implements Aggregator
+{
+  private final Aggregator delegate;
+  private final BaseNullableColumnValueSelector selector;
+  private boolean isNullResult = true;
+
+  public NullableAggregator(Aggregator delegate, 
BaseNullableColumnValueSelector selector)
+  {
+this.delegate = delegate;
+this.selector = selector;
+  }
+
+  @Override
+  public void aggregate()
+  {
+boolean isNotNull = !selector.isNull();
+if (isNotNull) {
+  if (isNullResult) {
+isNullResult = false;
+  }
+  delegate.aggregate();
+}
+  }
+
+  @Override
+  @Nullable
+  public Object get()
+  {
+if (isNullResult) {
+  return null;
+}
+return delegate.get();
+  }
+
+  @Override
+  public float getFloat()
+  {
+if (isNullResult) {
+  throw new IllegalStateException("Cannot return float for Null Value");
+}
+return delegate.getFloat();
+  }
+
+  @Override
+  public long getLong()
+  {
+if (isNullResult) {
+  throw new IllegalStateException("Cannot return long for Null Value");
+}
+return delegate.getLong();
+  }
+
+  @Override
+  public double getDouble()
+  {
+if (isNullResult) {
+  throw new IllegalStateException("Cannot return double for Null Value");
+}
+return delegate.getDouble();
+  }
+
+  @Override
+  public boolean isNull()
 
 Review comment:
   Looks like this is the reason:
   
   > The default implementation always return false to enable smooth backward 
compatibility, re-implement if your aggregator is nullable.
   
   I would say it would be fine to make a breaking change for an 
`ExtensionPoint` in our major releases, but it's up to you. Do you have a rough 
plan to change this in the future?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #5958: Part 2 of changes for 
SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204922799
 
 

 ##
 File path: 
processing/src/main/java/io/druid/query/aggregation/DoubleMaxAggregatorFactory.java
 ##
 @@ -51,25 +53,44 @@ public DoubleMaxAggregatorFactory(String name, String 
fieldName)
   }
 
   @Override
-  public Aggregator factorize(ColumnSelectorFactory metricFactory)
+  protected BaseDoubleColumnValueSelector selector(ColumnSelectorFactory 
metricFactory)
   {
-return new DoubleMaxAggregator(getDoubleColumnSelector(metricFactory, 
Double.NEGATIVE_INFINITY));
+return getDoubleColumnSelector(
+metricFactory,
+Double.NEGATIVE_INFINITY
 
 Review comment:
   And do we need to change this to something else which can actually represent 
a null? If so, do you have a plan for it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #5958: Part 2 of changes for 
SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204937099
 
 

 ##
 File path: 
processing/src/main/java/io/druid/query/aggregation/NullableBufferAggregator.java
 ##
 @@ -0,0 +1,116 @@
+/*
+ * Licensed to Metamarkets Group Inc. (Metamarkets) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. Metamarkets licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package io.druid.query.aggregation;
+
+import io.druid.common.config.NullHandling;
+import io.druid.guice.annotations.PublicApi;
+import io.druid.segment.BaseNullableColumnValueSelector;
+
+import javax.annotation.Nullable;
+import java.nio.ByteBuffer;
+
+/**
+ * The result of a NullableBufferAggregator will be null if all the values to 
be aggregated are null values or no values
+ * are aggregated at all. If any of the value is non-null, the result would be 
the aggregated value of the delegate
+ * aggregator. Note that the delegate aggregator is not required to perform 
check for
+ * {@link BaseNullableColumnValueSelector#isNull()} on the selector as only 
non-null values will be passed to the
+ * delegate aggregator. This class is only used when SQL compatible null 
handling is enabled.
+ */
 
 Review comment:
   Would you please add a doc for buffer layout?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #5958: Part 2 of changes for 
SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204946252
 
 

 ##
 File path: 
sql/src/main/java/io/druid/sql/calcite/expression/builtin/CeilOperatorConversion.java
 ##
 @@ -80,7 +80,7 @@ public DruidExpression toDruidExpression(
   // So there is no simple extraction for this operator.
   return DruidExpression.fromFunctionCall(
   "timestamp_ceil",
-  ImmutableList.of(
+  Arrays.asList(
 
 Review comment:
   Why this change?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #5958: Part 2 of changes for 
SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204943710
 
 

 ##
 File path: 
processing/src/main/java/io/druid/query/filter/SelectorDimFilter.java
 ##
 @@ -77,10 +81,12 @@ public SelectorDimFilter(
 byte[] valueBytes = (value == null) ? new byte[]{} : 
StringUtils.toUtf8(value);
 byte[] extractionFnBytes = extractionFn == null ? new byte[0] : 
extractionFn.getCacheKey();
 
-return ByteBuffer.allocate(3 + dimensionBytes.length + valueBytes.length + 
extractionFnBytes.length)
+return ByteBuffer.allocate(5 + dimensionBytes.length + valueBytes.length + 
extractionFnBytes.length)
 
 Review comment:
   nit: can we use `CacheKeyBuilder` here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #5958: Part 2 of changes for 
SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204937970
 
 

 ##
 File path: 
processing/src/main/java/io/druid/query/aggregation/NullableAggregatorFactory.java
 ##
 @@ -0,0 +1,114 @@
+/*
+ * Licensed to Metamarkets Group Inc. (Metamarkets) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. Metamarkets licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package io.druid.query.aggregation;
+
+
+import io.druid.common.config.NullHandling;
+import io.druid.guice.annotations.ExtensionPoint;
+import io.druid.segment.BaseNullableColumnValueSelector;
+import io.druid.segment.ColumnSelectorFactory;
+import io.druid.segment.ColumnValueSelector;
+
+/**
+ * Abstract class with functionality to wrap {@link Aggregator}, {@link 
BufferAggregator} and {@link AggregateCombiner}
+ * to support nullable aggregations for SQL compatibility. Implementations of 
{@link AggregatorFactory} which need to
+ * Support Nullable Aggregations are encouraged to extend this class.
+ */
+@ExtensionPoint
+public abstract class NullableAggregatorFactory extends AggregatorFactory
 
 Review comment:
   Currently only numeric aggregatorFactories extend this class. Do you have a 
plan for making other aggregatorFactories nullable?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #5958: Part 2 of changes for 
SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204945663
 
 

 ##
 File path: processing/src/main/java/io/druid/segment/filter/LikeFilter.java
 ##
 @@ -90,7 +90,10 @@ public boolean supportsSelectivityEstimation(
   {
 if (isSimpleEquals()) {
   // Verify that dimension equals prefix.
-  return ImmutableList.of(selector.getBitmapIndex(dimension, 
likeMatcher.getPrefix()));
+  return ImmutableList.of(selector.getBitmapIndex(
 
 Review comment:
   format
   
   ```java
 return ImmutableList.of(
 selector.getBitmapIndex(
 dimension,
 NullHandling.emptyToNullIfNeeded(likeMatcher.getPrefix())
 )
 );
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #5958: Part 2 of changes for 
SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204919328
 
 

 ##
 File path: 
indexing-service/src/test/java/io/druid/indexing/common/task/AppenderatorDriverRealtimeIndexTaskTest.java
 ##
 @@ -1466,7 +1505,7 @@ public void close()
 );
   }
 
-  public long sumMetric(final Task task, final DimFilter filter, final String 
metric)
+  public Long sumMetric(final Task task, final DimFilter filter, final String 
metric)
 
 Review comment:
   Please annotate `@Nullable`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #5958: Part 2 of changes for 
SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204930658
 
 

 ##
 File path: 
processing/src/main/java/io/druid/query/aggregation/NullableAggregator.java
 ##
 @@ -0,0 +1,107 @@
+/*
+ * Licensed to Metamarkets Group Inc. (Metamarkets) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. Metamarkets licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package io.druid.query.aggregation;
+
+import io.druid.guice.annotations.PublicApi;
+import io.druid.segment.BaseNullableColumnValueSelector;
+
+import javax.annotation.Nullable;
+
+/**
+ * The result of a NullableAggregator will be null if all the values to be 
aggregated are null values
+ * or no values are aggregated at all. If any of the value is non-null, the 
result would be the aggregated
+ * value of the delegate aggregator. Note that the delegate aggregator is not 
required to perform check for
+ * {@link BaseNullableColumnValueSelector#isNull()} on the selector as only 
non-null values will be passed
+ * to the delegate aggregator. This class is only used when SQL compatible 
null handling is enabled.
+ */
+@PublicApi
+public final class NullableAggregator implements Aggregator
+{
+  private final Aggregator delegate;
+  private final BaseNullableColumnValueSelector selector;
+  private boolean isNullResult = true;
+
+  public NullableAggregator(Aggregator delegate, 
BaseNullableColumnValueSelector selector)
+  {
+this.delegate = delegate;
+this.selector = selector;
+  }
+
+  @Override
+  public void aggregate()
+  {
+boolean isNotNull = !selector.isNull();
+if (isNotNull) {
+  if (isNullResult) {
+isNullResult = false;
+  }
+  delegate.aggregate();
+}
+  }
+
+  @Override
+  @Nullable
+  public Object get()
+  {
+if (isNullResult) {
+  return null;
+}
+return delegate.get();
+  }
+
+  @Override
+  public float getFloat()
+  {
+if (isNullResult) {
+  throw new IllegalStateException("Cannot return float for Null Value");
+}
+return delegate.getFloat();
+  }
+
+  @Override
+  public long getLong()
+  {
+if (isNullResult) {
+  throw new IllegalStateException("Cannot return long for Null Value");
+}
+return delegate.getLong();
+  }
+
+  @Override
+  public double getDouble()
+  {
+if (isNullResult) {
+  throw new IllegalStateException("Cannot return double for Null Value");
+}
+return delegate.getDouble();
+  }
+
+  @Override
+  public boolean isNull()
 
 Review comment:
   BTW, I wonder why we need two different code paths for primitive types and 
`Object` type. Can we simply share the same code path like enforcing to always 
call `isNull()` first for all types?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #5958: Part 2 of changes for 
SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204930332
 
 

 ##
 File path: 
processing/src/main/java/io/druid/query/aggregation/NullableAggregator.java
 ##
 @@ -0,0 +1,107 @@
+/*
+ * Licensed to Metamarkets Group Inc. (Metamarkets) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. Metamarkets licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package io.druid.query.aggregation;
+
+import io.druid.guice.annotations.PublicApi;
+import io.druid.segment.BaseNullableColumnValueSelector;
+
+import javax.annotation.Nullable;
+
+/**
+ * The result of a NullableAggregator will be null if all the values to be 
aggregated are null values
+ * or no values are aggregated at all. If any of the value is non-null, the 
result would be the aggregated
+ * value of the delegate aggregator. Note that the delegate aggregator is not 
required to perform check for
+ * {@link BaseNullableColumnValueSelector#isNull()} on the selector as only 
non-null values will be passed
+ * to the delegate aggregator. This class is only used when SQL compatible 
null handling is enabled.
+ */
+@PublicApi
+public final class NullableAggregator implements Aggregator
+{
+  private final Aggregator delegate;
+  private final BaseNullableColumnValueSelector selector;
+  private boolean isNullResult = true;
+
+  public NullableAggregator(Aggregator delegate, 
BaseNullableColumnValueSelector selector)
+  {
+this.delegate = delegate;
+this.selector = selector;
+  }
+
+  @Override
+  public void aggregate()
+  {
+boolean isNotNull = !selector.isNull();
+if (isNotNull) {
+  if (isNullResult) {
+isNullResult = false;
+  }
+  delegate.aggregate();
+}
+  }
+
+  @Override
+  @Nullable
+  public Object get()
+  {
+if (isNullResult) {
+  return null;
+}
+return delegate.get();
+  }
+
+  @Override
+  public float getFloat()
+  {
+if (isNullResult) {
+  throw new IllegalStateException("Cannot return float for Null Value");
+}
+return delegate.getFloat();
+  }
+
+  @Override
+  public long getLong()
+  {
+if (isNullResult) {
+  throw new IllegalStateException("Cannot return long for Null Value");
+}
+return delegate.getLong();
+  }
+
+  @Override
+  public double getDouble()
+  {
+if (isNullResult) {
+  throw new IllegalStateException("Cannot return double for Null Value");
+}
+return delegate.getDouble();
+  }
+
+  @Override
+  public boolean isNull()
 
 Review comment:
   Does it make more sense to rename this method to `isNumericNull()`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #5958: Part 2 of changes for 
SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204941324
 
 

 ##
 File path: 
processing/src/main/java/io/druid/query/aggregation/post/DoubleGreatestPostAggregator.java
 ##
 @@ -81,13 +77,15 @@ public Comparator getComparator()
   public Object compute(Map values)
   {
 Iterator fieldsIter = fields.iterator();
-double retVal = Double.NEGATIVE_INFINITY;
-if (fieldsIter.hasNext()) {
-  retVal = ((Number) fieldsIter.next().compute(values)).doubleValue();
-  while (fieldsIter.hasNext()) {
-double other = ((Number) 
fieldsIter.next().compute(values)).doubleValue();
-if (other > retVal) {
-  retVal = other;
+Double retVal = NullHandling.replaceWithDefault() ? 
Double.NEGATIVE_INFINITY : null;
+while (fieldsIter.hasNext()) {
+  Number nextVal = ((Number) fieldsIter.next().compute(values));
 
 Review comment:
   It looks that we need specialized `PostAggregator` for primitive types and a 
new method `isNull()` to avoid boxing/unboxing. Would you file an issue for 
this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #5958: Part 2 of changes for 
SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204916166
 
 

 ##
 File path: 
extensions-core/lookups-cached-single/src/test/java/io/druid/server/lookup/LoadingLookupTest.java
 ##
 @@ -43,13 +44,20 @@
   public void testApplyEmptyOrNull()
   {
 Assert.assertEquals(null, loadingLookup.apply(null));
-Assert.assertEquals(null, loadingLookup.apply(""));
+if (NullHandling.sqlCompatible()) {
+  // empty string should also have same behavior
+  Assert.assertEquals(null, loadingLookup.apply(""));
+}
   }
 
   @Test
   public void testUnapplyNull()
   {
-Assert.assertEquals(Collections.EMPTY_LIST, loadingLookup.unapply(null));
+if (NullHandling.sqlCompatible()) {
+  Assert.assertEquals(Collections.emptyList(), 
loadingLookup.unapply(null));
+} else {
+  Assert.assertEquals(null, loadingLookup.unapply(null));
 
 Review comment:
   `Assert.assertNull()`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #5958: Part 2 of changes for 
SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204933933
 
 

 ##
 File path: 
processing/src/main/java/io/druid/query/aggregation/NullableAggregatorFactory.java
 ##
 @@ -0,0 +1,114 @@
+/*
+ * Licensed to Metamarkets Group Inc. (Metamarkets) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. Metamarkets licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package io.druid.query.aggregation;
+
+
+import io.druid.common.config.NullHandling;
+import io.druid.guice.annotations.ExtensionPoint;
+import io.druid.segment.BaseNullableColumnValueSelector;
+import io.druid.segment.ColumnSelectorFactory;
+import io.druid.segment.ColumnValueSelector;
+
+/**
+ * Abstract class with functionality to wrap {@link Aggregator}, {@link 
BufferAggregator} and {@link AggregateCombiner}
+ * to support nullable aggregations for SQL compatibility. Implementations of 
{@link AggregatorFactory} which need to
+ * Support Nullable Aggregations are encouraged to extend this class.
+ */
+@ExtensionPoint
+public abstract class NullableAggregatorFactory extends AggregatorFactory
+{
+  @Override
+  public final Aggregator factorize(ColumnSelectorFactory metricFactory)
+  {
+T selector = selector(metricFactory);
+Aggregator aggregator = factorize(metricFactory, selector);
+return NullHandling.replaceWithDefault() ? aggregator : new 
NullableAggregator(aggregator, selector);
+  }
+
+  @Override
+  public final BufferAggregator factorizeBuffered(ColumnSelectorFactory 
metricFactory)
+  {
+T selector = selector(metricFactory);
+BufferAggregator aggregator = factorizeBuffered(metricFactory, selector);
+return NullHandling.replaceWithDefault() ? aggregator : new 
NullableBufferAggregator(aggregator, selector);
+  }
+
+  @Override
+  public final AggregateCombiner makeAggregateCombiner()
+  {
+AggregateCombiner combiner = makeAggregateCombiner2();
+return NullHandling.replaceWithDefault() ? combiner : new 
NullableAggregateCombiner(combiner);
+  }
+
+  @Override
+  public final int getMaxIntermediateSize()
+  {
+return getMaxIntermediateSize2() + (NullHandling.replaceWithDefault() ? 0 
: Byte.BYTES);
+  }
+
+  //  ABSTRACT METHODS BELOW --
+
+  /**
+   * Creates a {@link ColumnValueSelector} for the aggregated column.
+   *
+   * @see ColumnValueSelector
+   */
+  protected abstract T selector(ColumnSelectorFactory metricFactory);
+
+  /**
+   * Creates an {@link Aggregator} to aggregate values from several rows, by 
using the provided selector.
+   * @param metricFactory metricFactory
+   * @param selector {@link ColumnValueSelector} for the column to aggregate.
+   *
+   * @see Aggregator
+   */
+  protected abstract Aggregator factorize(ColumnSelectorFactory metricFactory, 
T selector);
+
+  /**
+   * Creates an {@link BufferAggregator} to aggregate values from several rows 
into a ByteBuffer.
+   * @param metricFactory metricFactory
+   * @param selector {@link ColumnValueSelector} for the column to aggregate.
+   *
+   * @see BufferAggregator
+   */
+  protected abstract BufferAggregator factorizeBuffered(
+  ColumnSelectorFactory metricFactory,
+  T selector
+  );
+
+  /**
+   * Creates an {@link AggregateCombiner} to fold rollup aggregation results 
from serveral "rows" of different indexes
+   * during index merging. AggregateCombiner implements the same logic as 
{@link #combine}, with the difference that it
+   * uses {@link ColumnValueSelector} and it's subinterfaces to get inputs and 
implements {@code ColumnValueSelector}
+   * to provide output.
+   *
+   * @see AggregateCombiner
+   * @see io.druid.segment.IndexMerger
+   */
+  protected abstract AggregateCombiner makeAggregateCombiner2();
+
+  /**
+   * Returns the maximum size that this aggregator will require in bytes for 
intermediate storage of results.
+   *
+   * @return the maximum number of bytes that an aggregator of this type will 
require for intermediate result storage.
+   */
+  protected abstract int getMaxIntermediateSize2();
 
 Review comment:
   Same here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and 

[GitHub] jihoonson commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #5958: Part 2 of changes for 
SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204942207
 
 

 ##
 File path: 
processing/src/main/java/io/druid/query/extraction/StrlenExtractionFn.java
 ##
 @@ -40,6 +41,9 @@ public static StrlenExtractionFn instance()
   @Override
   public String apply(@Nullable String value)
   {
+if (NullHandling.sqlCompatible() && value == null) {
+  return null;
 
 Review comment:
   Please annotate `@Nullable`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] AlexanderSaydakov closed issue #6037: snapshot quickstart error

2018-07-24 Thread GitBox
AlexanderSaydakov closed issue #6037: snapshot quickstart error
URL: https://github.com/apache/incubator-druid/issues/6037
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] AlexanderSaydakov commented on issue #6037: snapshot quickstart error

2018-07-24 Thread GitBox
AlexanderSaydakov commented on issue #6037: snapshot quickstart error
URL: 
https://github.com/apache/incubator-druid/issues/6037#issuecomment-407570981
 
 
   I found a machine with java 1.8.0_161, built druid on it, and there is no 
problem.
   I think we can close this issue.
   Thank you.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] leventov commented on a change in pull request #6041: Synchronize scheduled poll() calls in SQLMetadataSegmentManager

2018-07-24 Thread GitBox
leventov commented on a change in pull request #6041: Synchronize scheduled 
poll() calls in SQLMetadataSegmentManager
URL: https://github.com/apache/incubator-druid/pull/6041#discussion_r204921786
 
 

 ##
 File path: 
server/src/main/java/io/druid/metadata/SQLMetadataSegmentManager.java
 ##
 @@ -340,7 +357,9 @@ public boolean removeSegment(String ds, final String 
segmentID)
   @Override
   public boolean isStarted()
   {
-return started;
+synchronized (lock) {
 
 Review comment:
   Perf change is unimportant? It might wait for long scheduled `poll()`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] leventov commented on a change in pull request #6041: Synchronize scheduled poll() calls in SQLMetadataSegmentManager

2018-07-24 Thread GitBox
leventov commented on a change in pull request #6041: Synchronize scheduled 
poll() calls in SQLMetadataSegmentManager
URL: https://github.com/apache/incubator-druid/pull/6041#discussion_r204922700
 
 

 ##
 File path: 
server/src/main/java/io/druid/metadata/SQLMetadataSegmentManager.java
 ##
 @@ -96,9 +94,18 @@
   private final AtomicReference> 
dataSourcesRef;
   private final SQLMetadataConnector connector;
 
-  private volatile ListeningScheduledExecutorService exec = null;
-  private volatile ListenableFuture future = null;
-  private volatile boolean started;
+  /** The number of times this SQLMetadataSegmentManager was started. */
+  private long startCount = 0;
+  /**
+   * Equal to the current {@link #startCount} value, if the 
SQLMetadataSegmentManager is currently started; -1 if
+   * currently stopped.
+   *
+   * This field is used to implement a simple stamp mechanism instead of just 
a boolean "started" flag to prevent
+   * the theoretical situation of two tasks scheduled in {@link #start()} 
calling {@link #poll()} concurrently, if
+   * the sequence of {@link #start()} - {@link #stop()} - {@link #start()} 
actions occurs quickly.
+   */
 
 Review comment:
   Please cross-reference those things in both classes. Like I changed one 
class and didn't know that there is another one that demands a similar change. 
You knew that, but this knowledge should be in code for any code developer.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5471: Implement force push down for nested group by query

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #5471: Implement force push 
down for nested group by query
URL: https://github.com/apache/incubator-druid/pull/5471#discussion_r204904971
 
 

 ##
 File path: server/src/main/java/io/druid/server/coordination/ServerManager.java
 ##
 @@ -109,6 +110,14 @@ public ServerManager(
 this.serverConfig = serverConfig;
   }
 
+  private DataSource getTableDataSource(DataSource dataSource)
 
 Review comment:
   nit: `getInnerMostDataSource()` sounds more intuitive.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5471: Implement force push down for nested group by query

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #5471: Implement force push 
down for nested group by query
URL: https://github.com/apache/incubator-druid/pull/5471#discussion_r204899786
 
 

 ##
 File path: 
processing/src/main/java/io/druid/query/groupby/epinephelinae/GroupByRowProcessor.java
 ##
 @@ -98,7 +112,18 @@
? BooleanValueMatcher.of(true)
: 
filter.makeMatcher(columnSelectorFactory);
 
-final FilteredSequence filteredSequence = new FilteredSequence<>(
+final FilteredSequence filteredSequence = wasQueryPushedDown ? new 
FilteredSequence<>(
+rows,
+new Predicate()
+{
+  @Override
+  public boolean apply(Row input)
+  {
+rowSupplier.set(input);
+return true; // nothing to filter since it has already been done 
on the push down
 
 Review comment:
   If this is the case, I believe it's much faster to call `final 
AggregateResult retVal = rows.accumulate(AggregateResult.ok(), accumulator);` 
at 
https://github.com/apache/incubator-druid/pull/5471/files#diff-dfac707f8edd4ee002b9f3372cfc2699R192.
 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5471: Implement force push down for nested group by query

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #5471: Implement force push 
down for nested group by query
URL: https://github.com/apache/incubator-druid/pull/5471#discussion_r204903519
 
 

 ##
 File path: server/src/main/java/io/druid/server/QueryLifecycle.java
 ##
 @@ -249,8 +249,7 @@ public QueryResponse execute()
 
 final Map responseContext = 
DirectDruidClient.makeResponseContextForQuery();
 
-final Sequence res = QueryPlus.wrap(baseQuery)
-  
.withIdentity(authenticationResult.getIdentity())
+final Sequence res = 
QueryPlus.wrap(baseQuery).withIdentity(authenticationResult.getIdentity())
 
 Review comment:
   It looks to be the right formatting before being changed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5471: Implement force push down for nested group by query

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #5471: Implement force push 
down for nested group by query
URL: https://github.com/apache/incubator-druid/pull/5471#discussion_r204891826
 
 

 ##
 File path: 
processing/src/main/java/io/druid/query/groupby/GroupByQueryConfig.java
 ##
 @@ -29,6 +29,8 @@
   public static final String CTX_KEY_STRATEGY = "groupByStrategy";
   public static final String CTX_KEY_FORCE_LIMIT_PUSH_DOWN = 
"forceLimitPushDown";
   public static final String CTX_KEY_APPLY_LIMIT_PUSH_DOWN = 
"applyLimitPushDown";
+  public static final String CTX_KEY_FORCE_PUSH_DOWN_NESTED_QUERY = 
"forcePushDownNestedQuery";
+  public static final String CTX_KEY_EXECUTING_NESTED_QUERY = 
"executinNestedQuery";
 
 Review comment:
   typo: `executinNestedQuery` -> `executingNestedQuery`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5471: Implement force push down for nested group by query

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #5471: Implement force push 
down for nested group by query
URL: https://github.com/apache/incubator-druid/pull/5471#discussion_r204904422
 
 

 ##
 File path: processing/src/main/java/io/druid/query/BaseQuery.java
 ##
 @@ -257,4 +257,17 @@ public int hashCode()
 
 return Objects.hash(dataSource, descending, context, querySegmentSpec, 
duration, granularity);
   }
+
+  /**
+   * @param query
+   *
+   * @return intervals to query. For nested queries, return the intervals for 
the inner most query.
+   */
+  public static List getIntervals(BaseQuery query)
 
 Review comment:
   Since every query implementation extends `BaseQuery`, I think it's better to 
add this method to `Query` which is the higher interface than `BaseQuery`. What 
do you think about adding a method to `Query` like below:
   
   ```java
   default List getIntervalsOfInnerMostQuery()
 {
   if (getDataSource() instanceof QueryDataSource) {
 //noinspection unchecked
 return ((QueryDataSource) 
getDataSource()).getQuery().getIntervalsOfInnerMostQuery();
   } else {
 return getIntervals();
   }
 }
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5471: Implement force push down for nested group by query

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #5471: Implement force push 
down for nested group by query
URL: https://github.com/apache/incubator-druid/pull/5471#discussion_r204897508
 
 

 ##
 File path: 
processing/src/main/java/io/druid/query/groupby/epinephelinae/GroupByRowProcessor.java
 ##
 @@ -66,18 +70,28 @@
   final GroupByQueryResource resource,
   final ObjectMapper spillMapper,
   final String processingTmpDir,
-  final int mergeBufferSize
+  final int mergeBufferSize,
+  final boolean wasQueryPushedDown
   )
   {
-final GroupByQuery query = (GroupByQuery) queryParam;
+GroupByQuery queryWithRenamedDimensions = null;
+if (wasQueryPushedDown) {
+  List newDims = new ArrayList<>(((GroupByQuery) 
queryParam).getDimensions().size());
 
 Review comment:
   nit: can be simpler.
   
   ```java
 final List newDims = ((GroupByQuery) 
queryParam).getDimensions()

.stream()

.map(RenamedDimensionSpec::new)

.collect(Collectors.toList());
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on a change in pull request #5471: Implement force push down for nested group by query

2018-07-24 Thread GitBox
jihoonson commented on a change in pull request #5471: Implement force push 
down for nested group by query
URL: https://github.com/apache/incubator-druid/pull/5471#discussion_r204894682
 
 

 ##
 File path: 
processing/src/main/java/io/druid/query/groupby/GroupByQueryQueryToolChest.java
 ##
 @@ -162,10 +162,11 @@ public GroupByQueryQueryToolChest(
   Map context
   )
   {
+if (isNestedQueryPushDown(query, groupByStrategy)) {
 
 Review comment:
   I believe that once `isNestedQueryPushDown(query, groupByStrategy)` returns 
false, it never changes no matter how many times we call it in this method. 
Since `mergeGroupByResults()` can call internally itself again if `dataSource` 
is a `queryDataSource` 
(https://github.com/apache/incubator-druid/pull/5471/files#diff-631de2c7bba9484c3ffd4a58b4cf691eR193),
 this might be checked again and again redundantly. I would suggest to add new 
methods which is called in `mergeGroupByResults()` like below:
   
   ```
   mergeGroupByResults() -> mergeGroupByResultsWithoutPushDown() -> 
mergeGroupByResultsWithoutPushDown() if needed ...
|-> mergeGroupByResultsWithPushDown()
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] himanshug edited a comment on issue #6016: Druid 'Shapeshifting' Columns

2018-07-24 Thread GitBox
himanshug edited a comment on issue #6016: Druid 'Shapeshifting' Columns
URL: https://github.com/apache/incubator-druid/pull/6016#issuecomment-407536584
 
 
   This is impressive.
   
   I haven't read the code yet, but just the description. I had few doubts...
   
   First "Time to select rows" benchmark appears to have a peak just before 3M 
. Assuming that peak is at x M, it says performance would be better when 
selecting (x+delta) rows instead of x rows. I'm interested if there is an 
explanation for it. That peak shows up at different points in x-axis very 
consistently in all similar graphs. 
   
   In "The bad" section , it seems ShapeShiftingColumn will outperform current 
impl in *all* cases, if they could use blocks of varying sizes. That sounds 
great and deserves ateast validating that. If true, then I think it is well 
worth it even with a little bit of extra heap required given that this feature 
already requires re-tuning heap and maybe other jvm params.
   
   Does new design to read data increase heap reqirements, if yes then that 
would not be end of the world but deserves mention in the "bad" section (so 
re-tuning at historicals as well). Also new read design introduces mutation and 
I hope this doesn't mean requiring locking etc in the face of concurrent 
segment read or else that might cause more problems than being solved.
   That said, it appears that minor changes here can whack the performance very 
easily (maybe even different jvm versions, hardware will have different 
performance). unsure whether same code produces different performance on 
different jvm version and hardware. This is hard problem.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] himanshug commented on issue #4865: Unable to query large data set with `scan-query` via `broker`

2018-07-24 Thread GitBox
himanshug commented on issue #4865: Unable to query large data set with 
`scan-query` via `broker`
URL: 
https://github.com/apache/incubator-druid/issues/4865#issuecomment-407538797
 
 
   while there are multiple PRs trying to solve this by limiting memory used in 
DirectDruidClient which is great in general.
   However, just a thought I had was this could benefit if scan query didn't go 
through regular merging process at the broker and somehow simply streamed 
results from historicals to client as they showed up on broker.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] himanshug commented on issue #6016: Druid 'Shapeshifting' Columns

2018-07-24 Thread GitBox
himanshug commented on issue #6016: Druid 'Shapeshifting' Columns
URL: https://github.com/apache/incubator-druid/pull/6016#issuecomment-407536584
 
 
   This is impressive.
   
   I haven't read the code yet, but just the description. I had few doubts...
   
   First "Time to select rows" benchmark appears to have a peak just before 3M 
. Assuming that peak is at x M, it says performance would be better when 
selecting (x+delta) rows instead of x rows. I'm interested if there is an 
explanation for it. That peak shows up at different points in x-axis very 
consistently in all similar graphs. 
   
   In "The bad" section , it seems ShapeShiftingColumn will outperform current 
impl in *all* cases, if they could use blocks of varying sizes. That sounds 
great and deserves ateast validating that. If true, then I think it is well 
worth it even with a little bit of extra heap required given that this feature 
already requires re-tuning heap and maybe other jvm params.
   
   Does new design to read data increase heap reqirements, if yes then that 
would be a bigger bad IMO, not end of the world but deserves mention in the 
"bad" section. Also new read design introduces mutation and I hope this doesn't 
mean requiring locking etc in the face of concurrent segment read or else that 
might cause more problems than being solved.
   That said, it appears that minor changes here can whack the performance very 
easily (maybe even different jvm versions, hardware will have different 
performance). unsure whether same code produces different performance on 
different jvm version and hardware. This is hard problem.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] clintropolis commented on issue #6037: snapshot quickstart error

2018-07-24 Thread GitBox
clintropolis commented on issue #6037: snapshot quickstart error
URL: 
https://github.com/apache/incubator-druid/issues/6037#issuecomment-407533842
 
 
   This looks similar to this issue 
https://github.com/apache/incubator-druid/pull/5591#issuecomment-379393497


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson commented on issue #6037: snapshot quickstart error

2018-07-24 Thread GitBox
jihoonson commented on issue #6037: snapshot quickstart error
URL: 
https://github.com/apache/incubator-druid/issues/6037#issuecomment-407531966
 
 
   The latest version of Java 8 is `1.8.0_181`. Try that one.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] AlexanderSaydakov commented on issue #6037: snapshot quickstart error

2018-07-24 Thread GitBox
AlexanderSaydakov commented on issue #6037: snapshot quickstart error
URL: 
https://github.com/apache/incubator-druid/issues/6037#issuecomment-407531280
 
 
   I built the package on my Mac laptop with java 1.8.0_40.
   I tried building on the target Linux machine with 1.8.0_60, but got exactly 
the same result


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson opened a new pull request #6041: Synchronize scheduled poll() calls in SQLMetadataSegmentManager

2018-07-24 Thread GitBox
jihoonson opened a new pull request #6041: Synchronize scheduled poll() calls 
in SQLMetadataSegmentManager
URL: https://github.com/apache/incubator-druid/pull/6041
 
 
   Similar issue to https://github.com/apache/incubator-druid/issues/6028.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson closed issue #6028: Error in SqlMetadataRuleManagerTest

2018-07-24 Thread GitBox
jihoonson closed issue #6028: Error in SqlMetadataRuleManagerTest
URL: https://github.com/apache/incubator-druid/issues/6028
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] jihoonson closed pull request #6033: Synchronize scheduled poll() calls in SQLMetadataRuleManager to prevent flakiness in SqlMetadataRuleManagerTest

2018-07-24 Thread GitBox
jihoonson closed pull request #6033: Synchronize scheduled poll() calls in 
SQLMetadataRuleManager to prevent flakiness in SqlMetadataRuleManagerTest
URL: https://github.com/apache/incubator-druid/pull/6033
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/server/src/main/java/io/druid/metadata/SQLMetadataRuleManager.java 
b/server/src/main/java/io/druid/metadata/SQLMetadataRuleManager.java
index 4883a8e1100..6900dff018c 100644
--- a/server/src/main/java/io/druid/metadata/SQLMetadataRuleManager.java
+++ b/server/src/main/java/io/druid/metadata/SQLMetadataRuleManager.java
@@ -27,9 +27,6 @@
 import com.google.common.collect.ImmutableMap;
 import com.google.common.collect.Lists;
 import com.google.common.collect.Maps;
-import com.google.common.util.concurrent.ListenableFuture;
-import com.google.common.util.concurrent.ListeningScheduledExecutorService;
-import com.google.common.util.concurrent.MoreExecutors;
 import com.google.inject.Inject;
 import io.druid.audit.AuditEntry;
 import io.druid.audit.AuditInfo;
@@ -63,6 +60,7 @@
 import java.util.Arrays;
 import java.util.List;
 import java.util.Map;
+import java.util.concurrent.ScheduledExecutorService;
 import java.util.concurrent.TimeUnit;
 import java.util.concurrent.atomic.AtomicReference;
 
@@ -72,7 +70,6 @@
 public class SQLMetadataRuleManager implements MetadataRuleManager
 {
 
-
   public static void createDefaultRule(
   final IDBI dbi,
   final String ruleTable,
@@ -142,13 +139,19 @@ public Void withHandle(Handle handle) throws Exception
   private final AuditManager auditManager;
 
   private final Object lock = new Object();
-
-  private volatile boolean started = false;
-
-  private volatile ListeningScheduledExecutorService exec = null;
-  private volatile ListenableFuture future = null;
-
-  private volatile long retryStartTime = 0;
+  /** The number of times this SQLMetadataRuleManager was started. */
+  private long startCount = 0;
+  /**
+   * Equal to the current {@link #startCount} value, if the 
SQLMetadataRuleManager is currently started; -1 if
+   * currently stopped.
+   *
+   * This field is used to implement a simple stamp mechanism instead of just 
a boolean "started" flag to prevent
+   * the theoretical situation of two tasks scheduled in {@link #start()} 
calling {@link #poll()} concurrently, if
+   * the sequence of {@link #start()} - {@link #stop()} - {@link #start()} 
actions occurs quickly.
+   */
+  private long currentStartOrder = -1;
+  private ScheduledExecutorService exec = null;
+  private long retryStartTime = 0;
 
   @Inject
   public SQLMetadataRuleManager(
@@ -169,9 +172,7 @@ public SQLMetadataRuleManager(
 
Preconditions.checkNotNull(config.getAlertThreshold().toStandardDuration());
 Preconditions.checkNotNull(config.getPollDuration().toStandardDuration());
 
-this.rules = new AtomicReference<>(
-ImmutableMap.>of()
-);
+this.rules = new AtomicReference<>(ImmutableMap.of());
   }
 
   @Override
@@ -179,21 +180,34 @@ public SQLMetadataRuleManager(
   public void start()
   {
 synchronized (lock) {
-  if (started) {
+  if (currentStartOrder >= 0) {
 return;
   }
 
-  exec = 
MoreExecutors.listeningDecorator(Execs.scheduledSingleThreaded("DatabaseRuleManager-Exec--%d"));
+  startCount++;
+  currentStartOrder = startCount;
+  long localStartedOrder = currentStartOrder;
+
+  exec = Execs.scheduledSingleThreaded("DatabaseRuleManager-Exec--%d");
 
   createDefaultRule(dbi, getRulesTable(), config.getDefaultRule(), 
jsonMapper);
-  future = exec.scheduleWithFixedDelay(
+  exec.scheduleWithFixedDelay(
   new Runnable()
   {
 @Override
 public void run()
 {
   try {
-poll();
+// poll() is synchronized together with start() and stop() to 
ensure that when stop() exists, poll()
+// won't actually run anymore after that (it could only enter 
the syncrhonized section and exit
+// immediately because the localStartedOrder doesn't match the 
new currentStartOrder). It's needed
+// to avoid flakiness in SQLMetadataRuleManagerTest.
+// See https://github.com/apache/incubator-druid/issues/6028
+synchronized (lock) {
+  if (localStartedOrder == currentStartOrder) {
+poll();
+  }
+}
   }
   catch (Exception e) {
 log.error(e, "uncaught exception in rule manager polling 
thread");
@@ -204,8 +218,6 @@ public void run()
   config.getPollDuration().toStandardDuration().getMillis(),
   TimeUnit.MILLISECONDS
   );
-
-  

Re: Question about sketches aggregation in druid

2018-07-24 Thread Eshcar Hillel
 Thanks Himanshu!
I will update when we have the ConcurrentUnion in the DataSketches library, or 
earlier if we get interesting performance results with the union 
implementations.
On Tuesday, July 24, 2018, 8:39:25 PM GMT+3, Himanshu 
 wrote:  
 
 This came up in the dev sync today.

Here is the gist.

- Union is necessary because we merge sketches in multiple cases e.g. at
query time, while persisting the final segment to be pushed to deep storage
, indexing user's data that itself contains sketches (e.g. someone ran
batch pipelines with Pig etc and created data that already has sketches in
it).
- SynchronizedUnion is necessary only because of realtime indexing and
querying use case as Gian mentioned ( it is a single writer - multiple
readers use case). If we have a ConcurrentUnion implementation that
performs as good as non-thread-safe Union for single thread, then we should
totally be able to remove SynchronizedUnion.

-- Himanshu

On Sun, Jul 22, 2018 at 9:10 AM, Eshcar Hillel 
wrote:

>  I think part of my confusion stems from the gap in the level of
> abstraction.Druid terminology focuses on aggregation.But in sketches there
> are two levels of aggregation:A sketch is a first level aggregation which
> holds the gist of the stream,
> A union is a second level aggregation which can aggregate sketches.
> Adding 2 questions to the questions below:
> The locks are used to synchronize the access to the union - I assume the
> union is a second level aggregation merging sketches that are built during
> ingestion.
> 3) If this is not the case then why does druid apply a union and not
> simply uses a sketch to aggregate the data?4) if it is the case then is it
> guaranteed that the merged sketches are immutable? otherwise wrapping the
> union with locks is not enough.
>
> I hope my questions make more sense now.
> Thanks,Eshcar
>
>    On Sunday, July 22, 2018, 4:15:02 PM GMT+3, Eshcar Hillel <
> esh...@oath.com> wrote:
>
>  Thanks Gian - I was missing the part about aggregation during ingestion
> time roll-up.
> I looked at the SketchAggregator code and read the druid overview document
> let me verify that I got this right.Consider the best effort roll up mode:
> as events arrive they are ingested into multiple segments, call these
> s0...s9, but should belong to a single segment.Then a roll-up process ru
> aggregates s0...s9 one-by-one (?) into a single segment. During the roll up
> ru can be queried and therefore needs to be thread safe.
> 1) who is the "owner" of the roll up process? what triggers the roll-up
> thread? Is it considered as part of the ingestion/indexing time, or is it
> done at the background as a kind of an optimization?
>
> 2) The documents says "Data is queryable as soon as it isingested by the
> realtime processing logic." Does this means that queries can apply get to
> s0..s9? should they be thread safe as well?
>
>
>    On Thursday, July 19, 2018, 10:16:34 PM GMT+3, Gian Merlino <
> g...@apache.org> wrote:
>
>  Hi Eshcar,
>
> I don't think I 100% understand what you are asking, but I will say some
> things, and hopefully they will be helpful.
>
> In Druid we use aggregators for two things: aggregation during ingestion
> (for ingestion-time rollup) and aggregation during queries. During queries
> the aggregators are only ever used by one thread at a time. At ingestion
> time, "aggregate" and "get" can be called simultaneously. It happens
> because "aggregate" is called from an ingestion thread (because we update
> running aggregators during ingestion), and "get" is called by query threads
> (because they "get" those aggregator values from the ingestion aggregator
> object to feed them to a query aggregator object). These calls are not
> synchronized by Druid, so individual aggregators need to do it themselves
> if necessary. There was some effort to address this systematically:
> https://github.com/apache/incubator-druid/pull/3956, although it hasn't
> been finished yet. Check out some of the discussion on that patch for more
> background, and a question I just posted there: does it make more sense for
> ingestion-time aggregator thread-safety to be handled systematically (at
> the IncrementalIndex) or for each aggregator to need to be thread safe?
>
> If you're looking at "aggregate" and "get" in this file, those are the two
> that could get called simultaneously:
> https://github.com/apache/incubator-druid/blob/master/
> extensions-core/datasketches/src/main/java/io/druid/query/
> aggregation/datasketches/theta/SketchAggregator.java
>
> On Sun, Jul 15, 2018 at 12:11 AM Eshcar Hillel 
> wrote:
>
> >  Apologies, I must be missing something very basic in how incremental
> > indexing is working.A sketch is by itself an aggregator - it can absorb
> > millions of updates before it exceeds its space limit or is flushed to
> disk.
> > I assumed the ingestion thread aggregates data in multiple sketches in
> > parallel, then at query time a union operation is invoked to merge
> relevant
> 

[GitHub] jihoonson commented on issue #5980: Remove redundant type parameters and enforce some other style and inspection rules

2018-07-24 Thread GitBox
jihoonson commented on issue #5980: Remove redundant type parameters and 
enforce some other style and inspection rules
URL: https://github.com/apache/incubator-druid/pull/5980#issuecomment-407504015
 
 
   @asdf2014 would you update the PR description to include additional changes? 
E.g., Added new inspection/checkstyle rules.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] himanshug edited a comment on issue #3956: Thread safe reads for aggregators in IncrementalIndex

2018-07-24 Thread GitBox
himanshug edited a comment on issue #3956: Thread safe reads for aggregators in 
IncrementalIndex
URL: https://github.com/apache/incubator-druid/pull/3956#issuecomment-407502404
 
 
   In general, I agree with @leventov  here because different aggregators can 
handle concurrency with varying degree of efficiency.
   Unless, of course, there is a systematic way to do things that takes care of 
above e.g. introducing "boolean isThreadSafe()" method or something like that 
on Aggregator and then based on the answer, handle things correctly in 
IncrementalIndex. Then Aggregators can make the choice.
   Or else, I think aggregators not handling it properly are just buggy and 
should be fixed. Maybe update the aggregator doc with some blurbs on thread 
safety requirements.
   
   That said, we need synchronization only for realtime indexing code path and 
historical nodes pay the penalty of thread safety unnecessarily. If we could do 
something systematic to change the two code paths in some way that allows 
historicals not paying for thread safety, that would be good.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] gianm commented on issue #6037: snapshot quickstart error

2018-07-24 Thread GitBox
gianm commented on issue #6037: snapshot quickstart error
URL: 
https://github.com/apache/incubator-druid/issues/6037#issuecomment-407496867
 
 
   The release should work, since the issue (I believe) is with the javac 
compiler, not with the runtime. Maybe you could try building the package with a 
newer jdk and then running it on the _60 environment?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] AlexanderSaydakov commented on issue #6037: snapshot quickstart error

2018-07-24 Thread GitBox
AlexanderSaydakov commented on issue #6037: snapshot quickstart error
URL: 
https://github.com/apache/incubator-druid/issues/6037#issuecomment-407494838
 
 
   0.12.1 release works, but doesn't have the module I wanted to play with.
   1.8.0_60 jdk seems to be the latest available in our corporate environment 
at the moment.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



Re: Druid 0.12.2-rc1 vote

2018-07-24 Thread Gian Merlino
+1

On Thu, Jul 19, 2018 at 1:09 PM Jihoon Son  wrote:

> Hi all,
>
> we have no open issues and PRs for 0.12.2 (
> https://github.com/apache/incubator-druid/milestone/27). The 0.12.2 branch
> is already available and all PRs for 0.12.2 have merged into that branch.
>
> Let's vote on releasing RC1. Here is my +1.
>
> This is a non-ASF release.
>
> Best,
> Jihoon
>


Re: Dev sync

2018-07-24 Thread Jihoon Son
I can host today.

Here is the link to join:
https://hangouts.google.com/call/_KpvUiS0CO1-BMS8g2XFAAEE.

Jihoon

On Tue, Jul 24, 2018 at 8:40 AM Charles Allen  wrote:

> Is someone else able to start the dev sync this week? I'm out of town at a
> conference.
>


[GitHub] nishantmonu51 commented on issue #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
nishantmonu51 commented on issue #5958: Part 2 of changes for SQL Compatible 
Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#issuecomment-407462520
 
 
   @leventov @jihoonson : Thanks for the review comments, addressed them. 
   Do you have any more comments ? 
   Also, can you please add some info on how much of the review is pending and 
when can we expect to merge this one in.
   Its blocking feature development/patches for Apache Hive Integration for 
handling nulls coming from druid properly. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] nishantmonu51 commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
nishantmonu51 commented on a change in pull request #5958: Part 2 of changes 
for SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204816952
 
 

 ##
 File path: common/src/main/java/io/druid/math/expr/ExprEval.java
 ##
 @@ -245,36 +252,63 @@ public final ExprType type()
 @Override
 public final int asInt()
 {
-  if (value == null) {
+  Number number = asNumber();
+  if (number == null) {
 assert NullHandling.replaceWithDefault();
 return 0;
   }
-
-  final Integer theInt = Ints.tryParse(value);
-  assert NullHandling.replaceWithDefault() || theInt != null;
-  return theInt == null ? 0 : theInt;
+  return number.intValue();
 }
 
 @Override
 public final long asLong()
 {
-  // GuavaUtils.tryParseLong handles nulls, no need for special null 
handling here.
-  final Long theLong = GuavaUtils.tryParseLong(value);
-  assert NullHandling.replaceWithDefault() || theLong != null;
-  return theLong == null ? 0L : theLong;
+  Number number = asNumber();
+  if (number == null) {
+assert NullHandling.replaceWithDefault();
 
 Review comment:
   unable to find the discussion among lots of review comments. 
   I believe the intent here was to catch any coding errors/places where we may 
have missed handling nulls properly.
   @leventov: can probably add more context here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] nishantmonu51 commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
nishantmonu51 commented on a change in pull request #5958: Part 2 of changes 
for SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204815667
 
 

 ##
 File path: common/src/main/java/io/druid/math/expr/ExprEval.java
 ##
 @@ -99,10 +98,10 @@ public Object value()
 return value;
   }
 
-  public boolean isNull()
 
 Review comment:
   this method is replaced with isNumericNull as that is what we need to check 
for Expressions. 
   isNumericNull will check whether the primitive long/float/double equivalent 
is null or not. 
   It is possible for an expression to have a non-null String value but it can 
return null when parsed as a primitive long/float/double.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] nishantmonu51 commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
nishantmonu51 commented on a change in pull request #5958: Part 2 of changes 
for SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204811571
 
 

 ##
 File path: sql/src/main/java/io/druid/sql/calcite/rel/DruidSemiJoin.java
 ##
 @@ -294,7 +295,11 @@ public RelOptCost computeSelfCost(final RelOptPlanner 
planner, final RelMetadata
 
 for (int i : rightKeys) {
   final Object value = row[i];
-  final String stringValue = value != null ? String.valueOf(value) 
: "";
+  if (value == null) {
+// NULLS are not supposed to match NULLs in a join. So ignore 
them.
 
 Review comment:
   done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] nishantmonu51 commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
nishantmonu51 commented on a change in pull request #5958: Part 2 of changes 
for SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204811510
 
 

 ##
 File path: sql/src/main/java/io/druid/sql/calcite/planner/Calcites.java
 ##
 @@ -107,6 +108,7 @@ public static SchemaPlus createRootSchema(final Schema 
druidSchema, final Author
 
   public static String escapeStringLiteral(final String s)
   {
+Preconditions.checkNotNull(s);
 if (s == null) {
 
 Review comment:
   removed the == null check, it was not needed. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] nishantmonu51 commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
nishantmonu51 commented on a change in pull request #5958: Part 2 of changes 
for SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204811220
 
 

 ##
 File path: 
processing/src/main/java/io/druid/segment/virtual/SingleStringInputCachingExpressionColumnValueSelector.java
 ##
 @@ -139,7 +139,7 @@ private ExprEval eval()
   @Override
   public boolean isNull()
   {
-return eval().isNull();
+return eval().isNumericNull();
 
 Review comment:
   added comment to code. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] nishantmonu51 commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
nishantmonu51 commented on a change in pull request #5958: Part 2 of changes 
for SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204811378
 
 

 ##
 File path: 
server/src/main/java/io/druid/server/listener/announcer/ListeningAnnouncerConfig.java
 ##
 @@ -93,7 +93,7 @@ public String getAnnouncementPath(String listenerName)
   {
 return ZKPaths.makePath(
 getListenersPath(), Preconditions.checkNotNull(
 
 Review comment:
   done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] nishantmonu51 commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
nishantmonu51 commented on a change in pull request #5958: Part 2 of changes 
for SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204811156
 
 

 ##
 File path: 
processing/src/main/java/io/druid/segment/virtual/ExpressionSelectors.java
 ##
 @@ -248,23 +248,32 @@ public void inspectRuntimeShape(RuntimeShapeInspector 
inspector)
   {
 final Map> suppliers = Maps.newHashMap();
 for (String columnName : Parser.findRequiredBindings(expression)) {
-  final ColumnCapabilities columnCapabilities = 
columnSelectorFactory.getColumnCapabilities(columnName);
+  final ColumnCapabilities columnCapabilities = columnSelectorFactory
+  .getColumnCapabilities(columnName);
   final ValueType nativeType = columnCapabilities != null ? 
columnCapabilities.getType() : null;
   final Supplier supplier;
 
   if (nativeType == ValueType.FLOAT) {
-supplier = 
columnSelectorFactory.makeColumnValueSelector(columnName)::getFloat;
+ColumnValueSelector selector = columnSelectorFactory
+.makeColumnValueSelector(columnName);
+supplier = makeNullableSupplier(selector, selector::getFloat);
   } else if (nativeType == ValueType.LONG) {
-supplier = 
columnSelectorFactory.makeColumnValueSelector(columnName)::getLong;
+ColumnValueSelector selector = columnSelectorFactory
+.makeColumnValueSelector(columnName);
+supplier = makeNullableSupplier(selector, selector::getLong);
   } else if (nativeType == ValueType.DOUBLE) {
-supplier = 
columnSelectorFactory.makeColumnValueSelector(columnName)::getDouble;
+ColumnValueSelector selector = columnSelectorFactory
+.makeColumnValueSelector(columnName);
+supplier = makeNullableSupplier(selector, selector::getDouble);
   } else if (nativeType == ValueType.STRING) {
 supplier = supplierFromDimensionSelector(
-columnSelectorFactory.makeDimensionSelector(new 
DefaultDimensionSpec(columnName, columnName))
+columnSelectorFactory
+.makeDimensionSelector(new 
DefaultDimensionSpec(columnName, columnName))
 );
   } else if (nativeType == null) {
 // Unknown ValueType. Try making an Object selector and see if that 
gives us anything useful.
-supplier = 
supplierFromObjectSelector(columnSelectorFactory.makeColumnValueSelector(columnName));
+supplier = supplierFromObjectSelector(columnSelectorFactory
+.makeColumnValueSelector(columnName));
 
 Review comment:
   done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] nishantmonu51 commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
nishantmonu51 commented on a change in pull request #5958: Part 2 of changes 
for SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204811346
 
 

 ##
 File path: 
processing/src/main/java/io/druid/segment/virtual/SingleLongInputCachingExpressionColumnValueSelector.java
 ##
 @@ -150,6 +150,6 @@ private ExprEval eval(final long value)
   @Override
   public boolean isNull()
   {
-return getObject().isNull();
+return getObject().isNumericNull();
 
 Review comment:
   It is possible for an expression to have a non-null String value but it can 
return null when parsed as a primitive long/float/double.
   ExprEval.isNumericNull checks whether the parsed primitive value is null or 
not.
   Added comment to code. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] nishantmonu51 commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
nishantmonu51 commented on a change in pull request #5958: Part 2 of changes 
for SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204810950
 
 

 ##
 File path: 
processing/src/main/java/io/druid/query/groupby/orderby/DefaultLimitSpec.java
 ##
 @@ -269,12 +270,22 @@ public int compare(Row left, Row right)
 
   private Ordering metricOrdering(final String column, final Comparator 
comparator)
   {
-return Ordering.from(Comparator.comparing((Row row) -> row.getRaw(column), 
Comparator.nullsLast(comparator)));
+if (NullHandling.sqlCompatible()) {
+  return Ordering.from(Comparator.comparing((Row row) -> 
row.getRaw(column), Comparator.nullsFirst(comparator)));
+} else {
+  return Ordering.from(Comparator.comparing((Row row) -> 
row.getRaw(column), Comparator.nullsLast(comparator)));
+}
   }
 
   private Ordering dimensionOrdering(final String dimension, final 
StringComparator comparator)
   {
-return Ordering.from(Comparator.comparing((Row row) -> 
row.getDimension(dimension).isEmpty() ? null : 
row.getDimension(dimension).get(0), Comparator.nullsFirst(comparator)));
+return Ordering.from(
+Comparator.comparing(
+(Row row) -> row.getDimension(dimension).isEmpty()
 
 Review comment:
   extracted method to avoid calling getDimension twice. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] nishantmonu51 commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
nishantmonu51 commented on a change in pull request #5958: Part 2 of changes 
for SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204811034
 
 

 ##
 File path: processing/src/main/java/io/druid/query/topn/TopNMapFn.java
 ##
 @@ -56,14 +56,10 @@
 return longVal == null ? DimensionHandlerUtils.ZERO_LONG : longVal;
   };
 
-  private static Function FLOAT_TRANSFORMER = input -> {
-final Float floatVal = DimensionHandlerUtils.convertObjectToFloat(input);
-return floatVal == null ? DimensionHandlerUtils.ZERO_FLOAT : floatVal;
-  };
-  private static Function DOUBLE_TRANSFORMER = input -> {
-final Double doubleValue = 
DimensionHandlerUtils.convertObjectToDouble(input);
-return doubleValue == null ? DimensionHandlerUtils.ZERO_DOUBLE : 
doubleValue;
-  };
+  private static Function FLOAT_TRANSFORMER = input -> 
DimensionHandlerUtils.convertObjectToFloat(input);
+
+  private static Function DOUBLE_TRANSFORMER = input -> 
DimensionHandlerUtils.convertObjectToDouble(
 
 Review comment:
   done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] nishantmonu51 commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
nishantmonu51 commented on a change in pull request #5958: Part 2 of changes 
for SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204810994
 
 

 ##
 File path: processing/src/main/java/io/druid/query/topn/TopNMapFn.java
 ##
 @@ -56,14 +56,10 @@
 return longVal == null ? DimensionHandlerUtils.ZERO_LONG : longVal;
   };
 
-  private static Function FLOAT_TRANSFORMER = input -> {
-final Float floatVal = DimensionHandlerUtils.convertObjectToFloat(input);
-return floatVal == null ? DimensionHandlerUtils.ZERO_FLOAT : floatVal;
-  };
-  private static Function DOUBLE_TRANSFORMER = input -> {
-final Double doubleValue = 
DimensionHandlerUtils.convertObjectToDouble(input);
-return doubleValue == null ? DimensionHandlerUtils.ZERO_DOUBLE : 
doubleValue;
-  };
+  private static Function FLOAT_TRANSFORMER = input -> 
DimensionHandlerUtils.convertObjectToFloat(input);
 
 Review comment:
   done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] nishantmonu51 commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
nishantmonu51 commented on a change in pull request #5958: Part 2 of changes 
for SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204810789
 
 

 ##
 File path: 
processing/src/main/java/io/druid/query/groupby/orderby/DefaultLimitSpec.java
 ##
 @@ -269,12 +270,22 @@ public int compare(Row left, Row right)
 
   private Ordering metricOrdering(final String column, final Comparator 
comparator)
   {
-return Ordering.from(Comparator.comparing((Row row) -> row.getRaw(column), 
Comparator.nullsLast(comparator)));
+if (NullHandling.sqlCompatible()) {
 
 Review comment:
   added comment. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] nishantmonu51 commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
nishantmonu51 commented on a change in pull request #5958: Part 2 of changes 
for SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204810693
 
 

 ##
 File path: processing/src/main/java/io/druid/query/filter/ValueGetter.java
 ##
 @@ -27,5 +29,7 @@
   // converted to strings. We should also add functions
   // for these and modify ColumnComparisonFilter to handle
   // comparing Long and Float columns to eachother.
+  // Returns null when the underlying Long/Float value is null.
 
 Review comment:
   done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] nishantmonu51 commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
nishantmonu51 commented on a change in pull request #5958: Part 2 of changes 
for SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204810633
 
 

 ##
 File path: 
processing/src/main/java/io/druid/query/filter/SelectorDimFilter.java
 ##
 @@ -90,7 +95,7 @@ public SelectorDimFilter(
   @Override
   public DimFilter optimize()
   {
-return new InDimFilter(dimension, ImmutableList.of(value), 
extractionFn).optimize();
+return new InDimFilter(dimension, Arrays.asList(value), 
extractionFn).optimize();
 
 Review comment:
   done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] nishantmonu51 commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
nishantmonu51 commented on a change in pull request #5958: Part 2 of changes 
for SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204810580
 
 

 ##
 File path: processing/src/main/java/io/druid/query/expression/ExprUtils.java
 ##
 @@ -73,7 +74,14 @@ public static PeriodGranularity toPeriodGranularity(
 } else {
   Chronology chronology = timeZone == null ? 
ISOChronology.getInstanceUTC() : ISOChronology.getInstance(timeZone);
   final Object value = originArg.eval(bindings).value();
-  origin = value != null ? new DateTime(value, chronology) : null;
+  if (value instanceof String && StringUtils.isEmpty((String) value)) {
 
 Review comment:
   done. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



Dev sync

2018-07-24 Thread Charles Allen
Is someone else able to start the dev sync this week? I'm out of town at a
conference.


[GitHub] nishantmonu51 commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
nishantmonu51 commented on a change in pull request #5958: Part 2 of changes 
for SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204801468
 
 

 ##
 File path: 
processing/src/main/java/io/druid/query/groupby/orderby/DefaultLimitSpec.java
 ##
 @@ -269,12 +270,22 @@ public int compare(Row left, Row right)
 
   private Ordering metricOrdering(final String column, final Comparator 
comparator)
   {
-return Ordering.from(Comparator.comparing((Row row) -> row.getRaw(column), 
Comparator.nullsLast(comparator)));
+if (NullHandling.sqlCompatible()) {
+  return Ordering.from(Comparator.comparing((Row row) -> 
row.getRaw(column), Comparator.nullsFirst(comparator)));
+} else {
+  return Ordering.from(Comparator.comparing((Row row) -> 
row.getRaw(column), Comparator.nullsLast(comparator)));
+}
   }
 
   private Ordering dimensionOrdering(final String dimension, final 
StringComparator comparator)
   {
-return Ordering.from(Comparator.comparing((Row row) -> 
row.getDimension(dimension).isEmpty() ? null : 
row.getDimension(dimension).get(0), Comparator.nullsFirst(comparator)));
+return Ordering.from(
+Comparator.comparing(
+(Row row) -> row.getDimension(dimension).isEmpty()
 
 Review comment:
   Its just a formatting change as the line was longer than 120 cols, no logic 
changed in this PR. reformatted to make it more readable. also remove dupe call 
to getDimension


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] nishantmonu51 commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
nishantmonu51 commented on a change in pull request #5958: Part 2 of changes 
for SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204801468
 
 

 ##
 File path: 
processing/src/main/java/io/druid/query/groupby/orderby/DefaultLimitSpec.java
 ##
 @@ -269,12 +270,22 @@ public int compare(Row left, Row right)
 
   private Ordering metricOrdering(final String column, final Comparator 
comparator)
   {
-return Ordering.from(Comparator.comparing((Row row) -> row.getRaw(column), 
Comparator.nullsLast(comparator)));
+if (NullHandling.sqlCompatible()) {
+  return Ordering.from(Comparator.comparing((Row row) -> 
row.getRaw(column), Comparator.nullsFirst(comparator)));
+} else {
+  return Ordering.from(Comparator.comparing((Row row) -> 
row.getRaw(column), Comparator.nullsLast(comparator)));
+}
   }
 
   private Ordering dimensionOrdering(final String dimension, final 
StringComparator comparator)
   {
-return Ordering.from(Comparator.comparing((Row row) -> 
row.getDimension(dimension).isEmpty() ? null : 
row.getDimension(dimension).get(0), Comparator.nullsFirst(comparator)));
+return Ordering.from(
+Comparator.comparing(
+(Row row) -> row.getDimension(dimension).isEmpty()
 
 Review comment:
   Its just a formatting change as the line was longer than 120 cols, no logic 
changed in this PR. reformatted to make it more readable. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] nishantmonu51 commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
nishantmonu51 commented on a change in pull request #5958: Part 2 of changes 
for SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204801468
 
 

 ##
 File path: 
processing/src/main/java/io/druid/query/groupby/orderby/DefaultLimitSpec.java
 ##
 @@ -269,12 +270,22 @@ public int compare(Row left, Row right)
 
   private Ordering metricOrdering(final String column, final Comparator 
comparator)
   {
-return Ordering.from(Comparator.comparing((Row row) -> row.getRaw(column), 
Comparator.nullsLast(comparator)));
+if (NullHandling.sqlCompatible()) {
+  return Ordering.from(Comparator.comparing((Row row) -> 
row.getRaw(column), Comparator.nullsFirst(comparator)));
+} else {
+  return Ordering.from(Comparator.comparing((Row row) -> 
row.getRaw(column), Comparator.nullsLast(comparator)));
+}
   }
 
   private Ordering dimensionOrdering(final String dimension, final 
StringComparator comparator)
   {
-return Ordering.from(Comparator.comparing((Row row) -> 
row.getDimension(dimension).isEmpty() ? null : 
row.getDimension(dimension).get(0), Comparator.nullsFirst(comparator)));
+return Ordering.from(
+Comparator.comparing(
+(Row row) -> row.getDimension(dimension).isEmpty()
 
 Review comment:
   Its just a formatting change, no logic changed in this PR. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] nishantmonu51 commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
nishantmonu51 commented on a change in pull request #5958: Part 2 of changes 
for SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204797189
 
 

 ##
 File path: processing/src/main/java/io/druid/query/expression/ExprUtils.java
 ##
 @@ -73,7 +74,14 @@ public static PeriodGranularity toPeriodGranularity(
 } else {
   Chronology chronology = timeZone == null ? 
ISOChronology.getInstanceUTC() : ISOChronology.getInstance(timeZone);
   final Object value = originArg.eval(bindings).value();
-  origin = value != null ? new DateTime(value, chronology) : null;
+  if (value instanceof String && StringUtils.isEmpty((String) value)) {
 
 Review comment:
   https://github.com/apache/incubator-druid/issues/6040
   Will fix this line in this PR and prohibit the api in future PR. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] nishantmonu51 opened a new issue #6040: Review comment - Prohibit Usage of isNullOrEmpty and isEmpty for null handling

2018-07-24 Thread GitBox
nishantmonu51 opened a new issue #6040: Review comment - Prohibit Usage of 
isNullOrEmpty and isEmpty for null handling
URL: https://github.com/apache/incubator-druid/issues/6040
 
 
   It seems to me that usage of 
com.google.guava.common.base.Strings.isNullOrEmpty(), 
java.lang.String.isEmpty(), etc. should be prohibited too in favor of 
NullHanlding.isNullOrEquivalent() and 
StringUtils.isNullOrEmptyNonDruidDataString() (should create such method)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] nishantmonu51 opened a new issue #6039: Review comment - Extract Combining Factories for Float/Double/Long First/Last aggregators

2018-07-24 Thread GitBox
nishantmonu51 opened a new issue #6039: Review comment - Extract Combining 
Factories for Float/Double/Long First/Last aggregators
URL: https://github.com/apache/incubator-druid/issues/6039
 
 
   The parameter of NullableAggregatorFactory is not the output selector type, 
it's the input selector type. So it should be BaseFloatColumnValueSelector for 
FloatFirstAggregatorFactory. The reason why this refactoring couldn't be done 
right now is that getCombiningFactory() returns an anonymous subclass of 
FloatFirstAggregatorFactory, that IMO wrong. It seems to me that it should be a 
separate class FloatFirstCombiningAggregatorFactory extends 
NullableAggregatorFactory>>


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] nishantmonu51 commented on a change in pull request #5958: Part 2 of changes for SQL Compatible Null Handling

2018-07-24 Thread GitBox
nishantmonu51 commented on a change in pull request #5958: Part 2 of changes 
for SQL Compatible Null Handling
URL: https://github.com/apache/incubator-druid/pull/5958#discussion_r204795900
 
 

 ##
 File path: 
processing/src/main/java/io/druid/query/aggregation/first/FloatFirstAggregatorFactory.java
 ##
 @@ -32,19 +32,21 @@
 import io.druid.query.aggregation.AggregatorFactory;
 import io.druid.query.aggregation.AggregatorUtil;
 import io.druid.query.aggregation.BufferAggregator;
+import io.druid.query.aggregation.NullableAggregatorFactory;
 import io.druid.query.monomorphicprocessing.RuntimeShapeInspector;
-import io.druid.segment.BaseObjectColumnValueSelector;
 import io.druid.segment.ColumnSelectorFactory;
+import io.druid.segment.ColumnValueSelector;
 import io.druid.segment.column.Column;
 
+import javax.annotation.Nullable;
 import java.nio.ByteBuffer;
 import java.util.Arrays;
 import java.util.Comparator;
 import java.util.List;
 import java.util.Map;
 import java.util.Objects;
 
-public class FloatFirstAggregatorFactory extends AggregatorFactory
+public class FloatFirstAggregatorFactory extends 
NullableAggregatorFactory
 
 Review comment:
   not a blocker for this PR, would like to do it a follow up PR - 
https://github.com/apache/incubator-druid/issues/6039


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



[GitHub] undertruck opened a new issue #6038: Ingesting Parquet file gives UTF-8 error [Druid 0.12.0]

2018-07-24 Thread GitBox
undertruck opened a new issue #6038: Ingesting Parquet file gives UTF-8 error 
[Druid 0.12.0]
URL: https://github.com/apache/incubator-druid/issues/6038
 
 
   I've a AWS Glue generated Parquet file.  I've installed Parquet and Avro 
extensions (tried with 0.12.0 and 0.12.1 both) and I get following error in 
each case 
   
   $ >curl -X 'POST' -H 'Content-Type:application/json' -d 
@quickstart/master.parquet localhost:8090/druid/indexer/v1/task
   
   
   
   Error 500 
   
   
   HTTP ERROR: 500
   Problem accessing /druid/indexer/v1/task. Reason:
   javax.servlet.ServletException: 
com.fasterxml.jackson.core.JsonParseException: Invalid UTF-8 middle byte 0x27
at [Source: 
HttpInputOverHTTP@149d71fc[c=8000,q=1,[0]=Content@519fed0b{HeapByteBufferR@67183cce[p=8000,l=8192,c=8192,r=192]={PAR1\x15\x04\x15\xC0\x81\x01\x15\xF4L\x15\xA0\t...X\xA2\xC7\x1c\xB7\xCc\x81\xC9\x1c\x984\x82I#s42\xC7\x1dtB\xC7\x1cs\xC0\xE3H\x1fx\xCc\x81...\xE2\x08$\xAa`R\x87#\xB0`RI\x1d\x90\xD4}},s=STREAM];
 line: 1, column: 14]
   http://eclipse.org/jetty;>Powered by Jetty:// 
9.3.19.v20170502
   
   
   
   == JSON config file ==
   
   Slide-Analytics: ~/downloads/druid-0.12.0 >more quickstart/master.json
   {
 "type" : "index_hadoop",
 "spec" : {
   "ioConfig" : {
 "type" : "hadoop",
 "inputSpec" : {
   "type" : "static",
   "inputFormat": "io.druid.data.input.parquet.DruidParquetInputFormat",
   "paths" : "quickstart/master.parquet"
 }
   },
   "dataSchema" : {
 "dataSource" : "master",
 "granularitySpec" : {
   "type" : "uniform",
   "segmentGranularity" : "day",
   "queryGranularity" : "none",
   "intervals" : ["2010-03-01/2020-05-28"]
 },
 "parser" : {
   "type" : "parquet",
   "parseSpec" : {
 "format" : "timeAndDims",
 "dimensionsSpec" : {
   "dimensions" : [
   ]
 },
 "timestampSpec" : {
   "format" : "auto",
   "column" : "ndate"
 }
   }
 },
 "metricsSpec" : [
   {
 "name" : "count",
 "type" : "count"
   },
   {
 "name" : "collection_USD_SUM",
 "type" : "longSum",
 "fieldName" : "collection_USD"
   },
   {
 "name" : "order_count",
 "type" : "hyperUnique",
 "fieldName" : "orderNumber"
   },
   {
 "name" : "lead_count",
 "type" : "count",
 "fieldName" : "Sales.leads"
   }
   
 ]
   },
   "tuningConfig" : {
 "type" : "hadoop",
 "partitionsSpec" : {
   "type" : "hashed",
   "targetPartitionSize" : 500
 },
 "jobProperties" : {}
   }
 }
   }
   
   
   ==
   
   Please help to resolve this problem.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org