[GitHub] drill issue #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 Stabali...
Github user ilooner commented on the issue: https://github.com/apache/drill/pull/1023 Travis build is flakey. Rerunning build. ---
[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...
GitHub user ilooner reopened a pull request: https://github.com/apache/drill/pull/1023 DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 Stabalize TestValueVector tests. ## DRILL-5922 - QueryContext was never closed when the Foreman finished. So the query child allocator was never closed. - The PlanSplitter created a QueryContext temporarily to construct an RPC message but never closed it. - The waitForExit method was error prone. Changed it to use the standard condition variable pattern. ## DRILL-5926 The TestValueVector tests would run out of memory. I simple increased the MaxDirectMemorySize for the forked test processes in the pom to avoid this. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ilooner/drill DRILL-5922 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1023.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1023 commit 005a478238ff8a0ec085e051c099a109a73d3e8a Author: Timothy Farkas Date: 2017-11-10T19:13:18Z DRILL-5922: - The QueryContext was never closed when the Foreman finished, so it's child allocator was never closed. Now it is. - The PlanSplitter created a QueryContext temporarily to construct an RPC message but never closed it. Now the temp QueryContext is closed. - The waitForExit method was error prone. Changed it to use the standard condition variable pattern. - Fixed timeouts in graceful shutdown tests commit 551c54eca5ed9a69b990daa2a300c850e11e9376 Author: Timothy Farkas Date: 2017-11-10T20:59:42Z DRILL-5926: The TestValueVector tests would run out of memory. Increased the MaxDirectMemorySize for the forked test processes in the pom to avoid this. commit 31175885bcd60abde5ec88ff3fcece589619265f Author: Timothy Farkas Date: 2018-01-09T09:20:09Z DRILL-6003: Fixed sporadic test failures caused by the creation of duplicate clusters in TestDynamicUDFSupport. ---
[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...
Github user ilooner closed the pull request at: https://github.com/apache/drill/pull/1023 ---
[GitHub] drill pull request #1073: DRILL-5967: Fixed memory leak in OrderedPartitionS...
Github user ilooner closed the pull request at: https://github.com/apache/drill/pull/1073 ---
[GitHub] drill pull request #1073: DRILL-5967: Fixed memory leak in OrderedPartitionS...
GitHub user ilooner reopened a pull request: https://github.com/apache/drill/pull/1073 DRILL-5967: Fixed memory leak in OrderedPartitionSender The OrderedPartitionSender was leaking memory every time it was created because it created a wrapper RecordBatch which allocated memory but was never closed. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ilooner/drill DRILL-5967 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1073.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1073 commit 13cd2fe7e8dccb2ac7546de508fbfbed8e19b48b Author: Timothy Farkas Date: 2017-12-14T18:48:27Z DRILL-5967: Fixed memory leak in OrderedPartitionSender commit e48e274649be7b9a14746c200d0ec5f5b3740664 Author: Timothy Farkas Date: 2017-12-18T20:01:30Z - Applied review comments commit 2233e6b32c1dd9f1b89fe4d6fa557e8bd894592e Author: Timothy Farkas Date: 2017-12-18T21:54:01Z - Applied more review comments commit f3aef285fc168cad07b32e5e99cd2cd7811d765c Author: Timothy Farkas Date: 2017-12-18T21:56:13Z - Applied review comments commit 0cdb2f3c4944f6253c72d318276603fc329546b5 Author: Timothy Farkas Date: 2017-12-19T19:27:19Z - Removed unnecessary creation of a list in OrderedPartitionSenderCreator commit f6d4aec39aad7326a735b883647772b28e2b86ab Author: Timothy Farkas Date: 2018-01-11T22:19:18Z - Added comment documenting the workaround. ---
[jira] [Created] (DRILL-6083) RestClientFixture does not connect to the correct webserver port
Karthikeyan Manivannan created DRILL-6083: - Summary: RestClientFixture does not connect to the correct webserver port Key: DRILL-6083 URL: https://issues.apache.org/jira/browse/DRILL-6083 Project: Apache Drill Issue Type: Bug Components: Tools, Build & Test Affects Versions: Future Reporter: Karthikeyan Manivannan Assignee: Karthikeyan Manivannan Fix For: 1.13.0 RestClientFixture always connects to the default http port (8047) instead of connecting to the webserver-port of the cluster. The cluster's webserver port won't be 8047 if there are other Drillbits running when the cluster is launched. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] drill issue #1073: DRILL-5967: Fixed memory leak in OrderedPartitionSender
Github user ilooner commented on the issue: https://github.com/apache/drill/pull/1073 @paul-rogers Added the comment. Please take a look and let me know if you'd like any other changes. ---
[GitHub] drill pull request #1066: DRILL-3993: Changes to support Calcite 1.15
Github user chunhui-shi commented on a diff in the pull request: https://github.com/apache/drill/pull/1066#discussion_r161077516 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/AggPruleBase.java --- @@ -82,4 +83,19 @@ protected boolean create2PhasePlan(RelOptRuleCall call, DrillAggregateRel aggreg } return true; } + + /** + * Returns group-by keys with the remapped arguments for specified aggregate. + * + * @param groupSet ImmutableBitSet of aggregate rel node, whose group-by keys should be remapped. + * @return {@link ImmutableBitSet} instance with remapped keys. + */ + public static ImmutableBitSet remapGroupSet(ImmutableBitSet groupSet) { --- End diff -- what is the reason we are going this remap with new calcite? And if the result is only depended on size of groupSet, we don't really need to iterate through the groupSet. ---
[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...
Github user daveoshinsky commented on a diff in the pull request: https://github.com/apache/drill/pull/570#discussion_r161069079 --- Diff: common/src/main/java/org/apache/drill/common/util/CoreDecimalUtility.java --- @@ -32,6 +32,14 @@ public static long getDecimal18FromBigDecimal(BigDecimal input, int scale, int p } public static int getMaxPrecision(TypeProtos.MinorType decimalType) { +/* --- End diff -- The two DAO comments are removed in my local merge environment. ---
[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...
Github user daveoshinsky commented on a diff in the pull request: https://github.com/apache/drill/pull/570#discussion_r161067313 --- Diff: exec/java-exec/src/main/codegen/templates/Decimal/CastDecimalVarchar.java --- @@ -150,6 +150,14 @@ public void setup() { public void eval() { +<#if type.from.contains("VarDecimal")> +java.math.BigDecimal bigDecimal = org.apache.drill.exec.util.DecimalUtility.getBigDecimalFromDrillBuf(in.buffer, in.start, in.end - in.start, in.scale); +String str = bigDecimal.toString(); +out.buffer = buffer; +out.start = 0; +out.end = Math.min((int)len.value, str.length()); +out.buffer.setBytes(0, str.getBytes()); --- End diff -- I take that back. Your change ensures it doesn't copy too many bytes. I'm including it in the remerged file. ---
[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...
Github user daveoshinsky commented on a diff in the pull request: https://github.com/apache/drill/pull/570#discussion_r161066659 --- Diff: exec/java-exec/src/main/codegen/templates/Decimal/CastDecimalVarchar.java --- @@ -150,6 +150,14 @@ public void setup() { public void eval() { +<#if type.from.contains("VarDecimal")> +java.math.BigDecimal bigDecimal = org.apache.drill.exec.util.DecimalUtility.getBigDecimalFromDrillBuf(in.buffer, in.start, in.end - in.start, in.scale); +String str = bigDecimal.toString(); +out.buffer = buffer; +out.start = 0; +out.end = Math.min((int)len.value, str.length()); +out.buffer.setBytes(0, str.getBytes()); --- End diff -- The Math.min use here ensures that we don't overrun the output buffer. It the string doesn't fit, it's truncated to the length available in the output buffer. Your change makes the code harder to understand than what I had, and your change is not necessary (the Math.min again). I am forced to manually remerge quite a few of my changes because of the way git rebase didn't merge to master the way I had wished, so am now working on this file. Leaving the line as I had it for now... ---
[GitHub] drill pull request #1060: DRILL-5846: Improve parquet performance for Flat D...
Github user sachouche commented on a diff in the pull request: https://github.com/apache/drill/pull/1060#discussion_r161065237 --- Diff: protocol/src/main/protobuf/UserBitShared.proto --- @@ -148,6 +148,8 @@ message SerializedField { optional int32 value_count = 4; optional int32 var_byte_length = 5; optional int32 buffer_length = 7; + optional bool is_dup = 8; + optional int32 logical_value_count = 9; --- End diff -- Just had an offline conversation with Paul and Kunal. The agreement is that we should introduce a versioning mechanism at the protocol level so that clients could advertise their version identifier; this way the server can use this information to turn on / off features based on the client capabilities. The task is only to create the preliminary versioning infrastructure; more sophistication can be added later on. ---
[GitHub] drill pull request #1060: DRILL-5846: Improve parquet performance for Flat D...
Github user sachouche commented on a diff in the pull request: https://github.com/apache/drill/pull/1060#discussion_r161040070 --- Diff: exec/vector/src/main/codegen/templates/VariableLengthVectors.java --- @@ -309,7 +314,7 @@ public void setInitialCapacity(final int valueCount) { if (size > MAX_ALLOCATION_SIZE) { throw new OversizedAllocationException("Requested amount of memory is more than max allowed allocation size"); } -allocationSizeInBytes = (int) size; +allocationSizeInBytes = (int)size; --- End diff -- Fixed. ---
[GitHub] drill pull request #1060: DRILL-5846: Improve parquet performance for Flat D...
Github user sachouche commented on a diff in the pull request: https://github.com/apache/drill/pull/1060#discussion_r161039122 --- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java --- @@ -874,6 +880,46 @@ public void setSafe(int index, BigDecimal value) { set(index, value); } +/** + * Copies the bulk input into this value vector and extends its capacity if necessary. + * @param input bulk input + */ +public void setSafe(VLBulkInput input) { + setSafe(input, null); +} + +/** + * Copies the bulk input into this value vector and extends its capacity if necessary. The callback + * mechanism allows decoration as caller is invoked for each bulk entry. + * + * @param input bulk input + * @param callback a bulk input callback object (optional) + */ +public void setSafe(VLBulkInput input, VLBulkInput.BulkInputCallback callback) { --- End diff -- This code is not Parquet specific. Instead, it can be triggered by any Reader which desires to load data in a bulk fashion. Vectors currently expose Mutator APIs for loading single values; I see no good reason which prevent us from passing bulk values instead of a single one at a time which prevent us from code optimization. Look at ByBuffer APIs they allow you to pass single byte values but also byte arrays. ---
[GitHub] drill pull request #1060: DRILL-5846: Improve parquet performance for Flat D...
Github user sachouche commented on a diff in the pull request: https://github.com/apache/drill/pull/1060#discussion_r161036145 --- Diff: exec/memory/base/src/main/java/org/apache/drill/exec/util/MemoryUtils.java --- @@ -0,0 +1,186 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.util; + +import java.lang.reflect.Field; +import java.nio.ByteOrder; + +import sun.misc.Unsafe; + +/** Exposes advanced Memory Access APIs for Little-Endian / Unaligned platforms */ +@SuppressWarnings("restriction") +public final class MemoryUtils { + + // Ensure this is a little-endian hardware */ + static { +if (ByteOrder.nativeOrder() != ByteOrder.LITTLE_ENDIAN) { + throw new IllegalStateException("Drill only runs on LittleEndian systems."); +} + } + + /** Java's unsafe object */ + private static Unsafe UNSAFE; + + static { +try { + Field theUnsafe = Unsafe.class.getDeclaredField("theUnsafe"); + theUnsafe.setAccessible(true); + UNSAFE = (Unsafe) theUnsafe.get(null); +} +catch (Exception e) { + throw new RuntimeException(e); +} + } + + /** Byte arrays offset */ + private static final long BYTE_ARRAY_OFFSET = UNSAFE.arrayBaseOffset(byte[].class); + + /** Number of bytes in a long */ + public static final int LONG_NUM_BYTES = 8; + /** Number of bytes in an int */ + public static final int INT_NUM_BYTES = 4; + /** Number of bytes in a short */ + public static final int SHORT_NUM_BYTES = 2; + +// +// APIs +// + + /** + * @param data source byte array + * @param index index within the byte array + * @return short value starting at data+index + */ + public static short getShort(byte[] data, int index) { --- End diff -- Please refer to my reply to Parth. I personally think the memory package should be our main abstraction (of course along with guidelines). Putting all memory access APIs within the DrillBuf class will only clutter that class.. I rather have a centralized memory utility class for advanced processing which do not involve a DrillBuf. ---
[GitHub] drill pull request #1060: DRILL-5846: Improve parquet performance for Flat D...
Github user sachouche commented on a diff in the pull request: https://github.com/apache/drill/pull/1060#discussion_r161045748 --- Diff: exec/vector/src/main/codegen/templates/VariableLengthVectors.java --- @@ -386,7 +391,7 @@ public void reAlloc() { throw new OversizedAllocationException("Unable to expand the buffer. Max allowed buffer size is reached."); } -logger.trace("Reallocating VarChar, new size {}", newAllocationSize); +logger.trace("Reallocating VarChar, new size {}",newAllocationSize); --- End diff -- Fixed. ---
[GitHub] drill pull request #1060: DRILL-5846: Improve parquet performance for Flat D...
Github user sachouche commented on a diff in the pull request: https://github.com/apache/drill/pull/1060#discussion_r161030963 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java --- @@ -449,7 +451,7 @@ public void close() throws Exception { // been created. Gracefully handle that case. if (options != null) { - options.close(); -} +options.close(); } } +} --- End diff -- The issue was with the close() method which was not properly indented. Fixed it. ---
[GitHub] drill pull request #1060: DRILL-5846: Improve parquet performance for Flat D...
Github user sachouche commented on a diff in the pull request: https://github.com/apache/drill/pull/1060#discussion_r161032779 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/NullableColumnReader.java --- @@ -165,17 +181,133 @@ + "Run Length: {} \t Null Run Length: {} \t readCount: {} \t writeCount: {} \t " + "recordsReadInThisIteration: {} \t valuesReadInCurrentPass: {} \t " + "totalValuesRead: {} \t readStartInBytes: {} \t readLength: {} \t pageReader.byteLength: {} \t " - + "definitionLevelsRead: {} \t pageReader.currentPageCount: {}", + + "currPageValuesProcessed: {} \t pageReader.currentPageCount: {}", recordsToReadInThisPass, runLength, nullRunLength, readCount, writeCount, recordsReadInThisIteration, valuesReadInCurrentPass, totalValuesRead, readStartInBytes, readLength, pageReader.byteLength, - definitionLevelsRead, pageReader.currentPageCount); + currPageValuesProcessed, pageReader.currentPageCount); } valueVec.getMutator().setValueCount(valuesReadInCurrentPass); } + private final void processPagesBulk(long recordsToReadInThisPass) throws IOException { +readStartInBytes = 0; +readLength = 0; +readLengthInBits = 0; +recordsReadInThisIteration = 0; +vectorData = castedBaseVector.getBuffer(); + +// values need to be spaced out where nulls appear in the column +// leaving blank space for nulls allows for random access to values +// to optimize copying data out of the buffered disk stream, runs of defined values +// are located and copied together, rather than copying individual values + +int valueCount = 0; +final int maxValuesToProcess = Math.min((int) recordsToReadInThisPass, valueVec.getValueCapacity()); + +// To handle the case where the page has been already loaded +if (pageReader.definitionLevels != null && currPageValuesProcessed == 0) { + definitionLevelWrapper.set(pageReader.definitionLevels, pageReader.currentPageCount); +} + +while (valueCount < maxValuesToProcess) { + + // read a page if needed + if (!pageReader.hasPage() || (currPageValuesProcessed == pageReader.currentPageCount)) { +if (!pageReader.next()) { + break; +} + +//New page. Reset the definition level. +currPageValuesProcessed= 0; +recordsReadInThisIteration = 0; +readStartInBytes = 0; + +// Update the Definition Level reader +definitionLevelWrapper.set(pageReader.definitionLevels, pageReader.currentPageCount); + } + + definitionLevelWrapper.readFirstIntegerIfNeeded(); + + int numNullValues = 0; + int numNonNullValues= 0; + final int remaining = maxValuesToProcess - valueCount; + int currBatchSz = Math.min(remaining, (pageReader.currentPageCount - currPageValuesProcessed)); + assert currBatchSz > 0; + + // Let's skip the next run of nulls if any ... + int idx; + for (idx = 0; idx < currBatchSz; ++idx) { +if (definitionLevelWrapper.readCurrInteger() == 1) { + break; // non-value encountered +} +definitionLevelWrapper.nextIntegerIfNotEOF(); + } + numNullValues += idx; + + // Write the nulls if any --- End diff -- This is the original logic for processing fixed precision columns (except more efficient); the intent is to figure out how many null values there are before processing non-null values. We could have avoided this for-loop if we controlled the Parquet ValuesReader API; I was tempted to do that but decided to prioritize my tasks. The good news is that I recorded all such improvement opportunities and will try to tackle them at some point. ---
[jira] [Created] (DRILL-6082) RpcExceptionHandler log doesn't print "cause" for exception
Sorabh Hamirwasia created DRILL-6082: Summary: RpcExceptionHandler log doesn't print "cause" for exception Key: DRILL-6082 URL: https://issues.apache.org/jira/browse/DRILL-6082 Project: Apache Drill Issue Type: Bug Components: Execution - RPC Affects Versions: 1.12.0 Reporter: Sorabh Hamirwasia Assignee: Sorabh Hamirwasia Fix For: 1.13.0 In [RpcExceptionHandler|https://github.com/apache/drill/blob/master/exec/rpc/src/main/java/org/apache/drill/exec/rpc/RpcExceptionHandler.java#L35] we are printing the warning log with cause but in wrong format. This is resulting in loss of actual "cause" in the log file. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] drill pull request #1060: DRILL-5846: Improve parquet performance for Flat D...
Github user sachouche commented on a diff in the pull request: https://github.com/apache/drill/pull/1060#discussion_r161028252 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java --- @@ -471,8 +519,8 @@ public void close() throws Exception { container.clear(); mutator.clear(); if (currentReader != null) { - currentReader.close(); -} +currentReader.close(); + } --- End diff -- done. ---
[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...
Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/570#discussion_r161015751 --- Diff: exec/vector/src/main/codegen/templates/VariableLengthVectors.java --- @@ -418,6 +422,16 @@ public void get(int index, Nullable${minor.class}Holder holder){ <#switch minor.class> +<#case "VarDecimal"> +@Override +public ${friendlyType} getObject(int index) { + byte[] b = get(index); + BigInteger bi = new BigInteger(b); + BigDecimal bd = new BigDecimal(bi, getField().getScale()); + //System.out.println("VarDecimal getObject " + index + " scale " + getField().getScale() + " len " + b.length + " bd " + bd); --- End diff -- Please remove this comment ---
[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...
Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/570#discussion_r161006318 --- Diff: exec/java-exec/src/main/codegen/templates/Decimal/CastVarCharDecimal.java --- @@ -85,6 +85,30 @@ public void eval() { // Assign the scale and precision out.scale = (int) scale.value; + +<#if type.to.endsWith("VarDecimal")> + +// VarDecimal gets its own cast logic +int readIndex = in.start; +int endIndex = in.end; +StringBuffer sb = new StringBuffer(); --- End diff -- I think it would be easier to receive string using this code: ``` byte[] buf = new byte[in.end - in.start]; buffer.getBytes(in.start, buf, 0, in.end - in.start); String s = new String(buf, Charsets.UTF_8); ``` ---
[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...
Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/570#discussion_r161004122 --- Diff: exec/java-exec/src/main/codegen/templates/Decimal/CastIntDecimal.java --- @@ -68,15 +68,31 @@ public void setup() { public void eval() { out.scale = (int) scale.value; + +<#if !type.to.endsWith("VarDecimal")> out.precision = (int) precision.value; + -<#if type.to == "Decimal9" || type.to == "Decimal18"> +<#if type.to.endsWith("VarDecimal")> +out.start = 0; +out.buffer = buffer; +String s = Long.toString((long)in.value); +for (int i = 0; i < out.scale; ++i) { // add 0's to get unscaled integer +s += "0"; +} +java.math.BigInteger bi = new java.math.BigInteger(s); +java.math.BigDecimal bd = new java.math.BigDecimal(bi, out.scale); +//System.out.println("CastIntDecimal in " + in.value + " scale " + out.scale + " string " + s + " bd " + bd); --- End diff -- Please remove this comment. ---
[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...
Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/570#discussion_r160963518 --- Diff: exec/vector/src/main/codegen/templates/NullableValueVectors.java --- @@ -327,13 +327,17 @@ public Mutator getMutator(){ return v; } - public void copyFrom(int fromIndex, int thisIndex, Nullable${minor.class}Vector from){ + protected void copyFromUnsafe(int fromIndex, int thisIndex, Nullable${minor.class}Vector from){ --- End diff -- Since we already have `copyFromSafe()` method in this class, `copyFrom()` is supposed to be unsafe. ---
[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...
Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/570#discussion_r160941530 --- Diff: exec/java-exec/src/main/codegen/templates/SqlAccessors.java --- @@ -127,6 +127,25 @@ public String getString(int index) { } <#break> +<#case "VarDecimal"> + +@Override +public String getString(int index){ +<#if mode=="Nullable"> +if(ac.isNull(index)){ + return null; +} + +try { --- End diff -- I think it would be better just throw the exception. Then it is easier to detect a case when data was stored in the vector incorrectly, or if it has come from the stored data, inform a user that data was corrupted. ---
[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...
Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/570#discussion_r160980556 --- Diff: exec/vector/src/main/codegen/templates/VariableLengthVectors.java --- @@ -539,7 +553,12 @@ public void setValueLengthSafe(int index, int length) { } -public void setSafe(int index, int start, int end, DrillBuf buffer){ +<#if type.minor == "VarDecimal"> --- End diff -- Sorry, perhaps I made mistake, yes, it looks OK. But do we need to pass `scale` for `VarDecimal` even if we don't use it in this method? I suppose should be made a change in `ComplexWriters.java` to avoid the usage of this method with the additional argument. ---
[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...
Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/570#discussion_r160987506 --- Diff: exec/java-exec/src/main/codegen/templates/Decimal/CastDecimalVarchar.java --- @@ -150,6 +150,14 @@ public void setup() { public void eval() { +<#if type.from.contains("VarDecimal")> +java.math.BigDecimal bigDecimal = org.apache.drill.exec.util.DecimalUtility.getBigDecimalFromDrillBuf(in.buffer, in.start, in.end - in.start, in.scale); +String str = bigDecimal.toString(); +out.buffer = buffer; +out.start = 0; +out.end = Math.min((int)len.value, str.length()); +out.buffer.setBytes(0, str.getBytes()); --- End diff -- I guess we should do the same thing as for other decimals: ``` out.buffer.setBytes(0, String.valueOf(str.substring(0, out.end)).getBytes()); ``` ---
[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...
Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/570#discussion_r160939410 --- Diff: exec/java-exec/src/main/codegen/templates/Decimal/DecimalFunctions.java --- @@ -102,7 +111,578 @@ <#-- For each DECIMAL... type (in DecimalTypes.tdd) ... --> <#list comparisonTypesDecimal.decimalTypes as type> -<#if type.name.endsWith("Sparse")> +<#if type.name.endsWith("VarDecimal")> + +<@pp.changeOutputFile name="/org/apache/drill/exec/expr/fn/impl/${type.name}Functions.java" /> + +<#include "/@includes/license.ftl" /> + +package org.apache.drill.exec.expr.fn.impl; + +<#include "/@includes/vv_imports.ftl" /> + +import org.apache.drill.exec.expr.DrillSimpleFunc; +import org.apache.drill.exec.expr.annotations.FunctionTemplate; +import org.apache.drill.exec.expr.annotations.FunctionTemplate.NullHandling; +import org.apache.drill.exec.expr.annotations.Output; +import org.apache.drill.exec.expr.annotations.Param; +import org.apache.drill.exec.expr.annotations.Workspace; +import org.apache.drill.exec.expr.fn.FunctionGenerationHelper; +import org.apache.drill.exec.expr.holders.*; +import org.apache.drill.exec.record.RecordBatch; + +import io.netty.buffer.ByteBuf; +import io.netty.buffer.DrillBuf; + +import java.nio.ByteBuffer; + +@SuppressWarnings("unused") +public class ${type.name}Functions { +private static void initBuffer(DrillBuf buffer) { +// for VarDecimal, this method of setting initial size is actually only a very rough heuristic. +int size = (${type.storage} * (org.apache.drill.exec.util.DecimalUtility.INTEGER_SIZE)); +buffer = buffer.reallocIfNeeded(size); + } + +@FunctionTemplate(name = "subtract", scope = FunctionTemplate.FunctionScope.DECIMAL_ADD_SCALE, nulls = NullHandling.NULL_IF_NULL) --- End diff -- By default `checkPrecision` is false, so it is not required to specify it explicitly in this case. ---
[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...
Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/570#discussion_r160958360 --- Diff: exec/vector/src/main/codegen/templates/ComplexWriters.java --- @@ -99,7 +99,7 @@ public void write(Nullable${minor.class?cap_first}Holder h) { <#if !(minor.class == "Decimal9" || minor.class == "Decimal18" || minor.class == "Decimal28Sparse" || minor.class == "Decimal38Sparse" || minor.class == "Decimal28Dense" || minor.class == "Decimal38Dense")> public void write${minor.class}(<#list fields as field>${field.type} ${field.name}<#if field_has_next>, ) { -mutator.addSafe(idx(), <#list fields as field>${field.name}<#if field_has_next>, ); +mutator.addSafe(idx(), <#list fields as field><#if field.name == "scale"><#break>${field.name}<#if field_has_next && fields[field_index+1].name != "scale" >, ); --- End diff -- I meant to replace this line by something like this: ``` <#if minor.class == "VarDecimal"> mutator.addSafe(idx()<#list fields as field><#if field.include!true>, ${field.name}); // or something better <#else> mutator.addSafe(idx(), <#list fields as field>${field.name}<#if field_has_next>, ); ``` Since `field.include` is used only for Decimal types, we can replace all this code by this line: ``` mutator.addSafe(idx()<#list fields as field><#if field.include!true>, ${field.name}); ``` But do we use this method for VarDecimal somewhere? If not, it would be better to add VarDecimal `minor.class` to the if condition with other decimal types above. BTW, please modify similar changes below in this class and in `RepeatedValueVectors.java` ---
[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...
Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/570#discussion_r161002913 --- Diff: common/src/main/java/org/apache/drill/common/util/CoreDecimalUtility.java --- @@ -32,6 +32,14 @@ public static long getDecimal18FromBigDecimal(BigDecimal input, int scale, int p } public static int getMaxPrecision(TypeProtos.MinorType decimalType) { +/* --- End diff -- This method is used in `TypeCastRules.getCost()` and in `CoreDecimalUtility.getPrecisionRange()`. Regarding its usage in `TypeCastRules.getCost()`, we may return `RelDataTypeSystem.getMaxNumericPrecision()` for VarDecimal. But regarding its usage in `CoreDecimalUtility.getPrecisionRange()` method, `getPrecisionRange()` is used in DecimalScalePrecisionFunction classes for calculating resulting precision. I suppose these methods should be rewritten in the same way as it is implemented in Calcite `RelDataTypeFactoryImpl`. But I think it would be better to make these changes in separate Jira. But now please remove these comments. ---
[GitHub] drill issue #1074: DRILL-6025: Display execution time of a query in RUNNING ...
Github user arina-ielchiieva commented on the issue: https://github.com/apache/drill/pull/1074 +1, LGTM. ---
[GitHub] drill pull request #1074: DRILL-6025: Display execution time of a query in R...
Github user prasadns14 commented on a diff in the pull request: https://github.com/apache/drill/pull/1074#discussion_r161012925 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileWrapper.java --- @@ -211,16 +211,25 @@ public String getExecutionDuration() { return NOT_AVAILABLE_LABEL; } +long queryEndTime; + +// Check if State is RUNNING, set end time to current time +if (profile.getState() == QueryState.RUNNING) { --- End diff -- @arina-ielchiieva, thank you for catching the mistake. Fixed it ---
Odd Error on Login with 1.13-SNAPSHOT
I am probably missing something minor here, but I am working with Ted Dunning on some PCAP plugin stuff, so I built his 1.13 SNAPSHOT, and when I try to login I see { "errorMessage" : "No configuration setting found for key 'drill.exec.http.auth'" } I am guessing that something was added that I need to fill out in my config? Is there a JIRA or something that can guide me on this? Thanks
[GitHub] drill pull request #1088: DRILL-6081: Set query end time before writing fina...
GitHub user arina-ielchiieva opened a pull request: https://github.com/apache/drill/pull/1088 DRILL-6081: Set query end time before writing final profile You can merge this pull request into a Git repository by running: $ git pull https://github.com/arina-ielchiieva/drill DRILL-6081 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1088.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1088 commit e60baf9dc194853090173c13ac959f40b938032a Author: Arina Ielchiieva Date: 2018-01-11T15:40:04Z DRILL-6081: Set query end time before writing final profile ---
[GitHub] drill pull request #1083: DRILL-4185: UNION ALL involving empty directory on...
Github user vdiravka commented on a diff in the pull request: https://github.com/apache/drill/pull/1083#discussion_r160996060 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSelection.java --- @@ -424,6 +429,23 @@ public MetadataContext getMetaContext() { return metaContext; } + /** + * @return true if this file selectionRoot points to an empty directory, false otherwise + */ + public boolean isEmptyDirectory() { +return emptyDirectory; + } + + /** + * Setting this as true allows to identify this as empty directory file selection + * + * @param emptyDirectory empty directory flag + */ + public void setEmptyDirectory(boolean emptyDirectory) { --- End diff -- Done ---
[GitHub] drill pull request #1083: DRILL-4185: UNION ALL involving empty directory on...
Github user vdiravka commented on a diff in the pull request: https://github.com/apache/drill/pull/1083#discussion_r160997779 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetFormatPlugin.java --- @@ -250,20 +250,12 @@ private boolean metaDataFileExists(FileSystem fs, FileStatus dir) throws IOExcep } boolean isDirReadable(DrillFileSystem fs, FileStatus dir) { - Path p = new Path(dir.getPath(), ParquetFileWriter.PARQUET_METADATA_FILE); try { -if (fs.exists(p)) { - return true; -} else { - - if (metaDataFileExists(fs, dir)) { -return true; - } - List statuses = DrillFileSystemUtil.listFiles(fs, dir.getPath(), false); - return !statuses.isEmpty() && super.isFileReadable(fs, statuses.get(0)); -} +// There should be at least one file, which is readable by Drill +List statuses = DrillFileSystemUtil.listFiles(fs, dir.getPath(), false); +return !statuses.isEmpty() && super.isFileReadable(fs, statuses.get(0)); --- End diff -- I have reverted my changes here. SchemalessScan will be created if ParquetGroupScan has empty entries and can't be used. The info that parquet metadata files are invalid will be logged. ---
[GitHub] drill pull request #1083: DRILL-4185: UNION ALL involving empty directory on...
Github user vdiravka commented on a diff in the pull request: https://github.com/apache/drill/pull/1083#discussion_r160936248 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/SchemalessScan.java --- @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to you under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.physical.base; + +import com.fasterxml.jackson.annotation.JsonTypeName; +import com.google.common.base.Preconditions; +import org.apache.drill.common.exceptions.ExecutionSetupException; +import org.apache.drill.common.expression.SchemaPath; +import org.apache.drill.exec.physical.PhysicalOperatorSetupException; +import org.apache.drill.exec.planner.logical.DynamicDrillTable; +import org.apache.drill.exec.proto.CoordinationProtos; + +import java.util.List; + +/** + * The type of scan operator, which allows to scan schemaless tables ({@link DynamicDrillTable} with null selection) + */ +@JsonTypeName("schemaless-scan") +public class SchemalessScan extends AbstractGroupScan implements SubScan { + + private String selectionRoot; --- End diff -- Done ---
[jira] [Created] (DRILL-6081) Duration of completed queries is continuously increasing.
Anton Gozhiy created DRILL-6081: --- Summary: Duration of completed queries is continuously increasing. Key: DRILL-6081 URL: https://issues.apache.org/jira/browse/DRILL-6081 Project: Apache Drill Issue Type: Bug Affects Versions: 1.13.0 Reporter: Anton Gozhiy Steps: 1. Execute some query (for example "select * from sys.version") 2. Go to the profiles page: http://node1:8047/profiles 3. Open the details page for the query 4. Expand the duration section Expected result: The duration should be adequate and shouldn't be changed after the page reload Actual result: The duration is continuously increasing (You should reload page to notice that) ||Planning||Queued||Execution||Total|| |0.092 sec|0.012 sec|59 min 36.487 sec |59 min 36.591 sec| Workaround: none Note: The issue introduced after the following fix: https://issues.apache.org/jira/browse/DRILL-5963 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] drill pull request #1074: DRILL-6025: Display execution time of a query in R...
Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1074#discussion_r160982037 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileWrapper.java --- @@ -211,16 +211,25 @@ public String getExecutionDuration() { return NOT_AVAILABLE_LABEL; } +long queryEndTime; + +// Check if State is RUNNING, set end time to current time +if (profile.getState() == QueryState.RUNNING) { --- End diff -- @prasadns14 I think that after rebase you have lost one of the initial changes. Since you did not remove check in previous if statement for `profile.getState() == QueryState.RUNNING`, this if statement will never be executed. ---
[GitHub] drill issue #1078: DRILL-6054: don't try to split the filter when it is not ...
Github user arina-ielchiieva commented on the issue: https://github.com/apache/drill/pull/1078 +1, LGTM. ---
[GitHub] drill issue #1074: DRILL-6025: Display execution time of a query in RUNNING ...
Github user arina-ielchiieva commented on the issue: https://github.com/apache/drill/pull/1074 Thanks, +1. ---