[jira] [Commented] (DRILL-6611) Add [meta]-[Enter] js handler for query form submission
[ https://issues.apache.org/jira/browse/DRILL-6611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555164#comment-16555164 ] ASF GitHub Bot commented on DRILL-6611: --- kkhatua commented on issue #1392: Implements DRILL-6611 to enable meta-enter query submission in web query interface URL: https://github.com/apache/drill/pull/1392#issuecomment-407639822 @hrbrmstr can you also change the PR's title to **DRILL-6611: Enable meta-enter query submission in web query interface** This will facilitate the Apache JIRA system to pick up the PR and automatically link to it. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add [meta]-[Enter] js handler for query form submission > --- > > Key: DRILL-6611 > URL: https://issues.apache.org/jira/browse/DRILL-6611 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.14.0 >Reporter: Bob Rudis >Assignee: Bob Rudis >Priority: Minor > Labels: doc-impacting > Fix For: 1.15.0 > > > The new ACE-based SQL query editor is great. Being able to submit the form > without using a mouse would be even better. > Adding: > > {noformat} > document.getElementById('queryForm') > .addEventListener('keydown', function(e) { > if (!(e.keyCode == 13 && e.metaKey)) return; > if (e.target.form) doSubmitQueryWithUserName(); > }); > {noformat} > {{to ./exec/java-exec/src/main/resources/rest/query/query.ftl adds such > support.}} > I can file a PR with the code if desired. > -- > Functionality (for the documentation): > This JIRA's commit introduces the following to Drill: > When composing queries in the web query editor it is now possible to submit > the query text by using the {{Meta+Enter}} key combination. This will trigger > the same action as pressing the {{Submit}} button. On Mac keyboards > {{Meta+Enter}} is {{Cmd+Enter}}. On Windows or Linux is {{Ctrl+Enter}} though > Linux users may have keymapped the {{Meta}} key to another physical keyboard > key. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6385) Support JPPD (Join Predicate Push Down)
[ https://issues.apache.org/jira/browse/DRILL-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555113#comment-16555113 ] ASF GitHub Bot commented on DRILL-6385: --- weijietong commented on issue #1334: DRILL-6385: Support JPPD feature URL: https://github.com/apache/drill/pull/1334#issuecomment-407627190 @amansinha100 thanks for your valuable review. Since being on vacation , others will be commented and updated later. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Support JPPD (Join Predicate Push Down) > --- > > Key: DRILL-6385 > URL: https://issues.apache.org/jira/browse/DRILL-6385 > Project: Apache Drill > Issue Type: New Feature > Components: Server, Execution - Flow >Affects Versions: 1.14.0 >Reporter: weijie.tong >Assignee: weijie.tong >Priority: Major > > This feature is to support the JPPD (Join Predicate Push Down). It will > benefit the HashJoin ,Broadcast HashJoin performance by reducing the number > of rows to send across the network ,the memory consumed. This feature is > already supported by Impala which calls it RuntimeFilter > ([https://www.cloudera.com/documentation/enterprise/5-9-x/topics/impala_runtime_filtering.html]). > The first PR will try to push down a bloom filter of HashJoin node to > Parquet’s scan node. The propose basic procedure is described as follow: > # The HashJoin build side accumulate the equal join condition rows to > construct a bloom filter. Then it sends out the bloom filter to the foreman > node. > # The foreman node accept the bloom filters passively from all the fragments > that has the HashJoin operator. It then aggregates the bloom filters to form > a global bloom filter. > # The foreman node broadcasts the global bloom filter to all the probe side > scan nodes which maybe already have send out partial data to the hash join > nodes(currently the hash join node will prefetch one batch from both sides ). > 4. The scan node accepts a global bloom filter from the foreman node. > It will filter the rest rows satisfying the bloom filter. > > To implement above execution flow, some main new notion described as below: > 1. RuntimeFilter > It’s a filter container which may contain BloomFilter or MinMaxFilter. > 2. RuntimeFilterReporter > It wraps the logic to send hash join’s bloom filter to the foreman.The > serialized bloom filter will be sent out through the data tunnel.This object > will be instanced by the FragmentExecutor and passed to the > FragmentContext.So the HashJoin operator can obtain it through the > FragmentContext. > 3. RuntimeFilterRequestHandler > It is responsible to accept a SendRuntimeFilterRequest RPC to strip the > actual BloomFilter from the network. It then translates this filter to the > WorkerBee’s new interface registerRuntimeFilter. > Another RPC type is BroadcastRuntimeFilterRequest. It will register the > accepted global bloom filter to the WorkerBee by the registerRuntimeFilter > method and then propagate to the FragmentContext through which the probe side > scan node can fetch the aggregated bloom filter. > 4.RuntimeFilterManager > The foreman will instance a RuntimeFilterManager .It will indirectly get > every RuntimeFilter by the WorkerBee. Once all the BloomFilters have been > accepted and aggregated . It will broadcast the aggregated bloom filter to > all the probe side scan nodes through the data tunnel by a > BroadcastRuntimeFilterRequest RPC. > 5. RuntimeFilterEnableOption > A global option will be added to decide whether to enable this new feature. > > Welcome suggestion and advice from you.The related PR will be presented as > soon as possible. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6385) Support JPPD (Join Predicate Push Down)
[ https://issues.apache.org/jira/browse/DRILL-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555105#comment-16555105 ] ASF GitHub Bot commented on DRILL-6385: --- weijietong commented on a change in pull request #1334: DRILL-6385: Support JPPD feature URL: https://github.com/apache/drill/pull/1334#discussion_r204974446 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/work/filter/BloomFilterCreator.java ## @@ -0,0 +1,41 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.work.filter; + +import io.netty.buffer.DrillBuf; +import org.apache.drill.exec.memory.BufferAllocator; + +public class BloomFilterCreator { Review comment: Current implementation is one bloom filter one join column. To your example, multi-column join , will generate two bloom filters. The reason to this implementation is to achieve one vector column memory access by one hash computation. But the Murmur hash 's complex computation ate up the pipeline performance , the result performance does not so good. So I will change it to your assumes implementation. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Support JPPD (Join Predicate Push Down) > --- > > Key: DRILL-6385 > URL: https://issues.apache.org/jira/browse/DRILL-6385 > Project: Apache Drill > Issue Type: New Feature > Components: Server, Execution - Flow >Affects Versions: 1.14.0 >Reporter: weijie.tong >Assignee: weijie.tong >Priority: Major > > This feature is to support the JPPD (Join Predicate Push Down). It will > benefit the HashJoin ,Broadcast HashJoin performance by reducing the number > of rows to send across the network ,the memory consumed. This feature is > already supported by Impala which calls it RuntimeFilter > ([https://www.cloudera.com/documentation/enterprise/5-9-x/topics/impala_runtime_filtering.html]). > The first PR will try to push down a bloom filter of HashJoin node to > Parquet’s scan node. The propose basic procedure is described as follow: > # The HashJoin build side accumulate the equal join condition rows to > construct a bloom filter. Then it sends out the bloom filter to the foreman > node. > # The foreman node accept the bloom filters passively from all the fragments > that has the HashJoin operator. It then aggregates the bloom filters to form > a global bloom filter. > # The foreman node broadcasts the global bloom filter to all the probe side > scan nodes which maybe already have send out partial data to the hash join > nodes(currently the hash join node will prefetch one batch from both sides ). > 4. The scan node accepts a global bloom filter from the foreman node. > It will filter the rest rows satisfying the bloom filter. > > To implement above execution flow, some main new notion described as below: > 1. RuntimeFilter > It’s a filter container which may contain BloomFilter or MinMaxFilter. > 2. RuntimeFilterReporter > It wraps the logic to send hash join’s bloom filter to the foreman.The > serialized bloom filter will be sent out through the data tunnel.This object > will be instanced by the FragmentExecutor and passed to the > FragmentContext.So the HashJoin operator can obtain it through the > FragmentContext. > 3. RuntimeFilterRequestHandler > It is responsible to accept a SendRuntimeFilterRequest RPC to strip the > actual BloomFilter from the network. It then translates this filter to the > WorkerBee’s new interface registerRuntimeFilter. > Another RPC type is BroadcastRuntimeFilterRequest. It will register the > accepted global bloom filter to the WorkerBee by the registerRuntimeFilter > method and then propagate to the FragmentContext through which the p
[jira] [Commented] (DRILL-6385) Support JPPD (Join Predicate Push Down)
[ https://issues.apache.org/jira/browse/DRILL-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555102#comment-16555102 ] ASF GitHub Bot commented on DRILL-6385: --- weijietong commented on a change in pull request #1334: DRILL-6385: Support JPPD feature URL: https://github.com/apache/drill/pull/1334#discussion_r204973381 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/work/filter/RuntimeFilterManager.java ## @@ -0,0 +1,586 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.work.filter; + +import org.apache.calcite.rel.RelNode; +import org.apache.calcite.rel.core.JoinInfo; +import org.apache.calcite.rel.core.JoinRelType; +import org.apache.calcite.rel.metadata.RelMetadataQuery; +import org.apache.calcite.rel.type.RelDataType; +import org.apache.calcite.rel.type.RelDataTypeField; +import org.apache.commons.collections.CollectionUtils; +import org.apache.drill.exec.ExecConstants; +import org.apache.drill.exec.ops.AccountingDataTunnel; +import org.apache.drill.exec.ops.Consumer; +import org.apache.drill.exec.ops.QueryContext; +import org.apache.drill.exec.ops.SendingAccountor; +import org.apache.drill.exec.ops.StatusHandler; +import org.apache.drill.exec.physical.PhysicalPlan; + +import org.apache.drill.exec.physical.base.AbstractPhysicalVisitor; +import org.apache.drill.exec.physical.base.Exchange; +import org.apache.drill.exec.physical.base.GroupScan; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.config.BroadcastExchange; +import org.apache.drill.exec.physical.config.HashJoinPOP; +import org.apache.drill.exec.planner.fragment.Fragment; +import org.apache.drill.exec.planner.fragment.Wrapper; +import org.apache.drill.exec.planner.physical.HashJoinPrel; +import org.apache.drill.exec.planner.physical.ScanPrel; +import org.apache.drill.exec.proto.BitData; +import org.apache.drill.exec.proto.CoordinationProtos; +import org.apache.drill.exec.proto.GeneralRPCProtos; +import org.apache.drill.exec.proto.UserBitShared; +import org.apache.drill.exec.proto.helper.QueryIdHelper; +import org.apache.drill.exec.rpc.RpcException; +import org.apache.drill.exec.rpc.RpcOutcomeListener; +import org.apache.drill.exec.rpc.data.DataTunnel; +import org.apache.drill.exec.server.DrillbitContext; +import org.apache.drill.exec.util.Pointer; +import org.apache.drill.exec.work.QueryWorkUnit; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.ConcurrentHashMap; + +/** + * This class traverses the physical operator tree to find the HashJoin operator + * for which is JPPD (join predicate push down) is possible. The prerequisite to do JPPD + * is: + * 1. The join condition is equality + * 2. The physical join node is a HashJoin one + * 3. The probe side children of the HashJoin node should not contain a blocking operator like HashAgg + */ +public class RuntimeFilterManager { + + private Wrapper rootWrapper; + //HashJoin node's major fragment id to its corresponding probe side nodes's endpoints + private Map> joinMjId2probdeScanEps = new HashMap<>(); + //HashJoin node's major fragment id to its corresponding probe side nodes's number + private Map joinMjId2scanSize = new ConcurrentHashMap<>(); + //HashJoin node's major fragment id to its corresponding probe side scan node's belonging major fragment id + private Map joinMjId2ScanMjId = new HashMap<>(); + + private RuntimeFilterWritable aggregatedRuntimeFilter; + + private DrillbitContext drillbitContext; + + private SendingAccountor sendingAccountor = new SendingAccountor(); + + private String lineSeparator; + + private static final Logger logger = LoggerFactory.getLogger(RuntimeFilterManager.class); + + /** + * This class maintains context for the runtime join push down's filter management. It + * does a traversal of the physical operators by leveraging the root wrapper which indirectly + * holds the global PhysicalOperator tree a
[jira] [Commented] (DRILL-6385) Support JPPD (Join Predicate Push Down)
[ https://issues.apache.org/jira/browse/DRILL-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555101#comment-16555101 ] ASF GitHub Bot commented on DRILL-6385: --- weijietong commented on a change in pull request #1334: DRILL-6385: Support JPPD feature URL: https://github.com/apache/drill/pull/1334#discussion_r204973345 ## File path: exec/java-exec/src/main/resources/drill-module.conf ## @@ -455,6 +455,8 @@ drill.exec.options: { exec.hashjoin.num_partitions: 32, exec.hashjoin.num_rows_in_batch: 1024, exec.hashjoin.max_batches_in_memory: 0, +exec.hashjoin.enable.runtime_filter: true, Review comment: agree This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Support JPPD (Join Predicate Push Down) > --- > > Key: DRILL-6385 > URL: https://issues.apache.org/jira/browse/DRILL-6385 > Project: Apache Drill > Issue Type: New Feature > Components: Server, Execution - Flow >Affects Versions: 1.14.0 >Reporter: weijie.tong >Assignee: weijie.tong >Priority: Major > > This feature is to support the JPPD (Join Predicate Push Down). It will > benefit the HashJoin ,Broadcast HashJoin performance by reducing the number > of rows to send across the network ,the memory consumed. This feature is > already supported by Impala which calls it RuntimeFilter > ([https://www.cloudera.com/documentation/enterprise/5-9-x/topics/impala_runtime_filtering.html]). > The first PR will try to push down a bloom filter of HashJoin node to > Parquet’s scan node. The propose basic procedure is described as follow: > # The HashJoin build side accumulate the equal join condition rows to > construct a bloom filter. Then it sends out the bloom filter to the foreman > node. > # The foreman node accept the bloom filters passively from all the fragments > that has the HashJoin operator. It then aggregates the bloom filters to form > a global bloom filter. > # The foreman node broadcasts the global bloom filter to all the probe side > scan nodes which maybe already have send out partial data to the hash join > nodes(currently the hash join node will prefetch one batch from both sides ). > 4. The scan node accepts a global bloom filter from the foreman node. > It will filter the rest rows satisfying the bloom filter. > > To implement above execution flow, some main new notion described as below: > 1. RuntimeFilter > It’s a filter container which may contain BloomFilter or MinMaxFilter. > 2. RuntimeFilterReporter > It wraps the logic to send hash join’s bloom filter to the foreman.The > serialized bloom filter will be sent out through the data tunnel.This object > will be instanced by the FragmentExecutor and passed to the > FragmentContext.So the HashJoin operator can obtain it through the > FragmentContext. > 3. RuntimeFilterRequestHandler > It is responsible to accept a SendRuntimeFilterRequest RPC to strip the > actual BloomFilter from the network. It then translates this filter to the > WorkerBee’s new interface registerRuntimeFilter. > Another RPC type is BroadcastRuntimeFilterRequest. It will register the > accepted global bloom filter to the WorkerBee by the registerRuntimeFilter > method and then propagate to the FragmentContext through which the probe side > scan node can fetch the aggregated bloom filter. > 4.RuntimeFilterManager > The foreman will instance a RuntimeFilterManager .It will indirectly get > every RuntimeFilter by the WorkerBee. Once all the BloomFilters have been > accepted and aggregated . It will broadcast the aggregated bloom filter to > all the probe side scan nodes through the data tunnel by a > BroadcastRuntimeFilterRequest RPC. > 5. RuntimeFilterEnableOption > A global option will be added to decide whether to enable this new feature. > > Welcome suggestion and advice from you.The related PR will be presented as > soon as possible. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6385) Support JPPD (Join Predicate Push Down)
[ https://issues.apache.org/jira/browse/DRILL-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555100#comment-16555100 ] ASF GitHub Bot commented on DRILL-6385: --- weijietong commented on a change in pull request #1334: DRILL-6385: Support JPPD feature URL: https://github.com/apache/drill/pull/1334#discussion_r204973201 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/work/filter/RuntimeFilterManager.java ## @@ -0,0 +1,586 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.work.filter; + +import org.apache.calcite.rel.RelNode; +import org.apache.calcite.rel.core.JoinInfo; +import org.apache.calcite.rel.core.JoinRelType; +import org.apache.calcite.rel.metadata.RelMetadataQuery; +import org.apache.calcite.rel.type.RelDataType; +import org.apache.calcite.rel.type.RelDataTypeField; +import org.apache.commons.collections.CollectionUtils; +import org.apache.drill.exec.ExecConstants; +import org.apache.drill.exec.ops.AccountingDataTunnel; +import org.apache.drill.exec.ops.Consumer; +import org.apache.drill.exec.ops.QueryContext; +import org.apache.drill.exec.ops.SendingAccountor; +import org.apache.drill.exec.ops.StatusHandler; +import org.apache.drill.exec.physical.PhysicalPlan; + +import org.apache.drill.exec.physical.base.AbstractPhysicalVisitor; +import org.apache.drill.exec.physical.base.Exchange; +import org.apache.drill.exec.physical.base.GroupScan; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.config.BroadcastExchange; +import org.apache.drill.exec.physical.config.HashJoinPOP; +import org.apache.drill.exec.planner.fragment.Fragment; +import org.apache.drill.exec.planner.fragment.Wrapper; +import org.apache.drill.exec.planner.physical.HashJoinPrel; +import org.apache.drill.exec.planner.physical.ScanPrel; +import org.apache.drill.exec.proto.BitData; +import org.apache.drill.exec.proto.CoordinationProtos; +import org.apache.drill.exec.proto.GeneralRPCProtos; +import org.apache.drill.exec.proto.UserBitShared; +import org.apache.drill.exec.proto.helper.QueryIdHelper; +import org.apache.drill.exec.rpc.RpcException; +import org.apache.drill.exec.rpc.RpcOutcomeListener; +import org.apache.drill.exec.rpc.data.DataTunnel; +import org.apache.drill.exec.server.DrillbitContext; +import org.apache.drill.exec.util.Pointer; +import org.apache.drill.exec.work.QueryWorkUnit; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.ConcurrentHashMap; + +/** + * This class traverses the physical operator tree to find the HashJoin operator + * for which is JPPD (join predicate push down) is possible. The prerequisite to do JPPD + * is: + * 1. The join condition is equality + * 2. The physical join node is a HashJoin one + * 3. The probe side children of the HashJoin node should not contain a blocking operator like HashAgg + */ +public class RuntimeFilterManager { + + private Wrapper rootWrapper; + //HashJoin node's major fragment id to its corresponding probe side nodes's endpoints + private Map> joinMjId2probdeScanEps = new HashMap<>(); + //HashJoin node's major fragment id to its corresponding probe side nodes's number + private Map joinMjId2scanSize = new ConcurrentHashMap<>(); + //HashJoin node's major fragment id to its corresponding probe side scan node's belonging major fragment id + private Map joinMjId2ScanMjId = new HashMap<>(); + + private RuntimeFilterWritable aggregatedRuntimeFilter; + + private DrillbitContext drillbitContext; + + private SendingAccountor sendingAccountor = new SendingAccountor(); + + private String lineSeparator; + + private static final Logger logger = LoggerFactory.getLogger(RuntimeFilterManager.class); + + /** + * This class maintains context for the runtime join push down's filter management. It + * does a traversal of the physical operators by leveraging the root wrapper which indirectly + * holds the global PhysicalOperator tree a
[jira] [Commented] (DRILL-6385) Support JPPD (Join Predicate Push Down)
[ https://issues.apache.org/jira/browse/DRILL-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555031#comment-16555031 ] ASF GitHub Bot commented on DRILL-6385: --- weijietong commented on a change in pull request #1334: DRILL-6385: Support JPPD feature URL: https://github.com/apache/drill/pull/1334#discussion_r204965485 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/work/filter/RuntimeFilterManager.java ## @@ -0,0 +1,586 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.work.filter; + +import org.apache.calcite.rel.RelNode; +import org.apache.calcite.rel.core.JoinInfo; +import org.apache.calcite.rel.core.JoinRelType; +import org.apache.calcite.rel.metadata.RelMetadataQuery; +import org.apache.calcite.rel.type.RelDataType; +import org.apache.calcite.rel.type.RelDataTypeField; +import org.apache.commons.collections.CollectionUtils; +import org.apache.drill.exec.ExecConstants; +import org.apache.drill.exec.ops.AccountingDataTunnel; +import org.apache.drill.exec.ops.Consumer; +import org.apache.drill.exec.ops.QueryContext; +import org.apache.drill.exec.ops.SendingAccountor; +import org.apache.drill.exec.ops.StatusHandler; +import org.apache.drill.exec.physical.PhysicalPlan; + +import org.apache.drill.exec.physical.base.AbstractPhysicalVisitor; +import org.apache.drill.exec.physical.base.Exchange; +import org.apache.drill.exec.physical.base.GroupScan; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.config.BroadcastExchange; +import org.apache.drill.exec.physical.config.HashJoinPOP; +import org.apache.drill.exec.planner.fragment.Fragment; +import org.apache.drill.exec.planner.fragment.Wrapper; +import org.apache.drill.exec.planner.physical.HashJoinPrel; +import org.apache.drill.exec.planner.physical.ScanPrel; +import org.apache.drill.exec.proto.BitData; +import org.apache.drill.exec.proto.CoordinationProtos; +import org.apache.drill.exec.proto.GeneralRPCProtos; +import org.apache.drill.exec.proto.UserBitShared; +import org.apache.drill.exec.proto.helper.QueryIdHelper; +import org.apache.drill.exec.rpc.RpcException; +import org.apache.drill.exec.rpc.RpcOutcomeListener; +import org.apache.drill.exec.rpc.data.DataTunnel; +import org.apache.drill.exec.server.DrillbitContext; +import org.apache.drill.exec.util.Pointer; +import org.apache.drill.exec.work.QueryWorkUnit; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.ConcurrentHashMap; + +/** + * This class traverses the physical operator tree to find the HashJoin operator + * for which is JPPD (join predicate push down) is possible. The prerequisite to do JPPD + * is: + * 1. The join condition is equality + * 2. The physical join node is a HashJoin one + * 3. The probe side children of the HashJoin node should not contain a blocking operator like HashAgg + */ +public class RuntimeFilterManager { + + private Wrapper rootWrapper; + //HashJoin node's major fragment id to its corresponding probe side nodes's endpoints + private Map> joinMjId2probdeScanEps = new HashMap<>(); + //HashJoin node's major fragment id to its corresponding probe side nodes's number + private Map joinMjId2scanSize = new ConcurrentHashMap<>(); + //HashJoin node's major fragment id to its corresponding probe side scan node's belonging major fragment id + private Map joinMjId2ScanMjId = new HashMap<>(); + + private RuntimeFilterWritable aggregatedRuntimeFilter; + + private DrillbitContext drillbitContext; + + private SendingAccountor sendingAccountor = new SendingAccountor(); + + private String lineSeparator; + + private static final Logger logger = LoggerFactory.getLogger(RuntimeFilterManager.class); + + /** + * This class maintains context for the runtime join push down's filter management. It + * does a traversal of the physical operators by leveraging the root wrapper which indirectly + * holds the global PhysicalOperator tree a
[jira] [Commented] (DRILL-6589) Push transitive closure generated predicates past aggregates/projects
[ https://issues.apache.org/jira/browse/DRILL-6589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554957#comment-16554957 ] ASF GitHub Bot commented on DRILL-6589: --- gparai commented on a change in pull request #1372: DRILL-6589: Push transitive closure predicates past aggregates/projects URL: https://github.com/apache/drill/pull/1372#discussion_r204952967 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/planner/RuleInstance.java ## @@ -140,4 +145,14 @@ SubQueryRemoveRule SUB_QUERY_JOIN_REMOVE_RULE = new SubQueryRemoveRule.SubQueryJoinRemoveRule(DrillRelFactories.LOGICAL_BUILDER); + + FilterAggregateTransposeRule DRILL_FILTER_AGGREGATE_TRANSPOSE_RULE = Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Push transitive closure generated predicates past aggregates/projects > - > > Key: DRILL-6589 > URL: https://issues.apache.org/jira/browse/DRILL-6589 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.13.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.15.0 > > > Here is a sample query that may benefit from this optimization: > SELECT * FROM T1 WHERE a1 = 5 AND a1 IN (SELECT a2 FROM T2); > Here the transitive predicate a2 = 5 would be pushed past the aggregate due > to this optimization. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6589) Push transitive closure generated predicates past aggregates/projects
[ https://issues.apache.org/jira/browse/DRILL-6589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554958#comment-16554958 ] ASF GitHub Bot commented on DRILL-6589: --- gparai commented on issue #1372: DRILL-6589: Push transitive closure predicates past aggregates/projects URL: https://github.com/apache/drill/pull/1372#issuecomment-407595809 @vdiravka thanks for the review. I have addressed your review comments. Please take a look. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Push transitive closure generated predicates past aggregates/projects > - > > Key: DRILL-6589 > URL: https://issues.apache.org/jira/browse/DRILL-6589 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.13.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.15.0 > > > Here is a sample query that may benefit from this optimization: > SELECT * FROM T1 WHERE a1 = 5 AND a1 IN (SELECT a2 FROM T2); > Here the transitive predicate a2 = 5 would be pushed past the aggregate due > to this optimization. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6589) Push transitive closure generated predicates past aggregates/projects
[ https://issues.apache.org/jira/browse/DRILL-6589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554955#comment-16554955 ] ASF GitHub Bot commented on DRILL-6589: --- gparai commented on a change in pull request #1372: DRILL-6589: Push transitive closure predicates past aggregates/projects URL: https://github.com/apache/drill/pull/1372#discussion_r204952905 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushFilterPastProjectRule.java ## @@ -50,9 +54,12 @@ } private DrillPushFilterPastProjectRule(RelBuilderFactory relBuilderFactory) { -super(operand(LogicalFilter.class, operand(LogicalProject.class, any())), relBuilderFactory,null); +super(operand(LogicalFilter.class, operand(LogicalProject.class, any())), relBuilderFactory, null); } + private DrillPushFilterPastProjectRule(RelBuilderFactory relBuilderFactory, boolean forDrill) { Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Push transitive closure generated predicates past aggregates/projects > - > > Key: DRILL-6589 > URL: https://issues.apache.org/jira/browse/DRILL-6589 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.13.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.15.0 > > > Here is a sample query that may benefit from this optimization: > SELECT * FROM T1 WHERE a1 = 5 AND a1 IN (SELECT a2 FROM T2); > Here the transitive predicate a2 = 5 would be pushed past the aggregate due > to this optimization. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6589) Push transitive closure generated predicates past aggregates/projects
[ https://issues.apache.org/jira/browse/DRILL-6589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554953#comment-16554953 ] ASF GitHub Bot commented on DRILL-6589: --- gparai commented on a change in pull request #1372: DRILL-6589: Push transitive closure predicates past aggregates/projects URL: https://github.com/apache/drill/pull/1372#discussion_r204952377 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/planner/RuleInstance.java ## @@ -140,4 +145,14 @@ SubQueryRemoveRule SUB_QUERY_JOIN_REMOVE_RULE = new SubQueryRemoveRule.SubQueryJoinRemoveRule(DrillRelFactories.LOGICAL_BUILDER); + + FilterAggregateTransposeRule DRILL_FILTER_AGGREGATE_TRANSPOSE_RULE = + new FilterAggregateTransposeRule(Filter.class, + DrillRelBuilder.proto(DrillRelFactories.DRILL_LOGICAL_FILTER_FACTORY, + DrillRelFactories.DRILL_LOGICAL_AGGREGATE_FACTORY), Aggregate.class); + + FilterProjectTransposeRule DRILL_FILTER_PROJECT_TRANSPOSE_RULE = Review comment: Removed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Push transitive closure generated predicates past aggregates/projects > - > > Key: DRILL-6589 > URL: https://issues.apache.org/jira/browse/DRILL-6589 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.13.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.15.0 > > > Here is a sample query that may benefit from this optimization: > SELECT * FROM T1 WHERE a1 = 5 AND a1 IN (SELECT a2 FROM T2); > Here the transitive predicate a2 = 5 would be pushed past the aggregate due > to this optimization. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6589) Push transitive closure generated predicates past aggregates/projects
[ https://issues.apache.org/jira/browse/DRILL-6589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554945#comment-16554945 ] ASF GitHub Bot commented on DRILL-6589: --- gparai commented on a change in pull request #1372: DRILL-6589: Push transitive closure predicates past aggregates/projects URL: https://github.com/apache/drill/pull/1372#discussion_r204951832 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRelFactories.java ## @@ -122,4 +127,16 @@ public RelNode createJoin(RelNode left, RelNode right, } } + private static class DrillAggregateFactoryImpl implements RelFactories.AggregateFactory { + +@Override +public RelNode createAggregate(RelNode input, boolean indicator, ImmutableBitSet groupSet, + ImmutableList groupSets, List aggCalls) { + try { +return new DrillAggregateRel(input.getCluster(), input.getTraitSet(), input, indicator, groupSet, groupSets, aggCalls); + } catch (InvalidRelException ex) { Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Push transitive closure generated predicates past aggregates/projects > - > > Key: DRILL-6589 > URL: https://issues.apache.org/jira/browse/DRILL-6589 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.13.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.15.0 > > > Here is a sample query that may benefit from this optimization: > SELECT * FROM T1 WHERE a1 = 5 AND a1 IN (SELECT a2 FROM T2); > Here the transitive predicate a2 = 5 would be pushed past the aggregate due > to this optimization. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6589) Push transitive closure generated predicates past aggregates/projects
[ https://issues.apache.org/jira/browse/DRILL-6589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554944#comment-16554944 ] ASF GitHub Bot commented on DRILL-6589: --- gparai commented on a change in pull request #1372: DRILL-6589: Push transitive closure predicates past aggregates/projects URL: https://github.com/apache/drill/pull/1372#discussion_r204951732 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRelFactories.java ## @@ -122,4 +127,16 @@ public RelNode createJoin(RelNode left, RelNode right, } } + private static class DrillAggregateFactoryImpl implements RelFactories.AggregateFactory { Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Push transitive closure generated predicates past aggregates/projects > - > > Key: DRILL-6589 > URL: https://issues.apache.org/jira/browse/DRILL-6589 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.13.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.15.0 > > > Here is a sample query that may benefit from this optimization: > SELECT * FROM T1 WHERE a1 = 5 AND a1 IN (SELECT a2 FROM T2); > Here the transitive predicate a2 = 5 would be pushed past the aggregate due > to this optimization. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6589) Push transitive closure generated predicates past aggregates/projects
[ https://issues.apache.org/jira/browse/DRILL-6589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554942#comment-16554942 ] ASF GitHub Bot commented on DRILL-6589: --- gparai commented on a change in pull request #1372: DRILL-6589: Push transitive closure predicates past aggregates/projects URL: https://github.com/apache/drill/pull/1372#discussion_r204951494 ## File path: exec/java-exec/src/test/java/org/apache/drill/exec/fn/impl/TestAggregateFunctions.java ## @@ -732,6 +732,25 @@ public void testPushFilterInExprPastAgg() throws Exception { .build().run(); } + @Test + public void testTransitiveFilterPushPastAgg() throws Exception { Review comment: Moved testcase. I decided to remove the push filter past project rule from TC. It was causing too many side-effects (plan patterns breaking etc.). Moreover, it may not be very useful from a cost perspective. We can re-introduce it if it were to be applied in a cost-based manner (via Volcano planner). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Push transitive closure generated predicates past aggregates/projects > - > > Key: DRILL-6589 > URL: https://issues.apache.org/jira/browse/DRILL-6589 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.13.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.15.0 > > > Here is a sample query that may benefit from this optimization: > SELECT * FROM T1 WHERE a1 = 5 AND a1 IN (SELECT a2 FROM T2); > Here the transitive predicate a2 = 5 would be pushed past the aggregate due > to this optimization. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6589) Push transitive closure generated predicates past aggregates/projects
[ https://issues.apache.org/jira/browse/DRILL-6589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554943#comment-16554943 ] ASF GitHub Bot commented on DRILL-6589: --- gparai commented on a change in pull request #1372: DRILL-6589: Push transitive closure predicates past aggregates/projects URL: https://github.com/apache/drill/pull/1372#discussion_r204951637 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRelFactories.java ## @@ -92,7 +97,7 @@ public RelNode createProject(RelNode child, /** * Implementation of {@link RelFactories.FilterFactory} that - * returns a vanilla {@link LogicalFilter}. + * returns a vanilla LogicalFilter. Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Push transitive closure generated predicates past aggregates/projects > - > > Key: DRILL-6589 > URL: https://issues.apache.org/jira/browse/DRILL-6589 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.13.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Fix For: 1.15.0 > > > Here is a sample query that may benefit from this optimization: > SELECT * FROM T1 WHERE a1 = 5 AND a1 IN (SELECT a2 FROM T2); > Here the transitive predicate a2 = 5 would be pushed past the aggregate due > to this optimization. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6632) drill-jdbc-all jar size limit too small for release build
[ https://issues.apache.org/jira/browse/DRILL-6632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554841#comment-16554841 ] ASF GitHub Bot commented on DRILL-6632: --- Ben-Zvi closed pull request #1396: DRILL-6632: Increase jdbc-all jar size limit to 3650 URL: https://github.com/apache/drill/pull/1396 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/exec/jdbc-all/pom.xml b/exec/jdbc-all/pom.xml index f7af5110e50..983a98f4e2a 100644 --- a/exec/jdbc-all/pom.xml +++ b/exec/jdbc-all/pom.xml @@ -506,7 +506,7 @@ This is likely due to you adding new dependencies to a java-exec and not updating the excludes in this module. This is important as it minimizes the size of the dependency of Drill application users. - 3600 + 3650 1500 ${project.build.directory}/drill-jdbc-all-${project.version}.jar This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > drill-jdbc-all jar size limit too small for release build > - > > Key: DRILL-6632 > URL: https://issues.apache.org/jira/browse/DRILL-6632 > Project: Apache Drill > Issue Type: Bug > Components: Tools, Build & Test >Affects Versions: 1.14.0 >Reporter: Boaz Ben-Zvi >Assignee: Boaz Ben-Zvi >Priority: Blocker > Fix For: 1.14.0 > > > Among the changes for DRILL-6294, the limit for the drill-jdbc-all jar file > size was increased to 3600, about what was needed to accommodate the new > Calcite version. > However a Release build requires a slightly larger size (probably due to > adding several of those > *org.codehaus.plexus.compiler.javac.JavacCompiler6931842185404907145arguments*). > Proposed Fix: Increase the size limit to 36,500,000 > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6632) drill-jdbc-all jar size limit too small for release build
Boaz Ben-Zvi created DRILL-6632: --- Summary: drill-jdbc-all jar size limit too small for release build Key: DRILL-6632 URL: https://issues.apache.org/jira/browse/DRILL-6632 Project: Apache Drill Issue Type: Bug Components: Tools, Build & Test Affects Versions: 1.14.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi Fix For: 1.14.0 Among the changes for DRILL-6294, the limit for the drill-jdbc-all jar file size was increased to 3600, about what was needed to accommodate the new Calcite version. However a Release build requires a slightly larger size (probably due to adding several of those *org.codehaus.plexus.compiler.javac.JavacCompiler6931842185404907145arguments*). Proposed Fix: Increase the size limit to 36,500,000 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6629) BitVector split and transfer does not work correctly for transfer length < 8
[ https://issues.apache.org/jira/browse/DRILL-6629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554825#comment-16554825 ] ASF GitHub Bot commented on DRILL-6629: --- ppadma commented on a change in pull request #1395: DRILL-6629 BitVector split and transfer does not work correctly for transfer length < 8 URL: https://github.com/apache/drill/pull/1395#discussion_r204918913 ## File path: exec/vector/src/main/java/org/apache/drill/exec/vector/BitVector.java ## @@ -323,7 +323,8 @@ public void splitAndTransferTo(int startIndex, int length, BitVector target) { if (length % 8 != 0) { // start is not byte aligned so we have to copy some bits from the last full byte read in the // previous loop -byte lastButOneByte = byteIPlus1; +// if numBytesHoldingSourceBits == 1, lastButOneByte is the first byte, but we have not read it yet, so read it +byte lastButOneByte = (numBytesHoldingSourceBits == 1) ? this.data.getByte(firstByteIndex) : byteIPlus1; Review comment: @bitblender I think there could be a problem here. please check. If you are copying say from firstBitOffset 2, length 4. We want to copy 4 bits only. But, this might copy 6 bits. bitsFromLastButOneByte will be all bits from firstBitOffset to the end of the byte. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > BitVector split and transfer does not work correctly for transfer length < 8 > > > Key: DRILL-6629 > URL: https://issues.apache.org/jira/browse/DRILL-6629 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Data Types > Environment: BitVector split and transfer does not work correctly for > transfer length < 8. >Reporter: Karthikeyan Manivannan >Assignee: Karthikeyan Manivannan >Priority: Major > Fix For: 1.15.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-6631) Wrong result from LateralUnnest query with aggregation and order by
[ https://issues.apache.org/jira/browse/DRILL-6631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker reassigned DRILL-6631: Assignee: Parth Chandra > Wrong result from LateralUnnest query with aggregation and order by > --- > > Key: DRILL-6631 > URL: https://issues.apache.org/jira/browse/DRILL-6631 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.14.0 >Reporter: Parth Chandra >Assignee: Parth Chandra >Priority: Major > Fix For: 1.15.0 > > > Reported by Chun: > The following query gives correct result: > {noformat} > 0: jdbc:drill:zk=10.10.30.166:5181> select customer.c_custkey, > customer.c_name, orders.totalprice from customer, lateral (select > sum(t.o.o_totalprice) as totalprice from unnest(customer.c_orders) t(o) WHERE > t.o.o_totalprice in > (89230.03,270087.44,246408.53,82657.72,153941.38,65277.06,180309.76)) orders > where customer.c_custkey = 101276; > ++-+-+ > | c_custkey | c_name| totalprice | > ++-+-+ > | 101276 | Customer#000101276 | 82657.72| > ++-+-+ > 1 row selected (6.184 seconds) > {noformat} > But if I remove the where clause and replace it with order by and limit, I > got the following empty result set. This is wrong. > {noformat} > 0: jdbc:drill:zk=10.10.30.166:5181> select customer.c_custkey, > customer.c_name, orders.totalprice from customer, lateral (select > sum(t.o.o_totalprice) as totalprice from unnest(customer.c_orders) t(o) WHERE > t.o.o_totalprice in > (89230.03,270087.44,246408.53,82657.72,153941.38,65277.06,180309.76)) orders > order by customer.c_custkey limit 50; > ++-+-+ > | c_custkey | c_name | totalprice | > ++-+-+ > ++-+-+ > No rows selected (2.753 seconds) > {noformat} > Here is the plan for the query giving the correct result: > {noformat} > 00-00Screen : rowType = RecordType(ANY c_custkey, ANY c_name, ANY > totalprice): rowcount = 472783.35, cumulative cost = {8242193.734985 > rows, 4.10218543349E7 cpu, 0.0 io, 5.80956180479E9 network, 0.0 > memory}, id = 14410 > 00-01 Project(c_custkey=[$0], c_name=[$1], totalprice=[$2]) : rowType = > RecordType(ANY c_custkey, ANY c_name, ANY totalprice): rowcount = 472783.35, > cumulative cost = {8194915.399985 rows, 4.0974575E7 cpu, 0.0 io, > 5.80956180479E9 network, 0.0 memory}, id = 14409 > 00-02UnionExchange : rowType = RecordType(ANY c_custkey, ANY c_name, > ANY totalprice): rowcount = 472783.35, cumulative cost = {7722132.04999 > rows, 3.955622594996E7 cpu, 0.0 io, 5.80956180479E9 network, 0.0 > memory}, id = 14408 > 01-01 LateralJoin(correlation=[$cor1], joinType=[inner], > requiredColumns=[{0}], column excluded from output: =[`c_orders`]) : rowType > = RecordType(ANY c_custkey, ANY c_name, ANY totalprice): rowcount = > 472783.35, cumulative cost = {7249348.6 rows, 3.577395915E7 cpu, 0.0 > io, 0.0 network, 0.0 memory}, id = 14407 > 01-03SelectionVectorRemover : rowType = RecordType(ANY c_orders, > ANY c_custkey, ANY c_name): rowcount = 472783.35, cumulative cost = > {6776561.35 rows, 2.442713975E7 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = > 14403 > 01-05 Filter(condition=[=($1, 101276)]) : rowType = > RecordType(ANY c_orders, ANY c_custkey, ANY c_name): rowcount = 472783.35, > cumulative cost = {6303778.0 rows, 2.39543564E7 cpu, 0.0 io, 0.0 network, 0.0 > memory}, id = 14402 > 01-07Scan(groupscan=[EasyGroupScan > [selectionRoot=maprfs:/drill/testdata/lateral/tpchsf1/json/customer, > numFiles=10, columns=[`c_orders`, `c_custkey`, `c_name`], > files=[maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_6.json, > maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_4.json, > maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_3.json, > maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_7.json, > maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_5.json, > maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_2.json, > maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_0.json, > maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_8.json, > maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_1.json, > maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_9.json]]]) : > rowType = RecordType(ANY c_orders, ANY c_custkey, ANY c_name): rowcount = > 3151889.0, cumulative cost = {3151889.0 rows, 9455667.0 cpu, 0.0 io, 0.0 > network, 0.0 memory}, id = 14401 > 01-02StreamAgg(group=[{}],
[jira] [Updated] (DRILL-6631) Wrong result from LateralUnnest query with aggregation and order by
[ https://issues.apache.org/jira/browse/DRILL-6631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker updated DRILL-6631: - Fix Version/s: 1.15.0 > Wrong result from LateralUnnest query with aggregation and order by > --- > > Key: DRILL-6631 > URL: https://issues.apache.org/jira/browse/DRILL-6631 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.14.0 >Reporter: Parth Chandra >Assignee: Parth Chandra >Priority: Major > Fix For: 1.15.0 > > > Reported by Chun: > The following query gives correct result: > {noformat} > 0: jdbc:drill:zk=10.10.30.166:5181> select customer.c_custkey, > customer.c_name, orders.totalprice from customer, lateral (select > sum(t.o.o_totalprice) as totalprice from unnest(customer.c_orders) t(o) WHERE > t.o.o_totalprice in > (89230.03,270087.44,246408.53,82657.72,153941.38,65277.06,180309.76)) orders > where customer.c_custkey = 101276; > ++-+-+ > | c_custkey | c_name| totalprice | > ++-+-+ > | 101276 | Customer#000101276 | 82657.72| > ++-+-+ > 1 row selected (6.184 seconds) > {noformat} > But if I remove the where clause and replace it with order by and limit, I > got the following empty result set. This is wrong. > {noformat} > 0: jdbc:drill:zk=10.10.30.166:5181> select customer.c_custkey, > customer.c_name, orders.totalprice from customer, lateral (select > sum(t.o.o_totalprice) as totalprice from unnest(customer.c_orders) t(o) WHERE > t.o.o_totalprice in > (89230.03,270087.44,246408.53,82657.72,153941.38,65277.06,180309.76)) orders > order by customer.c_custkey limit 50; > ++-+-+ > | c_custkey | c_name | totalprice | > ++-+-+ > ++-+-+ > No rows selected (2.753 seconds) > {noformat} > Here is the plan for the query giving the correct result: > {noformat} > 00-00Screen : rowType = RecordType(ANY c_custkey, ANY c_name, ANY > totalprice): rowcount = 472783.35, cumulative cost = {8242193.734985 > rows, 4.10218543349E7 cpu, 0.0 io, 5.80956180479E9 network, 0.0 > memory}, id = 14410 > 00-01 Project(c_custkey=[$0], c_name=[$1], totalprice=[$2]) : rowType = > RecordType(ANY c_custkey, ANY c_name, ANY totalprice): rowcount = 472783.35, > cumulative cost = {8194915.399985 rows, 4.0974575E7 cpu, 0.0 io, > 5.80956180479E9 network, 0.0 memory}, id = 14409 > 00-02UnionExchange : rowType = RecordType(ANY c_custkey, ANY c_name, > ANY totalprice): rowcount = 472783.35, cumulative cost = {7722132.04999 > rows, 3.955622594996E7 cpu, 0.0 io, 5.80956180479E9 network, 0.0 > memory}, id = 14408 > 01-01 LateralJoin(correlation=[$cor1], joinType=[inner], > requiredColumns=[{0}], column excluded from output: =[`c_orders`]) : rowType > = RecordType(ANY c_custkey, ANY c_name, ANY totalprice): rowcount = > 472783.35, cumulative cost = {7249348.6 rows, 3.577395915E7 cpu, 0.0 > io, 0.0 network, 0.0 memory}, id = 14407 > 01-03SelectionVectorRemover : rowType = RecordType(ANY c_orders, > ANY c_custkey, ANY c_name): rowcount = 472783.35, cumulative cost = > {6776561.35 rows, 2.442713975E7 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = > 14403 > 01-05 Filter(condition=[=($1, 101276)]) : rowType = > RecordType(ANY c_orders, ANY c_custkey, ANY c_name): rowcount = 472783.35, > cumulative cost = {6303778.0 rows, 2.39543564E7 cpu, 0.0 io, 0.0 network, 0.0 > memory}, id = 14402 > 01-07Scan(groupscan=[EasyGroupScan > [selectionRoot=maprfs:/drill/testdata/lateral/tpchsf1/json/customer, > numFiles=10, columns=[`c_orders`, `c_custkey`, `c_name`], > files=[maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_6.json, > maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_4.json, > maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_3.json, > maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_7.json, > maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_5.json, > maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_2.json, > maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_0.json, > maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_8.json, > maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_1.json, > maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_9.json]]]) : > rowType = RecordType(ANY c_orders, ANY c_custkey, ANY c_name): rowcount = > 3151889.0, cumulative cost = {3151889.0 rows, 9455667.0 cpu, 0.0 io, 0.0 > network, 0.0 memory}, id = 14401 > 01-02StreamAgg(group=[{}], totalpric
[jira] [Commented] (DRILL-6631) Wrong result from LateralUnnest query with aggregation and order by
[ https://issues.apache.org/jira/browse/DRILL-6631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554664#comment-16554664 ] Parth Chandra commented on DRILL-6631: -- The issue is caused by incorrect handling of empty batches in the streaming aggregator. In case of empty input and no group by, streaming agg sends out a 'special' batch with no ( or null) records and a row count of 1. Once a special batch has been sent, streaming agg always returned a NONE outcome on subsequent calls to next(). In a lateral/unnest subquery, this behaviour needs to be emulated for every empty batch produced by unnest. However, we cannot return NONE after sending out such a batch, but must reset the state. Streaming agg is handling this incorrectly and returning NONE causing the query to terminate early. There are other issues with the handling of state in such a case. However none of the issues is caught by the unit tests because they all have a group-by. > Wrong result from LateralUnnest query with aggregation and order by > --- > > Key: DRILL-6631 > URL: https://issues.apache.org/jira/browse/DRILL-6631 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.14.0 >Reporter: Parth Chandra >Priority: Major > > Reported by Chun: > The following query gives correct result: > {noformat} > 0: jdbc:drill:zk=10.10.30.166:5181> select customer.c_custkey, > customer.c_name, orders.totalprice from customer, lateral (select > sum(t.o.o_totalprice) as totalprice from unnest(customer.c_orders) t(o) WHERE > t.o.o_totalprice in > (89230.03,270087.44,246408.53,82657.72,153941.38,65277.06,180309.76)) orders > where customer.c_custkey = 101276; > ++-+-+ > | c_custkey | c_name| totalprice | > ++-+-+ > | 101276 | Customer#000101276 | 82657.72| > ++-+-+ > 1 row selected (6.184 seconds) > {noformat} > But if I remove the where clause and replace it with order by and limit, I > got the following empty result set. This is wrong. > {noformat} > 0: jdbc:drill:zk=10.10.30.166:5181> select customer.c_custkey, > customer.c_name, orders.totalprice from customer, lateral (select > sum(t.o.o_totalprice) as totalprice from unnest(customer.c_orders) t(o) WHERE > t.o.o_totalprice in > (89230.03,270087.44,246408.53,82657.72,153941.38,65277.06,180309.76)) orders > order by customer.c_custkey limit 50; > ++-+-+ > | c_custkey | c_name | totalprice | > ++-+-+ > ++-+-+ > No rows selected (2.753 seconds) > {noformat} > Here is the plan for the query giving the correct result: > {noformat} > 00-00Screen : rowType = RecordType(ANY c_custkey, ANY c_name, ANY > totalprice): rowcount = 472783.35, cumulative cost = {8242193.734985 > rows, 4.10218543349E7 cpu, 0.0 io, 5.80956180479E9 network, 0.0 > memory}, id = 14410 > 00-01 Project(c_custkey=[$0], c_name=[$1], totalprice=[$2]) : rowType = > RecordType(ANY c_custkey, ANY c_name, ANY totalprice): rowcount = 472783.35, > cumulative cost = {8194915.399985 rows, 4.0974575E7 cpu, 0.0 io, > 5.80956180479E9 network, 0.0 memory}, id = 14409 > 00-02UnionExchange : rowType = RecordType(ANY c_custkey, ANY c_name, > ANY totalprice): rowcount = 472783.35, cumulative cost = {7722132.04999 > rows, 3.955622594996E7 cpu, 0.0 io, 5.80956180479E9 network, 0.0 > memory}, id = 14408 > 01-01 LateralJoin(correlation=[$cor1], joinType=[inner], > requiredColumns=[{0}], column excluded from output: =[`c_orders`]) : rowType > = RecordType(ANY c_custkey, ANY c_name, ANY totalprice): rowcount = > 472783.35, cumulative cost = {7249348.6 rows, 3.577395915E7 cpu, 0.0 > io, 0.0 network, 0.0 memory}, id = 14407 > 01-03SelectionVectorRemover : rowType = RecordType(ANY c_orders, > ANY c_custkey, ANY c_name): rowcount = 472783.35, cumulative cost = > {6776561.35 rows, 2.442713975E7 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = > 14403 > 01-05 Filter(condition=[=($1, 101276)]) : rowType = > RecordType(ANY c_orders, ANY c_custkey, ANY c_name): rowcount = 472783.35, > cumulative cost = {6303778.0 rows, 2.39543564E7 cpu, 0.0 io, 0.0 network, 0.0 > memory}, id = 14402 > 01-07Scan(groupscan=[EasyGroupScan > [selectionRoot=maprfs:/drill/testdata/lateral/tpchsf1/json/customer, > numFiles=10, columns=[`c_orders`, `c_custkey`, `c_name`], > files=[maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_6.json, > maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_4.json, > maprfs:///drill/testdata/lateral/tpchsf1/
[jira] [Created] (DRILL-6631) Wrong result from LateralUnnest query with aggregation and order by
Parth Chandra created DRILL-6631: Summary: Wrong result from LateralUnnest query with aggregation and order by Key: DRILL-6631 URL: https://issues.apache.org/jira/browse/DRILL-6631 Project: Apache Drill Issue Type: Bug Affects Versions: 1.14.0 Reporter: Parth Chandra Reported by Chun: The following query gives correct result: {noformat} 0: jdbc:drill:zk=10.10.30.166:5181> select customer.c_custkey, customer.c_name, orders.totalprice from customer, lateral (select sum(t.o.o_totalprice) as totalprice from unnest(customer.c_orders) t(o) WHERE t.o.o_totalprice in (89230.03,270087.44,246408.53,82657.72,153941.38,65277.06,180309.76)) orders where customer.c_custkey = 101276; ++-+-+ | c_custkey | c_name| totalprice | ++-+-+ | 101276 | Customer#000101276 | 82657.72| ++-+-+ 1 row selected (6.184 seconds) {noformat} But if I remove the where clause and replace it with order by and limit, I got the following empty result set. This is wrong. {noformat} 0: jdbc:drill:zk=10.10.30.166:5181> select customer.c_custkey, customer.c_name, orders.totalprice from customer, lateral (select sum(t.o.o_totalprice) as totalprice from unnest(customer.c_orders) t(o) WHERE t.o.o_totalprice in (89230.03,270087.44,246408.53,82657.72,153941.38,65277.06,180309.76)) orders order by customer.c_custkey limit 50; ++-+-+ | c_custkey | c_name | totalprice | ++-+-+ ++-+-+ No rows selected (2.753 seconds) {noformat} Here is the plan for the query giving the correct result: {noformat} 00-00Screen : rowType = RecordType(ANY c_custkey, ANY c_name, ANY totalprice): rowcount = 472783.35, cumulative cost = {8242193.734985 rows, 4.10218543349E7 cpu, 0.0 io, 5.80956180479E9 network, 0.0 memory}, id = 14410 00-01 Project(c_custkey=[$0], c_name=[$1], totalprice=[$2]) : rowType = RecordType(ANY c_custkey, ANY c_name, ANY totalprice): rowcount = 472783.35, cumulative cost = {8194915.399985 rows, 4.0974575E7 cpu, 0.0 io, 5.80956180479E9 network, 0.0 memory}, id = 14409 00-02UnionExchange : rowType = RecordType(ANY c_custkey, ANY c_name, ANY totalprice): rowcount = 472783.35, cumulative cost = {7722132.04999 rows, 3.955622594996E7 cpu, 0.0 io, 5.80956180479E9 network, 0.0 memory}, id = 14408 01-01 LateralJoin(correlation=[$cor1], joinType=[inner], requiredColumns=[{0}], column excluded from output: =[`c_orders`]) : rowType = RecordType(ANY c_custkey, ANY c_name, ANY totalprice): rowcount = 472783.35, cumulative cost = {7249348.6 rows, 3.577395915E7 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 14407 01-03SelectionVectorRemover : rowType = RecordType(ANY c_orders, ANY c_custkey, ANY c_name): rowcount = 472783.35, cumulative cost = {6776561.35 rows, 2.442713975E7 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 14403 01-05 Filter(condition=[=($1, 101276)]) : rowType = RecordType(ANY c_orders, ANY c_custkey, ANY c_name): rowcount = 472783.35, cumulative cost = {6303778.0 rows, 2.39543564E7 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 14402 01-07Scan(groupscan=[EasyGroupScan [selectionRoot=maprfs:/drill/testdata/lateral/tpchsf1/json/customer, numFiles=10, columns=[`c_orders`, `c_custkey`, `c_name`], files=[maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_6.json, maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_4.json, maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_3.json, maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_7.json, maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_5.json, maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_2.json, maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_0.json, maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_8.json, maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_1.json, maprfs:///drill/testdata/lateral/tpchsf1/json/customer/0_0_9.json]]]) : rowType = RecordType(ANY c_orders, ANY c_custkey, ANY c_name): rowcount = 3151889.0, cumulative cost = {3151889.0 rows, 9455667.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 14401 01-02StreamAgg(group=[{}], totalprice=[SUM($0)]) : rowType = RecordType(ANY totalprice): rowcount = 1.0, cumulative cost = {4.0 rows, 19.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 14406 01-04 Filter(condition=[OR(=($0, 89230.03), =($0, 270087.44), =($0, 246408.53), =($0, 82657.72), =($0, 153941.38), =($0, 65277.06), =($0, 180309.76))]) : rowType = RecordType(ANY ITEM): rowcount = 1.0, cumulative cost = {3.0 rows, 7.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 14405 01-0
[jira] [Updated] (DRILL-6629) BitVector split and transfer does not work correctly for transfer length < 8
[ https://issues.apache.org/jira/browse/DRILL-6629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker updated DRILL-6629: - Reviewer: Sorabh Hamirwasia > BitVector split and transfer does not work correctly for transfer length < 8 > > > Key: DRILL-6629 > URL: https://issues.apache.org/jira/browse/DRILL-6629 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Data Types > Environment: BitVector split and transfer does not work correctly for > transfer length < 8. >Reporter: Karthikeyan Manivannan >Assignee: Karthikeyan Manivannan >Priority: Major > Fix For: 1.15.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6629) BitVector split and transfer does not work correctly for transfer length < 8
[ https://issues.apache.org/jira/browse/DRILL-6629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554588#comment-16554588 ] ASF GitHub Bot commented on DRILL-6629: --- bitblender commented on issue #1395: DRILL-6629 BitVector split and transfer does not work correctly for transfer length < 8 URL: https://github.com/apache/drill/pull/1395#issuecomment-407495625 @HanumathRao @sohami Previous fix for the BitVector split and transfer was missing a case. Please review this. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > BitVector split and transfer does not work correctly for transfer length < 8 > > > Key: DRILL-6629 > URL: https://issues.apache.org/jira/browse/DRILL-6629 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Data Types > Environment: BitVector split and transfer does not work correctly for > transfer length < 8. >Reporter: Karthikeyan Manivannan >Assignee: Karthikeyan Manivannan >Priority: Major > Fix For: 1.15.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5796) Filter pruning for multi rowgroup parquet file
[ https://issues.apache.org/jira/browse/DRILL-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554546#comment-16554546 ] ASF GitHub Bot commented on DRILL-5796: --- jbimbert commented on a change in pull request #1298: DRILL-5796: Filter pruning for multi rowgroup parquet file URL: https://github.com/apache/drill/pull/1298#discussion_r204839799 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/expr/stat/ParquetIsPredicate.java ## @@ -62,91 +62,126 @@ private ParquetIsPredicate(LogicalExpression expr, BiPredicate, Ra return visitor.visitUnknown(this, value); } - @Override - public boolean canDrop(RangeExprEvaluator evaluator) { + /** + * Apply the filter condition against the meta of the rowgroup. + */ + public RowsMatch matches(RangeExprEvaluator evaluator) { Statistics exprStat = expr.accept(evaluator, null); -if (isNullOrEmpty(exprStat)) { - return false; -} +return isNullOrEmpty(exprStat) ? RowsMatch.SOME : predicate.apply(exprStat, evaluator); + } -return predicate.test(exprStat, evaluator); + /** + * After the applying of the filter against the statistics of the rowgroup, if the result is RowsMatch.ALL, + * then we still must know if the rowgroup contains some null values, because it can change the filter result. + * If it contains some null values, then we change the RowsMatch.ALL into RowsMatch.SOME, which sya that maybe + * some values (the null ones) should be disgarded. + */ + private static RowsMatch checkNull(Statistics exprStat) { +return hasNoNulls(exprStat) ? RowsMatch.ALL : RowsMatch.SOME; } + /** + * Return true if exprStat.getMin is defined and true + */ + private static Boolean minIsTrue(Statistics exprStat) { return exprStat.hasNonNullValue() && ((BooleanStatistics) exprStat).getMin(); } + + /** + * Return true if exprStat.getMin is defined and false + */ + private static Boolean minIsFalse(Statistics exprStat) { return exprStat.hasNonNullValue() && !((BooleanStatistics) exprStat).getMin(); } + + /** + * Return true if exprStat.getMax is defined and true + */ + private static Boolean maxIsTrue(Statistics exprStat) { return exprStat.hasNonNullValue() && ((BooleanStatistics) exprStat).getMax(); } + + /** + * Return true if exprStat.getMax is defined and false + */ + private static Boolean maxIsFalse(Statistics exprStat) { return exprStat.hasNonNullValue() && !((BooleanStatistics) exprStat).getMax(); } + /** * IS NULL predicate. */ private static > LogicalExpression createIsNullPredicate(LogicalExpression expr) { return new ParquetIsPredicate(expr, -//if there are no nulls -> canDrop -(exprStat, evaluator) -> hasNoNulls(exprStat)) { - private final boolean isArray = isArray(expr); - - private boolean isArray(LogicalExpression expression) { -if (expression instanceof TypedFieldExpr) { - TypedFieldExpr typedFieldExpr = (TypedFieldExpr) expression; - SchemaPath schemaPath = typedFieldExpr.getPath(); - return schemaPath.isArray(); -} -return false; - } - - @Override - public boolean canDrop(RangeExprEvaluator evaluator) { + (exprStat, evaluator) -> { // for arrays we are not able to define exact number of nulls // [1,2,3] vs [1,2] -> in second case 3 is absent and thus it's null but statistics shows no nulls -return !isArray && super.canDrop(evaluator); - } -}; +if (expr instanceof TypedFieldExpr) { + TypedFieldExpr typedFieldExpr = (TypedFieldExpr) expr; + if (typedFieldExpr.getPath().isArray()) { +return RowsMatch.SOME; + } +} +if (hasNoNulls(exprStat)) { + return RowsMatch.NONE; +} +return isAllNulls(exprStat, evaluator.getRowCount()) ? RowsMatch.ALL : RowsMatch.SOME; + }); } /** * IS NOT NULL predicate. */ private static > LogicalExpression createIsNotNullPredicate(LogicalExpression expr) { return new ParquetIsPredicate(expr, -//if there are all nulls -> canDrop -(exprStat, evaluator) -> isAllNulls(exprStat, evaluator.getRowCount()) + (exprStat, evaluator) -> isAllNulls(exprStat, evaluator.getRowCount()) ? RowsMatch.NONE : checkNull(exprStat) ); } /** * IS TRUE predicate. */ private static LogicalExpression createIsTruePredicate(LogicalExpression expr) { -return new ParquetIsPredicate(expr, (exprStat, evaluator) -> -//if max value is not true or if there are all nulls -> canDrop -isAllNulls(exprStat, evaluator.getRowCount()) || exprStat.hasNonNullValue() && !((BooleanStatistics) exprStat).getMax() -); +return new ParquetIsPredicate(expr, (exprStat, evaluator) -> { + if (isAllNulls(exprStat, evaluator.getRowCou
[jira] [Commented] (DRILL-5796) Filter pruning for multi rowgroup parquet file
[ https://issues.apache.org/jira/browse/DRILL-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554545#comment-16554545 ] ASF GitHub Bot commented on DRILL-5796: --- jbimbert commented on a change in pull request #1298: DRILL-5796: Filter pruning for multi rowgroup parquet file URL: https://github.com/apache/drill/pull/1298#discussion_r204839765 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/expr/stat/ParquetIsPredicate.java ## @@ -62,91 +62,126 @@ private ParquetIsPredicate(LogicalExpression expr, BiPredicate, Ra return visitor.visitUnknown(this, value); } - @Override - public boolean canDrop(RangeExprEvaluator evaluator) { + /** + * Apply the filter condition against the meta of the rowgroup. + */ + public RowsMatch matches(RangeExprEvaluator evaluator) { Statistics exprStat = expr.accept(evaluator, null); -if (isNullOrEmpty(exprStat)) { - return false; -} +return isNullOrEmpty(exprStat) ? RowsMatch.SOME : predicate.apply(exprStat, evaluator); + } -return predicate.test(exprStat, evaluator); + /** + * After the applying of the filter against the statistics of the rowgroup, if the result is RowsMatch.ALL, + * then we still must know if the rowgroup contains some null values, because it can change the filter result. + * If it contains some null values, then we change the RowsMatch.ALL into RowsMatch.SOME, which sya that maybe + * some values (the null ones) should be disgarded. + */ + private static RowsMatch checkNull(Statistics exprStat) { +return hasNoNulls(exprStat) ? RowsMatch.ALL : RowsMatch.SOME; } + /** + * Return true if exprStat.getMin is defined and true + */ + private static Boolean minIsTrue(Statistics exprStat) { return exprStat.hasNonNullValue() && ((BooleanStatistics) exprStat).getMin(); } Review comment: Functions suppressed because of next comment This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Filter pruning for multi rowgroup parquet file > -- > > Key: DRILL-5796 > URL: https://issues.apache.org/jira/browse/DRILL-5796 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Parquet >Reporter: Damien Profeta >Assignee: Jean-Blas IMBERT >Priority: Major > Fix For: 1.14.0 > > > Today, filter pruning use the file name as the partitioning key. This means > you can remove a partition only if the whole file is for the same partition. > With parquet, you can prune the filter if the rowgroup make a partition of > your dataset as the unit of work if the rowgroup not the file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5796) Filter pruning for multi rowgroup parquet file
[ https://issues.apache.org/jira/browse/DRILL-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554547#comment-16554547 ] ASF GitHub Bot commented on DRILL-5796: --- jbimbert commented on a change in pull request #1298: DRILL-5796: Filter pruning for multi rowgroup parquet file URL: https://github.com/apache/drill/pull/1298#discussion_r204839867 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetPushDownFilter.java ## @@ -165,12 +167,29 @@ protected void doOnMatch(RelOptRuleCall call, FilterPrel filter, ProjectPrel pro return; } - RelNode newScan = ScanPrel.create(scan, scan.getTraitSet(), newGroupScan, scan.getRowType());; if (project != null) { newScan = project.copy(project.getTraitSet(), ImmutableList.of(newScan)); } + +if (newGroupScan instanceof AbstractParquetGroupScan) { + RowsMatch matchAll = RowsMatch.ALL; + List rowGroupInfos = ((AbstractParquetGroupScan) newGroupScan).rowGroupInfos; + for (RowGroupInfo rowGroup : rowGroupInfos) { +if (rowGroup.getRowsMatch() != RowsMatch.ALL) { + matchAll = RowsMatch.SOME; + break; +} + } + if (matchAll == ParquetFilterPredicate.RowsMatch.ALL) { +call.transformTo(newScan); + } +} else { + final RelNode newFilter = filter.copy(filter.getTraitSet(), ImmutableList.of(newScan)); + call.transformTo(newFilter); +} + final RelNode newFilter = filter.copy(filter.getTraitSet(), ImmutableList.of(newScan)); Review comment: Exact. Suppressed duplicate lines This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Filter pruning for multi rowgroup parquet file > -- > > Key: DRILL-5796 > URL: https://issues.apache.org/jira/browse/DRILL-5796 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Parquet >Reporter: Damien Profeta >Assignee: Jean-Blas IMBERT >Priority: Major > Fix For: 1.14.0 > > > Today, filter pruning use the file name as the partitioning key. This means > you can remove a partition only if the whole file is for the same partition. > With parquet, you can prune the filter if the rowgroup make a partition of > your dataset as the unit of work if the rowgroup not the file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5796) Filter pruning for multi rowgroup parquet file
[ https://issues.apache.org/jira/browse/DRILL-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554543#comment-16554543 ] ASF GitHub Bot commented on DRILL-5796: --- jbimbert commented on a change in pull request #1298: DRILL-5796: Filter pruning for multi rowgroup parquet file URL: https://github.com/apache/drill/pull/1298#discussion_r204839620 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/expr/stat/ParquetIsPredicate.java ## @@ -124,8 +124,7 @@ private static LogicalExpression createIsTruePredicate(LogicalExpression expr) { */ private static LogicalExpression createIsFalsePredicate(LogicalExpression expr) { return new ParquetIsPredicate(expr, (exprStat, evaluator) -> -//if min value is not false or if there are all nulls -> canDrop -isAllNulls(exprStat, evaluator.getRowCount()) || exprStat.hasNonNullValue() && ((BooleanStatistics) exprStat).getMin() + isAllNulls(exprStat, evaluator.getRowCount()) || exprStat.hasNonNullValue() && ((BooleanStatistics) exprStat).getMin() ? RowsMatch.NONE : checkNull(exprStat) Review comment: Done added 12 unit tests for cases a. ST:[min: true, max: true, num_nulls: 0] b. ST:[min: false, max: false, num_nulls: 0] c. ST:[min: false, max: true, num_nulls: 0] This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Filter pruning for multi rowgroup parquet file > -- > > Key: DRILL-5796 > URL: https://issues.apache.org/jira/browse/DRILL-5796 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Parquet >Reporter: Damien Profeta >Assignee: Jean-Blas IMBERT >Priority: Major > Fix For: 1.14.0 > > > Today, filter pruning use the file name as the partitioning key. This means > you can remove a partition only if the whole file is for the same partition. > With parquet, you can prune the filter if the rowgroup make a partition of > your dataset as the unit of work if the rowgroup not the file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5796) Filter pruning for multi rowgroup parquet file
[ https://issues.apache.org/jira/browse/DRILL-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554456#comment-16554456 ] ASF GitHub Bot commented on DRILL-5796: --- vrozov commented on a change in pull request #1298: DRILL-5796: Filter pruning for multi rowgroup parquet file URL: https://github.com/apache/drill/pull/1298#discussion_r204817724 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetPushDownFilter.java ## @@ -165,12 +167,29 @@ protected void doOnMatch(RelOptRuleCall call, FilterPrel filter, ProjectPrel pro return; } - RelNode newScan = ScanPrel.create(scan, scan.getTraitSet(), newGroupScan, scan.getRowType());; if (project != null) { newScan = project.copy(project.getTraitSet(), ImmutableList.of(newScan)); } + +if (newGroupScan instanceof AbstractParquetGroupScan) { + RowsMatch matchAll = RowsMatch.ALL; + List rowGroupInfos = ((AbstractParquetGroupScan) newGroupScan).rowGroupInfos; + for (RowGroupInfo rowGroup : rowGroupInfos) { +if (rowGroup.getRowsMatch() != RowsMatch.ALL) { + matchAll = RowsMatch.SOME; + break; +} + } + if (matchAll == ParquetFilterPredicate.RowsMatch.ALL) { +call.transformTo(newScan); + } +} else { + final RelNode newFilter = filter.copy(filter.getTraitSet(), ImmutableList.of(newScan)); + call.transformTo(newFilter); +} + final RelNode newFilter = filter.copy(filter.getTraitSet(), ImmutableList.of(newScan)); Review comment: How it works in case `newGroupScan` is not an instance of `AbstractParquetGroupScan`? Will not `filter.copy` be called twice? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Filter pruning for multi rowgroup parquet file > -- > > Key: DRILL-5796 > URL: https://issues.apache.org/jira/browse/DRILL-5796 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Parquet >Reporter: Damien Profeta >Assignee: Jean-Blas IMBERT >Priority: Major > Fix For: 1.14.0 > > > Today, filter pruning use the file name as the partitioning key. This means > you can remove a partition only if the whole file is for the same partition. > With parquet, you can prune the filter if the rowgroup make a partition of > your dataset as the unit of work if the rowgroup not the file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5796) Filter pruning for multi rowgroup parquet file
[ https://issues.apache.org/jira/browse/DRILL-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554414#comment-16554414 ] ASF GitHub Bot commented on DRILL-5796: --- vrozov commented on a change in pull request #1298: DRILL-5796: Filter pruning for multi rowgroup parquet file URL: https://github.com/apache/drill/pull/1298#discussion_r204807292 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/expr/stat/ParquetIsPredicate.java ## @@ -124,8 +124,7 @@ private static LogicalExpression createIsTruePredicate(LogicalExpression expr) { */ private static LogicalExpression createIsFalsePredicate(LogicalExpression expr) { return new ParquetIsPredicate(expr, (exprStat, evaluator) -> -//if min value is not false or if there are all nulls -> canDrop -isAllNulls(exprStat, evaluator.getRowCount()) || exprStat.hasNonNullValue() && ((BooleanStatistics) exprStat).getMin() + isAllNulls(exprStat, evaluator.getRowCount()) || exprStat.hasNonNullValue() && ((BooleanStatistics) exprStat).getMin() ? RowsMatch.NONE : checkNull(exprStat) Review comment: Please add unit testing. As you can see, integration tests may result in false positive. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Filter pruning for multi rowgroup parquet file > -- > > Key: DRILL-5796 > URL: https://issues.apache.org/jira/browse/DRILL-5796 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Parquet >Reporter: Damien Profeta >Assignee: Jean-Blas IMBERT >Priority: Major > Fix For: 1.14.0 > > > Today, filter pruning use the file name as the partitioning key. This means > you can remove a partition only if the whole file is for the same partition. > With parquet, you can prune the filter if the rowgroup make a partition of > your dataset as the unit of work if the rowgroup not the file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5796) Filter pruning for multi rowgroup parquet file
[ https://issues.apache.org/jira/browse/DRILL-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554406#comment-16554406 ] ASF GitHub Bot commented on DRILL-5796: --- vrozov commented on a change in pull request #1298: DRILL-5796: Filter pruning for multi rowgroup parquet file URL: https://github.com/apache/drill/pull/1298#discussion_r204805759 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/expr/stat/ParquetIsPredicate.java ## @@ -62,91 +62,126 @@ private ParquetIsPredicate(LogicalExpression expr, BiPredicate, Ra return visitor.visitUnknown(this, value); } - @Override - public boolean canDrop(RangeExprEvaluator evaluator) { + /** + * Apply the filter condition against the meta of the rowgroup. + */ + public RowsMatch matches(RangeExprEvaluator evaluator) { Statistics exprStat = expr.accept(evaluator, null); -if (isNullOrEmpty(exprStat)) { - return false; -} +return isNullOrEmpty(exprStat) ? RowsMatch.SOME : predicate.apply(exprStat, evaluator); + } -return predicate.test(exprStat, evaluator); + /** + * After the applying of the filter against the statistics of the rowgroup, if the result is RowsMatch.ALL, + * then we still must know if the rowgroup contains some null values, because it can change the filter result. + * If it contains some null values, then we change the RowsMatch.ALL into RowsMatch.SOME, which sya that maybe + * some values (the null ones) should be disgarded. + */ + private static RowsMatch checkNull(Statistics exprStat) { +return hasNoNulls(exprStat) ? RowsMatch.ALL : RowsMatch.SOME; } + /** + * Return true if exprStat.getMin is defined and true + */ + private static Boolean minIsTrue(Statistics exprStat) { return exprStat.hasNonNullValue() && ((BooleanStatistics) exprStat).getMin(); } + + /** + * Return true if exprStat.getMin is defined and false + */ + private static Boolean minIsFalse(Statistics exprStat) { return exprStat.hasNonNullValue() && !((BooleanStatistics) exprStat).getMin(); } + + /** + * Return true if exprStat.getMax is defined and true + */ + private static Boolean maxIsTrue(Statistics exprStat) { return exprStat.hasNonNullValue() && ((BooleanStatistics) exprStat).getMax(); } + + /** + * Return true if exprStat.getMax is defined and false + */ + private static Boolean maxIsFalse(Statistics exprStat) { return exprStat.hasNonNullValue() && !((BooleanStatistics) exprStat).getMax(); } + /** * IS NULL predicate. */ private static > LogicalExpression createIsNullPredicate(LogicalExpression expr) { return new ParquetIsPredicate(expr, -//if there are no nulls -> canDrop -(exprStat, evaluator) -> hasNoNulls(exprStat)) { - private final boolean isArray = isArray(expr); - - private boolean isArray(LogicalExpression expression) { -if (expression instanceof TypedFieldExpr) { - TypedFieldExpr typedFieldExpr = (TypedFieldExpr) expression; - SchemaPath schemaPath = typedFieldExpr.getPath(); - return schemaPath.isArray(); -} -return false; - } - - @Override - public boolean canDrop(RangeExprEvaluator evaluator) { + (exprStat, evaluator) -> { // for arrays we are not able to define exact number of nulls // [1,2,3] vs [1,2] -> in second case 3 is absent and thus it's null but statistics shows no nulls -return !isArray && super.canDrop(evaluator); - } -}; +if (expr instanceof TypedFieldExpr) { + TypedFieldExpr typedFieldExpr = (TypedFieldExpr) expr; + if (typedFieldExpr.getPath().isArray()) { +return RowsMatch.SOME; + } +} +if (hasNoNulls(exprStat)) { + return RowsMatch.NONE; +} +return isAllNulls(exprStat, evaluator.getRowCount()) ? RowsMatch.ALL : RowsMatch.SOME; + }); } /** * IS NOT NULL predicate. */ private static > LogicalExpression createIsNotNullPredicate(LogicalExpression expr) { return new ParquetIsPredicate(expr, -//if there are all nulls -> canDrop -(exprStat, evaluator) -> isAllNulls(exprStat, evaluator.getRowCount()) + (exprStat, evaluator) -> isAllNulls(exprStat, evaluator.getRowCount()) ? RowsMatch.NONE : checkNull(exprStat) ); } /** * IS TRUE predicate. */ private static LogicalExpression createIsTruePredicate(LogicalExpression expr) { -return new ParquetIsPredicate(expr, (exprStat, evaluator) -> -//if max value is not true or if there are all nulls -> canDrop -isAllNulls(exprStat, evaluator.getRowCount()) || exprStat.hasNonNullValue() && !((BooleanStatistics) exprStat).getMax() -); +return new ParquetIsPredicate(expr, (exprStat, evaluator) -> { + if (isAllNulls(exprStat, evaluator.getRowCount
[jira] [Commented] (DRILL-5796) Filter pruning for multi rowgroup parquet file
[ https://issues.apache.org/jira/browse/DRILL-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554363#comment-16554363 ] ASF GitHub Bot commented on DRILL-5796: --- vrozov commented on a change in pull request #1298: DRILL-5796: Filter pruning for multi rowgroup parquet file URL: https://github.com/apache/drill/pull/1298#discussion_r204797790 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/expr/stat/ParquetIsPredicate.java ## @@ -62,91 +62,126 @@ private ParquetIsPredicate(LogicalExpression expr, BiPredicate, Ra return visitor.visitUnknown(this, value); } - @Override - public boolean canDrop(RangeExprEvaluator evaluator) { + /** + * Apply the filter condition against the meta of the rowgroup. + */ + public RowsMatch matches(RangeExprEvaluator evaluator) { Statistics exprStat = expr.accept(evaluator, null); -if (isNullOrEmpty(exprStat)) { - return false; -} +return isNullOrEmpty(exprStat) ? RowsMatch.SOME : predicate.apply(exprStat, evaluator); + } -return predicate.test(exprStat, evaluator); + /** + * After the applying of the filter against the statistics of the rowgroup, if the result is RowsMatch.ALL, + * then we still must know if the rowgroup contains some null values, because it can change the filter result. + * If it contains some null values, then we change the RowsMatch.ALL into RowsMatch.SOME, which sya that maybe + * some values (the null ones) should be disgarded. + */ + private static RowsMatch checkNull(Statistics exprStat) { +return hasNoNulls(exprStat) ? RowsMatch.ALL : RowsMatch.SOME; } + /** + * Return true if exprStat.getMin is defined and true + */ + private static Boolean minIsTrue(Statistics exprStat) { return exprStat.hasNonNullValue() && ((BooleanStatistics) exprStat).getMin(); } Review comment: why **B**oolean? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Filter pruning for multi rowgroup parquet file > -- > > Key: DRILL-5796 > URL: https://issues.apache.org/jira/browse/DRILL-5796 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Parquet >Reporter: Damien Profeta >Assignee: Jean-Blas IMBERT >Priority: Major > Fix For: 1.14.0 > > > Today, filter pruning use the file name as the partitioning key. This means > you can remove a partition only if the whole file is for the same partition. > With parquet, you can prune the filter if the rowgroup make a partition of > your dataset as the unit of work if the rowgroup not the file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6630) Extra spaces are ignored while publishing results in Drill Web UI
Anton Gozhiy created DRILL-6630: --- Summary: Extra spaces are ignored while publishing results in Drill Web UI Key: DRILL-6630 URL: https://issues.apache.org/jira/browse/DRILL-6630 Project: Apache Drill Issue Type: Bug Affects Versions: 1.14.0 Reporter: Anton Gozhiy *Prerequisites:* Use Drill Web UI to submit queries *Query:* {code:sql} select ' sdssada' from (values(1)) {code} *Expected Result:* {noformat} " sdssada" {noformat} *Actual Result:* {noformat} "sds sada" {noformat} *Note:* Inspecting the element using Chrome Developer Tools you can see that it contain the real string. So something should be done with HTML formatting. -- This message was sent by Atlassian JIRA (v7.6.3#76005)