[PR] DRILL-8479: mergejoin leak when Depleting incoming batches throw exception… (drill)
shfshihuafeng opened a new pull request, #2878: URL: https://github.com/apache/drill/pull/2878 … (#2876) # [DRILL-8479](https://issues.apache.org/jira/browse/DRILL-8479): mergejoin leak when Depleting incoming batches throw exception ## Description when fragment failed, it call close() from MergeJoinBatch. but if leftIterator.close() throw exception, we could not call rightIterator.close() to release memory。 ## Documentation (Please describe user-visible changes similar to what should appear in the Drill documentation.) ## Testing The test method is the same with link, only one parameter needs to be modified, set planner.enable_hashjoin =false to ensure use mergejoin operator [](https://github.com/apache/drill/pull/2875) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Merge join leak (drill)
shfshihuafeng closed pull request #2877: Merge join leak URL: https://github.com/apache/drill/pull/2877 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] Merge join leak (drill)
shfshihuafeng opened a new pull request, #2877: URL: https://github.com/apache/drill/pull/2877 # [DRILL-](https://issues.apache.org/jira/browse/DRILL-): PR Title (Please replace `PR Title` with actual PR Title) ## Description (Please describe the change. If more than one ticket is fixed, include a reference to those tickets.) ## Documentation (Please describe user-visible changes similar to what should appear in the Drill documentation.) ## Testing (Please describe how this PR has been tested.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] mergejoin memory leak when Depleting incoming batches throw exception (drill)
shfshihuafeng commented on issue #2876: URL: https://github.com/apache/drill/issues/2876#issuecomment-1907424021 scenario is reliably repeated by above test ,i fixed 。i can not find leak -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] mergejoin memory leak when Depleting incoming batches throw exception (drill)
shfshihuafeng commented on issue #2876: URL: https://github.com/apache/drill/issues/2876#issuecomment-1907421433 [https://github.com/apache/drill/issues/2871](url) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] mergejoin memory leak when Depleting incoming batches throw exception (drill)
shfshihuafeng commented on issue #2876: URL: https://github.com/apache/drill/issues/2876#issuecomment-1907419612 i fixed it see attachment [0001-mergejoin-leak.patch](https://github.com/apache/drill/files/14033613/0001-mergejoin-leak.patch) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] mergejoin memory leak when Depleting incoming batches throw exception (drill)
shfshihuafeng opened a new issue, #2876: URL: https://github.com/apache/drill/issues/2876 Before submitting a bug report, please verify that you are using the most current version of Drill. **Describe the bug** mergejoin memory leak when Depleting incoming batches throw exception. because we could close rightIterator when leftIterator throw exception **To Reproduce** Steps to reproduce the behavior: 1. prepare data for tpch 1s 2. 20 concurrent for tpch sql8 3. set direct memory 5g 4. when it had OutOfMemoryException , stopped all sql. 5.finding memory leak **Expected behavior** when all sql sop , we should find direct memory is 0 AND could not find leak log like following. `Allocator(op:2:0:11:MergeJoinPOP) 100/73728/4874240/100 (res/actual/peak/limit)` **Error detail, log output or screenshots** `Unable to allocate buffer of size XX (rounded from XX) due to memory limit (). Current allocation: xx` **Drill version** The version of Drill you encountered the issue in. **Additional context** // code placeholder `select o_year, sum(case when nation = 'CHINA' then volume else 0 end) / sum(volume) as mkt_share from ( select extract(year from o_orderdate) as o_year, l_extendedprice * 1.0 as volume, n2.n_name as nation from hive.tpch1s.part, hive.tpch1s.supplier, hive.tpch1s.lineitem, hive.tpch1s.orders, hive.tpch1s.customer, hive.tpch1s.nation n1, hive.tpch1s.nation n2, hive.tpch1s.region where p_partkey = l_partkey and s_suppkey = l_suppkey and l_orderkey = o_orderkey and o_custkey = c_custkey and c_nationkey = n1.n_nationkey and n1.n_regionkey = r_regionkey and r_name = 'ASIA' and s_nationkey = n2.n_nationkey and o_orderdate between date '1995-01-01' and date '1996-12-31' and p_type = 'LARGE BRUSHED BRASS') as all_nations group by o_year order by o_year` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (DRILL-8479) mergejion memory leak when exception
shihuafeng created DRILL-8479: - Summary: mergejion memory leak when exception Key: DRILL-8479 URL: https://issues.apache.org/jira/browse/DRILL-8479 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.21.1 Reporter: shihuafeng Attachments: 0001-mergejoin-leak.patch *Describe the bug* megerjoin leak when RecordIterator allocate memory exception with OutOfMemoryException{*}{*} {*}Steps to reproduce the behavior{*}: # prepare data for tpch 1s # set direct memory 5g # set planner.enable_hashjoin =false to ensure use mergejoin operator。 # set drill.memory.debug.allocator =true (Check for memory leaks ) # 20 concurrent for tpch sql8 # when it had OutOfMemoryException or null EXCEPTION , stopped all sql. # finding memory leak *Expected behavior* when all sql sop , we should find direct memory is 0 AND could not find leak log like following. {code:java} Allocator(op:2:0:11:MergeJoinPOP) 100/73728/4874240/100 (res/actual/peak/limit){code} *Error detail, log output or screenshots* {code:java} Unable to allocate buffer of size XX (rounded from XX) due to memory limit (). Current allocation: xx{code} [^0001-mergejoin-leak.patch] sql {code:java} // code placeholder select o_year, sum(case when nation = 'CHINA' then volume else 0 end) / sum(volume) as mkt_share from ( select extract(year from o_orderdate) as o_year, l_extendedprice * 1.0 as volume, n2.n_name as nation from hive.tpch1s.part, hive.tpch1s.supplier, hive.tpch1s.lineitem, hive.tpch1s.orders, hive.tpch1s.customer, hive.tpch1s.nation n1, hive.tpch1s.nation n2, hive.tpch1s.region where p_partkey = l_partkey and s_suppkey = l_suppkey and l_orderkey = o_orderkey and o_custkey = c_custkey and c_nationkey = n1.n_nationkey and n1.n_regionkey = r_regionkey and r_name = 'ASIA' and s_nationkey = n2.n_nationkey and o_orderdate between date '1995-01-01' and date '1996-12-31' and p_type = 'LARGE BRUSHED BRASS') as all_nations group by o_year order by o_year {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: Parquet files with microsecond columns
A reply on your actual topic now. I think that following implementation of the int96_as_timestamp option will result in the type conversion being done "deeply enough" for Drill. I sympathise a lot with the design thinking in your second option but I'd personally go the first route and only consider the second route if something wasn't working. On 2024/01/22 11:36, Peter Franzen wrote: Hi, I am using Drill to query Parquet files that have fields of type timestamp_micros. By default, Drill truncates those microsecond values to milliseconds when reading the Parquet files in order to convert them to SQL timestamps. In some of my use cases I need to read the original microsecond values (as 64-bit values, not SQL timestamps) through Drill, but this doesn’t seem to be possible (unless I’ve missed something). I have explored a possible solution to this, and would like to run it by some developers more experienced with the Drill code base before I create a pull request. My idea is to add tow options similar to “store.parquet.reader.int96_as_timestamp" to control whether or not microsecond times and timestamps are truncated to milliseconds. These options would be added to “org.apache.drill.exec.ExecConstants" and "org.apache.drill.exec.server.options.SystemOptionManager", and to drill-module.conf: store.parquet.reader.time_micros_as_int64: false, store.parquet.reader.timestamp_micros_as_int64: false, These options would then be used in the same places as “store.parquet.reader.int96_as_timestamp”: org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory org.apache.drill.exec.store.parquet.columnreaders.ParquetToDrillTypeConverter org.apache.drill.exec.store.parquet2.DrillParquetGroupConverter to create an int64 reader instead of a time/timestamp reader when the correspodning option is set to true. In addition to this, “org.apache.drill.exec.store.parquet.metadata.FileMetadataCollector” must be altered to _not_ truncate the min and max values for time_micros/timestamp_micros if the corresponding option is true. This class doesn’t have a reference to an OptionManager, so my guess is that the two new options must be extractred from the OptionManager when the ParquetReaderConfig instance is created. Filtering on microsecond columns would be done using 64-bit values rather than TIME/TIMESTAMP values, e.g. select * from where = 1705914906694751; I’ve tested the solution outlined above, and it seems to work when using sqlline and with the JDBC driver, but not with the web based interface. Any pointers to the relevent code for that would be appreciated. An alternative solution to the above could be to intercept all reading of the Parquet schemas and modifying the schema to report the microsecond columns as int64 columns, i.e. to completely discard the information that the columns contain time/timestamp values. This could potentially make parts of the code where it is not obvious that the time/timestamp properties of columns are used behave as expected. However, this variant would not align with how INT96 timestamps are handled. Any thoughts on this idea for how to access microsecond values would be highly appreciated. Thanks, /Peter
Re: [PR] DRILL-8478. HashPartition memory leak when it allocate memory exception with OutOfMemoryException (#2874) (drill)
shfshihuafeng commented on code in PR #2875: URL: https://github.com/apache/drill/pull/2875#discussion_r1464222619 ## exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/AbstractHashBinaryRecordBatch.java: ## @@ -1312,7 +1312,9 @@ private void cleanup() { } // clean (and deallocate) each partition, and delete its spill file for (HashPartition partn : partitions) { - partn.close(); + if (partn != null) { +partn.close(); + } Review Comment: (partn != null) are necessary ,see Comment above on 1. fix idea -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] DRILL-8478. HashPartition memory leak when it allocate memory exception with OutOfMemoryException (#2874) (drill)
shfshihuafeng commented on code in PR #2875: URL: https://github.com/apache/drill/pull/2875#discussion_r1464211148 ## exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashPartition.java: ## @@ -157,11 +162,11 @@ public HashPartition(FragmentContext context, BufferAllocator allocator, Chained .build(logger); } catch (SchemaChangeException sce) { throw new IllegalStateException("Unexpected Schema Change while creating a hash table",sce); -} -this.hjHelper = semiJoin ? null : new HashJoinHelper(context, allocator); -tmpBatchesList = new ArrayList<>(); -if (numPartitions > 1) { - allocateNewCurrentBatchAndHV(); +} catch (OutOfMemoryException oom) { + close(); Review Comment: ### 1. fix idea The design is any operator fails, the entire operator stack is closed. but partitions is array which is initialed by null。if hashPartition object is not created successfully, it throw exception. so partitions array data after index which is null。 ` for (int part = 0; part < numPartitions; part++) { partitions[part] = new HashPartition(context, allocator, baseHashTable, buildBatch, probeBatch, semiJoin, RECORDS_PER_BATCH, spillSet, part, spilledState.getCycle(), numPartitions); }` for example partitions array length is 32, numPartitions =32 when numPartitions =10 ,throw except. partitions[11-31] will be null object which index numPartitions =10 was created failed ,but it had allocater memory. when calling close() , hashpartion object which numPartitions =10 can not call close,beacause it is null。 ### 2. another fix idea we do not throw exception and do not call close, but catch. we can create hash partiotn object . thus when calling close() , we can release。 ``` //1. add isException parameter when construct HashPartition object HashPartition(FragmentContext context, BufferAllocator allocator, ChainedHashTable baseHashTable, RecordBatch buildBatch, RecordBatch probeBatch, boolean semiJoin, int recordsPerBatch, SpillSet spillSet, int partNum, int cycleNum, int numPartitions , boolean **isException** ) //2. catch except to ensure HashPartition object has been created } catch (OutOfMemoryException oom) { //do not call close ,do not throw except isException =true; } //3.deal with exception AbstractHashBinaryRecordBatch#initializeBuild boolean isException = false; try { for (int part = 0; part < numPartitions; part++) { if (isException) { break; } partitions[part] = new HashPartition(context, allocator, baseHashTable, buildBatch, probeBatch, semiJoin, RECORDS_PER_BATCH, spillSet, part, spilledState.getCycle(), numPartitions,**isException** ); } } catch (Exception e) { isException = true; } if (isException ){ throw UserException.memoryError(exceptions[0]) .message("Failed to allocate hash partition.") .build(logger); } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] DRILL-8478. HashPartition memory leak when it allocate memory exception with OutOfMemoryException (#2874) (drill)
shfshihuafeng commented on code in PR #2875: URL: https://github.com/apache/drill/pull/2875#discussion_r1464211148 ## exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashPartition.java: ## @@ -157,11 +162,11 @@ public HashPartition(FragmentContext context, BufferAllocator allocator, Chained .build(logger); } catch (SchemaChangeException sce) { throw new IllegalStateException("Unexpected Schema Change while creating a hash table",sce); -} -this.hjHelper = semiJoin ? null : new HashJoinHelper(context, allocator); -tmpBatchesList = new ArrayList<>(); -if (numPartitions > 1) { - allocateNewCurrentBatchAndHV(); +} catch (OutOfMemoryException oom) { + close(); Review Comment: ### 1. fix idea The design is any operator fails, the entire operator stack is closed. but partitions is array which is initialed by null。if hashPartition object is not created successfully, it throw exception. so partitions array data after index which is null。 ` for (int part = 0; part < numPartitions; part++) { partitions[part] = new HashPartition(context, allocator, baseHashTable, buildBatch, probeBatch, semiJoin, RECORDS_PER_BATCH, spillSet, part, spilledState.getCycle(), numPartitions); }` for example partitions array length is 32, numPartitions =32 when numPartitions =10 ,throw except. partitions[11-31] will be null object which index numPartitions =10 was created failed ,but it had allocater memory. when calling close() , hashpartion object which numPartitions =10 can not call close,beacause it is null。 ### 2. another fix idea we do not throw exception and do not call close, but catch. we can create hash partiotn object . thus when calling close() , we can release。 ``` //add isException parameter when construct HashPartition object HashPartition(FragmentContext context, BufferAllocator allocator, ChainedHashTable baseHashTable, RecordBatch buildBatch, RecordBatch probeBatch, boolean semiJoin, int recordsPerBatch, SpillSet spillSet, int partNum, int cycleNum, int numPartitions , boolean **isException** ) } catch (OutOfMemoryException oom) { //do not call close ,do not throw except isException =true; } AbstractHashBinaryRecordBatch#initializeBuild boolean isException = false; try { for (int part = 0; part < numPartitions; part++) { if (isException) { break; } partitions[part] = new HashPartition(context, allocator, baseHashTable, buildBatch, probeBatch, semiJoin, RECORDS_PER_BATCH, spillSet, part, spilledState.getCycle(), numPartitions,**isException** ); } } catch (Exception e) { isException = true; } if (isException ){ throw UserException.memoryError(exceptions[0]) .message("Failed to allocate hash partition.") .build(logger); } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] DRILL-8478. HashPartition memory leak when it allocate memory exception with OutOfMemoryException (#2874) (drill)
shfshihuafeng commented on code in PR #2875: URL: https://github.com/apache/drill/pull/2875#discussion_r1464211148 ## exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashPartition.java: ## @@ -157,11 +162,11 @@ public HashPartition(FragmentContext context, BufferAllocator allocator, Chained .build(logger); } catch (SchemaChangeException sce) { throw new IllegalStateException("Unexpected Schema Change while creating a hash table",sce); -} -this.hjHelper = semiJoin ? null : new HashJoinHelper(context, allocator); -tmpBatchesList = new ArrayList<>(); -if (numPartitions > 1) { - allocateNewCurrentBatchAndHV(); +} catch (OutOfMemoryException oom) { + close(); Review Comment: ### 1. fix idea The design is any operator fails, the entire operator stack is closed. but partitions is array which is initialed by null。if hashPartition object is not created successfully, it throw exception. so partitions array data after index which is null。 ` for (int part = 0; part < numPartitions; part++) { partitions[part] = new HashPartition(context, allocator, baseHashTable, buildBatch, probeBatch, semiJoin, RECORDS_PER_BATCH, spillSet, part, spilledState.getCycle(), numPartitions); }` for example partitions array length is 32, numPartitions =32 when numPartitions =10 ,throw except. partitions[11-31] will be null object which index numPartitions =10 was created failed ,but it had allocater memory. when calling close() , hashpartion object which numPartitions =10 can not call close,beacause it is null。 ### 2. another fix idea we do not throw exception and do not call close, but catch. we can create hash partiotn object . thus when calling close() , we can release。 but if ``` //add isException parameter when construct HashPartition object HashPartition(FragmentContext context, BufferAllocator allocator, ChainedHashTable baseHashTable, RecordBatch buildBatch, RecordBatch probeBatch, boolean semiJoin, int recordsPerBatch, SpillSet spillSet, int partNum, int cycleNum, int numPartitions , boolean **isException** ) } catch (OutOfMemoryException oom) { //do not call close ,do not throw except isException =true; } AbstractHashBinaryRecordBatch#initializeBuild boolean isException = false; try { for (int part = 0; part < numPartitions; part++) { if (isException) { break; } partitions[part] = new HashPartition(context, allocator, baseHashTable, buildBatch, probeBatch, semiJoin, RECORDS_PER_BATCH, spillSet, part, spilledState.getCycle(), numPartitions,**isException** ); } } catch (Exception e) { isException = true; } if (isException ){ throw UserException.memoryError(exceptions[0]) .message("Failed to allocate hash partition.") .build(logger); } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] DRILL-8478. HashPartition memory leak when it allocate memory exception with OutOfMemoryException (#2874) (drill)
shfshihuafeng commented on code in PR #2875: URL: https://github.com/apache/drill/pull/2875#discussion_r1464211148 ## exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashPartition.java: ## @@ -157,11 +162,11 @@ public HashPartition(FragmentContext context, BufferAllocator allocator, Chained .build(logger); } catch (SchemaChangeException sce) { throw new IllegalStateException("Unexpected Schema Change while creating a hash table",sce); -} -this.hjHelper = semiJoin ? null : new HashJoinHelper(context, allocator); -tmpBatchesList = new ArrayList<>(); -if (numPartitions > 1) { - allocateNewCurrentBatchAndHV(); +} catch (OutOfMemoryException oom) { + close(); Review Comment: ### 1. fix idea The design is any operator fails, the entire operator stack is closed. but partitions is array which is initialed by null。if hashPartition object is not created successfully, it throw exception. so partitions array data after index which is null。 ` for (int part = 0; part < numPartitions; part++) { partitions[part] = new HashPartition(context, allocator, baseHashTable, buildBatch, probeBatch, semiJoin, RECORDS_PER_BATCH, spillSet, part, spilledState.getCycle(), numPartitions); }` for example partitions array length is 32, numPartitions =32 when numPartitions =10 ,throw except. partitions[11-31] will be null object which index numPartitions =10 was created failed ,but it had allocater memory. when calling close() , hashpartion object which numPartitions =10 can not call close,beacause it is null。 ### 2. another fix idea we do not throw exception and do not call close, but catch. we can create hash partiotn object . thus when calling close() , we can release。 but if ``` HashPartition(FragmentContext context, BufferAllocator allocator, ChainedHashTable baseHashTable, RecordBatch buildBatch, RecordBatch probeBatch, boolean semiJoin, int recordsPerBatch, SpillSet spillSet, int partNum, int cycleNum, int numPartitions , boolean **isException** ) } catch (OutOfMemoryException oom) { //do not call close ,do not throw except isException =false; } AbstractHashBinaryRecordBatch#initializeBuild boolean isException = false; try { for (int part = 0; part < numPartitions; part++) { if (isException) { break; } partitions[part] = new HashPartition(context, allocator, baseHashTable, buildBatch, probeBatch, semiJoin, RECORDS_PER_BATCH, spillSet, part, spilledState.getCycle(), numPartitions,**isException** ); } } catch (Exception e) { isException = true; } if (isException ){ throw UserException.memoryError(exceptions[0]) .message("Failed to allocate hash partition.") .build(logger); } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] DRILL-8478. HashPartition memory leak when it allocate memory exception with OutOfMemoryException (#2874) (drill)
shfshihuafeng commented on code in PR #2875: URL: https://github.com/apache/drill/pull/2875#discussion_r1464211148 ## exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashPartition.java: ## @@ -157,11 +162,11 @@ public HashPartition(FragmentContext context, BufferAllocator allocator, Chained .build(logger); } catch (SchemaChangeException sce) { throw new IllegalStateException("Unexpected Schema Change while creating a hash table",sce); -} -this.hjHelper = semiJoin ? null : new HashJoinHelper(context, allocator); -tmpBatchesList = new ArrayList<>(); -if (numPartitions > 1) { - allocateNewCurrentBatchAndHV(); +} catch (OutOfMemoryException oom) { + close(); Review Comment: ### 1. fix idea The design is any operator fails, the entire operator stack is closed. but partitions is array which is initialed by null。if hashPartition object is not created successfully, it throw exception. so partitions array data after index which is null。 ` for (int part = 0; part < numPartitions; part++) { partitions[part] = new HashPartition(context, allocator, baseHashTable, buildBatch, probeBatch, semiJoin, RECORDS_PER_BATCH, spillSet, part, spilledState.getCycle(), numPartitions); }` for example partitions array length is 32, numPartitions =32 when numPartitions =10 ,throw except. partitions[11-31] will be null object which index numPartitions =10 was created failed ,but it had allocater memory. when calling close() , hashpartion object which numPartitions =10 can not call close,beacause it is null。 ### 2. another fix idea we do not throw exception and do not call close, but catch. we can create hash partiotn object . thus when calling close() , we can release。 but if ``` } catch (OutOfMemoryException oom) { //do not call close ,only throw except throw UserException.memoryError(oom) .message("Failed to allocate hash partition.") .build(logger); } AbstractHashBinaryRecordBatch#initializeBuild boolean isException = false; try { for (int part = 0; part < numPartitions; part++) { if (isException) { break; } partitions[part] = new HashPartition(context, allocator, baseHashTable, buildBatch, probeBatch, semiJoin, RECORDS_PER_BATCH, spillSet, part, spilledState.getCycle(), numPartitions); } } catch (Exception e) { isException = true; } if (isException ){ throw UserException.memoryError(exceptions[0]) .message("Failed to allocate hash partition.") .build(logger); } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] DRILL-8478. HashPartition memory leak when it allocate memory exception with OutOfMemoryException (#2874) (drill)
shfshihuafeng commented on code in PR #2875: URL: https://github.com/apache/drill/pull/2875#discussion_r1464211148 ## exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashPartition.java: ## @@ -157,11 +162,11 @@ public HashPartition(FragmentContext context, BufferAllocator allocator, Chained .build(logger); } catch (SchemaChangeException sce) { throw new IllegalStateException("Unexpected Schema Change while creating a hash table",sce); -} -this.hjHelper = semiJoin ? null : new HashJoinHelper(context, allocator); -tmpBatchesList = new ArrayList<>(); -if (numPartitions > 1) { - allocateNewCurrentBatchAndHV(); +} catch (OutOfMemoryException oom) { + close(); Review Comment: ### 1. fix idea The design is any operator fails, the entire operator stack is closed. but partitions is array which is initialed by null。if hashPartition object is not created successfully, it throw exception. so partitions array data after index which is null。 ` for (int part = 0; part < numPartitions; part++) { partitions[part] = new HashPartition(context, allocator, baseHashTable, buildBatch, probeBatch, semiJoin, RECORDS_PER_BATCH, spillSet, part, spilledState.getCycle(), numPartitions); }` for example partitions array length is 32, numPartitions =32 when numPartitions =10 ,throw except. partitions[11-31] will be null object which index numPartitions =10 was created failed ,but it had allocater memory. when calling close() , hashpartion object which numPartitions =10 can not call close,beacause it is null。 2. another fix idea we do not throw exception and do not call close, but catch. we can create hash partiotn object . thus when calling close() , we can release。 but if ``` } catch (OutOfMemoryException oom) { //do not call close ,only throw except throw UserException.memoryError(oom) .message("Failed to allocate hash partition.") .build(logger); } AbstractHashBinaryRecordBatch#initializeBuild boolean isException = false; try { for (int part = 0; part < numPartitions; part++) { if (isException) { break; } partitions[part] = new HashPartition(context, allocator, baseHashTable, buildBatch, probeBatch, semiJoin, RECORDS_PER_BATCH, spillSet, part, spilledState.getCycle(), numPartitions); } } catch (Exception e) { isException = true; } if (isException ){ throw UserException.memoryError(exceptions[0]) .message("Failed to allocate hash partition.") .build(logger); } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] DRILL-8474: Add Daffodil Format Plugin (drill)
mbeckerle commented on PR #2836: URL: https://github.com/apache/drill/pull/2836#issuecomment-1906827568 Ok, so the geo-ip UDF stuff has no special mechanisms or description about those resource files, so the generic code that "scans" must find them and drag them along automatically. That's the behavior I want. What is "Drill's 3rd Party Jar folder"? If a magic folder just gets dragged over to all nodes, and drill uses a class loader that arranges for jars in that folder to be searched, then there is very little to do, since a DFDL schema can be just a set of jar files containing related resources, and the classes for Daffodil's own UDFs and layers which are java code extensions of its own kind. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] DRILL-8478. HashPartition memory leak when it allocate memory exception with OutOfMemoryException (#2874) (drill)
paul-rogers commented on code in PR #2875: URL: https://github.com/apache/drill/pull/2875#discussion_r1463921977 ## exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/AbstractHashBinaryRecordBatch.java: ## @@ -1312,7 +1312,9 @@ private void cleanup() { } // clean (and deallocate) each partition, and delete its spill file for (HashPartition partn : partitions) { - partn.close(); + if (partn != null) { +partn.close(); + } Review Comment: The above is OK as a work-around. I wonder, however, where the code added a null pointer to the partition list. That should never happen. If it does, it should be fixed at the point where the null pointer is added to the list. Fixing it here is incomplete: there are other places where we loop through the list, and those will also fail. ## exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashPartition.java: ## @@ -157,11 +162,11 @@ public HashPartition(FragmentContext context, BufferAllocator allocator, Chained .build(logger); } catch (SchemaChangeException sce) { throw new IllegalStateException("Unexpected Schema Change while creating a hash table",sce); -} -this.hjHelper = semiJoin ? null : new HashJoinHelper(context, allocator); -tmpBatchesList = new ArrayList<>(); -if (numPartitions > 1) { - allocateNewCurrentBatchAndHV(); +} catch (OutOfMemoryException oom) { + close(); Review Comment: This call is _probably_ fine. However, the design is that if any operator fails, the entire operator stack is closed. So, `close()` should be called by the fragment executor. There is probably no harm in calling `close()` here, as long as the `close()` method is safe to call twice. If the fragment executor _does not_ call close when the failure occurs during setup, then there is a bug since failing to call `close()` results in just this kind of error. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] DRILL-8474: Add Daffodil Format Plugin (drill)
cgivre commented on PR #2836: URL: https://github.com/apache/drill/pull/2836#issuecomment-1906689793 > > > @cgivre @paul-rogers is there an example of a Drill UDF that is not part of the drill repository tree? > > > I'd like to understand the mechanisms for distributing any jar files and dependencies of the UDF that drill uses. I can't find any such in the quasi-USFs that are in the Drill tree, because well, since they are part of Drill, and so are their dependencies, this problem doesn't exist. > > > > > > @mbeckerle Here's an example: https://github.com/datadistillr/drill-humanname-functions. I'm sorry we weren't able to connect last week. > > If I understand this correctly, if a jar is on the classpath and has drill-module.conf in its root dir, then drill will find it and read that HOCON file to get the package to add to drill.classpath.scanning.packages. I believe that is correct. > > Drill then appears to scan jars for class files for those packages. Not sure what it is doing with the class files. I imagine it is repackaging them somehow so Drill can use them on the drill distributed nodes. But it isn't yet clear to me how this aspect works. Do these classes just get loaded on the distributed drill nodes? Or is the classpath augmented in some way on the drill nodes so that they see a jar that contains all these classes? > > I have two questions: > > (1) what about dependencies? The UDF may depend on libraries which depend on other libraries, etc. So UDFs are a bit of a special case, but if they do have dependencies, you have to also include those JAR files in the UDF directory, or in Drill's 3rd party JAR folder. I'm not that good with maven, but I've often wondered about making a so-called fat-JAR which includes the dependencies as part of the UDF JAR file. > > (2) what about non-class files, e.g., things under src/main/resources of the project that go into the jar, but aren't "class" files? How do those things also get moved? How would code running in the drill node access these? The usual method is to call getResource(URL) with a URL that gives the path within a jar file to the resource in question. Take a look at this UDF. https://github.com/datadistillr/drill-geoip-functions This UDF has a few external resources including a CSV file and the MaxMind databases. > > Thanks for any info. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] DRILL-8474: Add Daffodil Format Plugin (drill)
mbeckerle commented on PR #2836: URL: https://github.com/apache/drill/pull/2836#issuecomment-1906561549 > > @cgivre @paul-rogers is there an example of a Drill UDF that is not part of the drill repository tree? > > I'd like to understand the mechanisms for distributing any jar files and dependencies of the UDF that drill uses. I can't find any such in the quasi-USFs that are in the Drill tree, because well, since they are part of Drill, and so are their dependencies, this problem doesn't exist. > > @mbeckerle Here's an example: https://github.com/datadistillr/drill-humanname-functions. I'm sorry we weren't able to connect last week. If I understand this correctly, if a jar is on the classpath and has drill-module.conf in its root dir, then drill will find it and read that HOCON file to get the package to add to drill.classpath.scanning.packages. Drill then appears to scan jars for class files for those packages. Not sure what it is doing with the class files. I imagine it is repackaging them somehow so Drill can use them on the drill distributed nodes. But it isn't yet clear to me how this aspect works. Do these classes just get loaded on the distributed drill nodes? Or is the classpath augmented in some way on the drill nodes so that they see a jar that contains all these classes? I have two questions: (1) what about dependencies? The UDF may depend on libraries which depend on other libraries, etc. (2) what about non-class files, e.g., things under src/main/resources of the project that go into the jar, but aren't "class" files? How do those things also get moved? How would code running in the drill node access these? The usual method is to call getResource(URL) with a URL that gives the path within a jar file to the resource in question. Thanks for any info. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: Parquet files with microsecond columns
A quick test with authentication enabled, each statement issued separately in the web UI. set `window.enable` = false; -- disable window functions select row_number() over (order by 1); org.apache.drill.common.exceptions.UserRemoteException: UNSUPPORTED_OPERATION ERROR: Window functions are disabled See Apache Drill JIRA: DRILL-2559 On 2024/01/23 17:56, James Turton wrote: My experience has been that if you've switched on authentication in Drill then web UI /does/ sustain a session. If not then it doesn't. On 2024/01/23 09:48, Peter Franzen wrote: Hi Paul, Thanks for your comments. I wasn’t aware that the Web UI doesn’t have sessions; when setting the option at the system level the Web UI behaves as expected. I’ll go ahead and create a pull request within the next few days. /Peter On 22 Jan 2024, at 21:40, Paul Rogers wrote: Hi Peter, It sounds like you are on the right track: the new option is the quick short-term solution. The best long-term solution is to generalize Drill's date/time type, but that would take much more work. (Drill also has a bug where the treatment of timezones is incorrect, which forces Drill to run in the UTC time zone -- something that will also require difficult work.) Given that JDBC works, the problem must be in the web interface, not in your Parquet implementation. You've solved the problem with a new session option. The web interface, however, has no sessions: if you set an option in one call, and do your query in another, Drill will have "forgotten" your option. Instead, there is a way to attach options to each query. Are you using that feature? As I recall, the JSON message to submit a query has an additional field to hold session options. I do not recall, however, if the web UI added that feature. Does anyone else know? Two workarounds. First, use your favorite JSON request tool to submit a query with the option set. Second, set your option as a system option so it is available to all sessions: ALTER SYSTEM SET... Thanks, - Paul On Mon, Jan 22, 2024 at 1:38 AM Peter Franzen wrote: Hi, I am using Drill to query Parquet files that have fields of type timestamp_micros. By default, Drill truncates those microsecond values to milliseconds when reading the Parquet files in order to convert them to SQL timestamps. In some of my use cases I need to read the original microsecond values (as 64-bit values, not SQL timestamps) through Drill, but this doesn’t seem to be possible (unless I’ve missed something). I have explored a possible solution to this, and would like to run it by some developers more experienced with the Drill code base before I create a pull request. My idea is to add tow options similar to “store.parquet.reader.int96_as_timestamp" to control whether or not microsecond times and timestamps are truncated to milliseconds. These options would be added to “org.apache.drill.exec.ExecConstants" and "org.apache.drill.exec.server.options.SystemOptionManager", and to drill-module.conf: store.parquet.reader.time_micros_as_int64: false, store.parquet.reader.timestamp_micros_as_int64: false, These options would then be used in the same places as “store.parquet.reader.int96_as_timestamp”: org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory org.apache.drill.exec.store.parquet.columnreaders.ParquetToDrillTypeConverter org.apache.drill.exec.store.parquet2.DrillParquetGroupConverter to create an int64 reader instead of a time/timestamp reader when the correspodning option is set to true. In addition to this, “org.apache.drill.exec.store.parquet.metadata.FileMetadataCollector” must be altered to _not_ truncate the min and max values for time_micros/timestamp_micros if the corresponding option is true. This class doesn’t have a reference to an OptionManager, so my guess is that the two new options must be extractred from the OptionManager when the ParquetReaderConfig instance is created. Filtering on microsecond columns would be done using 64-bit values rather than TIME/TIMESTAMP values, e.g. select * from where = 1705914906694751; I’ve tested the solution outlined above, and it seems to work when using sqlline and with the JDBC driver, but not with the web based interface. Any pointers to the relevent code for that would be appreciated. An alternative solution to the above could be to intercept all reading of the Parquet schemas and modifying the schema to report the microsecond columns as int64 columns, i.e. to completely discard the information that the columns contain time/timestamp values. This could potentially make parts of the code where it is not obvious that the time/timestamp properties of columns are used behave as expected. However, this variant would not align with how INT96 timestamps are handled. Any thoughts on this idea for how to access microsecond values would be highly appreciated. Thanks, /Peter
Re: Parquet files with microsecond columns
My experience has been that if you've switched on authentication in Drill then web UI /does/ sustain a session. If not then it doesn't. On 2024/01/23 09:48, Peter Franzen wrote: Hi Paul, Thanks for your comments. I wasn’t aware that the Web UI doesn’t have sessions; when setting the option at the system level the Web UI behaves as expected. I’ll go ahead and create a pull request within the next few days. /Peter On 22 Jan 2024, at 21:40, Paul Rogers wrote: Hi Peter, It sounds like you are on the right track: the new option is the quick short-term solution. The best long-term solution is to generalize Drill's date/time type, but that would take much more work. (Drill also has a bug where the treatment of timezones is incorrect, which forces Drill to run in the UTC time zone -- something that will also require difficult work.) Given that JDBC works, the problem must be in the web interface, not in your Parquet implementation. You've solved the problem with a new session option. The web interface, however, has no sessions: if you set an option in one call, and do your query in another, Drill will have "forgotten" your option. Instead, there is a way to attach options to each query. Are you using that feature? As I recall, the JSON message to submit a query has an additional field to hold session options. I do not recall, however, if the web UI added that feature. Does anyone else know? Two workarounds. First, use your favorite JSON request tool to submit a query with the option set. Second, set your option as a system option so it is available to all sessions: ALTER SYSTEM SET... Thanks, - Paul On Mon, Jan 22, 2024 at 1:38 AM Peter Franzen wrote: Hi, I am using Drill to query Parquet files that have fields of type timestamp_micros. By default, Drill truncates those microsecond values to milliseconds when reading the Parquet files in order to convert them to SQL timestamps. In some of my use cases I need to read the original microsecond values (as 64-bit values, not SQL timestamps) through Drill, but this doesn’t seem to be possible (unless I’ve missed something). I have explored a possible solution to this, and would like to run it by some developers more experienced with the Drill code base before I create a pull request. My idea is to add tow options similar to “store.parquet.reader.int96_as_timestamp" to control whether or not microsecond times and timestamps are truncated to milliseconds. These options would be added to “org.apache.drill.exec.ExecConstants" and "org.apache.drill.exec.server.options.SystemOptionManager", and to drill-module.conf: store.parquet.reader.time_micros_as_int64: false, store.parquet.reader.timestamp_micros_as_int64: false, These options would then be used in the same places as “store.parquet.reader.int96_as_timestamp”: org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory org.apache.drill.exec.store.parquet.columnreaders.ParquetToDrillTypeConverter org.apache.drill.exec.store.parquet2.DrillParquetGroupConverter to create an int64 reader instead of a time/timestamp reader when the correspodning option is set to true. In addition to this, “org.apache.drill.exec.store.parquet.metadata.FileMetadataCollector” must be altered to _not_ truncate the min and max values for time_micros/timestamp_micros if the corresponding option is true. This class doesn’t have a reference to an OptionManager, so my guess is that the two new options must be extractred from the OptionManager when the ParquetReaderConfig instance is created. Filtering on microsecond columns would be done using 64-bit values rather than TIME/TIMESTAMP values, e.g. select * from where = 1705914906694751; I’ve tested the solution outlined above, and it seems to work when using sqlline and with the JDBC driver, but not with the web based interface. Any pointers to the relevent code for that would be appreciated. An alternative solution to the above could be to intercept all reading of the Parquet schemas and modifying the schema to report the microsecond columns as int64 columns, i.e. to completely discard the information that the columns contain time/timestamp values. This could potentially make parts of the code where it is not obvious that the time/timestamp properties of columns are used behave as expected. However, this variant would not align with how INT96 timestamps are handled. Any thoughts on this idea for how to access microsecond values would be highly appreciated. Thanks, /Peter
Re: [PR] DRILL-8478. HashPartition memory leak when it allocate memory exception with OutOfMemoryException (#2874) (drill)
jnturton merged PR #2875: URL: https://github.com/apache/drill/pull/2875 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: Parquet files with microsecond columns
Hi Paul, Thanks for your comments. I wasn’t aware that the Web UI doesn’t have sessions; when setting the option at the system level the Web UI behaves as expected. I’ll go ahead and create a pull request within the next few days. /Peter > On 22 Jan 2024, at 21:40, Paul Rogers wrote: > > Hi Peter, > > It sounds like you are on the right track: the new option is the quick > short-term solution. The best long-term solution is to generalize Drill's > date/time type, but that would take much more work. (Drill also has a bug > where the treatment of timezones is incorrect, which forces Drill to run in > the UTC time zone -- something that will also require difficult work.) > > Given that JDBC works, the problem must be in the web interface, not in > your Parquet implementation. You've solved the problem with a new session > option. The web interface, however, has no sessions: if you set an option > in one call, and do your query in another, Drill will have "forgotten" your > option. Instead, there is a way to attach options to each query. Are you > using that feature? > > As I recall, the JSON message to submit a query has an additional field to > hold session options. I do not recall, however, if the web UI added that > feature. Does anyone else know? Two workarounds. First, use your favorite > JSON request tool to submit a query with the option set. Second, set your > option as a system option so it is available to all sessions: ALTER SYSTEM > SET... > > Thanks, > > - Paul > > On Mon, Jan 22, 2024 at 1:38 AM Peter Franzen wrote: > >> Hi, >> >> I am using Drill to query Parquet files that have fields of type >> timestamp_micros. By default, Drill truncates those microsecond >> values to milliseconds when reading the Parquet files in order to convert >> them to SQL timestamps. >> >> In some of my use cases I need to read the original microsecond values (as >> 64-bit values, not SQL timestamps) through Drill, but >> this doesn’t seem to be possible (unless I’ve missed something). >> >> I have explored a possible solution to this, and would like to run it by >> some developers more experienced with the Drill code base >> before I create a pull request. >> >> My idea is to add tow options similar to >> “store.parquet.reader.int96_as_timestamp" to control whether or not >> microsecond >> times and timestamps are truncated to milliseconds. These options would be >> added to “org.apache.drill.exec.ExecConstants" and >> "org.apache.drill.exec.server.options.SystemOptionManager", and to >> drill-module.conf: >> >>store.parquet.reader.time_micros_as_int64: false, >>store.parquet.reader.timestamp_micros_as_int64: false, >> >> These options would then be used in the same places as >> “store.parquet.reader.int96_as_timestamp”: >> >> org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory >> >> org.apache.drill.exec.store.parquet.columnreaders.ParquetToDrillTypeConverter >> org.apache.drill.exec.store.parquet2.DrillParquetGroupConverter >> >> to create an int64 reader instead of a time/timestamp reader when the >> correspodning option is set to true. >> >> In addition to this, >> “org.apache.drill.exec.store.parquet.metadata.FileMetadataCollector” must >> be altered to _not_ truncate the min and max >> values for time_micros/timestamp_micros if the corresponding option is >> true. This class doesn’t have a reference to an OptionManager, so >> my guess is that the two new options must be extractred from the >> OptionManager when the ParquetReaderConfig instance is created. >> >> Filtering on microsecond columns would be done using 64-bit values rather >> than TIME/TIMESTAMP values, e.g. >> >> select * from where = 1705914906694751; >> >> I’ve tested the solution outlined above, and it seems to work when using >> sqlline and with the JDBC driver, but not with the web based interface. >> Any pointers to the relevent code for that would be appreciated. >> >> An alternative solution to the above could be to intercept all reading of >> the Parquet schemas and modifying the schema to report the >> microsecond columns as int64 columns, i.e. to completely discard the >> information that the columns contain time/timestamp values. >> This could potentially make parts of the code where it is not obvious that >> the time/timestamp properties of columns are used behave >> as expected. However, this variant would not align with how INT96 >> timestamps are handled. >> >> Any thoughts on this idea for how to access microsecond values would be >> highly appreciated. >> >> Thanks, >> >> /Peter >> >>
Re: [PR] DRILL-8478. HashPartition memory leak when it allocate memory exception with OutOfMemoryException (#2874) (drill)
shfshihuafeng commented on PR #2875: URL: https://github.com/apache/drill/pull/2875#issuecomment-1905599592 > [An unsued import crept in](https://github.com/apache/drill/actions/runs/7622586264/job/20762475705#step:6:1277), could you remove it please? removed it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] DRILL-8478. HashPartition memory leak when it allocate memory exception with OutOfMemoryException (#2874) (drill)
jnturton commented on PR #2875: URL: https://github.com/apache/drill/pull/2875#issuecomment-1905598192 [An unsued import crept in](https://github.com/apache/drill/actions/runs/7622586264/job/20762475705#step:6:1277), could you remove it please? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org