[jira] [Updated] (DRILL-4278) Memory leak when using LIMIT

2016-01-16 Thread jean-claude (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jean-claude updated DRILL-4278:
---
Description: 
copy the parquet files in the samples directory so that you have a 12 or so
$ ls -lha /apache-drill-1.4.0/sample-data/nationsMF/
nationsMF1.parquet
nationsMF2.parquet
nationsMF3.parquet

create a file with a few thousand lines like these
select * from dfs.`/Users/jccote/apache-drill-1.4.0/sample-data/nationsMF` 
limit 500;

start drill
$ /apache-drill-1.4.0/bin/drill-embeded

reduce the slice target size to force drill to use multiple fragment/threads
jdbc:drill:zk=local> system set planner.slice_target=10;

now run the list of queries from the file your created above
jdbc:drill:zk=local> !run /Users/jccote/test-memory-leak-using-limit.sql

the java heap space keeps going up until the old space is at 100% and 
eventually you get an OutOfMemoryException in drill

$ jstat -gccause 86850 5s
  S0 S1 E  O  M CCSYGC YGCTFGCFGCT GCT  
  LGCC GCC 
  0.00   0.00 100.00 100.00  98.56  96.71   2279   26.682   240  458.139  
484.821 GCLocker Initiated GC Ergonomics  
  0.00   0.00 100.00  99.99  98.56  96.71   2279   26.682   242  461.347  
488.028 Allocation Failure   Ergonomics  
  0.00   0.00 100.00  99.99  98.56  96.71   2279   26.682   245  466.630  
493.311 Allocation Failure   Ergonomics  
  0.00   0.00 100.00  99.99  98.56  96.71   2279   26.682   247  470.020  
496.702 Allocation Failure   Ergonomics  


If you do the same test but do not use the LIMIT then the memory usage does not 
go up.

If you add a where clause so that no results are returned, then the memory 
usage does not go up.

Something with the RPC layer?

Also it seems sensitive to the number of fragments/threads. If you limit it to 
one fragment/thread the memory usage goes up much slower.

I have used parquet files and CSV files. In either case the behaviour is the 
same.




  was:
copy the parquet files in the samples directory so that you have a 12 or so
$ ls -lha /apache-drill-1.4.0/sample-data/nationsMF/
nationsMF1.parquet
nationsMF2.parquet
nationsMF3.parquet

create a file with a few thousand lines like these
select * from dfs.`/Users/jccote/apache-drill-1.4.0/sample-data/nationsMF` 
limit 500;

start drill
$ /apache-drill-1.4.0/bin/drill-embeded

reduce the slice target size to force drill to use multiple fragment/threads
jdbc:drill:zk=local> system set planner.slice_target=10;

now run the list of queries from the file your created above
jdbc:drill:zk=local> !run /Users/jccote/test-memory-leak-using-limit.sql

the java heap space keeps going up until the old space is at 100% and 
eventually you get an OutOfMemoryException in drill

$ jstat -gccause 86850 5s
  S0 S1 E  O  M CCSYGC YGCTFGCFGCT GCT  
  LGCC GCC 
  0.00   0.00 100.00 100.00  98.56  96.71   2279   26.682   240  458.139  
484.821 GCLocker Initiated GC Ergonomics  
  0.00   0.00 100.00  99.99  98.56  96.71   2279   26.682   242  461.347  
488.028 Allocation Failure   Ergonomics  
  0.00   0.00 100.00  99.99  98.56  96.71   2279   26.682   245  466.630  
493.311 Allocation Failure   Ergonomics  
  0.00   0.00 100.00  99.99  98.56  96.71   2279   26.682   247  470.020  
496.702 Allocation Failure   Ergonomics  


If you do the same test but do not use the LIMIT then the memory usage does not 
go up.

If you add a where clause so that no results are returned, then the memory 
usage does not go up.

Something with the RPC layer?

Also it seems sensitive to the number of fragments/threads. If you limit it to 
one fragment/thread the memory usage goes up much slower.







> Memory leak when using LIMIT
> 
>
> Key: DRILL-4278
> URL: https://issues.apache.org/jira/browse/DRILL-4278
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - RPC
>Affects Versions: 1.4.0
> Environment: OS X
>Reporter: jean-claude
>
> copy the parquet files in the samples directory so that you have a 12 or so
> $ ls -lha /apache-drill-1.4.0/sample-data/nationsMF/
> nationsMF1.parquet
> nationsMF2.parquet
> nationsMF3.parquet
> create a file with a few thousand lines like these
> select * from dfs.`/Users/jccote/apache-drill-1.4.0/sample-data/nationsMF` 
> limit 500;
> start drill
> $ /apache-drill-1.4.0/bin/drill-embeded
> reduce the slice target size to force drill to use multiple fragment/threads
> jdbc:drill:zk=local> system set planner.slice_target=10;
> now run the list of queries from the file your created above
> jdbc:drill:zk=local> !run /Users/jccote/test-memory-leak-using-limit.sql
> the java heap space keeps going up until the old space is at 100% and 
> 

[jira] [Updated] (DRILL-4278) Memory leak when using LIMIT

2016-01-16 Thread jean-claude (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jean-claude updated DRILL-4278:
---
Environment: 
OS X

0: jdbc:drill:zk=local> select * from sys.version;
+--+---+-++++
| version  | commit_id |   
commit_message|commit_time |
build_email | build_time |
+--+---+-++++
| 1.4.0| 32b871b24c7b69f59a1d2e70f444eed6e599e825  | [maven-release-plugin] 
prepare release drill-1.4.0  | 08.12.2015 @ 00:24:59 PST  | 
venki.koruka...@gmail.com  | 08.12.2015 @ 01:14:39 PST  |
+--+---+-++++

0: jdbc:drill:zk=local> select * from sys.options where status <> 'DEFAULT';
+-+---+-+--+--+-+---++
|name | kind  |  type   |  status  | num_val  | 
string_val  | bool_val  | float_val  |
+-+---+-+--+--+-+---++
| planner.slice_target| LONG  | SYSTEM  | CHANGED  | 10   | null
| null  | null   |
| planner.width.max_per_node  | LONG  | SYSTEM  | CHANGED  | 5| null
| null  | null   |
+-+---+-+--+--+-+---++
2 rows selected (0.16 seconds)




  was:OS X


> Memory leak when using LIMIT
> 
>
> Key: DRILL-4278
> URL: https://issues.apache.org/jira/browse/DRILL-4278
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - RPC
>Affects Versions: 1.4.0
> Environment: OS X
> 0: jdbc:drill:zk=local> select * from sys.version;
> +--+---+-++++
> | version  | commit_id |   
> commit_message|commit_time |
> build_email | build_time |
> +--+---+-++++
> | 1.4.0| 32b871b24c7b69f59a1d2e70f444eed6e599e825  | 
> [maven-release-plugin] prepare release drill-1.4.0  | 08.12.2015 @ 00:24:59 
> PST  | venki.koruka...@gmail.com  | 08.12.2015 @ 01:14:39 PST  |
> +--+---+-++++
> 0: jdbc:drill:zk=local> select * from sys.options where status <> 'DEFAULT';
> +-+---+-+--+--+-+---++
> |name | kind  |  type   |  status  | num_val  | 
> string_val  | bool_val  | float_val  |
> +-+---+-+--+--+-+---++
> | planner.slice_target| LONG  | SYSTEM  | CHANGED  | 10   | null  
>   | null  | null   |
> | planner.width.max_per_node  | LONG  | SYSTEM  | CHANGED  | 5| null  
>   | null  | null   |
> +-+---+-+--+--+-+---++
> 2 rows selected (0.16 seconds)
>Reporter: jean-claude
>
> copy the parquet files in the samples directory so that you have a 12 or so
> $ ls -lha /apache-drill-1.4.0/sample-data/nationsMF/
> nationsMF1.parquet
> nationsMF2.parquet
> nationsMF3.parquet
> create a file with a few thousand lines like these
> select * from dfs.`/Users/jccote/apache-drill-1.4.0/sample-data/nationsMF` 
> limit 500;
> start drill
> $ /apache-drill-1.4.0/bin/drill-embeded
> reduce the slice target size to force drill to use multiple fragment/threads
> jdbc:drill:zk=local> system set planner.slice_target=10;
> now run the list of queries from the file your created above
> jdbc:drill:zk=local> !run /Users/jccote/test-memory-leak-using-limit.sql
> the java heap space keeps going up