There is a new patch for optimizing partition pruning, including CPU and memory. I think it is not in 0.7 yet. Can you try trunk and see how much memory you need?
BTW 72k partitions is indeed quite a large number. When I did the experiments with the new patch, you'll need about 300MB for 20k partitions. On May 17, 2011, at 5:45 AM, Terje Marthinussen wrote: > Hi, > > I was running on a 0.7 trunk build from February 2011 until last Friday and > upgraded to trunk again then. > > Things works ok except memory usage when doing queries with large number of > partitions is quite dramatically up. > I could query 12 months of data in one table with ~72k of partitions with 4 > buckets each without trouble before with a HEAP of 768MB, I have tried with a > HEAP of up to 4GB with the new 0.7 build but still I fail. > > I don't see a lot of other people complain, so I wonder if I just have > unusually many partitions or if there is a bug here or I have done something > stupid in the upgrade. > > Exception in thread "main" java.lang.OutOfMemoryError: Java heap space > at java.util.ArrayList.<init>(ArrayList.java:112) > at java.util.ArrayList.<init>(ArrayList.java:119) > at org.datanucleus.store.rdbms.sql.SQLText.append(SQLText.java:158) > at > org.datanucleus.store.rdbms.sql.SQLStatement.getSelectStatement(SQLStatement.java:1282) > at > org.datanucleus.store.rdbms.scostore.RDBMSJoinListStore.listIterator(RDBMSJoinListStore.java:116) > at > org.datanucleus.store.mapped.scostore.AbstractListStore.listIterator(AbstractListStore.java:84) > at > org.datanucleus.store.mapped.scostore.AbstractListStore.iterator(AbstractListStore.java:74) > at org.datanucleus.sco.backed.List.loadFromStore(List.java:241) > at org.datanucleus.sco.backed.List.iterator(List.java:507) > at > org.apache.hadoop.hive.metastore.ObjectStore.convertToFieldSchemas(ObjectStore.java:866) > at > org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:920) > at > org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:929) > at > org.apache.hadoop.hive.metastore.ObjectStore.convertToPart(ObjectStore.java:1063) > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsWithAuth(ObjectStore.java:1138) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$25.run(HiveMetaStore.java:1595) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$25.run(HiveMetaStore.java:1592) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.executeWithRetry(HiveMetaStore.java:319) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_with_auth(HiveMetaStore.java:1592) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsWithAuthInfo(HiveMetaStoreClient.java:593) > at > org.apache.hadoop.hive.ql.metadata.Hive.getPartitions(Hive.java:1456) > at > org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune(PartitionPruner.java:207) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genMapRedTasks(SemanticAnalyzer.java:6583) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7113) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:238) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:376) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:334) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:843) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:218) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:302) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:531) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > > > Regards, > Terje