On Oct 5, 2010, at 4:38 AM, Terje Marthinussen wrote:

> Just tested analyze table with a trunk build (from yesterday, oct 4th).
> 
> tried various variations (with or without partitions) of it, but regardless
> of what I try, I either get:
> --
> analyze table normalized  compute
> statistics;
> 
> FAILED: Error in semantic analysis: Table is partitioned and partition
> specification is needed
> --
> Fair enough if it is not supported, but specifying no partitions seems to be
> supported according to the docs at
> http://wiki.apache.org/hadoop/Hive/StatsDev ?
> 
Sorry the design spec was out-dated. I've updated the wiki to reflect the 
syntax change.

> --
> analyze table normalized  partition(intdate) compute statistics;
> FAILED: Error in semantic analysis: line 1:36 Dynamic partition cannot be
> the parent of a static partition intdate
> --
If you have multiple partition columns, the order of them are important since 
they reflect the hierarchical DFS directory structure. So you have to specify 
the parent partition first and then sub-partitions. The partition spec has to 
be able to be mapped a *one* HDFS directory.  So partition (parent='val', 
subpart) is allowed but not partition (parent, subpart='val') or without giving 
parent in the spec. 

> ok, I may understand this (or maybe not :)) may be good to add some notes
> about it on the wiki though
> 
This may be a bug when the partition spec includes non-partition columns. I'll 
verify and file a JIRA for that. So in the partition spec you can only include 
partition columns, in the order they appear in the CREATE TABLE DDL. 

> Then the OOM:
> analyze table normalized
> partition(intdate,country,logtype,service,hostname,filedate,filedate_ext)
> compute statistics;
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>    at java.util.zip.InflaterInputStream.<init>(InflaterInputStream.java:71)
>    at java.util.zip.ZipFile$1.<init>(ZipFile.java:212)
>    at java.util.zip.ZipFile.getInputStream(ZipFile.java:212)
>    at java.util.zip.ZipFile.getInputStream(ZipFile.java:180)
>    at java.util.jar.JarFile.getManifestFromReference(JarFile.java:167)
>    at java.util.jar.JarFile.getManifest(JarFile.java:148)
>    at sun.misc.URLClassPath$JarLoader$2.getManifest(URLClassPath.java:696)
>    at java.net.URLClassLoader.defineClass(URLClassLoader.java:228)
>    at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
>    at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
>    at java.security.AccessController.doPrivileged(Native Method)
>    at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>    at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
>    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>    at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
>    at
> org.datanucleus.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:262)
>    at
> org.datanucleus.jdo.state.JDOStateManagerImpl.isLoaded(JDOStateManagerImpl.java:2020)
>    at
> org.apache.hadoop.hive.metastore.model.MStorageDescriptor.jdoGetsortCols(MStorageDescriptor.java)
>    at
> org.apache.hadoop.hive.metastore.model.MStorageDescriptor.getSortCols(MStorageDescriptor.java:206)
>    at
> org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:759)
>    at
> org.apache.hadoop.hive.metastore.ObjectStore.convertToPart(ObjectStore.java:859)
>    at
> org.apache.hadoop.hive.metastore.ObjectStore.convertToParts(ObjectStore.java:896)
>    at
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitions(ObjectStore.java:886)
>    at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$21.run(HiveMetaStore.java:1333)
>    at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$21.run(HiveMetaStore.java:1330)
>    at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.executeWithRetry(HiveMetaStore.java:234)
>    at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions(HiveMetaStore.java:1330)
>    at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_ps(HiveMetaStore.java:1760)
>    at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitions(HiveMetaStoreClient.java:515)
>    at org.apache.hadoop.hive.ql.metadata.Hive.getPartitions(Hive.java:1267)
>    at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.setupStats(SemanticAnalyzer.java:5793)
>    at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genTablePlan(SemanticAnalyzer.java:5603)
> 
> 
> the actual stack is different from each execution of analyze.
> 
> Another version:
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>    at java.util.Arrays.copyOf(Arrays.java:2882)
>    at
> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
>    at
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:597)
>    at java.lang.StringBuilder.append(StringBuilder.java:212)
>    at
> org.datanucleus.JDOClassLoaderResolver.newCacheKey(JDOClassLoaderResolver.java:382)
>    at
> org.datanucleus.JDOClassLoaderResolver.classForName(JDOClassLoaderResolver.java:173)
>    at
> org.datanucleus.JDOClassLoaderResolver.classForName(JDOClassLoaderResolver.java:412)
>    at
> org.datanucleus.store.mapped.mapping.EmbeddedMapping.getJavaType(EmbeddedMapping.java:574)
>    at
> org.datanucleus.store.mapped.mapping.EmbeddedMapping.getObject(EmbeddedMapping.java:455)
>    at
> org.datanucleus.store.mapped.scostore.ListStoreIterator.<init>(ListStoreIterator.java:94)
>    at
> org.datanucleus.store.rdbms.scostore.RDBMSListStoreIterator.<init>(RDBMSListStoreIterator.java:41)
>    at
> org.datanucleus.store.rdbms.scostore.RDBMSJoinListStore.listIterator(RDBMSJoinListStore.java:158)
>    at
> org.datanucleus.store.mapped.scostore.AbstractListStore.listIterator(AbstractListStore.java:84)
>    at
> org.datanucleus.store.mapped.scostore.AbstractListStore.iterator(AbstractListStore.java:74)
>    at
> org.datanucleus.store.types.sco.backed.List.loadFromStore(List.java:241)
>    at org.datanucleus.store.types.sco.backed.List.iterator(List.java:494)
>    at
> org.apache.hadoop.hive.metastore.ObjectStore.convertToFieldSchemas(ObjectStore.java:706)
>    at
> org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:759)
>    at
> org.apache.hadoop.hive.metastore.ObjectStore.convertToPart(ObjectStore.java:859)
>    at
> org.apache.hadoop.hive.metastore.ObjectStore.convertToParts(ObjectStore.java:896)
>    at
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitions(ObjectStore.java:886)
>    at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$21.run(HiveMetaStore.java:1333)
>    at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$21.run(HiveMetaStore.java:1330)
>    at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.executeWithRetry(HiveMetaStore.java:234)
>    at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions(HiveMetaStore.java:1330)
>    at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_ps(HiveMetaStore.java:1760)
>    at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitions(HiveMetaStoreClient.java:515)
>    at org.apache.hadoop.hive.ql.metadata.Hive.getPartitions(Hive.java:1267)
>    at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.setupStats(SemanticAnalyzer.java:5793)
>    at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genTablePlan(SemanticAnalyzer.java:5603)
>    at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:5834)
>    at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:6432)
> 
> 
> Makes no difference if I limit this to a single partition or not or any
> other variation of the partition specification.
> 
> It is a sequence file based table, dynamic and static partitions as well as
> compression.
> 
> Best regards,
> Terje

Reply via email to