sporadic `Unable to find class` with anonymous functions in udf
using cdh5.4.3 (hive1.1) via HiveServer. Does anyone have a suggestion about what to do / look for? the error: org.apache.hadoop.hive.ql.parse.SemanticException: Generate Map Join Task Error: Unable to find class: com.foursquare.hadoop.hive.udf.IsDefinedUDF$$anonfun$initialize$6 Serialization trace: isDefinedFunc (com.foursquare.hadoop.hive.udf.IsDefinedUDF) genericUDF (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) predicate (org.apache.hadoop.hive.ql.plan.FilterDesc) conf (org.apache.hadoop.hive.ql.exec.FilterOperator) opParseCtxMap (org.apache.hadoop.hive.ql.plan.MapWork) mapWork (org.apache.hadoop.hive.ql.plan.MapredWork) at org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinTaskDispatcher.processCurrentTask(CommonJoinTaskDispatcher.java:517) at org.apache.hadoop.hive.ql.optimizer.physical.AbstractJoinTaskDispatcher.dispatch(AbstractJoinTaskDispatcher.java:180) at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111) at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180) the udf: @Description(name = "isDefined", value = "returns true if the object is not null and not empty and not \"\"", extended = "Example:\n" + "SELECT isDefined(col)\n") class IsDefinedUDF extends GenericUDF with Serializable { var isDefinedFunc: Option[Object] => Boolean = null override def initialize(arguments: Array[ObjectInspector]): ObjectInspector = { val arg = arguments.toVector if (arg.length !=? 1) { throw new UDFArgumentLengthException("isDefined only takes one argument.") } Option(arg.head) match { case Some(a: ListObjectInspector) => { isDefinedFunc = {obj => obj.map(o => !(a.getList(o).asScala.toList.isEmpty)).getOrElse(false)} } case Some(a: MapObjectInspector) => { isDefinedFunc = {obj => obj.map(o => !(a.getMap(o).asScala.toMap.isEmpty)).getOrElse(false)} } case Some(a: LazyStringObjectInspector) => { isDefinedFunc = {obj => a.getPrimitiveJavaObject(obj.getOrElse(new LazyString(a))) != ""} } case Some(a: StringObjectInspector) => { isDefinedFunc = {obj => a.getPrimitiveJavaObject(obj.getOrElse(new Text(""))) != ""} } case None => { isDefinedFunc = {x => false} } case _ => { isDefinedFunc = {obj => obj.isDefined} } } PrimitiveObjectInspectorFactory.javaBooleanObjectInspector } override def evaluate(arguments: Array[DeferredObject]): Object = { val arg = arguments.toVector.head isDefinedFunc(Option(arg.get())): java.lang.Boolean } override def getDisplayString(children: Array[String]) = { "isDefined(" + children(0) + ")" } }
Re: adding jars - hive on spark cdh 5.4.3
It didn't work. assuming I did the right thing. in the properties you could see {"key":"hive.aux.jars.path","value":"file:///data/loko/foursquare.web-hiverc/current/hadoop-hive-serde.jar,file:///data/loko/foursquare.web-hiverc/current/hadoop-hive-udf.jar","isFinal":false,"resource":"programatically"} which includes the jar that has the class I need but I still get org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: com.foursquare.hadoop.hive.io.HiveThriftSequenceFileInputFormat On Fri, Jan 8, 2016 at 12:24 PM, Edward Capriolo <edlinuxg...@gmail.com> wrote: > You can not 'add jar' input formats and serde's. They need to be part of > your auxlib. > > On Fri, Jan 8, 2016 at 12:19 PM, Ophir Etzion <op...@foursquare.com> > wrote: > >> I tried now. still getting >> >> 16/01/08 16:37:34 ERROR exec.Utilities: Failed to load plan: >> hdfs://hadoop-alidoro-nn-vip/tmp/hive/hive/c2af9882-38a9-42b0-8d17-3f56708383e8/hive_2016-01-08_16-36-41_370_3307331506800215903-3/-mr-10004/3c90a796-47fc-4541-bbec-b196c40aefab/map.xml: >> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find >> class: com.foursquare.hadoop.hive.io.HiveThriftSequenceFileInputFormat >> Serialization trace: >> inputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc) >> aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork) >> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find >> class: com.foursquare.hadoop.hive.io.HiveThriftSequenceFileInputFormat >> >> >> HiveThriftSequenceFileInputFormat is in one of the jars I'm trying to add. >> >> >> On Thu, Jan 7, 2016 at 9:58 PM, Prem Sure <premsure...@gmail.com> wrote: >> >>> did you try -- jars property in spark submit? if your jar is of huge >>> size, you can pre-load the jar on all executors in a common available >>> directory to avoid network IO. >>> >>> On Thu, Jan 7, 2016 at 4:03 PM, Ophir Etzion <op...@foursquare.com> >>> wrote: >>> >>>> I' trying to add jars before running a query using hive on spark on cdh >>>> 5.4.3. >>>> I've tried applying the patch in >>>> https://issues.apache.org/jira/browse/HIVE-12045 (manually as the >>>> patch is done on a different hive version) but still hasn't succeeded. >>>> >>>> did anyone manage to do ADD JAR successfully with CDH? >>>> >>>> Thanks, >>>> Ophir >>>> >>> >>> >> >
Re: adding jars - hive on spark cdh 5.4.3
Thanks! In certain use cases you could but forgot about the aux thing, thats probably it. On Fri, Jan 8, 2016 at 12:24 PM, Edward Capriolo <edlinuxg...@gmail.com> wrote: > You can not 'add jar' input formats and serde's. They need to be part of > your auxlib. > > On Fri, Jan 8, 2016 at 12:19 PM, Ophir Etzion <op...@foursquare.com> > wrote: > >> I tried now. still getting >> >> 16/01/08 16:37:34 ERROR exec.Utilities: Failed to load plan: >> hdfs://hadoop-alidoro-nn-vip/tmp/hive/hive/c2af9882-38a9-42b0-8d17-3f56708383e8/hive_2016-01-08_16-36-41_370_3307331506800215903-3/-mr-10004/3c90a796-47fc-4541-bbec-b196c40aefab/map.xml: >> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find >> class: com.foursquare.hadoop.hive.io.HiveThriftSequenceFileInputFormat >> Serialization trace: >> inputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc) >> aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork) >> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find >> class: com.foursquare.hadoop.hive.io.HiveThriftSequenceFileInputFormat >> >> >> HiveThriftSequenceFileInputFormat is in one of the jars I'm trying to add. >> >> >> On Thu, Jan 7, 2016 at 9:58 PM, Prem Sure <premsure...@gmail.com> wrote: >> >>> did you try -- jars property in spark submit? if your jar is of huge >>> size, you can pre-load the jar on all executors in a common available >>> directory to avoid network IO. >>> >>> On Thu, Jan 7, 2016 at 4:03 PM, Ophir Etzion <op...@foursquare.com> >>> wrote: >>> >>>> I' trying to add jars before running a query using hive on spark on cdh >>>> 5.4.3. >>>> I've tried applying the patch in >>>> https://issues.apache.org/jira/browse/HIVE-12045 (manually as the >>>> patch is done on a different hive version) but still hasn't succeeded. >>>> >>>> did anyone manage to do ADD JAR successfully with CDH? >>>> >>>> Thanks, >>>> Ophir >>>> >>> >>> >> >
Re: adding jars - hive on spark cdh 5.4.3
I tried now. still getting 16/01/08 16:37:34 ERROR exec.Utilities: Failed to load plan: hdfs://hadoop-alidoro-nn-vip/tmp/hive/hive/c2af9882-38a9-42b0-8d17-3f56708383e8/hive_2016-01-08_16-36-41_370_3307331506800215903-3/-mr-10004/3c90a796-47fc-4541-bbec-b196c40aefab/map.xml: org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: com.foursquare.hadoop.hive.io.HiveThriftSequenceFileInputFormat Serialization trace: inputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc) aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork) org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: com.foursquare.hadoop.hive.io.HiveThriftSequenceFileInputFormat HiveThriftSequenceFileInputFormat is in one of the jars I'm trying to add. On Thu, Jan 7, 2016 at 9:58 PM, Prem Sure <premsure...@gmail.com> wrote: > did you try -- jars property in spark submit? if your jar is of huge size, > you can pre-load the jar on all executors in a common available directory > to avoid network IO. > > On Thu, Jan 7, 2016 at 4:03 PM, Ophir Etzion <op...@foursquare.com> wrote: > >> I' trying to add jars before running a query using hive on spark on cdh >> 5.4.3. >> I've tried applying the patch in >> https://issues.apache.org/jira/browse/HIVE-12045 (manually as the patch >> is done on a different hive version) but still hasn't succeeded. >> >> did anyone manage to do ADD JAR successfully with CDH? >> >> Thanks, >> Ophir >> > >
adding jars - hive on spark cdh 5.4.3
I' trying to add jars before running a query using hive on spark on cdh 5.4.3. I've tried applying the patch in https://issues.apache.org/jira/browse/HIVE-12045 (manually as the patch is done on a different hive version) but still hasn't succeeded. did anyone manage to do ADD JAR successfully with CDH? Thanks, Ophir
last_modified_time and transient_lastDdlTime - what is transient_lastDdlTime for.
I want to know for each of my tables the last time it was modified. some of my tables don't have last_modified_time in the table parameters but all have transient_lastDdlTime. transient_lastDdlTime seems to be the same as last_modified_time in some of the tables I randomly cheked. what is the time in transient_lastDdlTime? if it also the modified time why is there also last_modified_time? Thanks, Ophir
hive on spark
During spark-submit when running hive on spark I get: Exception in thread "main" java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider org.apache.hadoop.hdfs.HftpFileSystem could not be instantiated Caused by: java.lang.IllegalAccessError: tried to access method org.apache.hadoop.fs.DelegationTokenRenewer.(Ljava/lang/Class;)V from class org.apache.hadoop.hdfs.HftpFileSystem I managed to make hive on spark work on a staging cluster I have and now I'm trying to do the same on a production cluster and this happened. Both are cdh5.4.3. I read that this is due to something not being compiled against the correct hadoop version. my main question what is the binary/jar/file that can cause this? I tried replacing the binaries and jars to the ones used by the staging cluster (that hive on spark worked on) and it didn't help. Thank you for anyone reading this, and thank you for any direction on where to look. Ophir
Re: Hive on Spark - Error: Child process exited before connecting back
Hi, the versions are spark 1.3.0 and hive 1.1.0 as part of cloudera 5.4.3. I find it weird that it would work only on the version you mentioned as there is documentation (not good documentation but still..) on how to do it with cloudera that packages different versions. Thanks for the answer though. why would spark 1.5.2 specifically would not work with hive? Ophir On Tue, Dec 15, 2015 at 5:33 PM, Mich Talebzadeh <m...@peridale.co.uk> wrote: > Hi, > > > > The only version that I have managed to run Hive using Spark engine is > Spark 1.3.1 on Hive 1.2.1 > > > > Can you confirm the version of Spark you are running? > > > > FYI, Spark 1.5.2 will not work with Hive. > > > > HTH > > > > Mich Talebzadeh > > > > *Sybase ASE 15 Gold Medal Award 2008* > > A Winning Strategy: Running the most Critical Financial Data on ASE 15 > > > http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf > > Author of the books* "A Practitioner’s Guide to Upgrading to Sybase ASE > 15", ISBN 978-0-9563693-0-7*. > > co-author *"Sybase Transact SQL Guidelines Best Practices", ISBN > 978-0-9759693-0-4* > > *Publications due shortly:* > > *Complex Event Processing in Heterogeneous Environments*, ISBN: > 978-0-9563693-3-8 > > *Oracle and Sybase, Concepts and Contrasts*, ISBN: 978-0-9563693-1-4, volume > one out shortly > > > > http://talebzadehmich.wordpress.com > > > > NOTE: The information in this email is proprietary and confidential. This > message is for the designated recipient only, if you are not the intended > recipient, you should destroy it immediately. Any information in this > message shall not be understood as given or endorsed by Peridale Technology > Ltd, its subsidiaries or their employees, unless expressly so stated. It is > the responsibility of the recipient to ensure that this email is virus > free, therefore neither Peridale Ltd, its subsidiaries nor their employees > accept any responsibility. > > > > *From:* Ophir Etzion [mailto:op...@foursquare.com] > *Sent:* 15 December 2015 22:27 > *To:* u...@spark.apache.org; user@hive.apache.org > *Subject:* Hive on Spark - Error: Child process exited before connecting > back > > > > Hi, > > > > when trying to do Hive on Spark on CDH5.4.3 I get the following error when > trying to run a simple query using spark. > > I've tried setting everything written here ( > https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started) > as well as what the cdh recommends. > > any one encountered this as well? (searching for it didn't help much) > > the error: > > ERROR : Failed to execute spark task, with exception > 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark > client.)' > > org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark > client. > > at > org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:57) > > at > org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114) > > at > org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:120) > > at > org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97) > > at > org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) > > at > org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640) > > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399) > > at > org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183) > > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) > > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044) > > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144) > > at > org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69) > > at > org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:415) > > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > > at > org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208) > >
Hive on Spark - Error: Child process exited before connecting back
Hi, when trying to do Hive on Spark on CDH5.4.3 I get the following error when trying to run a simple query using spark. I've tried setting everything written here ( https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started) as well as what the cdh recommends. any one encountered this as well? (searching for it didn't help much) the error: ERROR : Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)' org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark client. at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:57) at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114) at org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:120) at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144) at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69) at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before connecting back at com.google.common.base.Throwables.propagate(Throwables.java:156) at org.apache.hive.spark.client.SparkClientImpl.(SparkClientImpl.java:109) at org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80) at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.(RemoteHiveSparkClient.java:91) at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:65) at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:55) ... 22 more Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before connecting back at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) at org.apache.hive.spark.client.SparkClientImpl.(SparkClientImpl.java:99) ... 26 more Caused by: java.lang.RuntimeException: Cancel client '2b2d7314-e0cc-4933-82a1-992a3299d109'. Error: Child process exited before connecting back at org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179) at org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:427) ... 1 more ERROR : Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)' org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark client. at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:57) at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114) at org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:120) at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1640) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1399) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1183) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:144) at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69) at
trying to figure out number of MR jobs from explain output
Hi, I've been trying to figure out how to know the number of MR jobs that will be ran for a hive query using the EXPLAIN output. I haven't got to a consistent method to knowing that. for example (in one of my queries, ctas query): STAGE DEPENDENCIES: Stage-1 is a root stage Stage-7 depends on stages: Stage-1 , consists of Stage-4, Stage-3, Stage-5 Stage-4 Stage-0 depends on stages: Stage-4, Stage-3, Stage-6 Stage-8 depends on stages: Stage-0 Stage-2 depends on stages: Stage-8 Stage-3 Stage-5 Stage-6 depends on stages: Stage-5 Stage-1, Stage-3, Stage-5 are listed as map reduce steps. eventually 2 MR jobs ran. in other cases only 1 job runs. I couldn't find a consistent rule on how to figure this out. can anyone help?? Thank you!! below is full output explain CREATE TABLE beekeeper_results.test3 ROW FORMAT SERDE "com.foursquare.hadoop.hive.serde.lazycsv.LazySimpleCSVSerde" WITH SERDEPROPERTIES ('escape.delim'='\\', 'mapkey.delim'='\;', 'colelction.delim'='|') AS SELECT * FROM beekeeper_results.test2; OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-7 depends on stages: Stage-1 , consists of Stage-4, Stage-3, Stage-5 Stage-4 Stage-0 depends on stages: Stage-4, Stage-3, Stage-6 Stage-8 depends on stages: Stage-0 Stage-2 depends on stages: Stage-8 Stage-3 Stage-5 Stage-6 depends on stages: Stage-5 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: test2 Statistics: Num rows: 112 Data size: 11690 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: blasttag (type: string), actioncounts (type: array>), detailedclicks (type: array >), countsbyclient (type: array >), totalactioncounts (type: array >), actionsbydate (type: array >) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5 Statistics: Num rows: 112 Data size: 11690 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 112 Data size: 11690 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: com.foursquare.hadoop.hive.serde.lazycsv.LazySimpleCSVSerde name: beekeeper_results.test3 Stage: Stage-7 Conditional Operator Stage: Stage-4 Move Operator files: hdfs directory: true destination: hdfs://hadoop-alidoro-nn-vip/user/hive/warehouse/.hive-staging_hive_2015-12-11_21-52-35_063_8498858370292854265-1/-ext-10001 Stage: Stage-0 Move Operator files: hdfs directory: true destination: *** Stage: Stage-8 Create Table Operator: Create Table columns: blasttag string, actioncounts array >, detailedclicks array >, countsbyclient array >, totalactioncounts array >, actionsbydate array > input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat serde name: com.foursquare.hadoop.hive.serde.lazycsv.LazySimpleCSVSerde serde properties: colelction.delim | escape.delim \ mapkey.delim ; name: beekeeper_results.test3 Stage: Stage-2 Stats-Aggr Operator Stage: Stage-3 Map Reduce Map Operator Tree: TableScan File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: com.foursquare.hadoop.hive.serde.lazycsv.LazySimpleCSVSerde name: beekeeper_results.test3 Stage: Stage-5 Map Reduce Map Operator Tree: TableScan File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: com.foursquare.hadoop.hive.serde.lazycsv.LazySimpleCSVSerde name: beekeeper_results.test3 Stage: Stage-6 Move Operator files: hdfs directory: true destination: ***