Resolved. Bold text is FIX. ./bin/spark-submit -v --master yarn-cluster --jars /home/dvasthimal/spark1.3/spark-avro_2.10-1.0.0.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar --num-executors 1 --driver-memory 4g *--driver-java-options "-XX:MaxPermSize=2G" *--executor-memory 2g --executor-cores 1 --queue hdmi-express --class com.ebay.ep.poc.spark.reporting.SparkApp spark_reporting-1.0-SNAPSHOT.jar startDate=2015-02-16 endDate=2015-02-16 input=/user/dvasthimal/epdatasets/successdetail1/part-r-00000.avro subcommand=successevents2 output=/user/dvasthimal/epdatasets/successdetail2
On Thu, Mar 26, 2015 at 7:54 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> wrote: > Can someone please respond to this ? > > On Wed, Mar 25, 2015 at 11:18 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> > wrote: > >> http://spark.apache.org/docs/1.3.0/sql-programming-guide.html#hive-tables >> >> >> >> I modified the Hive query but run into same error. ( >> http://spark.apache.org/docs/1.3.0/sql-programming-guide.html#hive-tables >> ) >> >> >> val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc) >> sqlContext.sql("CREATE TABLE IF NOT EXISTS src_spark (key INT, value >> STRING)") >> sqlContext.sql("LOAD DATA LOCAL INPATH >> 'examples/src/main/resources/kv1.txt' INTO TABLE src") >> >> // Queries are expressed in HiveQL >> sqlContext.sql("FROM src SELECT key, >> value").collect().foreach(println) >> >> >> >> Command >> >> ./bin/spark-submit -v --master yarn-cluster --jars >> /home/dvasthimal/spark1.3/spark-avro_2.10-1.0.0.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar,/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar >> --num-executors 3 --driver-memory 8g --executor-memory 2g --executor-cores >> 1 --queue hdmi-express --class com.ebay.ep.poc.spark.reporting.SparkApp >> spark_reporting-1.0-SNAPSHOT.jar startDate=2015-02-16 endDate=2015-02-16 >> input=/user/dvasthimal/epdatasets/successdetail1/part-r-00000.avro >> subcommand=successevents2 output=/user/dvasthimal/epdatasets/successdetail2 >> >> >> Input >> >> -sh-4.1$ ls -l examples/src/main/resources/kv1.txt >> -rw-r--r-- 1 dvasthimal gid-dvasthimal 5812 Mar 5 17:31 >> examples/src/main/resources/kv1.txt >> -sh-4.1$ head examples/src/main/resources/kv1.txt >> 238val_238 >> 86val_86 >> 311val_311 >> 27val_27 >> 165val_165 >> 409val_409 >> 255val_255 >> 278val_278 >> 98val_98 >> 484val_484 >> -sh-4.1$ >> >> Log >> >> /apache/hadoop/bin/yarn logs -applicationId >> application_1426715280024_82757 >> >> … >> … >> … >> >> >> 15/03/25 07:52:44 INFO metastore.HiveMetaStore: No user is added in admin >> role, since config is empty >> 15/03/25 07:52:44 INFO session.SessionState: No Tez session required at >> this point. hive.execution.engine=mr. >> 15/03/25 07:52:47 INFO parse.ParseDriver: Parsing command: CREATE TABLE >> IF NOT EXISTS src_spark (key INT, value STRING) >> 15/03/25 07:52:47 INFO parse.ParseDriver: Parse Completed >> 15/03/25 07:52:48 INFO log.PerfLogger: <PERFLOG method=Driver.run >> from=org.apache.hadoop.hive.ql.Driver> >> 15/03/25 07:52:48 INFO log.PerfLogger: <PERFLOG method=TimeToSubmit >> from=org.apache.hadoop.hive.ql.Driver> >> 15/03/25 07:52:48 INFO ql.Driver: Concurrency mode is disabled, not >> creating a lock manager >> 15/03/25 07:52:48 INFO log.PerfLogger: <PERFLOG method=compile >> from=org.apache.hadoop.hive.ql.Driver> >> 15/03/25 07:52:48 INFO log.PerfLogger: <PERFLOG method=parse >> from=org.apache.hadoop.hive.ql.Driver> >> 15/03/25 07:52:48 INFO parse.ParseDriver: Parsing command: CREATE TABLE >> IF NOT EXISTS src_spark (key INT, value STRING) >> 15/03/25 07:52:48 INFO parse.ParseDriver: Parse Completed >> 15/03/25 07:52:48 INFO log.PerfLogger: </PERFLOG method=parse >> start=1427295168392 end=1427295168393 duration=1 >> from=org.apache.hadoop.hive.ql.Driver> >> 15/03/25 07:52:48 INFO log.PerfLogger: <PERFLOG method=semanticAnalyze >> from=org.apache.hadoop.hive.ql.Driver> >> 15/03/25 07:52:48 INFO parse.SemanticAnalyzer: Starting Semantic Analysis >> 15/03/25 07:52:48 INFO parse.SemanticAnalyzer: Creating table src_spark >> position=27 >> 15/03/25 07:52:48 INFO metastore.HiveMetaStore: 0: get_table : db=default >> tbl=src_spark >> 15/03/25 07:52:48 INFO HiveMetaStore.audit: ugi=dvasthimal >> ip=unknown-ip-addr cmd=get_table : db=default tbl=src_spark >> 15/03/25 07:52:48 INFO metastore.HiveMetaStore: 0: get_database: default >> 15/03/25 07:52:48 INFO HiveMetaStore.audit: ugi=dvasthimal >> ip=unknown-ip-addr cmd=get_database: default >> 15/03/25 07:52:48 INFO ql.Driver: Semantic Analysis Completed >> 15/03/25 07:52:48 INFO log.PerfLogger: </PERFLOG method=semanticAnalyze >> start=1427295168393 end=1427295168595 duration=202 >> from=org.apache.hadoop.hive.ql.Driver> >> 15/03/25 07:52:48 INFO ql.Driver: Returning Hive schema: >> Schema(fieldSchemas:null, properties:null) >> 15/03/25 07:52:48 INFO log.PerfLogger: </PERFLOG method=compile >> start=1427295168352 end=1427295168607 duration=255 >> from=org.apache.hadoop.hive.ql.Driver> >> 15/03/25 07:52:48 INFO log.PerfLogger: <PERFLOG method=Driver.execute >> from=org.apache.hadoop.hive.ql.Driver> >> 15/03/25 07:52:48 INFO ql.Driver: Starting command: CREATE TABLE IF NOT >> EXISTS src_spark (key INT, value STRING) >> 15/03/25 07:52:48 INFO log.PerfLogger: </PERFLOG method=TimeToSubmit >> start=1427295168349 end=1427295168625 duration=276 >> from=org.apache.hadoop.hive.ql.Driver> >> 15/03/25 07:52:48 INFO log.PerfLogger: <PERFLOG method=runTasks >> from=org.apache.hadoop.hive.ql.Driver> >> 15/03/25 07:52:48 INFO log.PerfLogger: <PERFLOG method=task.DDL.Stage-0 >> from=org.apache.hadoop.hive.ql.Driver> >> 15/03/25 07:52:51 INFO exec.DDLTask: Default to LazySimpleSerDe for table >> src_spark >> 15/03/25 07:52:52 INFO metastore.HiveMetaStore: 0: create_table: >> Table(tableName:src_spark, dbName:default, owner:dvasthimal, createTime: >> 1427295171, lastAccessTime:0, retention:0, >> sd:StorageDescriptor(cols:[FieldSchema(name:key, type:int, comment:null), >> FieldSchema(name:value, type:string, comment:null)], location:null, >> inputFormat:org.apache.hadoop.mapred.TextInputFormat, >> outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, >> compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, >> serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, >> parameters:{serialization.format=1}), bucketCols:[], sortCols:[], >> parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], >> skewedColValueLocationMaps:{}), storedAsSubDirectories:false), >> partitionKeys:[], parameters:{}, viewOriginalText:null, >> viewExpandedText:null, tableType:MANAGED_TABLE) >> 15/03/25 07:52:52 INFO HiveMetaStore.audit: ugi=dvasthimal >> ip=unknown-ip-addr cmd=create_table: Table(tableName:src_spark, >> dbName:default, owner:dvasthimal, createTime:1427295171, >> lastAccessTime:0, retention:0, >> sd:StorageDescriptor(cols:[FieldSchema(name:key, type:int, comment:null), >> FieldSchema(name:value, type:string, comment:null)], location:null, >> inputFormat:org.apache.hadoop.mapred.TextInputFormat, >> outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, >> compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, >> serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, >> parameters:{serialization.format=1}), bucketCols:[], sortCols:[], >> parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], >> skewedColValueLocationMaps:{}), storedAsSubDirectories:false), >> partitionKeys:[], parameters:{}, viewOriginalText:null, >> viewExpandedText:null, tableType:MANAGED_TABLE) >> 15/03/25 07:52:55 INFO log.PerfLogger: </PERFLOG method=Driver.execute >> start=1427295168607 end=1427295175852 duration=7245 >> from=org.apache.hadoop.hive.ql.Driver> >> Exception in thread "Driver" >> Exception: java.lang.OutOfMemoryError thrown from the >> UncaughtExceptionHandler in thread "Driver" >> >> LogType: stdout >> LogLength: 0 >> Log Contents: >> >> >> >> hive> describe src_spark; >> FAILED: SemanticException [Error 10001]: Table not found src_spark >> hive> >> >> >> >> Any suggestions ? >> >> >> I tried with 10g, 12g and 14g (max) for driver memory, but i still see >> the OutOfMemory exception. >> >> Please suggest >> >> -- >> Deepak >> >> > > > -- > Deepak > > -- Deepak