Re: OOM for HiveFromSpark example

Akhil Das Thu, 26 Mar 2015 01:39:13 -0700

When you run it in local mode ^^

Thanks
Best Regards


On Thu, Mar 26, 2015 at 2:06 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> wrote:

> I don;t think thats correct. load data local should pick input from local
> directory.
>
> On Thu, Mar 26, 2015 at 1:59 PM, Akhil Das <ak...@sigmoidanalytics.com>
> wrote:
>
>> Not sure, but you can create that path in all workers and put that file
>> in it.
>>
>> Thanks
>> Best Regards
>>
>> On Thu, Mar 26, 2015 at 1:56 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com>
>> wrote:
>>
>>> The Hive command
>>>
>>> LOAD DATA LOCAL INPATH
>>> '/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/examples/src/main/resources/kv1.txt'
>>> INTO TABLE src_spark
>>>
>>> 1. LOCAL INPATH. if i push to HDFS then how will it work ?
>>>
>>> 2. I cant use sc.addFile, cause i want to run Hive (Spark SQL) queries.
>>>
>>> On Thu, Mar 26, 2015 at 1:41 PM, Akhil Das <ak...@sigmoidanalytics.com>
>>> wrote:
>>>
>>>> Now its clear that the workers are not having the file kv1.txt in their
>>>> local filesystem. You can try putting that in hdfs and use the URI to that
>>>> file or try adding the file with sc.addFile
>>>>
>>>> Thanks
>>>> Best Regards
>>>>
>>>> On Thu, Mar 26, 2015 at 1:38 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com>
>>>> wrote:
>>>>
>>>>> Does not work
>>>>>
>>>>> 15/03/26 01:07:05 INFO HiveMetaStore.audit: ugi=dvasthimal
>>>>> ip=unknown-ip-addr cmd=get_table : db=default tbl=src_spark
>>>>> 15/03/26 01:07:06 ERROR ql.Driver: FAILED: SemanticException Line 1:23
>>>>> Invalid path
>>>>> ''/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/examples/src/main/resources/kv1.txt'':
>>>>> No files matching path
>>>>> file:/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/examples/src/main/resources/kv1.txt
>>>>> org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:23 Invalid
>>>>> path
>>>>> ''/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/examples/src/main/resources/kv1.txt'':
>>>>> No files matching path
>>>>> file:/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/examples/src/main/resources/kv1.txt
>>>>> at
>>>>> org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.applyConstraints(LoadSemanticAnalyzer.java:142)
>>>>> at
>>>>> org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.analyzeInternal(LoadSemanticAnalyzer.java:233)
>>>>> at
>>>>> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
>>>>> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:422)
>>>>>
>>>>>
>>>>>
>>>>> Does the input file needs to be passed to executor via -- jars ?
>>>>>
>>>>> On Thu, Mar 26, 2015 at 12:15 PM, Akhil Das <
>>>>> ak...@sigmoidanalytics.com> wrote:
>>>>>
>>>>>> Try to give the complete path to the file kv1.txt.
>>>>>> On 26 Mar 2015 11:48, "ÐΞ€ρ@Ҝ (๏̯͡๏)" <deepuj...@gmail.com> wrote:
>>>>>>
>>>>>>> I am now seeing this error.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 15/03/25 19:44:03 ERROR yarn.ApplicationMaster: User class threw
>>>>>>> exception: FAILED: SemanticException Line 1:23 Invalid path
>>>>>>> ''examples/src/main/resources/kv1.txt'': No files matching path
>>>>>>> file:/hadoop/10/scratch/local/usercache/dvasthimal/appcache/application_1426715280024_89893/container_1426715280024_89893_01_000002/examples/src/main/resources/kv1.txt
>>>>>>>
>>>>>>> org.apache.spark.sql.execution.QueryExecutionException: FAILED:
>>>>>>> SemanticException Line 1:23 Invalid path
>>>>>>> ''examples/src/main/resources/kv1.txt'': No files matching path
>>>>>>> file:/hadoop/10/scratch/local/usercache/dvasthimal/appcache/application_1426715280024_89893/container_1426715280024_89893_01_000002/examples/src/main/resources/kv1.txt
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:312)
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:280)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -sh-4.1$ pwd
>>>>>>>
>>>>>>> /home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4
>>>>>>>
>>>>>>> -sh-4.1$ ls examples/src/main/resources/kv1.txt
>>>>>>>
>>>>>>> examples/src/main/resources/kv1.txt
>>>>>>>
>>>>>>> -sh-4.1$
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Mar 26, 2015 at 8:08 AM, Zhan Zhang <zzh...@hortonworks.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>  You can do it in $SPARK_HOME/conf/spark-defaults.con
>>>>>>>>
>>>>>>>>  spark.driver.extraJavaOptions -XX:MaxPermSize=512m
>>>>>>>>
>>>>>>>>  Thanks.
>>>>>>>>
>>>>>>>>  Zhan Zhang
>>>>>>>>
>>>>>>>>
>>>>>>>>  On Mar 25, 2015, at 7:25 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>  Where and how do i pass this or other JVM argument ?
>>>>>>>> -XX:MaxPermSize=512m
>>>>>>>>
>>>>>>>> On Wed, Mar 25, 2015 at 11:36 PM, Zhan Zhang <
>>>>>>>> zzh...@hortonworks.com> wrote:
>>>>>>>>
>>>>>>>>> I solve this by  increase the PermGen memory size in driver.
>>>>>>>>>
>>>>>>>>>  -XX:MaxPermSize=512m
>>>>>>>>>
>>>>>>>>>  Thanks.
>>>>>>>>>
>>>>>>>>>  Zhan Zhang
>>>>>>>>>
>>>>>>>>>  On Mar 25, 2015, at 10:54 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>  I am facing same issue, posted a new thread. Please respond.
>>>>>>>>>
>>>>>>>>> On Wed, Jan 14, 2015 at 4:38 AM, Zhan Zhang <
>>>>>>>>> zzh...@hortonworks.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Folks,
>>>>>>>>>>
>>>>>>>>>> I am trying to run hive context in yarn-cluster mode, but met
>>>>>>>>>> some error. Does anybody know what cause the issue.
>>>>>>>>>>
>>>>>>>>>> I use following cmd to build the distribution:
>>>>>>>>>>
>>>>>>>>>>  ./make-distribution.sh -Phive -Phive-thriftserver  -Pyarn
>>>>>>>>>> -Phadoop-2.4
>>>>>>>>>>
>>>>>>>>>> 15/01/13 17:59:42 INFO cluster.YarnClusterScheduler:
>>>>>>>>>> YarnClusterScheduler.postStartHook done
>>>>>>>>>> 15/01/13 17:59:42 INFO storage.BlockManagerMasterActor:
>>>>>>>>>> Registering block manager cn122-10.l42scl.hortonworks.com:56157
>>>>>>>>>> with 1589.8 MB RAM, BlockManagerId(2,
>>>>>>>>>> cn122-10.l42scl.hortonworks.com, 56157)
>>>>>>>>>> 15/01/13 17:59:43 INFO parse.ParseDriver: Parsing command: CREATE
>>>>>>>>>> TABLE IF NOT EXISTS src (key INT, value STRING)
>>>>>>>>>> 15/01/13 17:59:43 INFO parse.ParseDriver: Parse Completed
>>>>>>>>>> 15/01/13 17:59:44 INFO metastore.HiveMetaStore: 0: Opening raw
>>>>>>>>>> store with implemenation 
>>>>>>>>>> class:org.apache.hadoop.hive.metastore.ObjectStore
>>>>>>>>>> 15/01/13 17:59:44 INFO metastore.ObjectStore: ObjectStore,
>>>>>>>>>> initialize called
>>>>>>>>>> 15/01/13 17:59:44 INFO DataNucleus.Persistence: Property
>>>>>>>>>> datanucleus.cache.level2 unknown - will be ignored
>>>>>>>>>> 15/01/13 17:59:44 INFO DataNucleus.Persistence: Property
>>>>>>>>>> hive.metastore.integral.jdo.pushdown unknown - will be ignored
>>>>>>>>>> 15/01/13 17:59:44 WARN DataNucleus.Connection: BoneCP specified
>>>>>>>>>> but not present in CLASSPATH (or one of dependencies)
>>>>>>>>>> 15/01/13 17:59:44 WARN DataNucleus.Connection: BoneCP specified
>>>>>>>>>> but not present in CLASSPATH (or one of dependencies)
>>>>>>>>>> 15/01/13 17:59:52 INFO metastore.ObjectStore: Setting MetaStore
>>>>>>>>>> object pin classes with
>>>>>>>>>> hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
>>>>>>>>>> 15/01/13 17:59:52 INFO metastore.MetaStoreDirectSql: MySQL check
>>>>>>>>>> failed, assuming we are not on mysql: Lexical error at line 1, 
>>>>>>>>>> column 5.
>>>>>>>>>> Encountered: "@" (64), after : "".
>>>>>>>>>> 15/01/13 17:59:53 INFO DataNucleus.Datastore: The class
>>>>>>>>>> "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as
>>>>>>>>>> "embedded-only" so does not have its own datastore table.
>>>>>>>>>> 15/01/13 17:59:53 INFO DataNucleus.Datastore: The class
>>>>>>>>>> "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as
>>>>>>>>>> "embedded-only" so does not have its own datastore table.
>>>>>>>>>> 15/01/13 17:59:59 INFO DataNucleus.Datastore: The class
>>>>>>>>>> "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as
>>>>>>>>>> "embedded-only" so does not have its own datastore table.
>>>>>>>>>> 15/01/13 17:59:59 INFO DataNucleus.Datastore: The class
>>>>>>>>>> "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as
>>>>>>>>>> "embedded-only" so does not have its own datastore table.
>>>>>>>>>> 15/01/13 18:00:00 INFO metastore.ObjectStore: Initialized
>>>>>>>>>> ObjectStore
>>>>>>>>>> 15/01/13 18:00:00 WARN metastore.ObjectStore: Version information
>>>>>>>>>> not found in metastore. hive.metastore.schema.verification is not 
>>>>>>>>>> enabled
>>>>>>>>>> so recording the schema version 0.13.1aa
>>>>>>>>>> 15/01/13 18:00:01 INFO metastore.HiveMetaStore: Added admin role
>>>>>>>>>> in metastore
>>>>>>>>>> 15/01/13 18:00:01 INFO metastore.HiveMetaStore: Added public role
>>>>>>>>>> in metastore
>>>>>>>>>> 15/01/13 18:00:01 INFO metastore.HiveMetaStore: No user is added
>>>>>>>>>> in admin role, since config is empty
>>>>>>>>>> 15/01/13 18:00:01 INFO session.SessionState: No Tez session
>>>>>>>>>> required at this point. hive.execution.engine=mr.
>>>>>>>>>> 15/01/13 18:00:02 INFO log.PerfLogger: <PERFLOG method=Driver.run
>>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>> 15/01/13 18:00:02 INFO log.PerfLogger: <PERFLOG
>>>>>>>>>> method=TimeToSubmit from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>> 15/01/13 18:00:02 INFO ql.Driver: Concurrency mode is disabled,
>>>>>>>>>> not creating a lock manager
>>>>>>>>>> 15/01/13 18:00:02 INFO log.PerfLogger: <PERFLOG method=compile
>>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: <PERFLOG method=parse
>>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>> 15/01/13 18:00:03 INFO parse.ParseDriver: Parsing command: CREATE
>>>>>>>>>> TABLE IF NOT EXISTS src (key INT, value STRING)
>>>>>>>>>> 15/01/13 18:00:03 INFO parse.ParseDriver: Parse Completed
>>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: </PERFLOG method=parse
>>>>>>>>>> start=1421190003030 end=1421190003031 duration=1
>>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: <PERFLOG
>>>>>>>>>> method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>> 15/01/13 18:00:03 INFO parse.SemanticAnalyzer: Starting Semantic
>>>>>>>>>> Analysis
>>>>>>>>>> 15/01/13 18:00:03 INFO parse.SemanticAnalyzer: Creating table src
>>>>>>>>>> position=27
>>>>>>>>>> 15/01/13 18:00:03 INFO metastore.HiveMetaStore: 0: get_table :
>>>>>>>>>> db=default tbl=src
>>>>>>>>>> 15/01/13 18:00:03 INFO HiveMetaStore.audit: ugi=zzhang
>>>>>>>>>> ip=unknown-ip-addr      cmd=get_table : db=default tbl=src
>>>>>>>>>> 15/01/13 18:00:03 INFO metastore.HiveMetaStore: 0: get_database:
>>>>>>>>>> default
>>>>>>>>>> 15/01/13 18:00:03 INFO HiveMetaStore.audit: ugi=zzhang
>>>>>>>>>> ip=unknown-ip-addr      cmd=get_database: default
>>>>>>>>>> 15/01/13 18:00:03 INFO ql.Driver: Semantic Analysis Completed
>>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: </PERFLOG
>>>>>>>>>> method=semanticAnalyze start=1421190003031 end=1421190003406 
>>>>>>>>>> duration=375
>>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>> 15/01/13 18:00:03 INFO ql.Driver: Returning Hive schema:
>>>>>>>>>> Schema(fieldSchemas:null, properties:null)
>>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: </PERFLOG method=compile
>>>>>>>>>> start=1421190002998 end=1421190003416 duration=418
>>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: <PERFLOG
>>>>>>>>>> method=Driver.execute from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>> 15/01/13 18:00:03 INFO ql.Driver: Starting command: CREATE TABLE
>>>>>>>>>> IF NOT EXISTS src (key INT, value STRING)
>>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: </PERFLOG
>>>>>>>>>> method=TimeToSubmit start=1421190002995 end=1421190003421 
>>>>>>>>>> duration=426
>>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: <PERFLOG method=runTasks
>>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: <PERFLOG
>>>>>>>>>> method=task.DDL.Stage-0 from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>> 15/01/13 18:00:03 INFO exec.DDLTask: Default to LazySimpleSerDe
>>>>>>>>>> for table src
>>>>>>>>>> 15/01/13 18:00:05 INFO log.PerfLogger: </PERFLOG
>>>>>>>>>> method=Driver.execute start=1421190003416 end=1421190005498 
>>>>>>>>>> duration=2082
>>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>> Exception in thread "Driver"
>>>>>>>>>> Exception: java.lang.OutOfMemoryError thrown from the
>>>>>>>>>> UncaughtExceptionHandler in thread "Driver"
>>>>>>>>>> --
>>>>>>>>>> CONFIDENTIALITY NOTICE
>>>>>>>>>> NOTICE: This message is intended for the use of the individual or
>>>>>>>>>> entity to
>>>>>>>>>> which it is addressed and may contain information that is
>>>>>>>>>> confidential,
>>>>>>>>>> privileged and exempt from disclosure under applicable law. If
>>>>>>>>>> the reader
>>>>>>>>>> of this message is not the intended recipient, you are hereby
>>>>>>>>>> notified that
>>>>>>>>>> any printing, copying, dissemination, distribution, disclosure or
>>>>>>>>>> forwarding of this communication is strictly prohibited. If you
>>>>>>>>>> have
>>>>>>>>>> received this communication in error, please contact the sender
>>>>>>>>>> immediately
>>>>>>>>>> and delete it from your system. Thank You.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>>>>>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  --
>>>>>>>>>  Deepak
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>  --
>>>>>>>>  Deepak
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Deepak
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Deepak
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Deepak
>>>
>>>
>>
>
>
> --
> Deepak
>
>

Re: OOM for HiveFromSpark example

Reply via email to