Re: OOM for HiveFromSpark example

๏̯͡๏ Thu, 26 Mar 2015 01:37:46 -0700

I don;t think thats correct. load data local should pick input from local
directory.


On Thu, Mar 26, 2015 at 1:59 PM, Akhil Das <ak...@sigmoidanalytics.com>
wrote:

> Not sure, but you can create that path in all workers and put that file in
> it.
>
> Thanks
> Best Regards
>
> On Thu, Mar 26, 2015 at 1:56 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com>
> wrote:
>
>> The Hive command
>>
>> LOAD DATA LOCAL INPATH
>> '/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/examples/src/main/resources/kv1.txt'
>> INTO TABLE src_spark
>>
>> 1. LOCAL INPATH. if i push to HDFS then how will it work ?
>>
>> 2. I cant use sc.addFile, cause i want to run Hive (Spark SQL) queries.
>>
>> On Thu, Mar 26, 2015 at 1:41 PM, Akhil Das <ak...@sigmoidanalytics.com>
>> wrote:
>>
>>> Now its clear that the workers are not having the file kv1.txt in their
>>> local filesystem. You can try putting that in hdfs and use the URI to that
>>> file or try adding the file with sc.addFile
>>>
>>> Thanks
>>> Best Regards
>>>
>>> On Thu, Mar 26, 2015 at 1:38 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com>
>>> wrote:
>>>
>>>> Does not work
>>>>
>>>> 15/03/26 01:07:05 INFO HiveMetaStore.audit: ugi=dvasthimal
>>>> ip=unknown-ip-addr cmd=get_table : db=default tbl=src_spark
>>>> 15/03/26 01:07:06 ERROR ql.Driver: FAILED: SemanticException Line 1:23
>>>> Invalid path
>>>> ''/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/examples/src/main/resources/kv1.txt'':
>>>> No files matching path
>>>> file:/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/examples/src/main/resources/kv1.txt
>>>> org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:23 Invalid
>>>> path
>>>> ''/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/examples/src/main/resources/kv1.txt'':
>>>> No files matching path
>>>> file:/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/examples/src/main/resources/kv1.txt
>>>> at
>>>> org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.applyConstraints(LoadSemanticAnalyzer.java:142)
>>>> at
>>>> org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.analyzeInternal(LoadSemanticAnalyzer.java:233)
>>>> at
>>>> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
>>>> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:422)
>>>>
>>>>
>>>>
>>>> Does the input file needs to be passed to executor via -- jars ?
>>>>
>>>> On Thu, Mar 26, 2015 at 12:15 PM, Akhil Das <ak...@sigmoidanalytics.com
>>>> > wrote:
>>>>
>>>>> Try to give the complete path to the file kv1.txt.
>>>>> On 26 Mar 2015 11:48, "ÐΞ€ρ@Ҝ (๏̯͡๏)" <deepuj...@gmail.com> wrote:
>>>>>
>>>>>> I am now seeing this error.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> 15/03/25 19:44:03 ERROR yarn.ApplicationMaster: User class threw
>>>>>> exception: FAILED: SemanticException Line 1:23 Invalid path
>>>>>> ''examples/src/main/resources/kv1.txt'': No files matching path
>>>>>> file:/hadoop/10/scratch/local/usercache/dvasthimal/appcache/application_1426715280024_89893/container_1426715280024_89893_01_000002/examples/src/main/resources/kv1.txt
>>>>>>
>>>>>> org.apache.spark.sql.execution.QueryExecutionException: FAILED:
>>>>>> SemanticException Line 1:23 Invalid path
>>>>>> ''examples/src/main/resources/kv1.txt'': No files matching path
>>>>>> file:/hadoop/10/scratch/local/usercache/dvasthimal/appcache/application_1426715280024_89893/container_1426715280024_89893_01_000002/examples/src/main/resources/kv1.txt
>>>>>>
>>>>>> at
>>>>>> org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:312)
>>>>>>
>>>>>> at
>>>>>> org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:280)
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> -sh-4.1$ pwd
>>>>>>
>>>>>> /home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4
>>>>>>
>>>>>> -sh-4.1$ ls examples/src/main/resources/kv1.txt
>>>>>>
>>>>>> examples/src/main/resources/kv1.txt
>>>>>>
>>>>>> -sh-4.1$
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Mar 26, 2015 at 8:08 AM, Zhan Zhang <zzh...@hortonworks.com>
>>>>>> wrote:
>>>>>>
>>>>>>>  You can do it in $SPARK_HOME/conf/spark-defaults.con
>>>>>>>
>>>>>>>  spark.driver.extraJavaOptions -XX:MaxPermSize=512m
>>>>>>>
>>>>>>>  Thanks.
>>>>>>>
>>>>>>>  Zhan Zhang
>>>>>>>
>>>>>>>
>>>>>>>  On Mar 25, 2015, at 7:25 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>  Where and how do i pass this or other JVM argument ?
>>>>>>> -XX:MaxPermSize=512m
>>>>>>>
>>>>>>> On Wed, Mar 25, 2015 at 11:36 PM, Zhan Zhang <zzh...@hortonworks.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> I solve this by  increase the PermGen memory size in driver.
>>>>>>>>
>>>>>>>>  -XX:MaxPermSize=512m
>>>>>>>>
>>>>>>>>  Thanks.
>>>>>>>>
>>>>>>>>  Zhan Zhang
>>>>>>>>
>>>>>>>>  On Mar 25, 2015, at 10:54 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>  I am facing same issue, posted a new thread. Please respond.
>>>>>>>>
>>>>>>>> On Wed, Jan 14, 2015 at 4:38 AM, Zhan Zhang <zzh...@hortonworks.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> Hi Folks,
>>>>>>>>>
>>>>>>>>> I am trying to run hive context in yarn-cluster mode, but met some
>>>>>>>>> error. Does anybody know what cause the issue.
>>>>>>>>>
>>>>>>>>> I use following cmd to build the distribution:
>>>>>>>>>
>>>>>>>>>  ./make-distribution.sh -Phive -Phive-thriftserver  -Pyarn
>>>>>>>>> -Phadoop-2.4
>>>>>>>>>
>>>>>>>>> 15/01/13 17:59:42 INFO cluster.YarnClusterScheduler:
>>>>>>>>> YarnClusterScheduler.postStartHook done
>>>>>>>>> 15/01/13 17:59:42 INFO storage.BlockManagerMasterActor:
>>>>>>>>> Registering block manager cn122-10.l42scl.hortonworks.com:56157
>>>>>>>>> with 1589.8 MB RAM, BlockManagerId(2,
>>>>>>>>> cn122-10.l42scl.hortonworks.com, 56157)
>>>>>>>>> 15/01/13 17:59:43 INFO parse.ParseDriver: Parsing command: CREATE
>>>>>>>>> TABLE IF NOT EXISTS src (key INT, value STRING)
>>>>>>>>> 15/01/13 17:59:43 INFO parse.ParseDriver: Parse Completed
>>>>>>>>> 15/01/13 17:59:44 INFO metastore.HiveMetaStore: 0: Opening raw
>>>>>>>>> store with implemenation 
>>>>>>>>> class:org.apache.hadoop.hive.metastore.ObjectStore
>>>>>>>>> 15/01/13 17:59:44 INFO metastore.ObjectStore: ObjectStore,
>>>>>>>>> initialize called
>>>>>>>>> 15/01/13 17:59:44 INFO DataNucleus.Persistence: Property
>>>>>>>>> datanucleus.cache.level2 unknown - will be ignored
>>>>>>>>> 15/01/13 17:59:44 INFO DataNucleus.Persistence: Property
>>>>>>>>> hive.metastore.integral.jdo.pushdown unknown - will be ignored
>>>>>>>>> 15/01/13 17:59:44 WARN DataNucleus.Connection: BoneCP specified
>>>>>>>>> but not present in CLASSPATH (or one of dependencies)
>>>>>>>>> 15/01/13 17:59:44 WARN DataNucleus.Connection: BoneCP specified
>>>>>>>>> but not present in CLASSPATH (or one of dependencies)
>>>>>>>>> 15/01/13 17:59:52 INFO metastore.ObjectStore: Setting MetaStore
>>>>>>>>> object pin classes with
>>>>>>>>> hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
>>>>>>>>> 15/01/13 17:59:52 INFO metastore.MetaStoreDirectSql: MySQL check
>>>>>>>>> failed, assuming we are not on mysql: Lexical error at line 1, column 
>>>>>>>>> 5.
>>>>>>>>> Encountered: "@" (64), after : "".
>>>>>>>>> 15/01/13 17:59:53 INFO DataNucleus.Datastore: The class
>>>>>>>>> "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as
>>>>>>>>> "embedded-only" so does not have its own datastore table.
>>>>>>>>> 15/01/13 17:59:53 INFO DataNucleus.Datastore: The class
>>>>>>>>> "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as
>>>>>>>>> "embedded-only" so does not have its own datastore table.
>>>>>>>>> 15/01/13 17:59:59 INFO DataNucleus.Datastore: The class
>>>>>>>>> "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as
>>>>>>>>> "embedded-only" so does not have its own datastore table.
>>>>>>>>> 15/01/13 17:59:59 INFO DataNucleus.Datastore: The class
>>>>>>>>> "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as
>>>>>>>>> "embedded-only" so does not have its own datastore table.
>>>>>>>>> 15/01/13 18:00:00 INFO metastore.ObjectStore: Initialized
>>>>>>>>> ObjectStore
>>>>>>>>> 15/01/13 18:00:00 WARN metastore.ObjectStore: Version information
>>>>>>>>> not found in metastore. hive.metastore.schema.verification is not 
>>>>>>>>> enabled
>>>>>>>>> so recording the schema version 0.13.1aa
>>>>>>>>> 15/01/13 18:00:01 INFO metastore.HiveMetaStore: Added admin role
>>>>>>>>> in metastore
>>>>>>>>> 15/01/13 18:00:01 INFO metastore.HiveMetaStore: Added public role
>>>>>>>>> in metastore
>>>>>>>>> 15/01/13 18:00:01 INFO metastore.HiveMetaStore: No user is added
>>>>>>>>> in admin role, since config is empty
>>>>>>>>> 15/01/13 18:00:01 INFO session.SessionState: No Tez session
>>>>>>>>> required at this point. hive.execution.engine=mr.
>>>>>>>>> 15/01/13 18:00:02 INFO log.PerfLogger: <PERFLOG method=Driver.run
>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>> 15/01/13 18:00:02 INFO log.PerfLogger: <PERFLOG
>>>>>>>>> method=TimeToSubmit from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>> 15/01/13 18:00:02 INFO ql.Driver: Concurrency mode is disabled,
>>>>>>>>> not creating a lock manager
>>>>>>>>> 15/01/13 18:00:02 INFO log.PerfLogger: <PERFLOG method=compile
>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: <PERFLOG method=parse
>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>> 15/01/13 18:00:03 INFO parse.ParseDriver: Parsing command: CREATE
>>>>>>>>> TABLE IF NOT EXISTS src (key INT, value STRING)
>>>>>>>>> 15/01/13 18:00:03 INFO parse.ParseDriver: Parse Completed
>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: </PERFLOG method=parse
>>>>>>>>> start=1421190003030 end=1421190003031 duration=1
>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: <PERFLOG
>>>>>>>>> method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>> 15/01/13 18:00:03 INFO parse.SemanticAnalyzer: Starting Semantic
>>>>>>>>> Analysis
>>>>>>>>> 15/01/13 18:00:03 INFO parse.SemanticAnalyzer: Creating table src
>>>>>>>>> position=27
>>>>>>>>> 15/01/13 18:00:03 INFO metastore.HiveMetaStore: 0: get_table :
>>>>>>>>> db=default tbl=src
>>>>>>>>> 15/01/13 18:00:03 INFO HiveMetaStore.audit: ugi=zzhang
>>>>>>>>> ip=unknown-ip-addr      cmd=get_table : db=default tbl=src
>>>>>>>>> 15/01/13 18:00:03 INFO metastore.HiveMetaStore: 0: get_database:
>>>>>>>>> default
>>>>>>>>> 15/01/13 18:00:03 INFO HiveMetaStore.audit: ugi=zzhang
>>>>>>>>> ip=unknown-ip-addr      cmd=get_database: default
>>>>>>>>> 15/01/13 18:00:03 INFO ql.Driver: Semantic Analysis Completed
>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: </PERFLOG
>>>>>>>>> method=semanticAnalyze start=1421190003031 end=1421190003406 
>>>>>>>>> duration=375
>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>> 15/01/13 18:00:03 INFO ql.Driver: Returning Hive schema:
>>>>>>>>> Schema(fieldSchemas:null, properties:null)
>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: </PERFLOG method=compile
>>>>>>>>> start=1421190002998 end=1421190003416 duration=418
>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: <PERFLOG
>>>>>>>>> method=Driver.execute from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>> 15/01/13 18:00:03 INFO ql.Driver: Starting command: CREATE TABLE
>>>>>>>>> IF NOT EXISTS src (key INT, value STRING)
>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: </PERFLOG
>>>>>>>>> method=TimeToSubmit start=1421190002995 end=1421190003421 duration=426
>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: <PERFLOG method=runTasks
>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: <PERFLOG
>>>>>>>>> method=task.DDL.Stage-0 from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>> 15/01/13 18:00:03 INFO exec.DDLTask: Default to LazySimpleSerDe
>>>>>>>>> for table src
>>>>>>>>> 15/01/13 18:00:05 INFO log.PerfLogger: </PERFLOG
>>>>>>>>> method=Driver.execute start=1421190003416 end=1421190005498 
>>>>>>>>> duration=2082
>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>> Exception in thread "Driver"
>>>>>>>>> Exception: java.lang.OutOfMemoryError thrown from the
>>>>>>>>> UncaughtExceptionHandler in thread "Driver"
>>>>>>>>> --
>>>>>>>>> CONFIDENTIALITY NOTICE
>>>>>>>>> NOTICE: This message is intended for the use of the individual or
>>>>>>>>> entity to
>>>>>>>>> which it is addressed and may contain information that is
>>>>>>>>> confidential,
>>>>>>>>> privileged and exempt from disclosure under applicable law. If the
>>>>>>>>> reader
>>>>>>>>> of this message is not the intended recipient, you are hereby
>>>>>>>>> notified that
>>>>>>>>> any printing, copying, dissemination, distribution, disclosure or
>>>>>>>>> forwarding of this communication is strictly prohibited. If you
>>>>>>>>> have
>>>>>>>>> received this communication in error, please contact the sender
>>>>>>>>> immediately
>>>>>>>>> and delete it from your system. Thank You.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>>>>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>  --
>>>>>>>>  Deepak
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>  --
>>>>>>>  Deepak
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Deepak
>>>>>>
>>>>>>
>>>>
>>>>
>>>> --
>>>> Deepak
>>>>
>>>>
>>>
>>
>>
>> --
>> Deepak
>>
>>
>


-- 
Deepak

Re: OOM for HiveFromSpark example

Reply via email to