Re: OOM for HiveFromSpark example

Akhil Das Thu, 26 Mar 2015 01:42:51 -0700

Could you try putting that file in hdfs and try like:

LOAD DATA INPATH 'hdfs://sigmoid/test/kv1.txt' INTO TABLE src_spark


Thanks
Best Regards

On Thu, Mar 26, 2015 at 2:07 PM, Akhil Das <ak...@sigmoidanalytics.com>
wrote:

> When you run it in local mode ^^
>
> Thanks
> Best Regards
>
> On Thu, Mar 26, 2015 at 2:06 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com>
> wrote:
>
>> I don;t think thats correct. load data local should pick input from local
>> directory.
>>
>> On Thu, Mar 26, 2015 at 1:59 PM, Akhil Das <ak...@sigmoidanalytics.com>
>> wrote:
>>
>>> Not sure, but you can create that path in all workers and put that file
>>> in it.
>>>
>>> Thanks
>>> Best Regards
>>>
>>> On Thu, Mar 26, 2015 at 1:56 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com>
>>> wrote:
>>>
>>>> The Hive command
>>>>
>>>> LOAD DATA LOCAL INPATH
>>>> '/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/examples/src/main/resources/kv1.txt'
>>>> INTO TABLE src_spark
>>>>
>>>> 1. LOCAL INPATH. if i push to HDFS then how will it work ?
>>>>
>>>> 2. I cant use sc.addFile, cause i want to run Hive (Spark SQL) queries.
>>>>
>>>> On Thu, Mar 26, 2015 at 1:41 PM, Akhil Das <ak...@sigmoidanalytics.com>
>>>> wrote:
>>>>
>>>>> Now its clear that the workers are not having the file kv1.txt in
>>>>> their local filesystem. You can try putting that in hdfs and use the URI 
>>>>> to
>>>>> that file or try adding the file with sc.addFile
>>>>>
>>>>> Thanks
>>>>> Best Regards
>>>>>
>>>>> On Thu, Mar 26, 2015 at 1:38 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Does not work
>>>>>>
>>>>>> 15/03/26 01:07:05 INFO HiveMetaStore.audit: ugi=dvasthimal
>>>>>> ip=unknown-ip-addr cmd=get_table : db=default tbl=src_spark
>>>>>> 15/03/26 01:07:06 ERROR ql.Driver: FAILED: SemanticException Line
>>>>>> 1:23 Invalid path
>>>>>> ''/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/examples/src/main/resources/kv1.txt'':
>>>>>> No files matching path
>>>>>> file:/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/examples/src/main/resources/kv1.txt
>>>>>> org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:23 Invalid
>>>>>> path
>>>>>> ''/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/examples/src/main/resources/kv1.txt'':
>>>>>> No files matching path
>>>>>> file:/home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4/examples/src/main/resources/kv1.txt
>>>>>> at
>>>>>> org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.applyConstraints(LoadSemanticAnalyzer.java:142)
>>>>>> at
>>>>>> org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.analyzeInternal(LoadSemanticAnalyzer.java:233)
>>>>>> at
>>>>>> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
>>>>>> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:422)
>>>>>>
>>>>>>
>>>>>>
>>>>>> Does the input file needs to be passed to executor via -- jars ?
>>>>>>
>>>>>> On Thu, Mar 26, 2015 at 12:15 PM, Akhil Das <
>>>>>> ak...@sigmoidanalytics.com> wrote:
>>>>>>
>>>>>>> Try to give the complete path to the file kv1.txt.
>>>>>>> On 26 Mar 2015 11:48, "ÐΞ€ρ@Ҝ (๏̯͡๏)" <deepuj...@gmail.com> wrote:
>>>>>>>
>>>>>>>> I am now seeing this error.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 15/03/25 19:44:03 ERROR yarn.ApplicationMaster: User class threw
>>>>>>>> exception: FAILED: SemanticException Line 1:23 Invalid path
>>>>>>>> ''examples/src/main/resources/kv1.txt'': No files matching path
>>>>>>>> file:/hadoop/10/scratch/local/usercache/dvasthimal/appcache/application_1426715280024_89893/container_1426715280024_89893_01_000002/examples/src/main/resources/kv1.txt
>>>>>>>>
>>>>>>>> org.apache.spark.sql.execution.QueryExecutionException: FAILED:
>>>>>>>> SemanticException Line 1:23 Invalid path
>>>>>>>> ''examples/src/main/resources/kv1.txt'': No files matching path
>>>>>>>> file:/hadoop/10/scratch/local/usercache/dvasthimal/appcache/application_1426715280024_89893/container_1426715280024_89893_01_000002/examples/src/main/resources/kv1.txt
>>>>>>>>
>>>>>>>> at
>>>>>>>> org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:312)
>>>>>>>>
>>>>>>>> at
>>>>>>>> org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:280)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -sh-4.1$ pwd
>>>>>>>>
>>>>>>>> /home/dvasthimal/spark1.3/spark-1.3.0-bin-hadoop2.4
>>>>>>>>
>>>>>>>> -sh-4.1$ ls examples/src/main/resources/kv1.txt
>>>>>>>>
>>>>>>>> examples/src/main/resources/kv1.txt
>>>>>>>>
>>>>>>>> -sh-4.1$
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Mar 26, 2015 at 8:08 AM, Zhan Zhang <zzh...@hortonworks.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>>  You can do it in $SPARK_HOME/conf/spark-defaults.con
>>>>>>>>>
>>>>>>>>>  spark.driver.extraJavaOptions -XX:MaxPermSize=512m
>>>>>>>>>
>>>>>>>>>  Thanks.
>>>>>>>>>
>>>>>>>>>  Zhan Zhang
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  On Mar 25, 2015, at 7:25 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>  Where and how do i pass this or other JVM argument ?
>>>>>>>>> -XX:MaxPermSize=512m
>>>>>>>>>
>>>>>>>>> On Wed, Mar 25, 2015 at 11:36 PM, Zhan Zhang <
>>>>>>>>> zzh...@hortonworks.com> wrote:
>>>>>>>>>
>>>>>>>>>> I solve this by  increase the PermGen memory size in driver.
>>>>>>>>>>
>>>>>>>>>>  -XX:MaxPermSize=512m
>>>>>>>>>>
>>>>>>>>>>  Thanks.
>>>>>>>>>>
>>>>>>>>>>  Zhan Zhang
>>>>>>>>>>
>>>>>>>>>>  On Mar 25, 2015, at 10:54 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>  I am facing same issue, posted a new thread. Please respond.
>>>>>>>>>>
>>>>>>>>>> On Wed, Jan 14, 2015 at 4:38 AM, Zhan Zhang <
>>>>>>>>>> zzh...@hortonworks.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Folks,
>>>>>>>>>>>
>>>>>>>>>>> I am trying to run hive context in yarn-cluster mode, but met
>>>>>>>>>>> some error. Does anybody know what cause the issue.
>>>>>>>>>>>
>>>>>>>>>>> I use following cmd to build the distribution:
>>>>>>>>>>>
>>>>>>>>>>>  ./make-distribution.sh -Phive -Phive-thriftserver  -Pyarn
>>>>>>>>>>> -Phadoop-2.4
>>>>>>>>>>>
>>>>>>>>>>> 15/01/13 17:59:42 INFO cluster.YarnClusterScheduler:
>>>>>>>>>>> YarnClusterScheduler.postStartHook done
>>>>>>>>>>> 15/01/13 17:59:42 INFO storage.BlockManagerMasterActor:
>>>>>>>>>>> Registering block manager cn122-10.l42scl.hortonworks.com:56157
>>>>>>>>>>> with 1589.8 MB RAM, BlockManagerId(2,
>>>>>>>>>>> cn122-10.l42scl.hortonworks.com, 56157)
>>>>>>>>>>> 15/01/13 17:59:43 INFO parse.ParseDriver: Parsing command:
>>>>>>>>>>> CREATE TABLE IF NOT EXISTS src (key INT, value STRING)
>>>>>>>>>>> 15/01/13 17:59:43 INFO parse.ParseDriver: Parse Completed
>>>>>>>>>>> 15/01/13 17:59:44 INFO metastore.HiveMetaStore: 0: Opening raw
>>>>>>>>>>> store with implemenation 
>>>>>>>>>>> class:org.apache.hadoop.hive.metastore.ObjectStore
>>>>>>>>>>> 15/01/13 17:59:44 INFO metastore.ObjectStore: ObjectStore,
>>>>>>>>>>> initialize called
>>>>>>>>>>> 15/01/13 17:59:44 INFO DataNucleus.Persistence: Property
>>>>>>>>>>> datanucleus.cache.level2 unknown - will be ignored
>>>>>>>>>>> 15/01/13 17:59:44 INFO DataNucleus.Persistence: Property
>>>>>>>>>>> hive.metastore.integral.jdo.pushdown unknown - will be ignored
>>>>>>>>>>> 15/01/13 17:59:44 WARN DataNucleus.Connection: BoneCP specified
>>>>>>>>>>> but not present in CLASSPATH (or one of dependencies)
>>>>>>>>>>> 15/01/13 17:59:44 WARN DataNucleus.Connection: BoneCP specified
>>>>>>>>>>> but not present in CLASSPATH (or one of dependencies)
>>>>>>>>>>> 15/01/13 17:59:52 INFO metastore.ObjectStore: Setting MetaStore
>>>>>>>>>>> object pin classes with
>>>>>>>>>>> hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
>>>>>>>>>>> 15/01/13 17:59:52 INFO metastore.MetaStoreDirectSql: MySQL check
>>>>>>>>>>> failed, assuming we are not on mysql: Lexical error at line 1, 
>>>>>>>>>>> column 5.
>>>>>>>>>>> Encountered: "@" (64), after : "".
>>>>>>>>>>> 15/01/13 17:59:53 INFO DataNucleus.Datastore: The class
>>>>>>>>>>> "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as
>>>>>>>>>>> "embedded-only" so does not have its own datastore table.
>>>>>>>>>>> 15/01/13 17:59:53 INFO DataNucleus.Datastore: The class
>>>>>>>>>>> "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as
>>>>>>>>>>> "embedded-only" so does not have its own datastore table.
>>>>>>>>>>> 15/01/13 17:59:59 INFO DataNucleus.Datastore: The class
>>>>>>>>>>> "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as
>>>>>>>>>>> "embedded-only" so does not have its own datastore table.
>>>>>>>>>>> 15/01/13 17:59:59 INFO DataNucleus.Datastore: The class
>>>>>>>>>>> "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as
>>>>>>>>>>> "embedded-only" so does not have its own datastore table.
>>>>>>>>>>> 15/01/13 18:00:00 INFO metastore.ObjectStore: Initialized
>>>>>>>>>>> ObjectStore
>>>>>>>>>>> 15/01/13 18:00:00 WARN metastore.ObjectStore: Version
>>>>>>>>>>> information not found in metastore. 
>>>>>>>>>>> hive.metastore.schema.verification is
>>>>>>>>>>> not enabled so recording the schema version 0.13.1aa
>>>>>>>>>>> 15/01/13 18:00:01 INFO metastore.HiveMetaStore: Added admin role
>>>>>>>>>>> in metastore
>>>>>>>>>>> 15/01/13 18:00:01 INFO metastore.HiveMetaStore: Added public
>>>>>>>>>>> role in metastore
>>>>>>>>>>> 15/01/13 18:00:01 INFO metastore.HiveMetaStore: No user is added
>>>>>>>>>>> in admin role, since config is empty
>>>>>>>>>>> 15/01/13 18:00:01 INFO session.SessionState: No Tez session
>>>>>>>>>>> required at this point. hive.execution.engine=mr.
>>>>>>>>>>> 15/01/13 18:00:02 INFO log.PerfLogger: <PERFLOG
>>>>>>>>>>> method=Driver.run from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>>> 15/01/13 18:00:02 INFO log.PerfLogger: <PERFLOG
>>>>>>>>>>> method=TimeToSubmit from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>>> 15/01/13 18:00:02 INFO ql.Driver: Concurrency mode is disabled,
>>>>>>>>>>> not creating a lock manager
>>>>>>>>>>> 15/01/13 18:00:02 INFO log.PerfLogger: <PERFLOG method=compile
>>>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: <PERFLOG method=parse
>>>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>>> 15/01/13 18:00:03 INFO parse.ParseDriver: Parsing command:
>>>>>>>>>>> CREATE TABLE IF NOT EXISTS src (key INT, value STRING)
>>>>>>>>>>> 15/01/13 18:00:03 INFO parse.ParseDriver: Parse Completed
>>>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: </PERFLOG method=parse
>>>>>>>>>>> start=1421190003030 end=1421190003031 duration=1
>>>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: <PERFLOG
>>>>>>>>>>> method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>>> 15/01/13 18:00:03 INFO parse.SemanticAnalyzer: Starting Semantic
>>>>>>>>>>> Analysis
>>>>>>>>>>> 15/01/13 18:00:03 INFO parse.SemanticAnalyzer: Creating table
>>>>>>>>>>> src position=27
>>>>>>>>>>> 15/01/13 18:00:03 INFO metastore.HiveMetaStore: 0: get_table :
>>>>>>>>>>> db=default tbl=src
>>>>>>>>>>> 15/01/13 18:00:03 INFO HiveMetaStore.audit: ugi=zzhang
>>>>>>>>>>> ip=unknown-ip-addr      cmd=get_table : db=default tbl=src
>>>>>>>>>>> 15/01/13 18:00:03 INFO metastore.HiveMetaStore: 0: get_database:
>>>>>>>>>>> default
>>>>>>>>>>> 15/01/13 18:00:03 INFO HiveMetaStore.audit: ugi=zzhang
>>>>>>>>>>> ip=unknown-ip-addr      cmd=get_database: default
>>>>>>>>>>> 15/01/13 18:00:03 INFO ql.Driver: Semantic Analysis Completed
>>>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: </PERFLOG
>>>>>>>>>>> method=semanticAnalyze start=1421190003031 end=1421190003406 
>>>>>>>>>>> duration=375
>>>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>>> 15/01/13 18:00:03 INFO ql.Driver: Returning Hive schema:
>>>>>>>>>>> Schema(fieldSchemas:null, properties:null)
>>>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: </PERFLOG method=compile
>>>>>>>>>>> start=1421190002998 end=1421190003416 duration=418
>>>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: <PERFLOG
>>>>>>>>>>> method=Driver.execute from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>>> 15/01/13 18:00:03 INFO ql.Driver: Starting command: CREATE TABLE
>>>>>>>>>>> IF NOT EXISTS src (key INT, value STRING)
>>>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: </PERFLOG
>>>>>>>>>>> method=TimeToSubmit start=1421190002995 end=1421190003421 
>>>>>>>>>>> duration=426
>>>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: <PERFLOG method=runTasks
>>>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>>> 15/01/13 18:00:03 INFO log.PerfLogger: <PERFLOG
>>>>>>>>>>> method=task.DDL.Stage-0 from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>>> 15/01/13 18:00:03 INFO exec.DDLTask: Default to LazySimpleSerDe
>>>>>>>>>>> for table src
>>>>>>>>>>> 15/01/13 18:00:05 INFO log.PerfLogger: </PERFLOG
>>>>>>>>>>> method=Driver.execute start=1421190003416 end=1421190005498 
>>>>>>>>>>> duration=2082
>>>>>>>>>>> from=org.apache.hadoop.hive.ql.Driver>
>>>>>>>>>>> Exception in thread "Driver"
>>>>>>>>>>> Exception: java.lang.OutOfMemoryError thrown from the
>>>>>>>>>>> UncaughtExceptionHandler in thread "Driver"
>>>>>>>>>>> --
>>>>>>>>>>> CONFIDENTIALITY NOTICE
>>>>>>>>>>> NOTICE: This message is intended for the use of the individual
>>>>>>>>>>> or entity to
>>>>>>>>>>> which it is addressed and may contain information that is
>>>>>>>>>>> confidential,
>>>>>>>>>>> privileged and exempt from disclosure under applicable law. If
>>>>>>>>>>> the reader
>>>>>>>>>>> of this message is not the intended recipient, you are hereby
>>>>>>>>>>> notified that
>>>>>>>>>>> any printing, copying, dissemination, distribution, disclosure or
>>>>>>>>>>> forwarding of this communication is strictly prohibited. If you
>>>>>>>>>>> have
>>>>>>>>>>> received this communication in error, please contact the sender
>>>>>>>>>>> immediately
>>>>>>>>>>> and delete it from your system. Thank You.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>>>>>>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  --
>>>>>>>>>>  Deepak
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  --
>>>>>>>>>  Deepak
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Deepak
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Deepak
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Deepak
>>>>
>>>>
>>>
>>
>>
>> --
>> Deepak
>>
>>
>

Re: OOM for HiveFromSpark example

Reply via email to