Re: Reporting errors from spark sql
Hi, See https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParseDriver.scala#L65 to learn how Spark SQL parses SQL texts. It could give you a way out. Pozdrawiam, Jacek Laskowski https://medium.com/@jaceklaskowski/ Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski On Thu, Aug 18, 2016 at 3:14 PM, yael aharonwrote: > Hello, > I am working on an SQL editor which is powered by spark SQL. When the SQL is > not valid, I would like to provide the user with a line number and column > number where the first error occurred. I am having a hard time finding a > mechanism that will give me that information programmatically. > > Most of the time, if an erroneous SQL statement is used, I am getting a > RuntimeException, where line number and column number are implicitly > embedded within the text of the message, but it is really error prone to > parse the message text and count the number of spaces prior to the '^' > symbol... > > Sometimes, AnalysisException is used, but when I try to extract the line and > startPosition from it, they are always empty. > > Any help would be greatly appreciated. > thanks! - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Reporting errors from spark sql
Hello, I am working on an SQL editor which is powered by spark SQL. When the SQL is not valid, I would like to provide the user with a line number and column number where the first error occurred. I am having a hard time finding a mechanism that will give me that information programmatically. Most of the time, if an erroneous SQL statement is used, I am getting a RuntimeException, where line number and column number are implicitly embedded within the text of the message, but it is really error prone to parse the message text and count the number of spaces prior to the '^' symbol... Sometimes, AnalysisException is used, but when I try to extract the line and startPosition from it, they are always empty. Any help would be greatly appreciated. thanks!
How to turn off Jetty Http stack errors on Spark web
Hi, Is it possible to disable Jetty stack trace with errors on Spark master:8080 ? When I trigger Http server error 500 than anyone can read details. I tried options available in log4j.properties but it doesn't help. Any hint? Thank you for answer MyCo - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: How to turn off Jetty Http stack errors on Spark web
Have you read this ? http://stackoverflow.com/questions/2246074/how-do-i-hide-stack-traces-in-the-browser-using-jetty On Wed, Sep 23, 2015 at 6:56 AM, Rafal Grzymkowski <m...@o2.pl> wrote: > Hi, > > Is it possible to disable Jetty stack trace with errors on Spark > master:8080 ? > When I trigger Http server error 500 than anyone can read details. > I tried options available in log4j.properties but it doesn't help. > Any hint? > > Thank you for answer > MyCo > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >
Re: How to turn off Jetty Http stack errors on Spark web
Yes, I've seen it, but there are no files web.xml and error.jsp in binary installation of Spark. To apply this solution I should probably take Spark sources than create missing files and than recompile Spark. Right? I am looking for a solution to turn off error details without recompilation. /MyCo
Getting outofmemory errors on spark
Hi, I'm reading data stored in S3 and aggregating and storing it in Cassandra using a spark job. When I run the job with approx 3Mil records (about 3-4 GB of data) stored in text files, I get the following error: (11529/14925)15/04/10 19:32:43 INFO TaskSetManager: Starting task 11609.0 in stage 4.0 (TID 56384, spark-slaves-test-cluster-k0b6.c.silver-argon-837.internal, PROCESS_LOCAL, 134 System information as of Fri Apr 10 19:08:57 UTC 201515/04/10 19:32:58 ERROR ActorSystemImpl: Uncaught fatal error from thread [sparkDriver-akka.remote.default-remote-dispatcher-5] shutting down ActorSystem [sparkDriv System load: 0.07 Processes: 155 Usage of /: 48.3% of 9.81GB Users logged in: 015/04/10 19:32:58 ERROR ActorSystemImpl: Uncaught fatal error from thread [sparkDriver-akka.remote.default-remote-dispatcher-5] shutting down ActorSystem [sparkDriver] *java.lang.OutOfMemoryError: GC overhead limit exceeded at* java.util.Arrays.copyOf(Arrays.java:2367) at java.lang. AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130) at java.lang.AbstractStringBuilder.ensureCapacityInternal( AbstractStringBuilder.java:114) at java.lang.AbstractStringBuilder.append( AbstractStringBuilder.java:535) at java.lang.StringBuilder.append(StringBuilder.java:204) at java.io.ObjectInputStream$BlockDataInputStream. readUTFSpan(ObjectInputStream.java:3143) at java.io.ObjectInputStream$ BlockDataInputStream.readUTFBody(ObjectInputStream.java:3051) at java.io.ObjectInputStream$BlockDataInputStream.readUTF(ObjectInputStream.java:2864) at java.io.ObjectInputStream.readUTF(ObjectInputStream.java:1072) at java.io.ObjectStreamClass.readNonProxy(ObjectStreamClass.java:671) at java.io.ObjectInputStream.readClassDescriptor(ObjectInputStream.java:830) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1601) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) at akka.serialization.JavaSerializer$$anonfun$1.apply(Serializer.scala:136) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at akka.serialization.JavaSerializer.fromBinary(Serializer.scala:136) at akka.serialization.Serialization$$anonfun$deserialize$1.apply(Serialization.scala:104) at scala.util.Try$.apply(Try.scala:161) at akka.serialization. Serialization.deserialize(Serialization.scala:98) at akka.remote.serialization.MessageContainerSerializer.fromBinary( MessageContainerSerializer.scala:63) at akka.serialization. Serialization$$anonfun$deserialize$1.apply(Serialization.scala:104) at scala.util.Try$.apply(Try.scala:161) at akka.serialization. Serialization.deserialize(Serialization.scala:98) at akka.remote.MessageSerializer$.deserialize(MessageSerializer.scala:23) at akka.remote.DefaultMessageDispatcher.payload$lzycompute$1(Endpoint.scala:58) at akka.remote.DefaultMessageDispatcher.payload$1(Endpoint.scala:58) at akka.remote.DefaultMessageDispatcher.dispatch(Endpoint.scala:76) at akka.remote.EndpointReader$$anonfun$receive$2.applyOrElse(Endpoint.scala:937) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) This error occurs in the final step of my script, when i'm storing the processed records in Cassandra. My memory-per-node is 10GB which means that *all my records should fit on one machine.* The script is in pyspark and I'm using a cluster with: - *Workers:* 5 - *Cores:* 80 Total, 80 Used - *Memory:* 506.5 GB Total, 40.0 GB Used Here is the relevant part of the code, for reference : def connectAndSave(partition): cluster = Cluster(['10.240.1.17']) dbsession = cluster.connect(load_test) ret = map(lambda x : saveUserData(x,dbsession),partition) dbsession.shutdown() cluster.shutdown() res = sessionsRdd.foreachPartition(lambda partition : connectAndSave( partition))
Re: Errors in SPARK
The error you're seeing typically means that you cannot connect to the Hive metastore itself. Some quick thoughts: - If you were to run show tables (instead of the CREATE TABLE statement), are you still getting the same error? - To confirm, the Hive metastore (MySQL database) is up and running - Did you download or build your version of Spark? On Tue, Mar 24, 2015 at 10:48 PM sandeep vura sandeepv...@gmail.com wrote: Hi Denny, Still facing the same issue.Please find the following errors. *scala val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)* *sqlContext: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.HiveContext@4e4f880c* *scala sqlContext.sql(CREATE TABLE IF NOT EXISTS src (key INT, value STRING))* *java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient* Cheers, Sandeep.v On Wed, Mar 25, 2015 at 11:10 AM, sandeep vura sandeepv...@gmail.com wrote: No I am just running ./spark-shell command in terminal I will try with above command On Wed, Mar 25, 2015 at 11:09 AM, Denny Lee denny.g@gmail.com wrote: Did you include the connection to a MySQL connector jar so that way spark-shell / hive can connect to the metastore? For example, when I run my spark-shell instance in standalone mode, I use: ./spark-shell --master spark://servername:7077 --driver-class-path /lib/ mysql-connector-java-5.1.27.jar On Fri, Mar 13, 2015 at 8:31 AM sandeep vura sandeepv...@gmail.com wrote: Hi Sparkers, Can anyone please check the below error and give solution for this.I am using hive version 0.13 and spark 1.2.1 . Step 1 : I have installed hive 0.13 with local metastore (mySQL database) Step 2: Hive is running without any errors and able to create tables and loading data in hive table Step 3: copied hive-site.xml in spark/conf directory Step 4: copied core-site.xml in spakr/conf directory Step 5: started spark shell Please check the below error for clarifications. scala val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc) sqlContext: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.Hi veContext@2821ec0c scala sqlContext.sql(CREATE TABLE IF NOT EXISTS src (key INT, value STRING)) java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate or g.apache.hadoop.hive. metastore.HiveMetaStoreClient at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.jav a:346) at org.apache.spark.sql.hive.HiveContext$$anonfun$4.apply(HiveContext.sc ala:235) at org.apache.spark.sql.hive.HiveContext$$anonfun$4.apply(HiveContext.sc ala:231) at scala.Option.orElse(Option.scala:257) at org.apache.spark.sql.hive.HiveContext.x$3$lzycompute(HiveContext.scal a:231) at org.apache.spark.sql.hive.HiveContext.x$3(HiveContext. scala:229) at org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext .scala:229) at org.apache.spark.sql.hive.HiveContext.hiveconf( HiveContext.scala:229) at org.apache.spark.sql.hive.HiveMetastoreCatalog.init(HiveMetastoreCa talog.scala:55) Regards, Sandeep.v
Re: Errors in SPARK
Hi Denny, Still facing the same issue.Please find the following errors. *scala val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)* *sqlContext: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.HiveContext@4e4f880c* *scala sqlContext.sql(CREATE TABLE IF NOT EXISTS src (key INT, value STRING))* *java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient* Cheers, Sandeep.v On Wed, Mar 25, 2015 at 11:10 AM, sandeep vura sandeepv...@gmail.com wrote: No I am just running ./spark-shell command in terminal I will try with above command On Wed, Mar 25, 2015 at 11:09 AM, Denny Lee denny.g@gmail.com wrote: Did you include the connection to a MySQL connector jar so that way spark-shell / hive can connect to the metastore? For example, when I run my spark-shell instance in standalone mode, I use: ./spark-shell --master spark://servername:7077 --driver-class-path /lib/mysql-connector-java-5.1.27.jar On Fri, Mar 13, 2015 at 8:31 AM sandeep vura sandeepv...@gmail.com wrote: Hi Sparkers, Can anyone please check the below error and give solution for this.I am using hive version 0.13 and spark 1.2.1 . Step 1 : I have installed hive 0.13 with local metastore (mySQL database) Step 2: Hive is running without any errors and able to create tables and loading data in hive table Step 3: copied hive-site.xml in spark/conf directory Step 4: copied core-site.xml in spakr/conf directory Step 5: started spark shell Please check the below error for clarifications. scala val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc) sqlContext: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.Hi veContext@2821ec0c scala sqlContext.sql(CREATE TABLE IF NOT EXISTS src (key INT, value STRING)) java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate or g.apache.hadoop.hive.metastore.HiveMetaStoreClient at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.jav a:346) at org.apache.spark.sql.hive.HiveContext$$anonfun$4.apply(HiveContext.sc ala:235) at org.apache.spark.sql.hive.HiveContext$$anonfun$4.apply(HiveContext.sc ala:231) at scala.Option.orElse(Option.scala:257) at org.apache.spark.sql.hive.HiveContext.x$3$lzycompute(HiveContext.scal a:231) at org.apache.spark.sql.hive.HiveContext.x$3(HiveContext.scala:229) at org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext .scala:229) at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:229) at org.apache.spark.sql.hive.HiveMetastoreCatalog.init(HiveMetastoreCa talog.scala:55) Regards, Sandeep.v
Re: Errors in SPARK
No I am just running ./spark-shell command in terminal I will try with above command On Wed, Mar 25, 2015 at 11:09 AM, Denny Lee denny.g@gmail.com wrote: Did you include the connection to a MySQL connector jar so that way spark-shell / hive can connect to the metastore? For example, when I run my spark-shell instance in standalone mode, I use: ./spark-shell --master spark://servername:7077 --driver-class-path /lib/mysql-connector-java-5.1.27.jar On Fri, Mar 13, 2015 at 8:31 AM sandeep vura sandeepv...@gmail.com wrote: Hi Sparkers, Can anyone please check the below error and give solution for this.I am using hive version 0.13 and spark 1.2.1 . Step 1 : I have installed hive 0.13 with local metastore (mySQL database) Step 2: Hive is running without any errors and able to create tables and loading data in hive table Step 3: copied hive-site.xml in spark/conf directory Step 4: copied core-site.xml in spakr/conf directory Step 5: started spark shell Please check the below error for clarifications. scala val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc) sqlContext: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.Hi veContext@2821ec0c scala sqlContext.sql(CREATE TABLE IF NOT EXISTS src (key INT, value STRING)) java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate or g.apache.hadoop.hive.metastore.HiveMetaStoreClient at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.jav a:346) at org.apache.spark.sql.hive.HiveContext$$anonfun$4.apply(HiveContext.sc ala:235) at org.apache.spark.sql.hive.HiveContext$$anonfun$4.apply(HiveContext.sc ala:231) at scala.Option.orElse(Option.scala:257) at org.apache.spark.sql.hive.HiveContext.x$3$lzycompute(HiveContext.scal a:231) at org.apache.spark.sql.hive.HiveContext.x$3(HiveContext.scala:229) at org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext .scala:229) at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:229) at org.apache.spark.sql.hive.HiveMetastoreCatalog.init(HiveMetastoreCa talog.scala:55) Regards, Sandeep.v
Re: Errors in SPARK
Did you include the connection to a MySQL connector jar so that way spark-shell / hive can connect to the metastore? For example, when I run my spark-shell instance in standalone mode, I use: ./spark-shell --master spark://servername:7077 --driver-class-path /lib/mysql-connector-java-5.1.27.jar On Fri, Mar 13, 2015 at 8:31 AM sandeep vura sandeepv...@gmail.com wrote: Hi Sparkers, Can anyone please check the below error and give solution for this.I am using hive version 0.13 and spark 1.2.1 . Step 1 : I have installed hive 0.13 with local metastore (mySQL database) Step 2: Hive is running without any errors and able to create tables and loading data in hive table Step 3: copied hive-site.xml in spark/conf directory Step 4: copied core-site.xml in spakr/conf directory Step 5: started spark shell Please check the below error for clarifications. scala val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc) sqlContext: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.Hi veContext@2821ec0c scala sqlContext.sql(CREATE TABLE IF NOT EXISTS src (key INT, value STRING)) java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate or g.apache.hadoop.hive.metastore.HiveMetaStoreClient at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.jav a:346) at org.apache.spark.sql.hive.HiveContext$$anonfun$4.apply(HiveContext.sc ala:235) at org.apache.spark.sql.hive.HiveContext$$anonfun$4.apply(HiveContext.sc ala:231) at scala.Option.orElse(Option.scala:257) at org.apache.spark.sql.hive.HiveContext.x$3$lzycompute(HiveContext.scal a:231) at org.apache.spark.sql.hive.HiveContext.x$3(HiveContext.scala:229) at org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext .scala:229) at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:229) at org.apache.spark.sql.hive.HiveMetastoreCatalog.init(HiveMetastoreCa talog.scala:55) Regards, Sandeep.v
Errors in SPARK
Hi Sparkers, Can anyone please check the below error and give solution for this.I am using hive version 0.13 and spark 1.2.1 . Step 1 : I have installed hive 0.13 with local metastore (mySQL database) Step 2: Hive is running without any errors and able to create tables and loading data in hive table Step 3: copied hive-site.xml in spark/conf directory Step 4: copied core-site.xml in spakr/conf directory Step 5: started spark shell Please check the below error for clarifications. scala val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc) sqlContext: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.Hi veContext@2821ec0c scala sqlContext.sql(CREATE TABLE IF NOT EXISTS src (key INT, value STRING)) java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate or g.apache.hadoop.hive.metastore.HiveMetaStoreClient at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.jav a:346) at org.apache.spark.sql.hive.HiveContext$$anonfun$4.apply(HiveContext.sc ala:235) at org.apache.spark.sql.hive.HiveContext$$anonfun$4.apply(HiveContext.sc ala:231) at scala.Option.orElse(Option.scala:257) at org.apache.spark.sql.hive.HiveContext.x$3$lzycompute(HiveContext.scal a:231) at org.apache.spark.sql.hive.HiveContext.x$3(HiveContext.scala:229) at org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext .scala:229) at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:229) at org.apache.spark.sql.hive.HiveMetastoreCatalog.init(HiveMetastoreCa talog.scala:55) Regards, Sandeep.v
Re: Errors in spark
I was actually just able to reproduce the issue. I do wonder if this is a bug -- the docs say When not configured by the hive-site.xml, the context automatically creates metastore_db and warehouse in the current directory. But as you can see in from the message warehouse is not in the current directory, it is under /user/hive. In my case this directory was owned by 'root' and noone else had write permissions. Changing the permissions works if you need to get unblocked quickly...But it does seem like a bug to me... On Fri, Feb 27, 2015 at 11:21 AM, sandeep vura sandeepv...@gmail.com wrote: Hi yana, I have removed hive-site.xml from spark/conf directory but still getting the same errors. Anyother way to work around. Regards, Sandeep On Fri, Feb 27, 2015 at 9:38 PM, Yana Kadiyska yana.kadiy...@gmail.com wrote: I think you're mixing two things: the docs say When* not *configured by the hive-site.xml, the context automatically creates metastore_db and warehouse in the current directory.. AFAIK if you want a local metastore, you don't put hive-site.xml anywhere. You only need the file if you're going to point to an external metastore. If you're pointing to an external metastore, in my experience I've also had to copy core-site.xml into conf in order to specify this property: namefs.defaultFS/name On Fri, Feb 27, 2015 at 10:39 AM, sandeep vura sandeepv...@gmail.com wrote: Hi Sparkers, I am using hive version - hive 0.13 and copied hive-site.xml in spark/conf and using default derby local metastore . While creating a table in spark shell getting the following error ..Can any one please look and give solution asap.. sqlContext.sql(CREATE TABLE IF NOT EXISTS sandeep (key INT, value STRING)) 15/02/27 23:06:13 ERROR RetryingHMSHandler: MetaException(message:file:/user/hive/warehouse_1/sandeep is not a directory or unable to create one) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1239) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1294) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) at com.sun.proxy.$Proxy12.create_table_with_environment_context(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:558) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:547) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89) at com.sun.proxy.$Proxy13.createTable(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:613) at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4189) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1503) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1270) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901) at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:305) at org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:276) at org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult$lzycompute(NativeCommand.scala:35) at org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult(NativeCommand.scala:35) at org.apache.spark.sql.execution.Command$class.execute(commands.scala:46) at org.apache.spark.sql.hive.execution.NativeCommand.execute(NativeCommand.scala:30) at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425) at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:425) at org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58) at org.apache.spark.sql.SchemaRDD.init(SchemaRDD.scala:108) at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:94) at
Re: Errors in spark
Hi yana, I have removed hive-site.xml from spark/conf directory but still getting the same errors. Anyother way to work around. Regards, Sandeep On Fri, Feb 27, 2015 at 9:38 PM, Yana Kadiyska yana.kadiy...@gmail.com wrote: I think you're mixing two things: the docs say When* not *configured by the hive-site.xml, the context automatically creates metastore_db and warehouse in the current directory.. AFAIK if you want a local metastore, you don't put hive-site.xml anywhere. You only need the file if you're going to point to an external metastore. If you're pointing to an external metastore, in my experience I've also had to copy core-site.xml into conf in order to specify this property: namefs.defaultFS/name On Fri, Feb 27, 2015 at 10:39 AM, sandeep vura sandeepv...@gmail.com wrote: Hi Sparkers, I am using hive version - hive 0.13 and copied hive-site.xml in spark/conf and using default derby local metastore . While creating a table in spark shell getting the following error ..Can any one please look and give solution asap.. sqlContext.sql(CREATE TABLE IF NOT EXISTS sandeep (key INT, value STRING)) 15/02/27 23:06:13 ERROR RetryingHMSHandler: MetaException(message:file:/user/hive/warehouse_1/sandeep is not a directory or unable to create one) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1239) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1294) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) at com.sun.proxy.$Proxy12.create_table_with_environment_context(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:558) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:547) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89) at com.sun.proxy.$Proxy13.createTable(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:613) at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4189) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1503) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1270) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901) at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:305) at org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:276) at org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult$lzycompute(NativeCommand.scala:35) at org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult(NativeCommand.scala:35) at org.apache.spark.sql.execution.Command$class.execute(commands.scala:46) at org.apache.spark.sql.hive.execution.NativeCommand.execute(NativeCommand.scala:30) at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425) at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:425) at org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58) at org.apache.spark.sql.SchemaRDD.init(SchemaRDD.scala:108) at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:94) at $line9.$read$$iwC$$iwC$$iwC$$iwC.init(console:15) at $line9.$read$$iwC$$iwC$$iwC.init(console:20) at $line9.$read$$iwC$$iwC.init(console:22) at $line9.$read$$iwC.init(console:24) at $line9.$read.init(console:26) at $line9.$read$.init(console:30) at $line9.$read$.clinit(console) at $line9.$eval$.init(console:7) at $line9.$eval$.clinit(console) at $line9.$eval.$print(console) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at
Errors in spark
Hi Sparkers, I am using hive version - hive 0.13 and copied hive-site.xml in spark/conf and using default derby local metastore . While creating a table in spark shell getting the following error ..Can any one please look and give solution asap.. sqlContext.sql(CREATE TABLE IF NOT EXISTS sandeep (key INT, value STRING)) 15/02/27 23:06:13 ERROR RetryingHMSHandler: MetaException(message:file:/user/hive/warehouse_1/sandeep is not a directory or unable to create one) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1239) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1294) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) at com.sun.proxy.$Proxy12.create_table_with_environment_context(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:558) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:547) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89) at com.sun.proxy.$Proxy13.createTable(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:613) at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4189) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1503) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1270) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901) at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:305) at org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:276) at org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult$lzycompute(NativeCommand.scala:35) at org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult(NativeCommand.scala:35) at org.apache.spark.sql.execution.Command$class.execute(commands.scala:46) at org.apache.spark.sql.hive.execution.NativeCommand.execute(NativeCommand.scala:30) at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425) at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:425) at org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58) at org.apache.spark.sql.SchemaRDD.init(SchemaRDD.scala:108) at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:94) at $line9.$read$$iwC$$iwC$$iwC$$iwC.init(console:15) at $line9.$read$$iwC$$iwC$$iwC.init(console:20) at $line9.$read$$iwC$$iwC.init(console:22) at $line9.$read$$iwC.init(console:24) at $line9.$read.init(console:26) at $line9.$read$.init(console:30) at $line9.$read$.clinit(console) at $line9.$eval$.init(console:7) at $line9.$eval$.clinit(console) at $line9.$eval.$print(console) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:852) at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1125) at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:674) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:705) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:669) at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:828) at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:873) at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:785) at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:628) at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:636) at
Re: Errors in spark
I think you're mixing two things: the docs say When* not *configured by the hive-site.xml, the context automatically creates metastore_db and warehouse in the current directory.. AFAIK if you want a local metastore, you don't put hive-site.xml anywhere. You only need the file if you're going to point to an external metastore. If you're pointing to an external metastore, in my experience I've also had to copy core-site.xml into conf in order to specify this property: namefs.defaultFS/name On Fri, Feb 27, 2015 at 10:39 AM, sandeep vura sandeepv...@gmail.com wrote: Hi Sparkers, I am using hive version - hive 0.13 and copied hive-site.xml in spark/conf and using default derby local metastore . While creating a table in spark shell getting the following error ..Can any one please look and give solution asap.. sqlContext.sql(CREATE TABLE IF NOT EXISTS sandeep (key INT, value STRING)) 15/02/27 23:06:13 ERROR RetryingHMSHandler: MetaException(message:file:/user/hive/warehouse_1/sandeep is not a directory or unable to create one) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1239) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1294) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) at com.sun.proxy.$Proxy12.create_table_with_environment_context(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:558) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:547) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89) at com.sun.proxy.$Proxy13.createTable(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:613) at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4189) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1503) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1270) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901) at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:305) at org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:276) at org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult$lzycompute(NativeCommand.scala:35) at org.apache.spark.sql.hive.execution.NativeCommand.sideEffectResult(NativeCommand.scala:35) at org.apache.spark.sql.execution.Command$class.execute(commands.scala:46) at org.apache.spark.sql.hive.execution.NativeCommand.execute(NativeCommand.scala:30) at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425) at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:425) at org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58) at org.apache.spark.sql.SchemaRDD.init(SchemaRDD.scala:108) at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:94) at $line9.$read$$iwC$$iwC$$iwC$$iwC.init(console:15) at $line9.$read$$iwC$$iwC$$iwC.init(console:20) at $line9.$read$$iwC$$iwC.init(console:22) at $line9.$read$$iwC.init(console:24) at $line9.$read.init(console:26) at $line9.$read$.init(console:30) at $line9.$read$.clinit(console) at $line9.$eval$.init(console:7) at $line9.$eval$.clinit(console) at $line9.$eval.$print(console) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:852) at
Errors in Spark streaming application due to HDFS append
Hi All, I’m trying to write streaming processed data in HDFS (Hadoop 2). The buffer is flushed and closed after each writing. The following errors occurred when opening the same file to append. I know for sure the error is caused by closing the file. Any idea? Here is the code to write HDFS def appendToFile(id: String, text: String): Unit = { println(Request to write + text.getBytes().length + bytes, MAX_BUF_SIZE: + LogConstants.MAX_BUF_SIZE) println(+++ Write to file id = + id) if (bufMap == null) { init } var fsout: FSDataOutputStream = null val filename = LogConstants.FILE_PATH + id try { fsout = getFSDOS(id, filename) println(Write + text.getBytes().length + of bytes in Text to [ + filename + ]) fsout.writeBytes(text) fsout.flush() //fsout.sync() //} catch { // case e: InterruptedException = } finally { if (fsout != null) fsout.close() } } Here are the errors observed: +++ Write to file id = 0 Wrote 129820 bytes +++ Write to file id = 0 14/11/05 18:01:35 ERROR Executor: Exception in task ID 998 14/11/05 18:01:35 ERROR Executor: Exception in task ID 998 org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException): 0 at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updatePipelineInternal(FSNamesystem.java:5969) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updatePipeline(FSNamesystem.java:5932) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.updatePipeline(NameNodeRpcServer.java:651) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.updatePipeline(ClientNamenodeProtocolServerSideTranslatorPB.java:889) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1982) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1980) at org.apache.hadoop.ipc.Client.call(Client.java:1347) at org.apache.hadoop.ipc.Client.call(Client.java:1300) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at com.sun.proxy.$Proxy11.updatePipeline(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy11.updatePipeline(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.updatePipeline(ClientNamenodeProtocolTranslatorPB.java:791) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1047) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:520) 14/11/05 18:01:36 ERROR TaskSetManager: Task 53.0:7 failed 1 times; aborting job 14/11/05 18:01:36 ERROR JobScheduler: Error running job streaming job 1415239295000 ms.0 org.apache.spark.SparkException: Job aborted due to stage failure: Task 53.0:7 failed 1 times, most recent failure: Exception failure in TID 998 on host localhost: org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException): 0 at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updatePipelineInternal(FSNamesystem.java:5969) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updatePipeline(FSNamesystem.java:5932) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.updatePipeline(NameNodeRpcServer.java:651) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.updatePipeline(ClientNamenodeProtocolServerSideTranslatorPB.java:889) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at