heyang wang created ZEPPELIN-2558:
-------------------------------------
Summary: Livy configuration mismatch
Key: ZEPPELIN-2558
URL: https://issues.apache.org/jira/browse/ZEPPELIN-2558
Project: Zeppelin
Issue Type: Bug
Components: livy-interpreter
Affects Versions: 0.7.1
Reporter: heyang wang
I am using zeppelin 0.7.1 with livy-0.4-snapshot. When I edit the livy
interpreter setting related to Spark resource in zeppelin web ui. I would get
the following error from yarn application master.
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit
exceeded
at org.apache.xerces.dom.DeferredDocumentImpl.getNodeObject(Unknown
Source)
at
org.apache.xerces.dom.DeferredDocumentImpl.synchronizeChildren(Unknown Source)
at
org.apache.xerces.dom.DeferredElementNSImpl.synchronizeChildren(Unknown Source)
at org.apache.xerces.dom.ParentNode.hasChildNodes(Unknown Source)
at
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2551)
at
org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2444)
at
org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2361)
at org.apache.hadoop.conf.Configuration.get(Configuration.java:968)
at
org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:987)
at
org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1388)
at
org.apache.hadoop.security.SecurityUtil.<clinit>(SecurityUtil.java:70)
at
org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:272)
at
org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:311)
at
org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:55)
at
org.apache.spark.deploy.yarn.YarnSparkHadoopUtil.<init>(YarnSparkHadoopUtil.scala:56)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at java.lang.Class.newInstance(Class.java:442)
at
org.apache.spark.deploy.SparkHadoopUtil$.liftedTree1$1(SparkHadoopUtil.scala:414)
at
org.apache.spark.deploy.SparkHadoopUtil$.yarn$lzycompute(SparkHadoopUtil.scala:412)
at
org.apache.spark.deploy.SparkHadoopUtil$.yarn(SparkHadoopUtil.scala:412)
at
org.apache.spark.deploy.SparkHadoopUtil$.get(SparkHadoopUtil.scala:437)
at
org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:747)
at
org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
It turned out the above error is caused from mismatch in zeppelin-livy
interpreter configuration with livy server configuration.
In zeppelin logs, I can see zeppelin is posting the following json to livy
server:
DEBUG [2017-05-17 11:29:39,821] ({pool-2-thread-9}
HttpAccessor.java[createRequest]:79) - Created POST request for
"http://10.204.11.182:8998/sessions"
DEBUG [2017-05-17 11:29:39,821] ({pool-2-thread-9}
RestTemplate.java[doWithRequest]:746) - Setting request Accept header to
[text/plain, application/json, application/*+json, */*]
DEBUG [2017-05-17 11:29:39,821] ({pool-2-thread-9}
RestTemplate.java[doWithRequest]:841) - Writing [{
"kind": "pyspark",
"proxyUser": "[email protected]",
"conf": {
"spark.executor.memory": "2",
"spark.driver.memory": "4",
"spark.driver.cores": "1",
"spark.executor.cores": "1",
"spark.executor.instances": "10"
}
However, from https://github.com/cloudera/livy, livy server accept
configurations like the following:
driverMemory Amount of memory to use for the driver process string
driverCores Number of cores to use for the driver process int
executorMemory Amount of memory to use per executor process string
executorCores Number of cores to use for each executor int
numExecutors Number of executors to launch for this session int
archives Archives to be used in this session List of string
It's obvious that there is mismatch between zeppelin and livy related to spark
resource specification. I am not sure whether Zeppelin or Livy should fix this.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)