Re: Re: about LIVY-424
I think it's really a mortal blow to livy about the repl scene . What I can do , I think is to monitoring spark metrics, when the driver's memory was used to a high leve I will isolate the session. 2018-11-14 lk_hadoop 发件人:"Harsch, Tim" 发送时间:2018-11-14 05:52 主题:Re: about LIVY-424 收件人:"user" 抄送: While it's true LIVY-424 creates a session leak due to REPL leak in Scala it's not the only thing that can. I've run hundreds of simple scala commands and the leak is only mild/moderate. However, some scala commands can be really problematic. For instance import org.apache.spark.sql._ run this import repeatedly and within only 10s of executions your sessions performance will degrade and eventually run out of memory. From: lk_hadoop Sent: Sunday, November 11, 2018 5:37:34 PM To: user Subject: about LIVY-424 [External Email] hi,all: I meet this issue https://issues.apache.org/jira/browse/LIVY-424 , anybody know how to resolve it? 2018-11-12 lk_hadoop
Re: RE: about LIVY-424
on yarn cluster model , I've tried to increase executor memorey from 4GB to 8GB , didn't help . Driver memory keep growing and killed by yarn finally: Attaching to process ID 10118, please wait... Debugger attached successfully. Server compiler detected. JVM version is 25.161-b12 using thread-local object allocation. Parallel GC with 8 thread(s) Heap Configuration: MinHeapFreeRatio = 0 MaxHeapFreeRatio = 100 MaxHeapSize = 4294967296 (4096.0MB) NewSize = 172490752 (164.5MB) MaxNewSize = 1431306240 (1365.0MB) OldSize = 345505792 (329.5MB) NewRatio = 2 SurvivorRatio= 8 MetaspaceSize= 21807104 (20.796875MB) CompressedClassSpaceSize = 1073741824 (1024.0MB) MaxMetaspaceSize = 17592186044415 MB G1HeapRegionSize = 0 (0.0MB) Heap Usage: PS Young Generation Eden Space: capacity = 1092616192 (1042.0MB) used = 18259168 (17.413299560546875MB) free = 1074357024 (1024.5867004394531MB) 1.67114199237494% used From Space: capacity = 169345024 (161.5MB) used = 0 (0.0MB) free = 169345024 (161.5MB) 0.0% used To Space: capacity = 167772160 (160.0MB) used = 0 (0.0MB) free = 167772160 (160.0MB) 0.0% used PS Old Generation capacity = 2863661056 (2731.0MB) used = 2680250800 (2556.0863494873047MB) free = 183410256 (174.9136505126953MB) 93.59525263593207% used 30723 interned Strings occupying 3189352 bytes. Application application_1541483082023_0636 failed 1 times due to AM Container for appattempt_1541483082023_0636_01 exited with exitCode: -104 For more detailed output, check application tracking page:http://bdp-scm-03:8088/proxy/application_1541483082023_0636/Then, click on links to logs of each attempt. Diagnostics: Container [pid=10104,containerID=container_e39_1541483082023_0636_01_01] is running beyond physical memory limits. Current usage: 4.6 GB of 4.5 GB physical memory used; 6.5 GB of 9.4 GB virtual memory used. Killing container. Dump of the process-tree for container_e39_1541483082023_0636_01_01 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 10118 10104 10104 10104 (java) 24792 3125 6873763840 1210012 /usr/java/jdk1.8.0_161/bin/java -server -Xmx4096m -Djava.io.tmpdir=/data/var/yarn/nm/usercache/devuser/appcache/application_1541483082023_0636/container_e39_1541483082023_0636_01_01/tmp -Dlog4j.configuration=file:/data/spark-conf-4-livy/logs/log4j-driver.properties -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:/tmp/drivergc.log -Dspark.yarn.app.container.log.dir=/var/yarn/container-logs/application_1541483082023_0636/container_e39_1541483082023_0636_01_01 org.apache.spark.deploy.yarn.ApplicationMaster --class org.apache.livy.rsc.driver.RSCDriverBootstrapper --properties-file /data/var/yarn/nm/usercache/devuser/appcache/application_1541483082023_0636/container_e39_1541483082023_0636_01_01/__spark_conf__/__spark_conf__.properties |- 10104 10102 10104 10104 (bash) 0 0 116027392 375 /bin/bash -c LD_LIBRARY_PATH=/data/cloudera/parcels/CDH-5.14.0-1.cdh5.14.0.p0.24/lib/hadoop/../../../CDH-5.14.0-1.cdh5.14.0.p0.24/lib/hadoop/lib/native::/data/cloudera/parcels/CDH-5.14.0-1.cdh5.14.0.p0.24/lib/hadoop/lib/native /usr/java/jdk1.8.0_161/bin/java -server -Xmx4096m -Djava.io.tmpdir=/data/var/yarn/nm/usercache/devuser/appcache/application_1541483082023_0636/container_e39_1541483082023_0636_01_01/tmp '-Dlog4j.configuration=file:/data/spark-conf-4-livy/logs/log4j-driver.properties' '-verbose:gc' '-XX:+PrintGCDetails' '-XX:+PrintGCDateStamps' '-Xloggc:/tmp/drivergc.log' -Dspark.yarn.app.container.log.dir=/var/yarn/container-logs/application_1541483082023_0636/container_e39_1541483082023_0636_01_01 org.apache.spark.deploy.yarn.ApplicationMaster --class 'org.apache.livy.rsc.driver.RSCDriverBootstrapper' --properties-file /data/var/yarn/nm/usercache/devuser/appcache/application_1541483082023_0636/container_e39_1541483082023_0636_01_01/__spark_conf__/__spark_conf__.properties 1> /var/yarn/container-logs/application_1541483082023_0636/container_e39_1541483082023_0636_01_01/stdout 2> /var/yarn/container-logs/application_1541483082023_0636/container_e39_1541483082023_0636_01_01/stderr Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143 Failing this attempt. Failing the application. 2018-11-12 lk_hadoop 发件人:"Rabe, Jens" 发送时间:2018-11-12 14:55 主题:RE: about LIVY-424 收件人:"user@livy.incubator.apache.org" 抄送: Do you run Spark in local mode or on a cluster? If on a cluster, try increasing executor memory. From: lk_hadoop Sent: Monday, Novembe
Re: about LIVY-424
I'm using livy-0.5.0 with spark2.3.0,I started a session with 4GB mem for Driver, And I run code server times : var tmp1 = spark.sql("use tpcds_bin_partitioned_orc_2");var tmp2 = spark.sql("select count(1) from tpcds_bin_partitioned_orc_2.store_sales").show the table have 5760749 rows data. after run about 10 times , the Driver physical memory will beyond 4.5GB and killed by yarn. I saw the old generation memory keep growing and can not release by gc. 2018-11-12 lk_hadoop 发件人:"lk_hadoop" 发送时间:2018-11-12 09:37 主题:about LIVY-424 收件人:"user" 抄送: hi,all: I meet this issue https://issues.apache.org/jira/browse/LIVY-424 , anybody know how to resolve it? 2018-11-12 lk_hadoop
about LIVY-424
hi,all: I meet this issue https://issues.apache.org/jira/browse/LIVY-424 , anybody know how to resolve it? 2018-11-12 lk_hadoop
Re: Re: livy can't list databases
Thank you j...@nanthrax.net , I have resolved it by config livy.repl.enable-hive-context = true 2017-11-23 lk_hadoop 发件人:Jean-Baptiste Onofré 发送时间:2017-11-22 21:45 主题:Re: livy can't list databases 收件人:"user" 抄送: Hi, let me take a look. Thanks for the report. Regards JB On 11/22/2017 02:42 PM, lk_hadoop wrote: > hi,all: > I'm trying livy0.4 with spark2.1 > curl -H "Content-type: application/json" -X POST > http://kafka02:8998/sessions -d '{"kind": "spark"}' | python -m json.tool > curl -H "Content-type: application/json" -X POST > http://kafka02:8998/sessions/0/statements -d '{"code": "spark.sql(\"show > databases\").show"}' | python -m json.tool > curl http://kafka02:8998/sessions/0/statements/0 | python -m json.tool >% Total% Received % Xferd Average Speed TimeTime Time > Current > Dload Upload Total Spent Left Speed > 100 235 100 2350 0 15959 0 --:--:-- --:--:-- --:--:-- 16785 > { > "code": "spark.sql(\"show databases\").show", > "id": 0, > "output": { > "data": { > "text/plain": "++\n|databaseName|\n++\n| > default|\n++" > }, > "execution_count": 0, > "status": "ok" > }, > "progress": 1.0, > "state": "available" > } > It looks like can't read the metadata,I have config livy with SPARK_HOME,and > run > under yarn model,the hive-site.xml also cp to SPARK_HOME/conf/. > *but't when I use spark-shell:* > ** > ** > scala> spark.sql("show databases").show > +-+ > | databaseName| > +-+ > | default| > | tpcds_carbon| > |tpcds_carbon2| > | tpcds_indexr| > |tpcds_parquet| > | tpcds_source| > +-+ > 2017-11-22 > > > lk_hadoop -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com
livy can't list databases
hi,all: I'm trying livy0.4 with spark2.1 curl -H "Content-type: application/json" -X POST http://kafka02:8998/sessions -d '{"kind": "spark"}' | python -m json.tool curl -H "Content-type: application/json" -X POST http://kafka02:8998/sessions/0/statements -d '{"code": "spark.sql(\"show databases\").show"}' | python -m json.tool curl http://kafka02:8998/sessions/0/statements/0 | python -m json.tool % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 100 235 100 2350 0 15959 0 --:--:-- --:--:-- --:--:-- 16785 { "code": "spark.sql(\"show databases\").show", "id": 0, "output": { "data": { "text/plain": "++\n|databaseName|\n++\n| default|\n++" }, "execution_count": 0, "status": "ok" }, "progress": 1.0, "state": "available" } It looks like can't read the metadata,I have config livy with SPARK_HOME,and run under yarn model,the hive-site.xml also cp to SPARK_HOME/conf/. but't when I use spark-shell: scala> spark.sql("show databases").show +-+ | databaseName| +-+ | default| | tpcds_carbon| |tpcds_carbon2| | tpcds_indexr| |tpcds_parquet| | tpcds_source| +-+ 2017-11-22 lk_hadoop