Re: Submit many spark applications

2018-05-25 Thread yncxcw
hi, please try to reduce the default heap size for the machine you use to submit applications: For example: export _JAVA_OPTIONS="-Xmx512M" The submitter which is also a JVM does not need to reserve lots of memory. Wei -- Sent from:

Re: cache OS memory and spark usage of it

2018-04-11 Thread yncxcw
hi, Raúl (1)&(2) yes, the OS needs some pressure to release it. For example, if you have a total 16GB ram in your machine, then you read a file of 8GB and immediately close it. Noe the page cache would cache 8GB the file data. Then you start a program requesting memory from OS, the OS will

Re: cache OS memory and spark usage of it

2018-04-10 Thread yncxcw
hi, Raúl First, the most of the OS memory cache is used by Page Cache which OS use for caching the recent read/write I/O. I think the understanding of OS memory cache should be discussed in two different perspectives. From a perspective of

Re: Spark production scenario

2018-03-08 Thread yncxcw
hi, Passion I don't know an exact solution. But yes, the port each executor chosen to communicate with driver is random. I am wondering if it's possible that you can have a node has two ethernet card, configure one card for intranet for Spark and configure one card for WAN. Then connect the

Re: Data loss in spark job

2018-02-27 Thread yncxcw
hi, Please check if your os supports memory overcommit. I doubted this caused by your os bans the memory overcommitment, and the os kills the process when memory overcommitment is detected (the spark executor is chosen to kill). This is why you receive sigterm, and executor failed with the

Re: Spark EMR executor-core vs Vcores

2018-02-26 Thread yncxcw
hi, all I also noticed this problem. The reason is that Yarn accounts each executor for only 1, no matter how many cores you configured. Because Yarn only uses memory as the primary metrics for resource allocation. It means that Yarn will pack as many as executors on each node as long as the