Fwd: [ANNOUNCE] Apache Sedona 1.6.1 released
Dear all, We are happy to report that we have released Apache Sedona 1.6.1. Apache Sedona is a cluster computing system for processing large-scale spatial data. Website: http://sedona.apache.org/ Release notes: https://github.com/apache/sedona/blob/sedona-1.6.1/docs/setup/release-notes.md Download links: https://github.com/apache/sedona/releases/tag/sedona-1.6.1 Additional resources: Mailing list: d...@sedona.apache.org Twitter: https://twitter.com/ApacheSedona LinkedIn: https://www.linkedin.com/company/apache-sedona Regards, Apache Sedona Team - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: how to use spark.mesos.constraints
Hi, I am also trying to use the spark.mesos.constraints but it gives me the same error: job has not be accepted by any resources. I am doubting that I should start some additional service like ./sbin/start-mesos-shuffle-service.sh. Am I correct? Thanks, Jia On Tue, Dec 1, 2015 at 5:14 PM, rarediel wrote: > I am trying to add mesos constraints to my spark-submit command in my > marathon file I am setting it to spark.mesos.coarse=true. > > Here is an example of a constraint I am trying to set. > > --conf spark.mesos.constraint=cpus:2 > > I want to use the constraints to control the amount of executors are > created > so I can control the total memory of my spark job. > > I've tried many variations of resource constraints, but no matter which > resource or what number, range, etc. I do I always get the error "Initial > job has not accepted any resources; check your cluster UI...". My cluster > has the available resources. Is there any examples I can look at where > people use resource constraints? > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/how-to-use-spark-mesos-constraints-tp25541.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >
Re: What are the likely causes of org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle?
Hi Peng, I got exactly same error! My shuffle data is also very large. Have you figured out a method to solve that? Thanks, Jia On Fri, Apr 24, 2015 at 7:59 AM, Peng Cheng wrote: > I'm deploying a Spark data processing job on an EC2 cluster, the job is > small > for the cluster (16 cores with 120G RAM in total), the largest RDD has only > 76k+ rows. But heavily skewed in the middle (thus requires repartitioning) > and each row has around 100k of data after serialization. The job always > got > stuck in repartitioning. Namely, the job will constantly get following > errors and retries: > > org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output > location for shuffle > > org.apache.spark.shuffle.FetchFailedException: Error in opening > FileSegmentManagedBuffer > > org.apache.spark.shuffle.FetchFailedException: > java.io.FileNotFoundException: /tmp/spark-... > I've tried to identify the problem but it seems like both memory and disk > consumption of the machine throwing these errors are below 50%. I've also > tried different configurations, including: > > let driver/executor memory use 60% of total memory. > let netty to priortize JVM shuffling buffer. > increase shuffling streaming buffer to 128m. > use KryoSerializer and max out all buffers > increase shuffling memoryFraction to 0.4 > But none of them works. The small job always trigger the same series of > errors and max out retries (upt to 1000 times). How to troubleshoot this > thing in such situation? > > Thanks a lot if you have any clue. > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/What-are-the-likely-causes-of-org-apache-spark-shuffle-MetadataFetchFailedException-Missing-an-outpu-tp22646.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >
Help!!!Map or join one large datasets then suddenly remote Akka client disassociated
Hi folks, Help me! I met a very weird problem. I really need some help!! Here is my situation: Case: Assign keys to two datasets (one is 96GB with 2.7 billion records and one 1.5GB with 30k records) via MapPartitions first, and join them together with their keys. Environment: Standalone Spark on Amazon EC2 Master*1 13GB 8 cores Worker*16 each one 13GB 8 cores (After met this problem, I switched to Worker*16 each one 59GB 8 cores) Read and write on HDFS (same cluster) -- Problem: At the beginning:--- The MapPartitions looks no problem. But when Spark does the Join for two datasets, the console says *"ERROR TaskSchedulerImpl: Lost executor 4 on ip-172-31-27-174.us-west-2.compute.internal: remote Akka client disassociated"* Then I go back to this worker and check its log There is something like "Master said remote Akka client disassociated and asked to kill executor *** and then the worker killed this executor" (Sorry I deleted that log and just remember the content.) There is no other errors before the Akka client disassociated (for both of master and worker). Then --- I tried one 62GB dataset with the 1.5 GB dataset. My job worked smoothly. *HOWEVER, I found one thing: If I set the spark.shuffle.memoryFraction to Zero, same error will happen on this 62GB dataset.* Then --- I switched my workers to Worker*16 each one 59GB 8 cores. Error even happened when Spark does the MapPartitions Some metrics I found *When I do the MapPartitions or Join with 96GB data, its shuffle write is around 100GB. And I cached 96GB data and its size is around 530GB.* *Garbage collection time for 96GB dataset when Spark does the Map or Join is around 12 second.* My analysis-- This problem might be caused by large shuffle write data. The large shuffle write caused high I/O on disk. If the shuffle write cannot be done by some timeout period, then the master will think this executor is disassociated. But I don't know how to solve this problem. --- Any help will be appreciated!!! Thanks, Jia
Cannot change the memory of workers
Hi guys, Currently I am running Spark program on Amazon EC2. Each worker has around (less than but near to )2 gb memory. By default, I can see each worker is allocated 976 mb memory as the table shows below on Spark WEB UI. I know this value is from (Total memory minus 1 GB). But I want more than 1 gb in each of my worker. AddressStateCoresMemory ALIVE1 (0 Used)976.0 MB (0.0 B Used)Based on the instruction on Spark website, I made "export SPARK_WORKER_MEMORY=1g" in spark-env.sh. But it doesn't work. BTW, I can set "SPARK_EXECUTOR_MEMORY=1g" and it works. Can anyone help me? Is there a requirement that one worker must maintain 1 gb memory for itself aside from the memory for Spark? Thanks, Jia