Re: coalesce ending up very unbalanced - but why?
Since it's pyspark it's just using the default hash partitioning I believe. Trying a prime number (71 so that there's enough CPUs) doesn't seem to change anything. Out of curiousity why did you suggest that? Googling "spark coalesce prime" doesn't give me any clue :-) Adrian On 14/12/2016 13:58, Dirceu Semighini Filho wrote: Hi Adrian, Which kind of partitioning are you using? Have you already tried to coalesce it to a prime number? 2016-12-14 11:56 GMT-02:00 Adrian Bridgett <adr...@opensignal.com <mailto:adr...@opensignal.com>>: I realise that coalesce() isn't guaranteed to be balanced and adding a repartition() does indeed fix this (at the cost of a large shuffle. I'm trying to understand _why_ it's so uneven (hopefully it helps someone else too). This is using spark v2.0.2 (pyspark). Essentially we're just reading CSVs into a DataFrame (which we persist serialised for some calculations), then writing it back out as PRQ. To avoid too many PRQ files I've set a coalesce of 72 (9 boxes, 8 CPUs each). The writers end up with about 700-900MB each (not bad). Except for one which is at 6GB before I killed it. Input data is 12000 gzipped CSV files in S3 (approx 30GB), named like this, almost all about 2MB each: s3://example-rawdata-prod/data/2016-12-13/v3.19.0/1481587209-i-da71c942-389.gz s3://example-rawdata-prod/data/2016-12-13/v3.19.0/1481587529-i-01d3dab021b760d29-334.gz (we're aware that this isn't an ideal naming convention from an S3 performance PoV). The actual CSV file format is: UUID\tINT\tINT\... . (wide rows - about 300 columns) e.g.: 17f9c2a7-ddf6-42d3-bada-63b845cb33a51481587198750 11213 1d723493-5341-450d-a506-5c96ce0697f01481587198751 11212 ... 64cec96f-732c-44b8-a02e-098d5b63ad771481587198752 11211 ... The dataframe seems to be stored evenly on all the nodes (according to the storage tab) and all the blocks are the same size. Most of the tasks are executed at NODE_LOCAL locality (although there are a few ANY). The oversized task is NODE_LOCAL though. The reading and calculations all seem evenly spread, confused why the writes aren't as I'd expect the input partitions to be even, what's causing and what we can do? Maybe it's possible for coalesce() to be a bit smarter in terms of which partitions it coalesces - balancing the size of the final partitions rather than the number of source partitions in each final partition. Thanks for any light you can shine! Adrian - To unsubscribe e-mail: user-unsubscr...@spark.apache.org <mailto:user-unsubscr...@spark.apache.org> -- *Adrian Bridgett* | Sysadmin Engineer, OpenSignal <http://www.opensignal.com> _ Office: 3rd Floor, The Angel Office, 2 Angel Square, London, EC1V 1NY Phone #: +44 777-377-8251 Skype: abridgett |@adrianbridgett <http://twitter.com/adrianbridgett>| LinkedIn link <https://uk.linkedin.com/in/abridgett> _
coalesce ending up very unbalanced - but why?
I realise that coalesce() isn't guaranteed to be balanced and adding a repartition() does indeed fix this (at the cost of a large shuffle. I'm trying to understand _why_ it's so uneven (hopefully it helps someone else too). This is using spark v2.0.2 (pyspark). Essentially we're just reading CSVs into a DataFrame (which we persist serialised for some calculations), then writing it back out as PRQ. To avoid too many PRQ files I've set a coalesce of 72 (9 boxes, 8 CPUs each). The writers end up with about 700-900MB each (not bad). Except for one which is at 6GB before I killed it. Input data is 12000 gzipped CSV files in S3 (approx 30GB), named like this, almost all about 2MB each: s3://example-rawdata-prod/data/2016-12-13/v3.19.0/1481587209-i-da71c942-389.gz s3://example-rawdata-prod/data/2016-12-13/v3.19.0/1481587529-i-01d3dab021b760d29-334.gz (we're aware that this isn't an ideal naming convention from an S3 performance PoV). The actual CSV file format is: UUID\tINT\tINT\... . (wide rows - about 300 columns) e.g.: 17f9c2a7-ddf6-42d3-bada-63b845cb33a51481587198750 11213 1d723493-5341-450d-a506-5c96ce0697f01481587198751 11212 ... 64cec96f-732c-44b8-a02e-098d5b63ad771481587198752 11211 ... The dataframe seems to be stored evenly on all the nodes (according to the storage tab) and all the blocks are the same size. Most of the tasks are executed at NODE_LOCAL locality (although there are a few ANY). The oversized task is NODE_LOCAL though. The reading and calculations all seem evenly spread, confused why the writes aren't as I'd expect the input partitions to be even, what's causing and what we can do? Maybe it's possible for coalesce() to be a bit smarter in terms of which partitions it coalesces - balancing the size of the final partitions rather than the number of source partitions in each final partition. Thanks for any light you can shine! Adrian - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: mesos in spark 2.0.1 - must call stop() otherwise app hangs
Fab thanks all - I'll ensure we fix our code :-) On 05/10/2016 18:10, Sean Owen wrote: Being discussed as we speak at https://issues.apache.org/jira/browse/SPARK-17707 Calling stop() is definitely the right thing to do and always has been (see examples), but, may be possible to get rid of the new non-daemon thread preventing shutdown to make it possible to still get away without it. - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: Issue with rogue data in csv file used in Spark application
We use the spark-csv (a successor of which is built in to spark 2.0) for this. It doesn't cause crashes, failed parsing is logged. We run on Mesos so I have to pull back all the logs from all the executors and search for failed lines (so that we can ensure that the failure rate isn't too high). Hope this helps. Adrian - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: very high maxresults setting (no collect())
Hi Michael, No spark upgrade, we've been changing some of our data pipelines so the data volumes have probably been getting a bit larger. Just in the last few weeks we've seen quite a few jobs needing a larger maxResultSize. Some jobs have gone from "fine with 1GB default" to 3GB. Wondering what besides a collect could cause this (as there's certainly not an explicit collect()). Mesos, parquet source data, a broadcast of a small table earlier which is joined then just a few aggregations, select, coalesce and spark-csv write. The executors go along nicely (as does the driver) and then we start to hit memory pressure on the driver in the output loop and the job grinds to a crawl (we eventually have to kill it and restart with more memory). Adrian - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
very high maxresults setting (no collect())
Hi, We've recently started seeing a huge increase in spark.driver.maxResultSize - we are starting to set it at 3GB (and increase our driver memory a lot to 12GB or so). This is on v1.6.1 with Mesos scheduler. All the docs I can see is that this is to do with .collect() being called on a large RDD (which isn't the case AFAIK - certainly nothing in the code) and it's rather puzzling me as to what's going on. I thought that the number of tasks was coming into it (about 14000 tasks in each of about a dozen stages). Adding a coalesce seemed to help but now we are hitting the problem again after a few minor code tweaks. What else could be contributing to this? Thoughts I've had: - number of tasks - metrics? - um, a bit stuck! The code looks like this: df= df.persist() val rows = df.count() // actually we loop over this a few times val output = df. groupBy("id").agg( avg($"score").as("avg_score"), count($"id").as("rows") ). select( $"id", $"avg_score, $"rows", ).sort($"id") output.coalesce(1000).write.format("com.databricks.spark.csv").save('/tmp/...') Cheers for any help/pointers! There are a couple of memory leak tickets fixed in v1.6.2 that may affect the driver so I may try an upgrade (the executors are fine). Adrian - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
2.0.1/2.1.x release dates
Just wondering if there were any rumoured release dates for either of the above. I'm seeing some odd hangs with 2.0.0 and mesos (and I know that the mesos integration has had a bit of updating in 2.1.x). Looking at JIRA, there's no suggested release date and issues seem to be added to a release version once resolved so the usual trick of looking at the resolved/unresolved ratio isn't helping :-) The wiki only mentions 2.0.0 so no joy there either. Still doing testing but then I don't want to test with 2.1.x if it's going to be under heavy development for a while longer. Thanks for any info, Adrian - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
coalesce serialising earlier work
In short: df.coalesce(1).write seems to make all the earlier calculations about the dataframe go through a single task (rather than on multiple tasks and then the final dataframe to be sent through a single worker). Any idea how we can force the job to run in parallel? In more detail: We have a job that we wish to write out as a single CSV file. We have two approaches (code below): df = (filtering, calculations) df.coalesce(num).write. format("com.databricks.spark.csv"). option("codec", "org.apache.hadoop.io.compress.GzipCodec"). save(output_path) Option A: (num=100) - dataframe calculated in parallel - upload in parallel - gzip in parallel - but we then have to run "hdfs dfs -getmerge" to download all data and then write it back again. Option B: (num=1) - single gzip (but gzip is pretty quick) - uploads go through a single machine - no HDFS commands - dataframe is _not_ calculated in parallel (we can see filters getting just a single task) What I'm not sure is why spark (1.6.1) is deciding to run just a single task for the calculation - and what we can do about it? We really want the df to be calculated in parallel and then this is _then_ coalesced before being written. (It may be that the -getmerge approach will still be faster) df.coalesce(100).coalesce(1).write. doesn't look very likely to help! Adrian -- *Adrian Bridgett*
odd python.PythonRunner Times values?
I'm seeing output like this on our mesos spark slaves: 16/05/23 11:44:04 INFO python.PythonRunner: Times: total = 1137, boot = -590, init = 593, finish = 1134 16/05/23 11:44:04 INFO python.PythonRunner: Times: total = 1652, boot = -446, init = 481, finish = 1617 This seems to be coming from pyspark/worker.py, however it looks like it should be being printed as milliseconds (as a long) - and it certainly doesn't look that way from the output above! - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Worker's BlockManager Folder not getting cleared
Hi Abhi - are you running on Mesos perchance? If so then with spark <1.6 you will be hitting https://issues.apache.org/jira/browse/SPARK-10975 With spark >= 1.6: https://issues.apache.org/jira/browse/SPARK-12430 and also be aware of: https://issues.apache.org/jira/browse/SPARK-12583 On 25/01/2016 07:14, Abhishek Anand wrote: Hi All, How long the shuffle files and data files are stored on the block manager folder of the workers. I have a spark streaming job with window duration of 2 hours and slide interval of 15 minutes. When I execute the following command in my block manager path find . -type f -cmin +150 -name "shuffle*" -exec ls {} \; I see a lot of files which means that they are not getting cleared which I was expecting that they should get cleared. Subsequently, this size keeps on increasing and takes space on the disk. Please suggest how to get rid of this and help on understanding this behaviour. Thanks !!! Abhi -- *Adrian Bridgett* | Sysadmin Engineer, OpenSignal <http://www.opensignal.com> _ Office: 3rd Floor, The Angel Office, 2 Angel Square, London, EC1V 1NY Phone #: +44 777-377-8251 Skype: abridgett |@adrianbridgett <http://twitter.com/adrianbridgett>| LinkedIn link <https://uk.linkedin.com/in/abridgett> _
Re: Executor deregistered after 2mins (mesos, 1.6.0-rc4)
I've worked around this by setting spark.shuffle.io.connectionTimeout=3600s, uploading the spark tarball to HDFS again and restarting the shuffle service (not 100% sure that last step is needed). I attempted (with my newbie Scala skills) to allow ExternalShuffleClient() to accept an optional closeIdleConnections parameter (defaulting to "true") so that the MesosExternalShuffleClient can set this to "false". I then passsed this into the TransportContext call. However this didn't seem to work (maybe it's using the config from HDFS not the local spark (which I thought the Driver used). Anyhow I'll do more testing and then raise a JIRA. Adrian -- *Adrian Bridgett* | Sysadmin Engineer, OpenSignal <http://www.opensignal.com> _ Office: First Floor, Scriptor Court, 155-157 Farringdon Road, Clerkenwell, London, EC1R 3AD Phone #: +44 777-377-8251 Skype: abridgett |@adrianbridgett <http://twitter.com/adrianbridgett>| LinkedIn link <https://uk.linkedin.com/in/abridgett> _
Re: Executor deregistered after 2mins (mesos, 1.6.0-rc4)
Hi Ted, sorry I should have been a bit more consistent in my cut and paste (there are nine nodes +driver) - I was concentrating on S9/6 (these logs are from that box - 10.1.201.165). S1/4 lines are: 15/12/29 18:49:45 INFO CoarseMesosSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (ip-10-1-202-114.ec2.internal:19891) with ID f02cb67a-3519-4655-b23a-edc0dd082bf1-S1/4 15/12/29 18:49:45 INFO ExecutorAllocationManager: New executor f02cb67a-3519-4655-b23a-edc0dd082bf1-S1/4 has registered (new total is 6) 15/12/29 18:49:45 INFO BlockManagerMasterEndpoint: Registering block manager ip-10-1-202-114.ec2.internal:14257 with 13.8 GB RAM, BlockManagerId(f02cb67a-3519-4655-b23a-edc0dd082bf1-S1/4, ip-10-1-202-114.ec2.internal, 14257) 15/12/29 18:58:07 WARN TaskSetManager: Lost task 21.0 in stage 1.0 (TID 2149, ip-10-1-200-232.ec2.internal): FetchFailed(BlockManagerId(f02cb67a-3519-4655-b23a-edc0dd082bf1-S1/4, ip-10-1-202-114.ec2.internal, 7337), shuffleId=1, mapId=5, reduceId=21, message= org.apache.spark.shuffle.FetchFailedException: java.lang.RuntimeException: Executor is not registered (appId=a9344e17-f767-4b1e-a32e-e98922d6ca43-0014, execId=f02cb67a-3519-4655-b23a-edc0dd082bf1-S1/4) at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.getBlockData(ExternalShuffleBlockResolver.java:183) I've tried to see how I can increase the idle timeout of the mesosExternalShuffleClient.registerDriverWithShuffleService as thats seems to be the core issue. On 29/12/2015 21:17, Ted Yu wrote: Have you searched log for 'f02cb67a-3519-4655-b23a-edc0dd082bf1-S1/4' ? In the snippet you posted, I don't see registration of this Executor. Cheers On Tue, Dec 29, 2015 at 12:43 PM, Adrian Bridgett <adr...@opensignal.com <mailto:adr...@opensignal.com>> wrote: We're seeing an "Executor is not registered" error on a Spark (1.6.0rc4, mesos-0.26) cluster. It seems as if the logic in MesosExternalShuffleService.scala isn't working for some reason (new in 1.6 I believe). spark application sees this: ... 15/12/29 18:49:41 INFO MesosExternalShuffleClient: Successfully registered app a9344e17-f767-4b1e-a32e-e98922d6ca43-0014 with external shuffle service. 15/12/29 18:49:41 INFO MesosExternalShuffleClient: Successfully registered app a9344e17-f767-4b1e-a32e-e98922d6ca43-0014 with external shuffle service. 15/12/29 18:49:43 INFO CoarseMesosSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (ip-10-1-201-165.ec2.internal:37660) with ID f02cb67a-3519-4655-b23a-edc0dd082bf1-S9/6 15/12/29 18:49:43 INFO ExecutorAllocationManager: New executor f02cb67a-3519-4655-b23a-edc0dd082bf1-S3/1 has registered (new total is 1) 15/12/29 18:49:43 INFO BlockManagerMasterEndpoint: Registering block manager ip-10-1-201-165.ec2.internal:53854 with 13.8 GB RAM, BlockManagerId(f02cb67a-3519-4655-b23a-edc0dd082bf1-S9/6, ip-10-1-201-165.ec2.internal, 53854) 15/12/29 18:49:43 INFO BlockManagerMasterEndpoint: Registering block manager ip-10-1-201-132.ec2.internal:12793 with 13.8 GB RAM, BlockManagerId(f02cb67a-3519-4655-b23a-edc0dd082bf1-S3/1, ip-10-1-201-132.ec2.internal, 12793) 15/12/29 18:49:43 INFO ExecutorAllocationManager: New executor f02cb67a-3519-4655-b23a-edc0dd082bf1-S9/6 has registered (new total is 2) ... 15/12/29 18:58:06 INFO BlockManagerInfo: Added broadcast_6_piece0 in memory on ip-10-1-201-165.ec2.internal:53854 (size: 5.2KB, free: 13.8GB) 15/12/29 18:58:06 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 1 to ip-10-1-202-121.ec2.internal:59734 15/12/29 18:58:06 INFO MapOutputTrackerMaster: Size of output statuses for shuffle 1 is 1671814 bytes 15/12/29 18:58:06 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 1 to ip-10-1-201-165.ec2.internal:37660 ... 15/12/29 18:58:07 INFO TaskSetManager: Starting task 63.0 in stage 1.0 (TID 2191, ip-10-1-200-232.ec2.internal, partition 63,PROCESS_LOCAL, 2171 bytes) 15/12/29 18:58:07 WARN TaskSetManager: Lost task 21.0 in stage 1.0 (TID 2149, ip-10-1-200-232.ec2.internal): FetchFailed(BlockManagerId(f02cb67a-3519-4655-b23a-edc0dd082bf1-S1/4, ip-10-1-202-114.ec2.internal, 7337), shuffleId=1, mapId=5, reduceId=21, message= org.apache.spark.shuffle.FetchFailedException: java.lang.RuntimeException: Executor is not registered (appId=a9344e17-f767-4b1e-a32e-e98922d6ca43-0014, execId=f02cb67a-3519-4655-b23a-edc0dd082bf1-S1/4) at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.getBlockData(ExternalShuffleBlockResolver.java:183) ... 15/12/29 18:58:07 INFO DAGScheduler: Resubmitting ShuffleMapStage 0 (reduceByKey at /home/ubuntu/ajay/name-mapper/kpis/namemap_kpi_processor.py:48) and ShuffleM
Re: Executor deregistered after 2mins (mesos, 1.6.0-rc4)
To wrap this up, it's the shuffle manager sending the FIN so setting spark.shuffle.io.connectionTimeout to 3600s is the only workaround right now. SPARK-12583 raised. Adrian -- *Adrian Bridgett*
Executor deregistered after 2mins (mesos, 1.6.0-rc4)
e the time from 120s to anything higher even when I set this on the driver (will retry setting that on the shuffle service): spark.network.timeout 180s spark.shuffle.io.connectionTimeout 240s Adrian -- *Adrian Bridgett*
Re: default parallelism and mesos executors
Thanks Iulian, I'll retest with 1.6.x once it's released (probably won't have enough spare time to test with the RC). On 11/12/2015 15:00, Iulian DragoČ™ wrote: On Wed, Dec 9, 2015 at 4:29 PM, Adrian Bridgett <adr...@opensignal.com <mailto:adr...@opensignal.com>> wrote: (resending, text only as first post on 2nd never seemed to make it) Using parallelize() on a dataset I'm only seeing two tasks rather than the number of cores in the Mesos cluster. This is with spark 1.5.1 and using the mesos coarse grained scheduler. Running pyspark in a console seems to show that it's taking a while before the mesos executors come online (at which point the default parallelism is changing). If I add "sleep 30" after initialising the SparkContext I get the "right" number (42 by coincidence!) I've just tried increasing minRegisteredResourcesRatio to 0.5 but this doesn't affect either the test case below nor my code. This limit seems to be implemented only in the coarse-grained Mesos scheduler, but the fix will be available starting with Spark 1.6.0 (1.5.2 doesn't have it). iulian Is there something else I can do instead? Perhaps it should be seeing how many tasks _should_ be available rather than how many are (I'm also using dynamicAllocation). 15/12/02 14:34:09 INFO mesos.CoarseMesosSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0 >>> >>> >>> print (sc.defaultParallelism) 2 >>> 15/12/02 14:34:12 INFO mesos.CoarseMesosSchedulerBackend: Mesos task 5 is now TASK_RUNNING 15/12/02 14:34:13 INFO mesos.MesosExternalShuffleClient: Successfully registered app 20151117-115458-164233482-5050-24333-0126 with external shuffle service. 15/12/02 14:34:15 INFO mesos.CoarseMesosSchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://sparkExecutor@ip-10-1-200-147.ec2.internal:41194/user/Executor#-1021429650]) with ID 20151117-115458-164233482-5050-24333-S22/5 15/12/02 14:34:15 INFO spark.ExecutorAllocationManager: New executor 20151117-115458-164233482-5050-24333-S22/5 has registered (new total is 1) >>> print (sc.defaultParallelism) 42 Thanks Adrian Bridgett - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org <mailto:user-unsubscr...@spark.apache.org> For additional commands, e-mail: user-h...@spark.apache.org <mailto:user-h...@spark.apache.org> -- -- Iulian Dragos -- Reactive Apps on the JVM www.typesafe.com <http://www.typesafe.com> -- *Adrian Bridgett* | Sysadmin Engineer, OpenSignal <http://www.opensignal.com> _ Office: First Floor, Scriptor Court, 155-157 Farringdon Road, Clerkenwell, London, EC1R 3AD Phone #: +44 777-377-8251 Skype: abridgett |@adrianbridgett <http://twitter.com/adrianbridgett>| LinkedIn link <https://uk.linkedin.com/in/abridgett> _
default parallelism and mesos executors
(resending, text only as first post on 2nd never seemed to make it) Using parallelize() on a dataset I'm only seeing two tasks rather than the number of cores in the Mesos cluster. This is with spark 1.5.1 and using the mesos coarse grained scheduler. Running pyspark in a console seems to show that it's taking a while before the mesos executors come online (at which point the default parallelism is changing). If I add "sleep 30" after initialising the SparkContext I get the "right" number (42 by coincidence!) I've just tried increasing minRegisteredResourcesRatio to 0.5 but this doesn't affect either the test case below nor my code. Is there something else I can do instead? Perhaps it should be seeing how many tasks _should_ be available rather than how many are (I'm also using dynamicAllocation). 15/12/02 14:34:09 INFO mesos.CoarseMesosSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0 >>> >>> >>> print (sc.defaultParallelism) 2 >>> 15/12/02 14:34:12 INFO mesos.CoarseMesosSchedulerBackend: Mesos task 5 is now TASK_RUNNING 15/12/02 14:34:13 INFO mesos.MesosExternalShuffleClient: Successfully registered app 20151117-115458-164233482-5050-24333-0126 with external shuffle service. 15/12/02 14:34:15 INFO mesos.CoarseMesosSchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://sparkExecutor@ip-10-1-200-147.ec2.internal:41194/user/Executor#-1021429650]) with ID 20151117-115458-164233482-5050-24333-S22/5 15/12/02 14:34:15 INFO spark.ExecutorAllocationManager: New executor 20151117-115458-164233482-5050-24333-S22/5 has registered (new total is 1) .... >>> print (sc.defaultParallelism) 42 Thanks Adrian Bridgett - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
default parallelism and mesos executors
Using parallelize() on a dataset I'm only seeing two tasks rather than the number of cores in the Mesos cluster. This is with spark 1.5.1 and using the mesos coarse grained scheduler. Running pyspark in a console seems to show that it's taking a while before the mesos executors come online (at which point the default parallelism is changing). If I add "sleep 30" after initialising the SparkContext I get the "right" number (42 by coincidence!) I've just tried increasing minRegisteredResourcesRatio to 0.5 but this doesn't affect either the test case below nor my code. Is there something else I can do instead? Perhaps it should be seeing how many tasks _should_ be available rather than how many are (I'm also using dynamicAllocation). 15/12/02 14:34:09 INFO mesos.CoarseMesosSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0 >>> >>> >>> print (sc.defaultParallelism) 2 >>> 15/12/02 14:34:12 INFO mesos.CoarseMesosSchedulerBackend: Mesos task 5 is now TASK_RUNNING 15/12/02 14:34:13 INFO mesos.MesosExternalShuffleClient: Successfully registered app 20151117-115458-164233482-5050-24333-0126 with external shuffle service. 15/12/02 14:34:15 INFO mesos.CoarseMesosSchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://sparkExecutor@ip-10-1-200-147.ec2.internal:41194/user/Executor#-1021429650]) with ID 20151117-115458-164233482-5050-24333-S22/5 15/12/02 14:34:15 INFO spark.ExecutorAllocationManager: New executor 20151117-115458-164233482-5050-24333-S22/5 has registered (new total is 1) >>> print (sc.defaultParallelism) 42 -- *Adrian Bridgett* | Sysadmin Engineer, OpenSignal <http://www.opensignal.com> _ Office: First Floor, Scriptor Court, 155-157 Farringdon Road, Clerkenwell, London, EC1R 3AD Phone #: +44 777-377-8251 Skype: abridgett |@adrianbridgett <http://twitter.com/adrianbridgett>| LinkedIn link <https://uk.linkedin.com/in/abridgett> _
Re: hdfs-ha on mesos - odd bug
Hi Sam, in short, no, it's a traditional install as we plan to use spot instances and didn't want price spikes to kill off HDFS. We're actually doing a bit of a hybrid, using spot instances for the mesos slaves, ondemand for the mesos masters. So for the time being, putting hdfs on the masters (we'll probably move to multiple slave instance types to avoid losing too many when spot price spikes, but for now this is acceptable). Masters running CDH5. Using hdfs://current-hdfs-master:8020 works fine, however using hdfs://nameservice1 fails in the rather odd way described (well, more that the workaround actually works!) I think there's some underlying bug here that's being exposed. On 14/09/2015 22:27, Sam Bessalah wrote: I don't know about the broken url. But are you running HDFS as a mesos framework? If so is it using mesos-dns? Then you should resolve the namenode via hdfs:/// On Mon, Sep 14, 2015 at 3:55 PM, Adrian Bridgett <adr...@opensignal.com <mailto:adr...@opensignal.com>> wrote: I'm hitting an odd issue with running spark on mesos together with HA-HDFS, with an even odder workaround. In particular I get an error that it can't find the HDFS nameservice unless I put in a _broken_ url (discovered that workaround by mistake!). core-site.xml, hdfs-site.xml is distributed to the slave node - and that file is read since I deliberately break the file then I get an error as you'd expect. NB: This is a bit different to http://mail-archives.us.apache.org/mod_mbox/spark-user/201402.mbox/%3c1392442185079-1549.p...@n3.nabble.com%3E Spark 1.5.0: t=sc.textFile("hdfs://nameservice1/tmp/issue") t.count() (fails) t=sc.textFile("file://etc/passwd") t.count() (errors about bad url - should have an extra / of course) t=sc.textFile("hdfs://nameservice1/tmp/issue") t.count() then it works!!! I should say that using file:///etc/passwd or hdfs:///tmp/issue both fail as well. Unless preceded by a broken url.I've tried setting spark.hadoop.cloneConf to true, no change. Sample (broken) run: 15/09/14 13:00:14 DEBUG HadoopRDD: Creating new JobConf and caching it for later re-use 15/09/14 13:00:14 DEBUG : address: ip-10-1-200-165/10.1.200.165 <http://10.1.200.165> isLoopbackAddress: false, with host 10.1.200.165 ip-10-1-200-165 15/09/14 13:00:14 DEBUG BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false 15/09/14 13:00:14 DEBUG BlockReaderLocal: dfs.client.read.shortcircuit = false 15/09/14 13:00:14 DEBUG BlockReaderLocal: dfs.client.domain.socket.data.traffic = false 15/09/14 13:00:14 DEBUG BlockReaderLocal: dfs.domain.socket.path = /var/run/hdfs-sockets/dn 15/09/14 13:00:14 DEBUG HAUtil: No HA service delegation token found for logical URI hdfs://nameservice1 15/09/14 13:00:14 DEBUG BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false 15/09/14 13:00:14 DEBUG BlockReaderLocal: dfs.client.read.shortcircuit = false 15/09/14 13:00:14 DEBUG BlockReaderLocal: dfs.client.domain.socket.data.traffic = false 15/09/14 13:00:14 DEBUG BlockReaderLocal: dfs.domain.socket.path = /var/run/hdfs-sockets/dn 15/09/14 13:00:14 DEBUG RetryUtils: multipleLinearRandomRetry = null 15/09/14 13:00:14 DEBUG Server: rpcKind=RPC_PROTOCOL_BUFFER, rpcRequestWrapperClass=class org.apache.hadoop.ipc.ProtobufRpcEngine$RpcRequestWrapper, rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker@6245f50b 15/09/14 13:00:14 DEBUG Client: getting client out of cache: org.apache.hadoop.ipc.Client@267f0fd3 15/09/14 13:00:14 DEBUG NativeCodeLoader: Trying to load the custom-built native-hadoop library... 15/09/14 13:00:14 DEBUG NativeCodeLoader: Loaded the native-hadoop library ... 15/09/14 13:00:14 DEBUG Client: Connecting to mesos-1.example.com/10.1.200.165:8020 <http://mesos-1.example.com/10.1.200.165:8020> 15/09/14 13:00:14 DEBUG Client: IPC Client (1739425103) connection to mesos-1.example.com/10.1.200.165:8020 <http://mesos-1.example.com/10.1.200.165:8020> from ubuntu: starting, having connections 1 15/09/14 13:00:14 DEBUG Client: IPC Client (1739425103) connection to mesos-1.example.com/10.1.200.165:8020 <http://mesos-1.example.com/10.1.200.165:8020> from ubuntu sending #0 15/09/14 13:00:14 DEBUG Client: IPC Client (1739425103) connection to mesos-1.example.com/10.1.200.165:8020 <http://mesos-1.example.com/10.1.200.165:8020> from ubuntu got value #0 15/09/14 13:00:14 DEBUG ProtobufRpcEngine: Call: getFileInfo took 36ms 15/09/14 13:00:14 DEBUG FileInputFormat: Time taken to get FileStatuses: 69 15/09/14 13:00:14 INFO FileInputFormat: Total input paths to process : 1 15/09/14 13:00:14 DEB
Re: hdfs-ha on mesos - odd bug
Thanks Steve - we are already taking the safe route - putting NN and datanodes on the central mesos-masters which are on demand. Later (much later!) we _may_ put some datanodes on spot instances (and using several spot instance types as the spikes seem to only affect one type - worst case we can rebuild the data as well). OTOH this would mainly only be beneficial if spark/mesos understood the data locality which is probably some time off (we don't need this ability now). Indeed, the error we are seeing is orthogonal to the setup - however my understanding of ha-hdfs is that it should be resolved via the hdfs-site.xml file and doesn't use DNS whatsoever (and indeed, it _does_ work - but only after we initialise the driver with a bad hdfs url.) I think there's some (missing) HDFS initialisation therefore when running spark on mesos - my suspicion is on the spark side (or my spark config). http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html#Configuration_details On 15/09/2015 10:24, Steve Loughran wrote: On 15 Sep 2015, at 08:55, Adrian Bridgett <adr...@opensignal.com> wrote: Hi Sam, in short, no, it's a traditional install as we plan to use spot instances and didn't want price spikes to kill off HDFS. We're actually doing a bit of a hybrid, using spot instances for the mesos slaves, ondemand for the mesos masters. So for the time being, putting hdfs on the masters (we'll probably move to multiple slave instance types to avoid losing too many when spot price spikes, but for now this is acceptable). Masters running CDH5. It's incredibly dangerous using hdfs NNs on spot vms; a significant enough spike will lose all of them in one go, and there goes your entire filesystem. Have a static VM, maybe even backed by EBS. If you look at Hadoop architectures from Hortonworks, Cloudera and Amazon themselves, the usual stance is HDFS on static nodes, spot instances for compute only Using hdfs://current-hdfs-master:8020 works fine, however using hdfs://nameservice1 fails in the rather odd way described (well, more that the workaround actually works!) I think there's some underlying bug here that's being exposed. this sounds an issue orthogonal to spot instances. Maybe related to how JVMs cache DNS entries forever? -- *Adrian Bridgett* | Sysadmin Engineer, OpenSignal <http://www.opensignal.com> _ Office: First Floor, Scriptor Court, 155-157 Farringdon Road, Clerkenwell, London, EC1R 3AD Phone #: +44 777-377-8251 Skype: abridgett |@adrianbridgett <http://twitter.com/adrianbridgett>| LinkedIn link <https://uk.linkedin.com/in/abridgett> _
hdfs-ha on mesos - odd bug
DEBUG : address: ip-10-1-200-245/10.1.200.245 isLoopbackAddress: false, with host 10.1.200.245 ip-10-1-200-245 15/09/14 13:47:17 DEBUG BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false 15/09/14 13:47:17 DEBUG BlockReaderLocal: dfs.client.read.shortcircuit = false 15/09/14 13:47:17 DEBUG BlockReaderLocal: dfs.client.domain.socket.data.traffic = false 15/09/14 13:47:17 DEBUG BlockReaderLocal: dfs.domain.socket.path = /var/run/hdfs-sockets/dn 15/09/14 13:47:17 DEBUG HAUtil: No HA service delegation token found for logical URI hdfs://nameservice1 15/09/14 13:47:17 DEBUG BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false 15/09/14 13:47:17 DEBUG BlockReaderLocal: dfs.client.read.shortcircuit = false 15/09/14 13:47:17 DEBUG BlockReaderLocal: dfs.client.domain.socket.data.traffic = false 15/09/14 13:47:17 DEBUG BlockReaderLocal: dfs.domain.socket.path = /var/run/hdfs-sockets/dn 15/09/14 13:47:17 DEBUG RetryUtils: multipleLinearRandomRetry = null 15/09/14 13:47:17 DEBUG Server: rpcKind=RPC_PROTOCOL_BUFFER, rpcRequestWrapperClass=class org.apache.hadoop.ipc.ProtobufRpcEngine$RpcRequestWrapper, rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker@30b68416 15/09/14 13:47:17 DEBUG Client: getting client out of cache: org.apache.hadoop.ipc.Client@4599b420 15/09/14 13:47:18 DEBUG NativeCodeLoader: Trying to load the custom-built native-hadoop library... 15/09/14 13:47:18 DEBUG NativeCodeLoader: Loaded the native-hadoop library 15/09/14 13:47:18 DEBUG DomainSocketWatcher: org.apache.hadoop.net.unix.DomainSocketWatcher$2@4ed189cf: starting with interruptCheckPeriodMs = 6 15/09/14 13:47:18 DEBUG PerformanceAdvisory: Both short-circuit local reads and UNIX domain socket are disabled. 15/09/14 13:47:18 DEBUG DataTransferSaslUtil: DataTransferProtocol not using SaslPropertiesResolver, no QOP found in configuration for dfs.data.transfer.protection 15/09/14 13:47:18 INFO deprecation: mapred.tip.id is deprecated. Instead, use mapreduce.task.id 15/09/14 13:47:18 INFO deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id 15/09/14 13:47:18 INFO deprecation: mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap 15/09/14 13:47:18 INFO deprecation: mapred.task.partition is deprecated. Instead, use mapreduce.task.partition 15/09/14 13:47:18 INFO deprecation: mapred.job.id is deprecated. Instead, use mapreduce.job.id 15/09/14 13:47:18 DEBUG Client: The ping interval is 6 ms. 15/09/14 13:47:18 DEBUG Client: Connecting to mesos-1.example.com/10.1.200.165:8020 15/09/14 13:47:18 DEBUG Client: IPC Client (2055067800) connection to mesos-1.example.com/10.1.200.165:8020 from ubuntu: starting, having connections 1 15/09/14 13:47:18 DEBUG Client: IPC Client (2055067800) connection to mesos-1.example.com/10.1.200.165:8020 from ubuntu sending #0 15/09/14 13:47:18 DEBUG Client: IPC Client (2055067800) connection to mesos-1.example.com/10.1.200.165:8020 from ubuntu got value #0 15/09/14 13:47:18 DEBUG ProtobufRpcEngine: Call: getBlockLocations took 28ms 15/09/14 13:47:18 DEBUG DFSClient: newInfo = LocatedBlocks{ -- *Adrian Bridgett* | Sysadmin Engineer, OpenSignal <http://www.opensignal.com> _ Office: First Floor, Scriptor Court, 155-157 Farringdon Road, Clerkenwell, London, EC1R 3AD Phone #: +44 777-377-8251 Skype: abridgett |@adrianbridgett <http://twitter.com/adrianbridgett>| LinkedIn link <https://uk.linkedin.com/in/abridgett> _
JNI issues with mesos
I'm trying to run spark (1.4.1) on top of mesos (0.23). I've followed the instructions (uploaded spark tarball to HDFS, set executor uri in both places etc) and yet on the slaves it's failing to lauch even the SparkPi example with a JNI error. It does run with a local master. A day of debugging later and it's time to ask for help! bin/spark-submit --master mesos://10.1.201.191:5050 --class org.apache.spark.examples.SparkPi /tmp/examples.jar (I'm putting the jar outside hdfs - on both client box + slave (turned off other slaves for debugging) - due to http://apache-spark-user-list.1001560.n3.nabble.com/Remote-jar-file-td20649.html. I should note that I had the same JNI errors when using the mesos cluster dispatcher). I'm using Oracle Java 8 (no other java - even openjdk - is installed) As you can see, the slave is downloading the framework fine (you can even see it extracted on the slave). Can anyone shed some light on what's going on - e.g. how is it attempting to run the executor? I'm going to try a different JVM (and try a custom spark distribution) but I suspect that the problem is much more basic. Maybe it can't find the hadoop native libs? Any light would be much appreciated :) I've included the slaves's stderr below: I0909 14:14:01.405185 32132 logging.cpp:177] Logging to STDERR I0909 14:14:01.405256 32132 fetcher.cpp:409] Fetcher Info: {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/20150826-133446-3217621258-5050-4064-S0\/ubuntu","items":[{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"hdfs:\/\/\/apps\/spark\/spark.tgz"}}],"sandbox_directory":"\/tmp\/mesos\/slaves\/20150826-133446-3217621258-5050-4064-S0\/frameworks\/20150826-133446-3217621258-5050-4064-211198\/executors\/20150826-133446-3217621258-5050-4064-S0\/runs\/38077da2-553e-4888-bfa3-ece2ab2119f3","user":"ubuntu"} I0909 14:14:01.406332 32132 fetcher.cpp:364] Fetching URI 'hdfs:///apps/spark/spark.tgz' I0909 14:14:01.406344 32132 fetcher.cpp:238] Fetching directly into the sandbox directory I0909 14:14:01.406358 32132 fetcher.cpp:176] Fetching URI 'hdfs:///apps/spark/spark.tgz' I0909 14:14:01.679055 32132 fetcher.cpp:104] Downloading resource with Hadoop client from 'hdfs:///apps/spark/spark.tgz' to '/tmp/mesos/slaves/20150826-133446-3217621258-5050-4064-S0/frameworks/20150826-133446-3217621258-5050-4064-211198/executors/20150826-133446-3217621258-5050-4064-S0/runs/38077da2-553e-4888-bfa3-ece2ab2119f3/spark.tgz' I0909 14:14:05.492626 32132 fetcher.cpp:76] Extracting with command: tar -C '/tmp/mesos/slaves/20150826-133446-3217621258-5050-4064-S0/frameworks/20150826-133446-3217621258-5050-4064-211198/executors/20150826-133446-3217621258-5050-4064-S0/runs/38077da2-553e-4888-bfa3-ece2ab2119f3' -xf '/tmp/mesos/slaves/20150826-133446-3217621258-5050-4064-S0/frameworks/20150826-133446-3217621258-5050-4064-211198/executors/20150826-133446-3217621258-5050-4064-S0/runs/38077da2-553e-4888-bfa3-ece2ab2119f3/spark.tgz' I0909 14:14:07.489753 32132 fetcher.cpp:84] Extracted '/tmp/mesos/slaves/20150826-133446-3217621258-5050-4064-S0/frameworks/20150826-133446-3217621258-5050-4064-211198/executors/20150826-133446-3217621258-5050-4064-S0/runs/38077da2-553e-4888-bfa3-ece2ab2119f3/spark.tgz' into '/tmp/mesos/slaves/20150826-133446-3217621258-5050-4064-S0/frameworks/20150826-133446-3217621258-5050-4064-211198/executors/20150826-133446-3217621258-5050-4064-S0/runs/38077da2-553e-4888-bfa3-ece2ab2119f3' W0909 14:14:07.489784 32132 fetcher.cpp:260] Copying instead of extracting resource from URI with 'extract' flag, because it does not seem to be an archive: hdfs:///apps/spark/spark.tgz I0909 14:14:07.489791 32132 fetcher.cpp:441] Fetched 'hdfs:///apps/spark/spark.tgz' to '/tmp/mesos/slaves/20150826-133446-3217621258-5050-4064-S0/frameworks/20150826-133446-3217621258-5050-4064-211198/executors/20150826-133446-3217621258-5050-4064-S0/runs/38077da2-553e-4888-bfa3-ece2ab2119f3/spark.tgz' Error: A JNI error has occurred, please check your installation and try again Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/Logger at java.lang.Class.getDeclaredMethods0(Native Method) at java.lang.Class.privateGetDeclaredMethods(Class.java:2701) at java.lang.Class.privateGetMethodRecursive(Class.java:3048) at java.lang.Class.getMethod0(Class.java:3018) at java.lang.Class.getMethod(Class.java:1784) at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544) at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526) Caused by: java.lang.ClassNotFoundException: org.slf4j.Logger at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 7 more
Re: JNI issues with mesos
5mins later... Trying 1.5 with a fairly plain build: ./make-distribution.sh --tgz --name os1 -Phadoop-2.6 and on my first attempt stderr showed: I0909 15:16:49.392144 1619 fetcher.cpp:441] Fetched 'hdfs:///apps/spark/spark15.tgz' to '/tmp/mesos/slaves/20150826-133446-3217621258-5050-4064-S1/frameworks/20150826-133446-3217621258-5050-4064-211204/executors/20150826-133446-3217621258-5050-4064-S1/runs/43026ba8-6624-4817-912c-3d7573433102/spark15.tgz' sh: 1: cd: can't cd to spark15.tgz sh: 1: ./bin/spark-class: not found Aha, let's rename the file in hdfs (and the two configs) from spark15.tgz to spark-1.5.0-bin-os1.tgz... Success!!! The same trick with 1.4 doesn't work, but now that I have something that does I can make progress. Hopefully this helps someone else :-) Adrian On 09/09/2015 16:59, Adrian Bridgett wrote: I'm trying to run spark (1.4.1) on top of mesos (0.23). I've followed the instructions (uploaded spark tarball to HDFS, set executor uri in both places etc) and yet on the slaves it's failing to lauch even the SparkPi example with a JNI error. It does run with a local master. A day of debugging later and it's time to ask for help! bin/spark-submit --master mesos://10.1.201.191:5050 --class org.apache.spark.examples.SparkPi /tmp/examples.jar (I'm putting the jar outside hdfs - on both client box + slave (turned off other slaves for debugging) - due to http://apache-spark-user-list.1001560.n3.nabble.com/Remote-jar-file-td20649.html. I should note that I had the same JNI errors when using the mesos cluster dispatcher). I'm using Oracle Java 8 (no other java - even openjdk - is installed) As you can see, the slave is downloading the framework fine (you can even see it extracted on the slave). Can anyone shed some light on what's going on - e.g. how is it attempting to run the executor? I'm going to try a different JVM (and try a custom spark distribution) but I suspect that the problem is much more basic. Maybe it can't find the hadoop native libs? Any light would be much appreciated :) I've included the slaves's stderr below: I0909 14:14:01.405185 32132 logging.cpp:177] Logging to STDERR I0909 14:14:01.405256 32132 fetcher.cpp:409] Fetcher Info: {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/20150826-133446-3217621258-5050-4064-S0\/ubuntu","items":[{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"hdfs:\/\/\/apps\/spark\/spark.tgz"}}],"sandbox_directory":"\/tmp\/mesos\/slaves\/20150826-133446-3217621258-5050-4064-S0\/frameworks\/20150826-133446-3217621258-5050-4064-211198\/executors\/20150826-133446-3217621258-5050-4064-S0\/runs\/38077da2-553e-4888-bfa3-ece2ab2119f3","user":"ubuntu"} I0909 14:14:01.406332 32132 fetcher.cpp:364] Fetching URI 'hdfs:///apps/spark/spark.tgz' I0909 14:14:01.406344 32132 fetcher.cpp:238] Fetching directly into the sandbox directory I0909 14:14:01.406358 32132 fetcher.cpp:176] Fetching URI 'hdfs:///apps/spark/spark.tgz' I0909 14:14:01.679055 32132 fetcher.cpp:104] Downloading resource with Hadoop client from 'hdfs:///apps/spark/spark.tgz' to '/tmp/mesos/slaves/20150826-133446-3217621258-5050-4064-S0/frameworks/20150826-133446-3217621258-5050-4064-211198/executors/20150826-133446-3217621258-5050-4064-S0/runs/38077da2-553e-4888-bfa3-ece2ab2119f3/spark.tgz' I0909 14:14:05.492626 32132 fetcher.cpp:76] Extracting with command: tar -C '/tmp/mesos/slaves/20150826-133446-3217621258-5050-4064-S0/frameworks/20150826-133446-3217621258-5050-4064-211198/executors/20150826-133446-3217621258-5050-4064-S0/runs/38077da2-553e-4888-bfa3-ece2ab2119f3' -xf '/tmp/mesos/slaves/20150826-133446-3217621258-5050-4064-S0/frameworks/20150826-133446-3217621258-5050-4064-211198/executors/20150826-133446-3217621258-5050-4064-S0/runs/38077da2-553e-4888-bfa3-ece2ab2119f3/spark.tgz' I0909 14:14:07.489753 32132 fetcher.cpp:84] Extracted '/tmp/mesos/slaves/20150826-133446-3217621258-5050-4064-S0/frameworks/20150826-133446-3217621258-5050-4064-211198/executors/20150826-133446-3217621258-5050-4064-S0/runs/38077da2-553e-4888-bfa3-ece2ab2119f3/spark.tgz' into '/tmp/mesos/slaves/20150826-133446-3217621258-5050-4064-S0/frameworks/20150826-133446-3217621258-5050-4064-211198/executors/20150826-133446-3217621258-5050-4064-S0/runs/38077da2-553e-4888-bfa3-ece2ab2119f3' W0909 14:14:07.489784 32132 fetcher.cpp:260] Copying instead of extracting resource from URI with 'extract' flag, because it does not seem to be an archive: hdfs:///apps/spark/spark.tgz I0909 14:14:07.489791 32132 fetcher.cpp:441] Fetched 'hdfs:///apps/spark/spark.tgz' to '/tmp/mesos/slaves/20150826-133446-3217621258-5050-4064-S0/frameworks/20150826-133446-3217621258-5050-4064-211198/executors/20150826-133446-3217621258-5050-4064-S0/runs/38077da2-553e-4888-bfa3-ece2ab2119f3/spark.tgz' Error: A JNI error has occurred, please check your installation and
Re: JNI issues with mesos
Thanks Tim, There's a little more to it in fact - if I use the pre-built-with-hadoop-2.6 binaries, all is good (with correctly named tarballs in hdfs). Using the pre-built with user-provided hadoop (including setting SPARK_DIST_CLASSPATH in setup-env.sh) then I get the JNI exception. Aha - I've found the minimal set of changes that fixes it. I can use the user-provided hadoop tarballs, but I _have_ to add spark-env.sh to them (which I wasn't expecting - I don't recall seeing this anywhere in the docs so I was expecting everything was setup by spark/mesos from the client config). FWIW, spark-env.sh: export SPARK_DIST_CLASSPATH=$(/opt/hadoop/bin/hadoop classpath) #export MESOS_NATIVE_JAVA_LIBRARY=/usr/lib/libmesos.so export SPARK_EXECUTOR_URI=hdfs:///apps/spark/spark15.tgz Leaving out SPARK_DIST_CLASSPATH leads to org.apache.hadoop.fs.FSDataInputStream class errors (as you'd expect). Leaving out MESOS_NATIVE_JAVA_LIBRARY seems to have no consequences ATM (it is set in the client). Leaving out SPARK_EXECUTOR_URI stops the job starting at all. spark-defaults.conf isn't required to be in the tarball, on the client it's set to: spark.master mesos://zk://mesos-1.example.net:2181,mesos-2.example.net:2181,mesos-3.example.net:2181/mesos spark.executor.uri hdfs:///apps/spark/spark15.tgz I guess this is the way forward for us right now, bit uncomfortable as I like to understand why :-) On 09/09/2015 18:43, Tim Chen wrote: Hi Adrian, Spark is expecting a specific naming of the tgz and also the folder name inside, as this is generated by running make-distribution.sh --tgz in the Spark source folder. If you use a Spark 1.4 tgz generated with that script with the same name and upload to HDFS again, fix the URI then it should work. Tim On Wed, Sep 9, 2015 at 8:18 AM, Adrian Bridgett <adr...@opensignal.com <mailto:adr...@opensignal.com>> wrote: 5mins later... Trying 1.5 with a fairly plain build: ./make-distribution.sh --tgz --name os1 -Phadoop-2.6 and on my first attempt stderr showed: I0909 15:16:49.392144 1619 fetcher.cpp:441] Fetched 'hdfs:///apps/spark/spark15.tgz' to '/tmp/mesos/slaves/20150826-133446-3217621258-5050-4064-S1/frameworks/20150826-133446-3217621258-5050-4064-211204/executors/20150826-133446-3217621258-5050-4064-S1/runs/43026ba8-6624-4817-912c-3d7573433102/spark15.tgz' sh: 1: cd: can't cd to spark15.tgz sh: 1: ./bin/spark-class: not found Aha, let's rename the file in hdfs (and the two configs) from spark15.tgz to spark-1.5.0-bin-os1.tgz... Success!!! The same trick with 1.4 doesn't work, but now that I have something that does I can make progress. Hopefully this helps someone else :-) Adrian On 09/09/2015 16:59, Adrian Bridgett wrote: I'm trying to run spark (1.4.1) on top of mesos (0.23). I've followed the instructions (uploaded spark tarball to HDFS, set executor uri in both places etc) and yet on the slaves it's failing to lauch even the SparkPi example with a JNI error. It does run with a local master. A day of debugging later and it's time to ask for help! bin/spark-submit --master mesos://10.1.201.191:5050 <http://10.1.201.191:5050> --class org.apache.spark.examples.SparkPi /tmp/examples.jar (I'm putting the jar outside hdfs - on both client box + slave (turned off other slaves for debugging) - due to http://apache-spark-user-list.1001560.n3.nabble.com/Remote-jar-file-td20649.html. I should note that I had the same JNI errors when using the mesos cluster dispatcher). I'm using Oracle Java 8 (no other java - even openjdk - is installed) As you can see, the slave is downloading the framework fine (you can even see it extracted on the slave). Can anyone shed some light on what's going on - e.g. how is it attempting to run the executor? I'm going to try a different JVM (and try a custom spark distribution) but I suspect that the problem is much more basic. Maybe it can't find the hadoop native libs? Any light would be much appreciated :) I've included the slaves's stderr below: I0909 14:14:01.405185 32132 logging.cpp:177] Logging to STDERR I0909 14:14:01.405256 32132 fetcher.cpp:409] Fetcher Info: {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/20150826-133446-3217621258-5050-4064-S0\/ubuntu","items":[{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"hdfs:\/\/\/apps\/spark\/spark.tgz"}}],"sandbox_directory":"\/tmp\/mesos\/slaves\/20150826-133446-3217621258-5050-4064-S0\/frameworks\/20150826-133446-3217621258-5050-4064-211198\/executors\/20150826-133446-3217621258-5050-4064-S0\/runs\/38077da2-553e-4888-bfa3-ece2ab2119f3","user":"ubuntu"} I0909 14:14:01.406332 32132 fetcher.cpp:364] Fetching URI 'hdfs: