Hello
I am testing Ozone and facing some problems. The part in Ozone I am interested
in is OzoneFS since I work with Hadoop.
I installed the latest Cloudera distribution which includes Ozone 0.5, followed
the instructions and now I am able to run any hdfs dfs commands against Ozone
and it looks great.
However, when I try to take it a step further and run YARN application on top
of it (MR or Spark jobs) I get errors.
I did add this configuration in spark:
spark.yarn.access.hadoopFileSystems=o3fs://[bucket].[volume].[hostname]:[port]
I tried to google the error messages but did not find useful information how to
make it work.
The documentation does not mention any additional configuration that should be
done in YARN or Spark. So, I assumed it will work seamlessly.
What am I missing ? Can YARN work on top of OzoneFS ?
Thank you
Guy Shilo
Here are the errors:
20/08/04 13:49:50 WARN io.KeyOutputStream: Encountered exception
java.io.IOException: Unexpected Storage Container Exception:
java.util.concurrent.CompletionException:
java.util.concurrent.CompletionException:
org.apache.ratis.protocol.GroupMismatchException:
219b808a-b83b-459b-936f-3c57e9a9aa0e: group-62B2B3BD903A not found. on the
pipeline Pipeline[ Id: 3a7640a2-e590-4cf0-8a6a-62b2b3bd903a, Nodes:
0adf0299-2fff-4d8f-acb1-f3eec172a33f{ip: 192.168.171.132, host: cloudera4.lan,
networkLocation: /default-rack, certSerialId:
null}5df9cf32-f15b-423a-86e1-d704f518422e{ip: 192.168.171.128, host:
cloudera2.lan, networkLocation: /default-rack, certSerialId:
null}219b808a-b83b-459b-936f-3c57e9a9aa0e{ip: 192.168.171.129, host:
cloudera3.lan, networkLocation: /default-rack, certSerialId: null}, Type:RATIS,
Factor:THREE, State:OPEN, leaderId:0adf0299-2fff-4d8f-acb1-f3eec172a33f,
CreationTimestamp2020-08-04T10:37:10.893Z]. The last committed block length is
0, uncommitted data length is 576985 retry count 0
20/08/04 13:49:50 INFO io.BlockOutputStreamEntryPool: Allocating block with
ExcludeList {datanodes = [], containerIds = [], pipelineIds =
[PipelineID=3a7640a2-e590-4cf0-8a6a-62b2b3bd903a]}
20/08/04 13:51:13 WARN io.KeyOutputStream: Encountered exception
java.io.IOException: Unexpected Storage Container Exception:
java.util.concurrent.CompletionException:
java.util.concurrent.CompletionException:
org.apache.ratis.protocol.GroupMismatchException:
219b808a-b83b-459b-936f-3c57e9a9aa0e: group-BD900690DB5D not found. on the
pipeline Pipeline[ Id: 64900222-0e69-43c4-99ce-bd900690db5d, Nodes:
0adf0299-2fff-4d8f-acb1-f3eec172a33f{ip: 192.168.171.132, host: cloudera4.lan,
networkLocation: /default-rack, certSerialId:
null}219b808a-b83b-459b-936f-3c57e9a9aa0e{ip: 192.168.171.129, host:
cloudera3.lan, networkLocation: /default-rack, certSerialId:
null}5df9cf32-f15b-423a-86e1-d704f518422e{ip: 192.168.171.128, host:
cloudera2.lan, networkLocation: /default-rack, certSerialId: null}, Type:RATIS,
Factor:THREE, State:OPEN, leaderId:219b808a-b83b-459b-936f-3c57e9a9aa0e,
CreationTimestamp2020-08-04T10:49:13.381Z]. The last committed block length is
0, uncommitted data length is 576985 retry count 0
20/08/04 13:51:13 INFO io.BlockOutputStreamEntryPool: Allocating block with
ExcludeList {datanodes = [], containerIds = [], pipelineIds =
[PipelineID=3a7640a2-e590-4cf0-8a6a-62b2b3bd903a,
PipelineID=64900222-0e69-43c4-99ce-bd900690db5d]}
20/08/04 13:51:13 INFO yarn.Client: Deleted staging directory
o3fs://bucket1.tests/user/root/.sparkStaging/application_1596537475392_0001
20/08/04 13:51:13 ERROR spark.SparkContext: Error initializing SparkContext.
java.io.IOException: INTERNAL_ERROR
org.apache.hadoop.ozone.om.exceptions.OMException: Allocated 0 blocks.
Requested 1 blocks
at
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:229)
at
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:402)
at
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:347)
at
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleFlushOrClose(KeyOutputStream.java:458)
at
org.apache.hadoop.ozone.client.io.KeyOutputStream.close(KeyOutputStream.java:509)
at
org.apache.hadoop.fs.ozone.OzoneFSOutputStream.close(OzoneFSOutputStream.java:56)
at
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
at
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:70)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:129)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:415)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:387)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:337)
at
org.apache.spark.deploy.yarn.Client.copyFileToRemote(Client.scala:372)
at
org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:687)
at
org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:905)
at
org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:180)
at
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:60)
at
org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:185)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:505)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2527)
at
org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:944)
at
org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:935)
at scala.Option.getOrElse(Option.scala:121)
at
org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:935)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:847)
at
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:922)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:931)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: INTERNAL_ERROR org.apache.hadoop.ozone.om.exceptions.OMException:
Allocated 0 blocks. Requested 1 blocks
at
org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:816)
at
org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.allocateBlock(OzoneManagerProtocolClientSideTranslatorPB.java:848)
at
org.apache.hadoop.ozone.client.io.BlockOutputStreamEntryPool.allocateNewBlock(BlockOutputStreamEntryPool.java:281)
at
org.apache.hadoop.ozone.client.io.BlockOutputStreamEntryPool.allocateBlockIfNeeded(BlockOutputStreamEntryPool.java:327)
at
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:208)
... 38 more
20/08/04 13:51:13 INFO server.AbstractConnector: Stopped
Spark@10ad20cb{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
20/08/04 13:51:13 INFO ui.SparkUI: Stopped Spark web UI at
http://cloudera2.lan:4040
20/08/04 13:51:13 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint:
Attempted to request executors before the AM has registered!
20/08/04 13:51:13 INFO cluster.YarnClientSchedulerBackend: Shutting down all
executors
20/08/04 13:51:13 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Asking
each executor to shut down
20/08/04 13:51:13 INFO cluster.YarnClientSchedulerBackend: Stopped
20/08/04 13:51:13 INFO spark.MapOutputTrackerMasterEndpoint:
MapOutputTrackerMasterEndpoint stopped!
20/08/04 13:51:13 INFO memory.MemoryStore: MemoryStore cleared
20/08/04 13:51:13 INFO storage.BlockManager: BlockManager stopped
20/08/04 13:51:13 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
20/08/04 13:51:13 WARN metrics.MetricsSystem: Stopping a MetricsSystem that is
not running
20/08/04 13:51:13 INFO
scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:
OutputCommitCoordinator stopped!
20/08/04 13:51:13 INFO spark.SparkContext: Successfully stopped SparkContext
Exception in thread "main" java.io.IOException: INTERNAL_ERROR
org.apache.hadoop.ozone.om.exceptions.OMException: Allocated 0 blocks.
Requested 1 blocks
at
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:229)
at
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:402)
at
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:347)
at
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleFlushOrClose(KeyOutputStream.java:458)
at
org.apache.hadoop.ozone.client.io.KeyOutputStream.close(KeyOutputStream.java:509)
at
org.apache.hadoop.fs.ozone.OzoneFSOutputStream.close(OzoneFSOutputStream.java:56)
at
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
at
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:70)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:129)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:415)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:387)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:337)
at
org.apache.spark.deploy.yarn.Client.copyFileToRemote(Client.scala:372)
at
org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:687)
at
org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:905)
at
org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:180)
at
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:60)
at
org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:185)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:505)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2527)
at
org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:944)
at
org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:935)
at scala.Option.getOrElse(Option.scala:121)
at
org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:935)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:847)
at
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:922)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:931)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: INTERNAL_ERROR org.apache.hadoop.ozone.om.exceptions.OMException:
Allocated 0 blocks. Requested 1 blocks
at
org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:816)
at
org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.allocateBlock(OzoneManagerProtocolClientSideTranslatorPB.java:848)
at
org.apache.hadoop.ozone.client.io.BlockOutputStreamEntryPool.allocateNewBlock(BlockOutputStreamEntryPool.java:281)
at
org.apache.hadoop.ozone.client.io.BlockOutputStreamEntryPool.allocateBlockIfNeeded(BlockOutputStreamEntryPool.java:327)
at
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:208)
... 38 more
20/08/04 13:51:13 INFO util.ShutdownHookManager: Shutdown hook called