Hi,
I guess you've loaded the S3 filesystem using the s3 FS plugin.

You need to put the right jar file containing the SAX2 driver class into
the plugin directory where you've also put the S3 filesystem plugin.
You can probably find out the name of the right sax2 jar file from your
local setup where everything is working.

I hope that helps!

Best,
Robert

On Thu, Aug 27, 2020 at 1:38 PM Averell <lvhu...@gmail.com> wrote:

> Hello,
>
> I have a Flink 1.10 job which runs in AWS EMR, checkpointing to S3a as well
> as writing output to S3a using StreamingFileSink. The job runs well until I
> add the Java Hadoop properties:  /-Dfs.s3a.acl.default=
> BucketOwnerFullControl/. Since after that, the checkpoint process fails to
> complete.
>
> /Caused by: org.xml.sax.SAXException: SAX2 driver class
> org.apache.xerces.parsers.SAXParser not found/
> I tried to add a jar file with that class
> (https://mvnrepository.com/artifact/xerces/xercesImpl/2.12.0) to my
> flink/lib/ directory, then got the same error but different stacktrace:
> /Caused by: org.apache.flink.util.SerializedThrowable: SAX2 driver class
> org.apache.xerces.parsers.SAXParser not found/
>
> This seems to be a dependencies conflict, but I couldn't track its root.
> In my IDE I didn't have any dependencies issue, while I couldn't find
> SAXParser in the dependencies tree.
>
> *Here is the stacktrace when the jar file is not there:*
> /Caused by: org.apache.hadoop.fs.s3a.AWSClientIOException: getFileStatus on
>
> s3a://mybucket/checkpoint/a9502b1c81ced10dfcbb21ac43f03e61/chk-2/41f51c24-60fd-474b-9f89-3d65d87037c7:
> com.amazonaws.SdkClientException: Couldn't initialize a SAX driver to
> create
> an XMLReader: Couldn't initialize a SAX driver to create an XMLReader
>         at
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:177)
>         at
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:145)
>         at
>
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2251)
>         at
>
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2149)
>         at
>
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2088)
>         at
> org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:749)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1169)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1149)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1038)
>         at
>
> org.apache.flink.fs.s3hadoop.common.HadoopFileSystem.create(HadoopFileSystem.java:141)
>         at
>
> org.apache.flink.fs.s3hadoop.common.HadoopFileSystem.create(HadoopFileSystem.java:37)
>         at
>
> org.apache.flink.core.fs.PluginFileSystemFactory$ClassLoaderFixingFileSystem.create(PluginFileSystemFactory.java:164)
>         at
>
> org.apache.flink.core.fs.SafetyNetWrapperFileSystem.create(SafetyNetWrapperFileSystem.java:126)
>         at
>
> org.apache.flink.core.fs.EntropyInjector.createEntropyAware(EntropyInjector.java:61)
>         at
>
> org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.createStream(FsCheckpointStreamFactory.java:356)
>         ... 17 more
> Caused by: com.amazonaws.SdkClientException: Couldn't initialize a SAX
> driver to create an XMLReader
>         at
>
> com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.<init>(XmlResponsesSaxParser.java:118)
>         at
>
> com.amazonaws.services.s3.model.transform.Unmarshallers$ListObjectsV2Unmarshaller.unmarshall(Unmarshallers.java:87)
>         at
>
> com.amazonaws.services.s3.model.transform.Unmarshallers$ListObjectsV2Unmarshaller.unmarshall(Unmarshallers.java:77)
>         at
>
> com.amazonaws.services.s3.internal.S3XmlResponseHandler.handle(S3XmlResponseHandler.java:62)
>         at
>
> com.amazonaws.services.s3.internal.S3XmlResponseHandler.handle(S3XmlResponseHandler.java:31)
>         at
>
> com.amazonaws.http.response.AwsResponseHandlerAdapter.handle(AwsResponseHandlerAdapter.java:70)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleResponse(AmazonHttpClient.java:1554)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1272)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
>         at
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
>         at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325)
>         at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272)
>         at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4266)
>         at
>
> com.amazonaws.services.s3.AmazonS3Client.listObjectsV2(AmazonS3Client.java:876)
>         at
>
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listObjects$5(S3AFileSystem.java:1262)
>         at
> org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:317)
>         at
> org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:280)
>         at
> org.apache.hadoop.fs.s3a.S3AFileSystem.listObjects(S3AFileSystem.java:1255)
>         at
>
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2223)
>         ... 29 more
> Caused by: org.xml.sax.SAXException: SAX2 driver class
> org.apache.xerces.parsers.SAXParser not found
> java.lang.ClassNotFoundException: org.apache.xerces.parsers.SAXParser
>         at
> org.xml.sax.helpers.XMLReaderFactory.loadClass(XMLReaderFactory.java:230)
>         at
>
> org.xml.sax.helpers.XMLReaderFactory.createXMLReader(XMLReaderFactory.java:191)
>         at
>
> com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.<init>(XmlResponsesSaxParser.java:115)
>         ... 52 more/
>
> *And here is the stacktrace when that jar file added to /lib/ folder*
>
> /Could not materialize checkpoint 1 for operator Source:
> <my_operators_chain> (1/2).
>         at
>
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.handleExecutionException(StreamTask.java:1238)
>         at
>
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:1180)
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.flink.util.SerializedThrowable: java.io.IOException:
> Could not open output stream for state backend
>         at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>         at
>
> org.apache.flink.runtime.concurrent.FutureUtils.runIfNotDoneAndGet(FutureUtils.java:461)
>         at
>
> org.apache.flink.streaming.api.operators.OperatorSnapshotFinalizer.<init>(OperatorSnapshotFinalizer.java:53)
>         at
>
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:1143)
>         ... 3 common frames omitted
> Caused by: org.apache.flink.util.SerializedThrowable: Could not open output
> stream for state backend
>         at
>
> org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.createStream(FsCheckpointStreamFactory.java:367)
>         at
>
> org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.flush(FsCheckpointStreamFactory.java:234)
>         at
>
> org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.write(FsCheckpointStreamFactory.java:209)
>         at java.io.DataOutputStream.write(DataOutputStream.java:107)
>         at java.io.FilterOutputStream.write(FilterOutputStream.java:97)
>         at
>
> org.apache.flink.api.common.typeutils.base.array.BytePrimitiveArraySerializer.serialize(BytePrimitiveArraySerializer.java:78)
>         at
>
> org.apache.flink.api.common.typeutils.base.array.BytePrimitiveArraySerializer.serialize(BytePrimitiveArraySerializer.java:33)
>         at
>
> org.apache.flink.runtime.state.PartitionableListState.write(PartitionableListState.java:116)
>         at
>
> org.apache.flink.runtime.state.DefaultOperatorStateBackendSnapshotStrategy$1.callInternal(DefaultOperatorStateBackendSnapshotStrategy.java:155)
>         at
>
> org.apache.flink.runtime.state.DefaultOperatorStateBackendSnapshotStrategy$1.callInternal(DefaultOperatorStateBackendSnapshotStrategy.java:108)
>         at
>
> org.apache.flink.runtime.state.AsyncSnapshotCallable.call(AsyncSnapshotCallable.java:75)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at
>
> org.apache.flink.runtime.concurrent.FutureUtils.runIfNotDoneAndGet(FutureUtils.java:458)
>         ... 5 common frames omitted
> Caused by: org.apache.flink.util.SerializedThrowable: getFileStatus on
>
> s3a://mybucket/checkpoint/d8ed6d1524169c942bbc455d2c519a39/chk-1/7f2d8fd6-4f3f-4da7-9ffd-5a7e3ea8e7e3:
> com.amazonaws.SdkClientException: Couldn't initialize a SAX driver to
> create
> an XMLReader: Couldn't initialize a SAX driver to create an XMLReader
>         at
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:177)
>         at
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:145)
>         at
>
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2251)
>         at
>
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2149)
>         at
>
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2088)
>         at
> org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:749)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1169)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1149)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1038)
>         at
>
> org.apache.flink.fs.s3hadoop.common.HadoopFileSystem.create(HadoopFileSystem.java:141)
>         at
>
> org.apache.flink.fs.s3hadoop.common.HadoopFileSystem.create(HadoopFileSystem.java:37)
>         at
>
> org.apache.flink.core.fs.PluginFileSystemFactory$ClassLoaderFixingFileSystem.create(PluginFileSystemFactory.java:164)
>         at
>
> org.apache.flink.core.fs.SafetyNetWrapperFileSystem.create(SafetyNetWrapperFileSystem.java:126)
>         at
>
> org.apache.flink.core.fs.EntropyInjector.createEntropyAware(EntropyInjector.java:61)
>         at
>
> org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.createStream(FsCheckpointStreamFactory.java:356)
>         ... 17 common frames omitted
> Caused by: org.apache.flink.util.SerializedThrowable: Couldn't initialize a
> SAX driver to create an XMLReader
>         at
>
> com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.<init>(XmlResponsesSaxParser.java:118)
>         at
>
> com.amazonaws.services.s3.model.transform.Unmarshallers$ListObjectsV2Unmarshaller.unmarshall(Unmarshallers.java:87)
>         at
>
> com.amazonaws.services.s3.model.transform.Unmarshallers$ListObjectsV2Unmarshaller.unmarshall(Unmarshallers.java:77)
>         at
>
> com.amazonaws.services.s3.internal.S3XmlResponseHandler.handle(S3XmlResponseHandler.java:62)
>         at
>
> com.amazonaws.services.s3.internal.S3XmlResponseHandler.handle(S3XmlResponseHandler.java:31)
>         at
>
> com.amazonaws.http.response.AwsResponseHandlerAdapter.handle(AwsResponseHandlerAdapter.java:70)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleResponse(AmazonHttpClient.java:1554)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1272)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
>         at
>
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
>         at
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
>         at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325)
>         at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272)
>         at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4266)
>         at
>
> com.amazonaws.services.s3.AmazonS3Client.listObjectsV2(AmazonS3Client.java:876)
>         at
>
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listObjects$5(S3AFileSystem.java:1262)
>         at
> org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:317)
>         at
> org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:280)
>         at
> org.apache.hadoop.fs.s3a.S3AFileSystem.listObjects(S3AFileSystem.java:1255)
>         at
>
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2223)
>         ... 29 common frames omitted
> Caused by: org.apache.flink.util.SerializedThrowable: SAX2 driver class
> org.apache.xerces.parsers.SAXParser not found
>         at
> org.xml.sax.helpers.XMLReaderFactory.loadClass(XMLReaderFactory.java:230)
>         at
>
> org.xml.sax.helpers.XMLReaderFactory.createXMLReader(XMLReaderFactory.java:191)
>         at
>
> com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.<init>(XmlResponsesSaxParser.java:115)
>         ... 52 common frames omitted
> Caused by: org.apache.flink.util.SerializedThrowable:
> org.apache.xerces.parsers.SAXParser
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
>         at
>
> org.apache.flink.core.plugin.PluginLoader$PluginClassLoader.loadClass(PluginLoader.java:149)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
>         at org.xml.sax.helpers.NewInstance.newInstance(NewInstance.java:82)
>         at
> org.xml.sax.helpers.XMLReaderFactory.loadClass(XMLReaderFactory.java:228)
>         ... 54 common frames omitted
> /
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>

Reply via email to