Re: Flink write ADLS show error: No FileSystem for scheme "file"
Hi, Gabor, I'm curious about why this happened in Azure file and not in other file format(I tried use s3 and it works OK) Gabor Somogyi 于2024年7月2日周二 16:59写道: > I see, thanks for sharing. > > The change what you've made makes sense. Let me explain the details. > Each and every plugin has it's own class loader. The reason behind that is > to avoid dependency collision with Flink's main class loader. > > I think if the mentioned change works when it's added as normal lib and > not as a plugin then the code can be merged to main as-is. > > G > > > On Thu, Jun 27, 2024 at 5:30 AM Xiao Xu wrote: > >> Hi, Gabar, >> >> Thanks to reply, I make sure that not conflict in maven, all the hadoop >> dependency is in provided scope, >> and checked my result jar it not contains >> (src/main/resources/META-INF/services). >> >> This is my pom: >> >> http://maven.apache.org/POM/4.0.0"; >> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; >> xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 >> http://maven.apache.org/xsd/maven-4.0.0.xsd";> >> 4.0.0 >> >> com.test.flink >> flink-sync >> 1.0-SNAPSHOT >> jar >> >> Flink Quickstart Job >> >> >> 1.8 >> 1.8 >> 1.18.1 >> 1.8 >> 2.12 >> 3.2.0 >>3.3.4 >>2.16.0 >>3.2.0 >> >> >> >> >> org.apache.flink >> flink-java >> ${flink.version} >> provided >> >> >> >> org.apache.flink >> flink-streaming-java >> ${flink.version} >> provided >> >> >> >> org.apache.flink >> flink-clients >> ${flink.version} >> provided >> >> >> >> org.apache.flink >> flink-connector-files >> ${flink.version} >> >> >> org.apache.flink >> flink-connector-kafka >> 3.1.0-1.18 >> >> >> >> org.apache.logging.log4j >> log4j-slf4j-impl >> ${log4j.version} >> runtime >> >> >> slf4j-api >> org.slf4j >> >> >> >> >> org.apache.logging.log4j >> log4j-api >> ${log4j.version} >> runtime >> >> >> org.apache.logging.log4j >> log4j-core >> ${log4j.version} >> runtime >> >> >> >> org.apache.flink >> flink-azure-fs-hadoop >> ${flink.version} >> provided >> >> >> >> >> >> org.apache.maven.plugins >> maven-assembly-plugin >> 3.0.0 >> >> false >> >>jar-with-dependencies >> >> >> >> >>make-assembly >>package >> >> single >> >> >> >> >> >> >> >> >> >> And like my reply in stackoverflow, I found the hadoop-common file : >> https://github.com/apache/hadoop/blob/release-3.3.4-RC1/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L3374 >> do not load any filesystem, dig in ServiceLoader.load(FileSystem.class) >> source code, it looks like have different class loader make it not load >> any filesystem. >> I changed the ServiceLoader.load(FileSystem.class) to >> ServiceLoader.load(FileSystem.class, >> FileSystem.class.getClassLoader()) and replace the flink-fs-azure-hadoop >> plugin, it works now, >> So I'm not sure why it works >> >> Gabor Somogyi 于2024年6月26日周三 16:52写道: >> >>> Hi Xiao, >>> >>> I'm not quite convinced that the azure plugin ruined your workload, I >>> would take a look at the dependency graph you've in the pom. >>> Adding multiple deps can conflict in terms of class loader services >>> (src/main/resources/META-INF/services). >>> >>> As an example you've 2 such dependencies where >>> org.apache.flink.core.fs.FileSystemFactory is in the jar. >>> Hadoop core contains "flie" and the other one something different. Let's >>> say you don't use service merge plugin in your >>> maven project. Such case Hadoop core `file` entry will be just >>> overwritten by the second one. >>> >>> Solution: Either avoid deps with conflicting services or add >>> ServicesResourceTransformer >>> to your maven project. >>> >>> G >>> >>> >>> On Wed, Jun 26, 2024 at 10:16 AM Xiao Xu wrote: >>> Hi, all I try to use Flink to write Azure Blob Storage which called ADLS, I put the flink-azure-fs-hadoop jar in plugins directory and when I start my write job it shows: Caused by: org.apache.hadoop.fs.UnsupportedFi
Re: Flink write ADLS show error: No FileSystem for scheme "file"
I see, thanks for sharing. The change what you've made makes sense. Let me explain the details. Each and every plugin has it's own class loader. The reason behind that is to avoid dependency collision with Flink's main class loader. I think if the mentioned change works when it's added as normal lib and not as a plugin then the code can be merged to main as-is. G On Thu, Jun 27, 2024 at 5:30 AM Xiao Xu wrote: > Hi, Gabar, > > Thanks to reply, I make sure that not conflict in maven, all the hadoop > dependency is in provided scope, > and checked my result jar it not contains > (src/main/resources/META-INF/services). > > This is my pom: > > http://maven.apache.org/POM/4.0.0"; > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; > xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 > http://maven.apache.org/xsd/maven-4.0.0.xsd";> > 4.0.0 > > com.test.flink > flink-sync > 1.0-SNAPSHOT > jar > > Flink Quickstart Job > > > 1.8 > 1.8 > 1.18.1 > 1.8 > 2.12 > 3.2.0 >3.3.4 >2.16.0 >3.2.0 > > > > > org.apache.flink > flink-java > ${flink.version} > provided > > > > org.apache.flink > flink-streaming-java > ${flink.version} > provided > > > > org.apache.flink > flink-clients > ${flink.version} > provided > > > > org.apache.flink > flink-connector-files > ${flink.version} > > > org.apache.flink > flink-connector-kafka > 3.1.0-1.18 > > > > org.apache.logging.log4j > log4j-slf4j-impl > ${log4j.version} > runtime > > > slf4j-api > org.slf4j > > > > > org.apache.logging.log4j > log4j-api > ${log4j.version} > runtime > > > org.apache.logging.log4j > log4j-core > ${log4j.version} > runtime > > > > org.apache.flink > flink-azure-fs-hadoop > ${flink.version} > provided > > > > > > org.apache.maven.plugins > maven-assembly-plugin > 3.0.0 > > false > >jar-with-dependencies > > > > >make-assembly >package > > single > > > > > > > > > > And like my reply in stackoverflow, I found the hadoop-common file : > https://github.com/apache/hadoop/blob/release-3.3.4-RC1/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L3374 > do not load any filesystem, dig in ServiceLoader.load(FileSystem.class) > source code, it looks like have different class loader make it not load > any filesystem. > I changed the ServiceLoader.load(FileSystem.class) to > ServiceLoader.load(FileSystem.class, > FileSystem.class.getClassLoader()) and replace the flink-fs-azure-hadoop > plugin, it works now, > So I'm not sure why it works > > Gabor Somogyi 于2024年6月26日周三 16:52写道: > >> Hi Xiao, >> >> I'm not quite convinced that the azure plugin ruined your workload, I >> would take a look at the dependency graph you've in the pom. >> Adding multiple deps can conflict in terms of class loader services >> (src/main/resources/META-INF/services). >> >> As an example you've 2 such dependencies where >> org.apache.flink.core.fs.FileSystemFactory is in the jar. >> Hadoop core contains "flie" and the other one something different. Let's >> say you don't use service merge plugin in your >> maven project. Such case Hadoop core `file` entry will be just >> overwritten by the second one. >> >> Solution: Either avoid deps with conflicting services or add >> ServicesResourceTransformer >> to your maven project. >> >> G >> >> >> On Wed, Jun 26, 2024 at 10:16 AM Xiao Xu wrote: >> >>> Hi, all >>> >>> I try to use Flink to write Azure Blob Storage which called ADLS, I put >>> the flink-azure-fs-hadoop jar in plugins directory and when I start my >>> write job it shows: >>> >>> Caused by: org.apache.hadoop.fs.UnsupportedFileSystemException: No >>> FileSystem for scheme "file" >>> at >>> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3443) >>> ~[hadoop-common-3.3.4.5.1.5.3.jar:?] >>> at >>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3466) >>> ~[hadoop-common-3.3.4.5.1.5.3.jar:?] >>> at >>> org.apache.hadoo
Re: Flink write ADLS show error: No FileSystem for scheme "file"
Hi, Gabar, Thanks to reply, I make sure that not conflict in maven, all the hadoop dependency is in provided scope, and checked my result jar it not contains (src/main/resources/META-INF/services). This is my pom: http://maven.apache.org/POM/4.0.0"; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd";> 4.0.0 com.test.flink flink-sync 1.0-SNAPSHOT jar Flink Quickstart Job 1.8 1.8 1.18.1 1.8 2.12 3.2.0 3.3.4 2.16.0 3.2.0 org.apache.flink flink-java ${flink.version} provided org.apache.flink flink-streaming-java ${flink.version} provided org.apache.flink flink-clients ${flink.version} provided org.apache.flink flink-connector-files ${flink.version} org.apache.flink flink-connector-kafka 3.1.0-1.18 org.apache.logging.log4j log4j-slf4j-impl ${log4j.version} runtime slf4j-api org.slf4j org.apache.logging.log4j log4j-api ${log4j.version} runtime org.apache.logging.log4j log4j-core ${log4j.version} runtime org.apache.flink flink-azure-fs-hadoop ${flink.version} provided org.apache.maven.plugins maven-assembly-plugin 3.0.0 false jar-with-dependencies make-assembly package single And like my reply in stackoverflow, I found the hadoop-common file : https://github.com/apache/hadoop/blob/release-3.3.4-RC1/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L3374 do not load any filesystem, dig in ServiceLoader.load(FileSystem.class) source code, it looks like have different class loader make it not load any filesystem. I changed the ServiceLoader.load(FileSystem.class) to ServiceLoader.load(FileSystem.class, FileSystem.class.getClassLoader()) and replace the flink-fs-azure-hadoop plugin, it works now, So I'm not sure why it works Gabor Somogyi 于2024年6月26日周三 16:52写道: > Hi Xiao, > > I'm not quite convinced that the azure plugin ruined your workload, I > would take a look at the dependency graph you've in the pom. > Adding multiple deps can conflict in terms of class loader services > (src/main/resources/META-INF/services). > > As an example you've 2 such dependencies where > org.apache.flink.core.fs.FileSystemFactory is in the jar. > Hadoop core contains "flie" and the other one something different. Let's > say you don't use service merge plugin in your > maven project. Such case Hadoop core `file` entry will be just overwritten > by the second one. > > Solution: Either avoid deps with conflicting services or add > ServicesResourceTransformer > to your maven project. > > G > > > On Wed, Jun 26, 2024 at 10:16 AM Xiao Xu wrote: > >> Hi, all >> >> I try to use Flink to write Azure Blob Storage which called ADLS, I put >> the flink-azure-fs-hadoop jar in plugins directory and when I start my >> write job it shows: >> >> Caused by: org.apache.hadoop.fs.UnsupportedFileSystemException: No >> FileSystem for scheme "file" >> at >> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3443) >> ~[hadoop-common-3.3.4.5.1.5.3.jar:?] >> at >> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3466) >> ~[hadoop-common-3.3.4.5.1.5.3.jar:?] >> at >> org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174) >> ~[hadoop-common-3.3.4.5.1.5.3.jar:?] >> at >> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574) >> ~[hadoop-common-3.3.4.5.1.5.3.jar:?] >> at >> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521) >> ~[hadoop-common-3.3.4.5.1.5.3.jar:?] >> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540) >> ~[hadoop-common-3.3.4.5.1.5.3.jar:?] >> at org.apache.hadoop.fs.FileSystem.getLocal(FileSystem.java:496) >> ~[hadoop-common-3.3.4.5.1.5.3.jar:?] >> at >> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:316) >> ~[hadoop-common-3.3.4.5.1.5.3.jar:?] >>
Re: Flink write ADLS show error: No FileSystem for scheme "file"
Hi Xiao, I'm not quite convinced that the azure plugin ruined your workload, I would take a look at the dependency graph you've in the pom. Adding multiple deps can conflict in terms of class loader services (src/main/resources/META-INF/services). As an example you've 2 such dependencies where org.apache.flink.core.fs.FileSystemFactory is in the jar. Hadoop core contains "flie" and the other one something different. Let's say you don't use service merge plugin in your maven project. Such case Hadoop core `file` entry will be just overwritten by the second one. Solution: Either avoid deps with conflicting services or add ServicesResourceTransformer to your maven project. G On Wed, Jun 26, 2024 at 10:16 AM Xiao Xu wrote: > Hi, all > > I try to use Flink to write Azure Blob Storage which called ADLS, I put > the flink-azure-fs-hadoop jar in plugins directory and when I start my > write job it shows: > > Caused by: org.apache.hadoop.fs.UnsupportedFileSystemException: No > FileSystem for scheme "file" > at > org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3443) > ~[hadoop-common-3.3.4.5.1.5.3.jar:?] > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3466) > ~[hadoop-common-3.3.4.5.1.5.3.jar:?] > at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174) > ~[hadoop-common-3.3.4.5.1.5.3.jar:?] > at > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574) > ~[hadoop-common-3.3.4.5.1.5.3.jar:?] > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521) > ~[hadoop-common-3.3.4.5.1.5.3.jar:?] > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540) > ~[hadoop-common-3.3.4.5.1.5.3.jar:?] > at org.apache.hadoop.fs.FileSystem.getLocal(FileSystem.java:496) > ~[hadoop-common-3.3.4.5.1.5.3.jar:?] > at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:316) > ~[hadoop-common-3.3.4.5.1.5.3.jar:?] > at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:393) > ~[hadoop-common-3.3.4.5.1.5.3.jar:?] > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165) > ~[hadoop-common-3.3.4.5.1.5.3.jar:?] > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) > ~[hadoop-common-3.3.4.5.1.5.3.jar:?] > at > org.apache.hadoop.fs.store.DataBlocks$DiskBlockFactory.createTmpFileForWrite(DataBlocks.java:980) > ~[hadoop-common-3.3.4.5.1.5.3.jar:?] > at > org.apache.hadoop.fs.store.DataBlocks$DiskBlockFactory.create(DataBlocks.java:960) > ~[hadoop-common-3.3.4.5.1.5.3.jar:?] > at > org.apache.hadoop.fs.azurebfs.services.AbfsOutputStream.createBlockIfNeeded(AbfsOutputStream.java:262) > ~[hadoop-azure-3.3.4.5.1.5.3.jar:?] > at > org.apache.hadoop.fs.azurebfs.services.AbfsOutputStream.(AbfsOutputStream.java:173) > ~[hadoop-azure-3.3.4.5.1.5.3.jar:?] > at > org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.createFile(AzureBlobFileSystemStore.java:580) > ~[hadoop-azure-3.3.4.5.1.5.3.jar:?] > at > org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.create(AzureBlobFileSystem.java:301) > ~[hadoop-azure-3.3.4.5.1.5.3.jar:?] > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1195) > ~[hadoop-common-3.3.4.5.1.5.3.jar:?] > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1175) > ~[hadoop-common-3.3.4.5.1.5.3.jar:?] > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1064) > ~[hadoop-common-3.3.4.5.1.5.3.jar:?] > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1052) > ~[hadoop-common-3.3.4.5.1.5.3.jar:?] > > I search the issue looks like this: > https://stackoverflow.com/questions/77238642/apache-flink-azure-abfs-file-sink-error-streaming-unsupportedfilesystemexcep > > my env: > Flink: 1.18.1 > JDK: 1.8 > > Does anyone else have the same problem? >
Flink write ADLS show error: No FileSystem for scheme "file"
Hi, all I try to use Flink to write Azure Blob Storage which called ADLS, I put the flink-azure-fs-hadoop jar in plugins directory and when I start my write job it shows: Caused by: org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "file" at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3443) ~[hadoop-common-3.3.4.5.1.5.3.jar:?] at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3466) ~[hadoop-common-3.3.4.5.1.5.3.jar:?] at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174) ~[hadoop-common-3.3.4.5.1.5.3.jar:?] at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574) ~[hadoop-common-3.3.4.5.1.5.3.jar:?] at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521) ~[hadoop-common-3.3.4.5.1.5.3.jar:?] at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540) ~[hadoop-common-3.3.4.5.1.5.3.jar:?] at org.apache.hadoop.fs.FileSystem.getLocal(FileSystem.java:496) ~[hadoop-common-3.3.4.5.1.5.3.jar:?] at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:316) ~[hadoop-common-3.3.4.5.1.5.3.jar:?] at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:393) ~[hadoop-common-3.3.4.5.1.5.3.jar:?] at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165) ~[hadoop-common-3.3.4.5.1.5.3.jar:?] at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) ~[hadoop-common-3.3.4.5.1.5.3.jar:?] at org.apache.hadoop.fs.store.DataBlocks$DiskBlockFactory.createTmpFileForWrite(DataBlocks.java:980) ~[hadoop-common-3.3.4.5.1.5.3.jar:?] at org.apache.hadoop.fs.store.DataBlocks$DiskBlockFactory.create(DataBlocks.java:960) ~[hadoop-common-3.3.4.5.1.5.3.jar:?] at org.apache.hadoop.fs.azurebfs.services.AbfsOutputStream.createBlockIfNeeded(AbfsOutputStream.java:262) ~[hadoop-azure-3.3.4.5.1.5.3.jar:?] at org.apache.hadoop.fs.azurebfs.services.AbfsOutputStream.(AbfsOutputStream.java:173) ~[hadoop-azure-3.3.4.5.1.5.3.jar:?] at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.createFile(AzureBlobFileSystemStore.java:580) ~[hadoop-azure-3.3.4.5.1.5.3.jar:?] at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.create(AzureBlobFileSystem.java:301) ~[hadoop-azure-3.3.4.5.1.5.3.jar:?] at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1195) ~[hadoop-common-3.3.4.5.1.5.3.jar:?] at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1175) ~[hadoop-common-3.3.4.5.1.5.3.jar:?] at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1064) ~[hadoop-common-3.3.4.5.1.5.3.jar:?] at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1052) ~[hadoop-common-3.3.4.5.1.5.3.jar:?] I search the issue looks like this: https://stackoverflow.com/questions/77238642/apache-flink-azure-abfs-file-sink-error-streaming-unsupportedfilesystemexcep my env: Flink: 1.18.1 JDK: 1.8 Does anyone else have the same problem?