Thanks I took a look and left some comments. I saw that you were proposing to use azfs as the scheme but I see wasb/wasb/abfss used in other data processing systems. I'm not sure which is the common one but wasb/wasbs/abfss show up on the Microsoft site so it might be best to use that instead of going with what TFX chooses unless someone on dev@ can give a concrete answer as to what should be used and why.
On Thu, Jul 16, 2020 at 2:20 PM Etta Rapp <[email protected]> wrote: > Hi Ashwin, > > Thanks for the suggestion, I hadn't considered that. For now I plan to go > ahead using the Azure client. > > Etta > > On Thu, Jul 16, 2020 at 3:36 PM Ashwin Ramaswami <[email protected]> > wrote: > >> Hi Etta, >> >> Have you thought about reusing the HadoopFileSystem to access Azure Blob >> Storage instead? It appears that Azure Blob Storage comes with a >> hdfs-compatible API with the wasb:// protocol. See >> https://issues.apache.org/jira/browse/BEAM-10103 >> >> Ashwin Ramaswami >> Student >> *Find me on my:* LinkedIn <https://www.linkedin.com/in/ashwin-r> | >> Website <https://epicfaace.github.io/> | GitHub >> <https://github.com/epicfaace> >> >> >> On Thu, Jul 16, 2020 at 3:29 PM Etta Rapp <[email protected]> wrote: >> >>> Hi, >>> >>> I am working on a project adding Azure Blobstore IO to Apache Beam. The >>> design document is available at http://s.apache.org/beam-azfs-java and >>> the JIRA issue is at https://issues.apache.org/jira/browse/BEAM-10378. >>> Can you please provide any feedback or suggestions? >>> >>> Thank you, >>> Etta Rapp >>> >>
