Thanks I took a look and left some comments.

I saw that you were proposing to use azfs as the scheme but I
see wasb/wasb/abfss used in other data processing systems. I'm not sure
which is the common one but wasb/wasbs/abfss show up on the Microsoft site
so it might be best to use that instead of going with what TFX chooses
unless someone on dev@ can give a concrete answer as to what should be used
and why.



On Thu, Jul 16, 2020 at 2:20 PM Etta Rapp <[email protected]> wrote:

> Hi Ashwin,
>
> Thanks for the suggestion, I hadn't considered that.  For now I plan to go
> ahead using the Azure client.
>
> Etta
>
> On Thu, Jul 16, 2020 at 3:36 PM Ashwin Ramaswami <[email protected]>
> wrote:
>
>> Hi Etta,
>>
>> Have you thought about reusing the HadoopFileSystem to access Azure Blob
>> Storage instead? It appears that Azure Blob Storage comes with a
>> hdfs-compatible API with the wasb:// protocol. See
>> https://issues.apache.org/jira/browse/BEAM-10103
>>
>> Ashwin Ramaswami
>> Student
>> *Find me on my:* LinkedIn <https://www.linkedin.com/in/ashwin-r> |
>> Website <https://epicfaace.github.io/> | GitHub
>> <https://github.com/epicfaace>
>>
>>
>> On Thu, Jul 16, 2020 at 3:29 PM Etta Rapp <[email protected]> wrote:
>>
>>> Hi,
>>>
>>> I am working on a project adding Azure Blobstore IO to Apache Beam.  The
>>> design document is available at http://s.apache.org/beam-azfs-java and
>>> the JIRA issue is at https://issues.apache.org/jira/browse/BEAM-10378.
>>> Can you please provide any feedback or suggestions?
>>>
>>> Thank you,
>>> Etta Rapp
>>>
>>

Reply via email to