Implementation an S3 file system for python SDK - Updated
Hi, I have updated the project proposal according to the given feedback. So can you guys check my proposal again and give me your feedback about corrections I have done. Here is the link to the updated project proposal https://docs.google.com/document/d/1i_PoIrbmhNgwKCS1TYWC28A9RsyZQFsQCJic3aCXO-8/edit?usp=sharing Thank you Pasan Kamburugamuwa
Re: Implementation an S3 file system for python SDK - Updated
+dev +Pablo Estrada +Chamikara Jayalath +Udi Meiri Thank you Pasan. I quickly looked at the proposal and it looks good. Added a few folks who could offer additional feedback. On Mon, Apr 8, 2019 at 12:13 AM Pasan Kamburugamuwa < pasankamburugamu...@gmail.com> wrote: > Hi, > > I have updated the project proposal according to the given feedback. So > can you guys check my proposal again and give me your feedback about > corrections I have done. > > Here is the link to the updated project proposal > > https://docs.google.com/document/d/1i_PoIrbmhNgwKCS1TYWC28A9RsyZQFsQCJic3aCXO-8/edit?usp=sharing > > Thank you > Pasan Kamburugamuwa >
Re: Implementation an S3 file system for python SDK - Updated
Currently, Pasan is working on a design for adding a couple implementations to the Filesystem interface in Python, and it's not necessary to consider SDF here. IMHO. On the other hand, Python's fileio[1] could probably use SDF-based improvements to split when many files are being matched. Best -P. On Mon, Apr 8, 2019 at 10:00 AM Alex Amato wrote: > +Lukasz Cwik , +Boyuan Zhang , +Lara > Schmidt > > Should splittable DoFn be considered in this design? In order to split and > scale the source step properly? > > On Mon, Apr 8, 2019 at 9:11 AM Ahmet Altay wrote: > >> +dev +Pablo Estrada +Chamikara >> Jayalath +Udi Meiri >> >> Thank you Pasan. I quickly looked at the proposal and it looks good. >> Added a few folks who could offer additional feedback. >> >> On Mon, Apr 8, 2019 at 12:13 AM Pasan Kamburugamuwa < >> pasankamburugamu...@gmail.com> wrote: >> >>> Hi, >>> >>> I have updated the project proposal according to the given feedback. So >>> can you guys check my proposal again and give me your feedback about >>> corrections I have done. >>> >>> Here is the link to the updated project proposal >>> >>> https://docs.google.com/document/d/1i_PoIrbmhNgwKCS1TYWC28A9RsyZQFsQCJic3aCXO-8/edit?usp=sharing >>> >>> Thank you >>> Pasan Kamburugamuwa >>> >>
Re: Implementation an S3 file system for python SDK - Updated
A filesystem is a lower level abstraction that a PTransform can use thus there is no need to consider SDF when creating the S3 filesytem. If we were redesigning the interface to all filesystems, then SDF should be considered. On Mon, Apr 8, 2019 at 10:54 AM Lara Schmidt wrote: > I'd push towards waiting until SDF is working end to end to begin > converting things. Unless it's something like Text.ReadAll batch API that > gets benefits without a SDF implementation. I don't have a lot of context > on what file APIs python already supports. > > On Mon, Apr 8, 2019 at 10:06 AM Pablo Estrada wrote: > >> Currently, Pasan is working on a design for adding a couple >> implementations to the Filesystem interface in Python, and it's not >> necessary to consider SDF here. IMHO. >> >> On the other hand, Python's fileio[1] could probably use SDF-based >> improvements to split when many files are being matched. >> Best >> -P. >> >> On Mon, Apr 8, 2019 at 10:00 AM Alex Amato wrote: >> >>> +Lukasz Cwik , +Boyuan Zhang , +Lara >>> Schmidt >>> >>> Should splittable DoFn be considered in this design? In order to split >>> and scale the source step properly? >>> >>> On Mon, Apr 8, 2019 at 9:11 AM Ahmet Altay wrote: >>> +dev +Pablo Estrada +Chamikara Jayalath +Udi Meiri Thank you Pasan. I quickly looked at the proposal and it looks good. Added a few folks who could offer additional feedback. On Mon, Apr 8, 2019 at 12:13 AM Pasan Kamburugamuwa < pasankamburugamu...@gmail.com> wrote: > Hi, > > I have updated the project proposal according to the given feedback. > So can you guys check my proposal again and give me your feedback about > corrections I have done. > > Here is the link to the updated project proposal > > https://docs.google.com/document/d/1i_PoIrbmhNgwKCS1TYWC28A9RsyZQFsQCJic3aCXO-8/edit?usp=sharing > > Thank you > Pasan Kamburugamuwa >
Re: Implementation an S3 file system for python SDK - Updated
Thanks for the proposal Pasan. Added some comments. As others mentioned, FileSystem interface is orthogonal to SDF (storage system instead of source format) so no need to wait for SDF. - Cham On Mon, Apr 8, 2019 at 10:57 AM Lukasz Cwik wrote: > A filesystem is a lower level abstraction that a PTransform can use thus > there is no need to consider SDF when creating the S3 filesytem. > If we were redesigning the interface to all filesystems, then SDF should > be considered. > > On Mon, Apr 8, 2019 at 10:54 AM Lara Schmidt > wrote: > >> I'd push towards waiting until SDF is working end to end to begin >> converting things. Unless it's something like Text.ReadAll batch API that >> gets benefits without a SDF implementation. I don't have a lot of context >> on what file APIs python already supports. >> >> On Mon, Apr 8, 2019 at 10:06 AM Pablo Estrada wrote: >> >>> Currently, Pasan is working on a design for adding a couple >>> implementations to the Filesystem interface in Python, and it's not >>> necessary to consider SDF here. IMHO. >>> >>> On the other hand, Python's fileio[1] could probably use SDF-based >>> improvements to split when many files are being matched. >>> Best >>> -P. >>> >>> On Mon, Apr 8, 2019 at 10:00 AM Alex Amato wrote: >>> +Lukasz Cwik , +Boyuan Zhang , +Lara Schmidt Should splittable DoFn be considered in this design? In order to split and scale the source step properly? On Mon, Apr 8, 2019 at 9:11 AM Ahmet Altay wrote: > +dev +Pablo Estrada +Chamikara > Jayalath +Udi Meiri > > Thank you Pasan. I quickly looked at the proposal and it looks good. > Added a few folks who could offer additional feedback. > > On Mon, Apr 8, 2019 at 12:13 AM Pasan Kamburugamuwa < > pasankamburugamu...@gmail.com> wrote: > >> Hi, >> >> I have updated the project proposal according to the given feedback. >> So can you guys check my proposal again and give me your feedback about >> corrections I have done. >> >> Here is the link to the updated project proposal >> >> https://docs.google.com/document/d/1i_PoIrbmhNgwKCS1TYWC28A9RsyZQFsQCJic3aCXO-8/edit?usp=sharing >> >> Thank you >> Pasan Kamburugamuwa >> >