Hi Tomek, Thank you for the pointers and the description in OAK-6922. It all makes sense and seems like a reasonable approach. I assume the description is upto date.
How does it perform compared to TarMK a) when the entire repo doesn't fit into RAM allocated to the container ? b) when the working set doesn't fit into RAM allocated to the container ? Since you mentioned cost, have you done a cost based analysis of RAM vs attached disk, assuming that TarMK has already been highly optimised to cope with deployments where the working set may only just fit into RAM ? IIRC the Azure attached disks mount Azure Blobs behind a kernel block device driver and use local SSD to optimise caching (in read and write through mode). Since there are a kernel block device they also benefit from the linux kernel VFS Disk Cache and support memory mapping via the page cache. So An Azure attached disk often behaves like a local SSD (IIUC). I realise that some containerisation frameworks in Azure dont yet support easy native Azure disk mounting (eg Mesos), but others do (eg AKS[1]) Best regards Ian 1 https://azure.microsoft.com/en-us/services/container-service/ https://docs.microsoft.com/en-us/azure/aks/azure-files-dynamic-pv On 1 March 2018 at 18:40, Matt Ryan <o...@mvryan.org> wrote: > Hi Tomek, > > Some time ago (November 2016 Oakathon IIRC) some people explored a similar > concept using AWS (S3) instead of Azure. If you haven’t discussed with > them already it may be worth doing so. IIRC Stefan Egli and I believe > Michael Duerig were involved and probably some others as well. > > -MR > > > On March 1, 2018 at 5:42:07 AM, Tomek Rekawek (reka...@adobe.com.invalid) > wrote: > > Hi Tommaso, > > so, the goal is to run the Oak in a cloud, in this case Azure. In order to > do this in a scalable way (eg. multiple instances on a single VM, > containerized), we need to take care of provisioning the sufficient amount > of space for the segmentstore. Mounting the physical SSD/HDD disks (in > Azure they’re called “Managed Disks” aka EBS in Amazon) has two drawbacks: > > * it’s expensive, > * it’s complex (each disk is a separate /dev/sdX that has to be formatted, > mounted, etc.) > > The point of the Azure Segment Store is to deal with these two issues, by > replacing the need for a local file system space with a remote service, > that will be (a) cheaper and (b) easier to provision (as it’ll be > configured on the application layer rather than VM layer). > > Another option would be using the Azure File Storage (which mounts the SMB > file system, not the “physical” disk). However, in this case we’d have a > remote storage that emulates a local one and SegmentMK doesn’t really > expect this. Rather than that it’s better to create a full-fledged remote > storage implementation, so we can work out the issues caused by the higher > latency, etc. > > Regards, > Tomek > > -- > Tomek Rękawek | Adobe Research | www.adobe.com > reka...@adobe.com > > > On 1 Mar 2018, at 11:16, Tommaso Teofili <tommaso.teof...@gmail.com> > wrote: > > > > Hi Tomek, > > > > While I think it's an interesting feature, I'd be also interested to hear > > about the user story behind your prototype. > > > > Regards, > > Tommaso > > > > > > Il giorno gio 1 mar 2018 alle ore 10:31 Tomek Rękawek <tom...@apache.org > > > > ha scritto: > > > >> Hello, > >> > >> I prepared a prototype for the Azure-based Segment Store, which allows > to > >> persist all the SegmentMK-related resources (segments, journal, > manifest, > >> etc.) on a remote service, namely the Azure Blob Storage [1]. The whole > >> description of the approach, data structure, etc. as well as the patch > can > >> be found in OAK-6922. It uses the extension points introduced in the > >> OAK-6921. > >> > >> While it’s still an experimental code, I’d like to commit it to trunk > >> rather sooner than later. The patch is already pretty big and I’d like > to > >> avoid developing it “privately” on my own branch. It’s a new, optional > >> Maven module, which doesn’t change any existing behaviour of Oak or > >> SegmentMK. The only change it makes externally is adding a few exports > to > >> the oak-segment-tar, so it can use the SPI introduced in the OAK-6921. > We > >> may narrow these exports to a single package if you think it’d be good > for > >> the encapsulation. > >> > >> There’s a related issue OAK-7297, which introduces the new fixture for > >> benchmark and ITs. After merging it, all the Oak integration tests pass > on > >> the Azure Segment Store. > >> > >> Looking forward for the feedback. > >> > >> Regards, > >> Tomek > >> > >> [1] https://azure.microsoft.com/en-us/services/storage/blobs/ > >> > >> -- > >> Tomek Rękawek | Adobe Research | www.adobe.com > >> reka...@adobe.com > >> > >> >