Do you already have a storage layer to persist these views or do you only need ephemeral views? Sounds interesting curious to find out more about your use case
On Wed, Oct 25, 2023 at 2:00 PM Lee, David (PAG) <david....@blackrock.com> wrote: > Here's my ideal use case scenario.. > > Create multiple datasets mapped to different file directories. > Create more datasets by logically generating additional computed columns > using expressions > Create joined dataset by joining datasets > Finally run a Scanner on the joined dataset to start materialization.. > > Pyarrow.Dataset.filter supports adding a @filter, but it doesn't have a > @columns argument. > Pyarrow.Dataset.Scanner supports both @filter and @columns, but I don't > want to create interim copies of data in memory. > > Simplified example: > Give a table that captures local values like 'en-US', 'en-GB', 'fr-CA', > etc.. > I want to use a pyarrow logical expression to split this into language and > country so I end up with: > Language: 'en', 'en', 'fr', .. > Country: 'US', 'GB', 'CA', .. > I then want to join Country to a Country dataset which contains Country > and Country_Name > Language: 'en', 'en', 'fr', .. > Country: 'US', 'GB', 'CA', .. > Country_Name: 'USA', 'Great Britain', 'Cananda', .. > > Basically can a dataset handle "logical" column projection to avoid > physical materialization in memory? > > > This message may contain information that is confidential or privileged. > If you are not the intended recipient, please advise the sender immediately > and delete this message. See > http://www.blackrock.com/corporate/compliance/email-disclaimers for > further information. Please refer to > http://www.blackrock.com/corporate/compliance/privacy-policy for more > information about BlackRock’s Privacy Policy. > > > For a list of BlackRock's office addresses worldwide, see > http://www.blackrock.com/corporate/about-us/contacts-locations. > > © 2023 BlackRock, Inc. All rights reserved. >