Do you already have a storage layer to persist these views or do you only
need ephemeral views? Sounds interesting curious to find out more about
your use case

On Wed, Oct 25, 2023 at 2:00 PM Lee, David (PAG) <david....@blackrock.com>
wrote:

> Here's my ideal use case scenario..
>
> Create multiple datasets mapped to different file directories.
> Create more datasets by logically generating additional computed columns
> using expressions
> Create joined dataset by joining datasets
> Finally run a Scanner on the joined dataset to start materialization..
>
> Pyarrow.Dataset.filter supports adding a @filter, but it doesn't have a
> @columns argument.
> Pyarrow.Dataset.Scanner supports both @filter and @columns, but I don't
> want to create interim copies of data in memory.
>
> Simplified example:
> Give a table that captures local values like 'en-US', 'en-GB', 'fr-CA',
> etc..
> I want to use a pyarrow logical expression to split this into language and
> country so I end up with:
> Language: 'en', 'en', 'fr', ..
> Country: 'US', 'GB', 'CA', ..
> I then want to join Country to a Country dataset which contains Country
> and Country_Name
> Language: 'en', 'en', 'fr', ..
> Country: 'US', 'GB', 'CA', ..
> Country_Name: 'USA', 'Great Britain', 'Cananda', ..
>
> Basically can a dataset handle "logical" column projection to avoid
> physical materialization in memory?
>
>
> This message may contain information that is confidential or privileged.
> If you are not the intended recipient, please advise the sender immediately
> and delete this message. See
> http://www.blackrock.com/corporate/compliance/email-disclaimers for
> further information.  Please refer to
> http://www.blackrock.com/corporate/compliance/privacy-policy for more
> information about BlackRock’s Privacy Policy.
>
>
> For a list of BlackRock's office addresses worldwide, see
> http://www.blackrock.com/corporate/about-us/contacts-locations.
>
> © 2023 BlackRock, Inc. All rights reserved.
>

Reply via email to