I think for the current requirement, substrait is something, which I'd like
to give it a try.

Thanks,
Surya


On Thu, Aug 29, 2024 at 11:59 AM Kevin Liu <[email protected]> wrote:

> If you're using open table formats, Delta Lake has the "generated column"
> feature which supports specifying a formula using other table columns.
>
> https://docs.databricks.com/en/delta/generated-columns.html
> https://delta.io/blog/2023-04-12-delta-lake-generated-columns/
>
> Cheers,
> Kevin
>
> On Thu, Aug 29, 2024 at 2:04 PM Jacek Pliszka <[email protected]>
> wrote:
>
>> Hi!
>>
>> Another option would be converting to an arrow-backed pandas table and
>> using a dataframe query method. Other libraries like DuckDB most
>> likely offer similar options.
>>
>> BR
>>
>> J
>>
>> czw., 29 sie 2024 o 02:54 Felipe Oliveira Carvalho
>> <[email protected]> napisał(a):
>> >
>> > You can build `compure::Expression` instances [1] and use them in
>> different contexts like scanning datasets [2] and producing Substrait plans
>> [3] that you can execute.
>> >
>> > But you have to write your own parser and define the scope and
>> semantics of the operations you would support.
>> >
>> > [1]
>> https://github.com/apache/arrow/blob/main/cpp/src/arrow/compute/expression.h#L45
>> > [2]
>> https://github.com/apache/arrow/blob/main/cpp/examples/arrow/dataset_documentation_example.cc#L266
>> > [3]
>> https://github.com/apache/arrow/blob/main/cpp/src/arrow/engine/substrait/relation.h#L55
>> >
>> > --
>> > Felipe
>> >
>> > On Wed, Aug 28, 2024 at 1:11 AM Surya Kiran Gullapalli <
>> [email protected]> wrote:
>> >>
>> >> Hello all,
>> >> Let's say I've a table containing 3 columns 'A', 'B', and 'C'. Is it
>> possible to create a 4th column 'D' using a formula (like (A+B)/C) ?
>> >>
>> >> I know I can manually create them using compute functions, but is it
>> possible to parse a formula like the above and compute the column on the
>> fly at runtime ?
>> >>
>> >> Any pointers are greatly appreciated.
>> >>
>> >> Thanks,
>> >> Surya
>>
>

Reply via email to