Makes sense, Nadav. I have been toying with the idea of having the
structure like this. I am trying to make it work on konf (argggh!!) though.
Do you think this sounds reasonable?
datasets:
hive:
transactions:
uri: /user/somepath
format: parquet
database: transations_daily
table: transx
second_transactions:
uri: /seconduser/somepath
format: avro
database: transations_monthly
table: avro_table
file:
users:
uri: s3://filestore
format: parquet
mode: overwrite
Cheers,
Arun
On Tue, Jan 29, 2019 at 1:45 PM Nadav Har Tzvi <[email protected]>
wrote:
> Hey Arun,
>
> I kinda feel like the datastores yaml is somewhat obscure. I propose the
> following structure.
>
> Instead of
>
> datasets:
> hive:
> - key: transactions
> uri: /user/somepath
> format: parquet
> database: transations_daily
> table: transx
>
> - key: second_transactions
> uri: /seconduser/somepath
> format: avro
> database: transations_monthly
> table: avro_table
> file:
> - key: users
> uri: s3://filestore
> format: parquet
> mode: overwrite
>
> I would have
>
> datasets:
> - key: transactions
> uri: /user/somepath
> format: parquet
> database: transations_daily
> table: transx
> type: hive
> - key: second_transactions
> uri: /seconduser/somepath
> format: avro
> database: transations_monthly
> table: avro_table
> type: hive
> - key: users
> uri: s3://filestore
> format: parquet
> mode: overwrite
> type: file
>
> In my opinion it is more straightforward and uniform. I think it is also
> more straightforward code-wise.
> What do you think?
>
> Cheers,
> Nadav
>
>
>
> On Mon, 14 Jan 2019 at 00:57, Yaniv Rodenski <[email protected]> wrote:
>
> > Hi Arun,
> >
> > I've added my comments to the PR, but good call, I agree @Nadav Har Tzvi
> > <[email protected]> should at least review as you both need to
> > maintain compatible APIs.
> >
> > Cheers,
> > Yaniv
> >
> > On Sun, Jan 13, 2019 at 10:21 PM Arun Manivannan <[email protected]>
> wrote:
> >
> >> Hi Guy, Yaniv and Nadiv,
> >>
> >> This PR <https://github.com/apache/incubator-amaterasu/pull/39> just
> >> captures part of the issue - the datasets.yaml, ConfigManager and the
> >> testcases. The Integration with the AmaContext is yet to be done but I
> >> would like to get your thoughts on the implementation.
> >>
> >> Guy - Would it be okay if you could help throw some light on the syntax
> >> and
> >> the idiomatic part of Kotlin itself. Newbie here.
> >>
> >> Cheers,
> >> Arun
> >>
> >> On Fri, Oct 12, 2018 at 7:15 PM Yaniv Rodenski (JIRA) <[email protected]>
> >> wrote:
> >>
> >> > Yaniv Rodenski created AMATERASU-52:
> >> > ---------------------------------------
> >> >
> >> > Summary: Implement AmaContext.datastores
> >> > Key: AMATERASU-52
> >> > URL:
> >> https://issues.apache.org/jira/browse/AMATERASU-52
> >> > Project: AMATERASU
> >> > Issue Type: Task
> >> > Reporter: Yaniv Rodenski
> >> > Assignee: Arun Manivannan
> >> > Fix For: 0.2.1-incubating
> >> >
> >> >
> >> > AmaContext.datastores should contain the data from datastores.yaml
> >> >
> >> >
> >> >
> >> > --
> >> > This message was sent by Atlassian JIRA
> >> > (v7.6.3#76005)
> >> >
> >>
> >
> >
> > --
> > Yaniv Rodenski
> >
> > +61 477 778 405
> > [email protected]
> >
> >
>