Makes sense, Nadav. I have been toying with the idea of having the
structure like this. I am trying to make it work on konf (argggh!!) though.
Do you think this sounds reasonable?


datasets:
  hive:
    transactions:
      uri: /user/somepath
      format: parquet
      database: transations_daily
      table: transx

    second_transactions:
      uri: /seconduser/somepath
      format: avro
      database: transations_monthly
      table: avro_table
  file:
    users:
      uri: s3://filestore
      format: parquet
      mode: overwrite



Cheers,
Arun


On Tue, Jan 29, 2019 at 1:45 PM Nadav Har Tzvi <nadavhart...@gmail.com>
wrote:

> Hey Arun,
>
> I kinda feel like the datastores yaml is somewhat obscure. I propose the
> following structure.
>
> Instead of
>
> datasets:
>   hive:
>     - key: transactions
>       uri: /user/somepath
>       format: parquet
>       database: transations_daily
>       table: transx
>
>     - key: second_transactions
>       uri: /seconduser/somepath
>       format: avro
>       database: transations_monthly
>       table: avro_table
>   file:
>     - key: users
>       uri: s3://filestore
>       format: parquet
>       mode: overwrite
>
> I would have
>
> datasets:
>   - key: transactions
>     uri: /user/somepath
>     format: parquet
>     database: transations_daily
>     table: transx
>     type: hive
>   - key: second_transactions
>     uri: /seconduser/somepath
>     format: avro
>     database: transations_monthly
>     table: avro_table
>     type: hive
>   - key: users
>     uri: s3://filestore
>     format: parquet
>     mode: overwrite
>     type: file
>
> In my opinion it is more straightforward and uniform. I think it is also
> more straightforward code-wise.
> What do you think?
>
> Cheers,
> Nadav
>
>
>
> On Mon, 14 Jan 2019 at 00:57, Yaniv Rodenski <ya...@shinto.io> wrote:
>
> > Hi Arun,
> >
> > I've added my comments to the PR, but good call, I agree @Nadav Har Tzvi
> > <nadavhart...@gmail.com> should at least review as you both need to
> > maintain compatible APIs.
> >
> > Cheers,
> > Yaniv
> >
> > On Sun, Jan 13, 2019 at 10:21 PM Arun Manivannan <a...@arunma.com>
> wrote:
> >
> >> Hi Guy, Yaniv and Nadiv,
> >>
> >> This PR <https://github.com/apache/incubator-amaterasu/pull/39> just
> >> captures part of the issue - the datasets.yaml, ConfigManager and the
> >> testcases. The Integration with the AmaContext is yet to be done but I
> >> would like to get your thoughts on the implementation.
> >>
> >> Guy - Would it be okay if you could help throw some light on the syntax
> >> and
> >> the idiomatic part of Kotlin itself. Newbie here.
> >>
> >> Cheers,
> >> Arun
> >>
> >> On Fri, Oct 12, 2018 at 7:15 PM Yaniv Rodenski (JIRA) <j...@apache.org>
> >> wrote:
> >>
> >> > Yaniv Rodenski created AMATERASU-52:
> >> > ---------------------------------------
> >> >
> >> >              Summary: Implement AmaContext.datastores
> >> >                  Key: AMATERASU-52
> >> >                  URL:
> >> https://issues.apache.org/jira/browse/AMATERASU-52
> >> >              Project: AMATERASU
> >> >           Issue Type: Task
> >> >             Reporter: Yaniv Rodenski
> >> >             Assignee: Arun Manivannan
> >> >              Fix For: 0.2.1-incubating
> >> >
> >> >
> >> > AmaContext.datastores should contain the data from datastores.yaml
> >> >
> >> >
> >> >
> >> > --
> >> > This message was sent by Atlassian JIRA
> >> > (v7.6.3#76005)
> >> >
> >>
> >
> >
> > --
> > Yaniv Rodenski
> >
> > +61 477 778 405
> > ya...@shinto.io
> >
> >
>

Reply via email to