Hi Andrey,

> This will force us to bother about interfaces/contracts and compatibility
> aspects in the future

Schemas and versions are part of thin client wire protocol.
This protocol is a public API - we'll have to care about compatibility
anyway.

Schema evolution is an important feature that users should understand.
I see no reason to hide it from the API.


> We already have a ticket [1]...
> Will 'idx -> column' mapping only be enough for your purposes?

The ticket sounds reasonable, but there are no API details.
At the very least we need public access to column names, types, and indexes.
I propose something like this:

interface Tuple {
    TupleSchema getSchema();
}

class TupleSchema {
    int getVersion();
    List<TupleSchemaColumn> getColumns();
}

class TupleSchemaColumn {
    int index;
    String name;
    String typeName;
    bool isKey;
}

On Wed, Jun 30, 2021 at 11:05 AM Andrey Mashenkov <
andrey.mashen...@gmail.com> wrote:

> Hi Pavel,
>
> 2. Schema and its version are internal things and shouldn't be exposed,
> otherwise, eventually, it will lead to the user will manage schemas
> manually on his side for some purposes.
> This will force us to bother about interfaces/contracts and compatibility
> aspects in the future with uncertain benefits for a user.
>
> As SchemaDescriptor is an internal API and the user expects a public API.
> I don't think we want to convert the descriptor back into public API terms
> and it may be impossible for some data.
>
> We already have a ticket [1] to support accessing a column by an index and
> exposing some metadata.
> Will 'idx -> column' mapping only be enough for your purposes?
>
> > int Tuple.columnIndex(String)
>
>
>
>  [1] https://issues.apache.org/jira/browse/IGNITE-14342
>
> On Wed, Jun 30, 2021 at 9:34 AM Pavel Tupitsyn <ptupit...@apache.org>
> wrote:
>
> > Hi Val,
> >
> > Thanks for the comments, please see below:
> >
> >
> > > Users will identify tables using names, they will never use IDs
> >
> > Ok, let's keep it this way.
> >
> >
> > > Sounds like the Tuple should implement Iterable.
> >
> > I don't think Iterable is enough.
> > We should have a way to get values by column index: Tuple.value(int
> index),
> > where index is according to column order in schema.
> >
> > Combined with Tuple.schema(), it will provide an efficient way to consume
> > data -
> > users will be able to read columns in any order, skip some of them, etc.
> >
> > This is a common pattern in APIs like ODBC or System.Data [1],
> > which we'll need to implement on top of our thin clients later.
> >
> >
> > > However, whether a user might
> > > need to get a table schema for a particular version, I'm not sure. Do
> you
> > > have a use case in mind for this? If not, I would keep this internal
> >
> > Ok, we can keep the Table.schema(ver) method internal, as long as
> > Table.schemas() is public and includes schema versions.
> >
> >
> > > We already have async counterparts for all the methods where this is
> > applicable
> >
> > IgniteTables.createTable(), dropTable(), tables(), table() do not have
> > async counterparts,
> > while internally they perform blocking wait on some futures.
> >
> >
> > [1]
> >
> >
> https://docs.microsoft.com/en-us/dotnet/api/system.data.idatareader?view=net-5.0
> >
> > On Wed, Jun 30, 2021 at 4:51 AM Valentin Kulichenko <
> > valentin.kuliche...@gmail.com> wrote:
> >
> > > Hi Pavel,
> > >
> > > Please see my comments below.
> > >
> > > -Val
> > >
> > > On Tue, Jun 29, 2021 at 2:23 PM Pavel Tupitsyn <ptupit...@apache.org>
> > > wrote:
> > >
> > > > Igniters,
> > > >
> > > > While working on "IEP-76 Thin Client Protocol for Ignite 3.0" [1] (to
> > be
> > > > discussed separately), the following suggestions for the Table API
> came
> > > up:
> > > >
> > > > 1. Expose table IDs: sending table name with every operation (e.g.
> GET)
> > > is
> > > > inefficient, string serialization is expensive by itself and names
> can
> > be
> > > > long.
> > > >     - Table.id()
> > > >     - IgniteTables.table(UUID)
> > > >     - IgniteTables.dropTable(UUID)
> > > >
> > >
> > > I don't think this should be a part of the public API. Users will
> > identify
> > > tables using names, they will never use IDs. As an internal
> optimization
> > > though - sure, we can have that. Sounds similar to the cache ID in 2.x.
> > >
> > >
> > > >
> > > > 2. Expose tuple schemas: to reduce payload size when sending tuples
> to
> > > the
> > > > client, we'll write only the schema version and column values, then
> the
> > > > client can retrieve and cache schemas (ordered set of columns per
> > > version).
> > > >     - Tuple.schema()
> > > >     - Table.schemas()
> > > >     - Table.schema(ver)
> > > >
> > >
> > > Exposing the schema of a tuple makes sense. However, whether a user
> might
> > > need to get a table schema for a particular version, I'm not sure. Do
> you
> > > have a use case in mind for this? If not, I would keep this internal as
> > > well.
> > >
> > >
> > > >
> > > > 3. Expose tuple values as a collection: to serialize tuples
> efficiently
> > > > (see p2) we need an API to get all values at once. Right now the only
> > API
> > > > is to get values by column name, which involves a HashMap lookup on
> > every
> > > > call.
> > > >     - Tuple.values()
> > > >
> > >
> > > Sounds like the Tuple should implement Iterable.
> > >
> > >
> > > >
> > > > 4. Simplify createTable API: use POJO-based configuration.
> > > >     Creating a Consumer when some properties are optional seems to be
> > > > non-trivial.
> > > >
> > >
> > > Yes, it's currently convoluted for sure. To my knowledge, there are
> plans
> > > to improve this, but I will let other folks chime in with more details.
> > >
> > >
> > > >
> > > > 5. Add async methods to IgniteTables API (all methods are async
> inside
> > > > anyway)
> > > >
> > >
> > > We already have async counterparts for all the methods where this is
> > > applicable. E.g., for TableView#get, there is TableView#getAsync. Is
> > there
> > > anything else that you propose to add?
> > >
> > >
> > > >
> > > >
> > > > Thoughts, objections?
> > > >
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-76+Thin+Client+Protocol+for+Ignite+3.0
> > > >
> > > > [2]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach
> > > >
> > >
> >
>
>
> --
> Best regards,
> Andrey V. Mashenkov
>

Reply via email to