Re: [DISCUSS] Support for complex record keys with TimestampBasedKeyGenerator

2020-03-21 Thread Pratyaksh Sharma
@Balaji @Vinoth Chandar ,

Here is a small attempt to make this a generic one -
https://github.com/apache/incubator-hudi/pull/1433/files. Please have a
look, happy to hear from everyone on this.

This is just a sample, if we agree on the implementation, I will add more
test cases and improve it further.

On Thu, Feb 27, 2020 at 9:43 PM Vinoth Chandar  wrote:

> +1 for adding a new composite KeyGenerator, which can combine both...
>
> Workaround : you can use the Transformer api to do a more flexible key
> generation as you wish as well. for deltastreamer
>
> On Tue, Feb 25, 2020 at 9:37 AM Balaji Varadarajan
>  wrote:
>
> >
> > See if you can have a generic implementation where individual fields in
> > the partition-path can be configured with their own key-generator class.
> > Currently, TimestampBasedKeyGenerator is the only type specific custom
> > generator. If we are anticipating more such classes for specialized
> types,
> > you can use a generic way to support overriding key-generator for
> > individual partition-fields once and for all.
> > Balaji.VOn Monday, February 24, 2020, 03:09:02 AM PST, Pratyaksh
> > Sharma  wrote:
> >
> >  Hi,
> >
> > We have TimestampBasedKeyGenerator for defining custom partition paths
> and
> > we have ComplexKeyGenerator for supporting having combination of fields
> as
> > record key or partition key.
> >
> > However we do not have support for the case where one wants to have
> > combination of fields as record key along with being able to define
> custom
> > partition paths. This use case recently came up at my organisation.
> >
> > How about having CustomTimestampBasedKeyGenerator which supports the
> above
> > use case? This class can simply extend TimestampBasedKeyGenerator and
> allow
> > users to have combination of fields as record key.
> >
> > Open to hearing others' opinions.
> >
>


Re: [DISCUSS] Support for complex record keys with TimestampBasedKeyGenerator

2020-02-27 Thread Vinoth Chandar
+1 for adding a new composite KeyGenerator, which can combine both...

Workaround : you can use the Transformer api to do a more flexible key
generation as you wish as well. for deltastreamer

On Tue, Feb 25, 2020 at 9:37 AM Balaji Varadarajan
 wrote:

>
> See if you can have a generic implementation where individual fields in
> the partition-path can be configured with their own key-generator class.
> Currently, TimestampBasedKeyGenerator is the only type specific custom
> generator. If we are anticipating more such classes for specialized types,
> you can use a generic way to support overriding key-generator for
> individual partition-fields once and for all.
> Balaji.VOn Monday, February 24, 2020, 03:09:02 AM PST, Pratyaksh
> Sharma  wrote:
>
>  Hi,
>
> We have TimestampBasedKeyGenerator for defining custom partition paths and
> we have ComplexKeyGenerator for supporting having combination of fields as
> record key or partition key.
>
> However we do not have support for the case where one wants to have
> combination of fields as record key along with being able to define custom
> partition paths. This use case recently came up at my organisation.
>
> How about having CustomTimestampBasedKeyGenerator which supports the above
> use case? This class can simply extend TimestampBasedKeyGenerator and allow
> users to have combination of fields as record key.
>
> Open to hearing others' opinions.
>


Re: [DISCUSS] Support for complex record keys with TimestampBasedKeyGenerator

2020-02-25 Thread Balaji Varadarajan
 
See if you can have a generic implementation where individual fields in the 
partition-path can be configured with their own key-generator class. Currently, 
TimestampBasedKeyGenerator is the only type specific custom generator. If we 
are anticipating more such classes for specialized types, you can use a generic 
way to support overriding key-generator for individual partition-fields once 
and for all.
Balaji.VOn Monday, February 24, 2020, 03:09:02 AM PST, Pratyaksh Sharma 
 wrote:  
 
 Hi,

We have TimestampBasedKeyGenerator for defining custom partition paths and
we have ComplexKeyGenerator for supporting having combination of fields as
record key or partition key.

However we do not have support for the case where one wants to have
combination of fields as record key along with being able to define custom
partition paths. This use case recently came up at my organisation.

How about having CustomTimestampBasedKeyGenerator which supports the above
use case? This class can simply extend TimestampBasedKeyGenerator and allow
users to have combination of fields as record key.

Open to hearing others' opinions.
  

[DISCUSS] Support for complex record keys with TimestampBasedKeyGenerator

2020-02-24 Thread Pratyaksh Sharma
Hi,

We have TimestampBasedKeyGenerator for defining custom partition paths and
we have ComplexKeyGenerator for supporting having combination of fields as
record key or partition key.

However we do not have support for the case where one wants to have
combination of fields as record key along with being able to define custom
partition paths. This use case recently came up at my organisation.

How about having CustomTimestampBasedKeyGenerator which supports the above
use case? This class can simply extend TimestampBasedKeyGenerator and allow
users to have combination of fields as record key.

Open to hearing others' opinions.