Got it. I'll look into implementation choices for creating a new data
source. Appreciate all the feedback.
On Mon, Jun 1, 2020 at 7:53 PM Vinoth Chandar wrote:
> >Is it to separate data and metadata access?
> Correct. We already have modes for querying data using format("hudi"). I
> feel it
>Is it to separate data and metadata access?
Correct. We already have modes for querying data using format("hudi"). I
feel it will get very confusing to mix data and metadata in the same
source.. for e.g a lot of options we support for data may not even make
sense for the TimelineRelation.
>This
Thanks for the feedback.
What is the advantage of doing
spark.read.format(“hudi-timeline”).load(basepath) as opposed to doing new
relation? Is it to separate data and metadata access?
Are you looking for similar functionality as HoodieDatasourceHelpers?
>
This class seems like a list of static
Great! I left some comment on the PR. around licensing and maintenance
overhead.
On Sun, May 31, 2020 at 11:51 PM Lamber Ken wrote:
> Hi forks,
>
> Learned from travis and github actions api docs these days, I used my
> project as a demo[1],
> the demo pull request will always fail, please use
Also please take a look at https://issues.apache.org/jira/browse/HUDI-309.
This was an effort to make the timeline more generalized for querying (for
a different purpose).. but good to revisit now..
On Sun, May 31, 2020 at 11:04 PM vbal...@apache.org
wrote:
>
> I strongly recommend using a
Hi Mario,
Thanks for the detailed explanation. Hudi already allows extra metadata to
be written atomically with each commit i.e write operation. In fact, that
is how we track checkpoints for our delta streamer tool.. It may not solve
the need for querying the data together with this information.
Hi Balaji,
business metadata are all types of info related to the business where the
Hudi solution is being used... from a COB (ie close of business date)
related to that commit to any qualifier related to that commit that might
be useful to be associated with that commit id. If we enable the
Hi forks,
Learned from travis and github actions api docs these days, I used my project
as a demo[1],
the demo pull request will always fail, please use "rerun tests" command, it
will rerun tests automatically.
if you are interested, try it.
Best,
Lamber-Ken
[1]
I strongly recommend using a separate datasource relation (option 1) to query
timeline. It is elegant and fits well with spark APIs.
Thanks.Balaji.VOn Saturday, May 30, 2020, 01:18:45 PM PDT, Vinoth Chandar
wrote:
Hi satish,
Are you looking for similar functionality as