Re: Branch for HIVE-4160

Ashutosh Chauhan Tue, 09 Apr 2013 20:51:15 -0700

Sounds good. I will create a branch soon.

Thanks,
Ashutosh



On Mon, Apr 8, 2013 at 7:31 PM, Namit Jain <[email protected]> wrote:

> Sounds good to me
>
>
> On 4/9/13 12:04 AM, "Jitendra Pandey" <[email protected]> wrote:
>
> >I agree that we shouldn't wait too long before merging the branch.
> >We are targeting to have basic queries working within a month from now and
> >will definitely propose to merge the branch back into trunk at that point.
> >We will limit the scope of the work on the branch to just a few operators
> >and primitive datatypes. Does that sound reasonable?
> >
> >regards
> >jitendra
> >
> >On Wed, Apr 3, 2013 at 9:03 PM, Namit Jain <[email protected]> wrote:
> >
> >> There is no right answer, but I feel if you go this path a long way, it
> >> will be very difficult
> >> to merge back. Given that this is not a new functionality, and
> >>improvement
> >> to existing code
> >> (which will also evolve), it will become difficult to maintain/review a
> >> big diff in the future.
> >>
> >> I haven't thought much about it, but can start by creating the
> >>high-level
> >> interfaces first, and then
> >> going from there. For e.g.: create interfaces for operators which take
> >>in
> >> an array of rows instead of
> >> a single row - initially the array size can always be 1. Now, proceed
> >>from
> >> there.
> >>
> >> What makes you think, merging a branch 6 months/1 year from now will be
> >> easier than working on the
> >> current branch ?
> >>
> >> Having said that, both approaches can be made to work - but I think you
> >> are just delaying the
> >> merging work instead of taking the hit upfront.
> >>
> >> Thanks,
> >> -namit
> >>
> >>
> >>
> >> On 4/4/13 2:40 AM, "Jitendra Pandey" <[email protected]> wrote:
> >>
> >> >   We did consider implementing these changes on the trunk. But, it
> >>would
> >> >take several patches in various parts of the code before a simple end
> >>to
> >> >end query can be executed on vectorized path. For example a patch for
> >> >vectorized expressions  will be a significant amount of code, but will
> >>not
> >> >be used in a query until a vectorized operator is implemented and the
> >> >query
> >> >plan is modified to use the vectorized path. Vectorization of even
> >>basic
> >> >expressions becomes non trivial because we need to optimize for various
> >> >cases like chain of expressions, for non-null columns or repeating
> >>values
> >> >and also handle case for nullable columns, or short circuit
> >>optimization
> >> >etc. Careful handling of these is important for performance gains.
> >> >
> >> > Committing those intermediate patches in trunk  without stabilizing
> >>them
> >> >in a branch first might be a cause of concern.
> >> >
> >> >  A separate branch will let us make incremental changes to the system
> >>so
> >> >that each patch addresses a single feature or functionality and is
> >>small
> >> >enough to review.
> >> >   We will make sure that the branch is frequently updated with the
> >> >changes
> >> >in the trunk to avoid conflicts at the time of the merge.
> >> >  Also, we plan to propose merger of the branch as soon as a basic end
> >>to
> >> >end query begins to work and is sufficiently tested, instead of waiting
> >> >for
> >> >all operators to get vectorized. Initially our target is to make select
> >> >and
> >> >filter operators work with vectorized expressions for primitive types.
> >> >
> >> >   We will have a single global configuration flag that can be used to
> >> >turn
> >> >off the entire vectorization code path and we will specifically test to
> >> >make sure that when this flag is off there is no regression on the
> >>current
> >> >system. When vectorization is turned on, we will have a validation
> >>step to
> >> >make sure the given query is supported on the vectorization path
> >>otherwise
> >> >it will fall back to current code path.
> >> >
> >> >  Although, we intend to follow commit then review policy on the branch
> >> >for
> >> >speed of development, each patch will have an associated jira and will
> >>be
> >> >available for review and feedback.
> >> >
> >> >thanks
> >> >jitendra
> >> >
> >> >On Tue, Apr 2, 2013 at 8:37 PM, Namit Jain <[email protected]> wrote:
> >> >
> >> >> It will be difficult to merge back the branch.
> >> >> Can you stage your changes incrementally ?
> >> >>
> >> >> I mean, start with the making the operators vectorized - it can be a
> >>for
> >> >> loop to
> >> >> start with ? I think it will be very difficult to merge it back if we
> >> >> diverge on this.
> >> >> I would recommend starting with simple interfaces for operators and
> >>then
> >> >> plugging them
> >> >> in slowly instead of a new branch, unless this approach is extremely
> >> >> difficult.
> >> >>
> >> >>
> >> >> Thanks,
> >> >> -namit
> >> >>
> >> >> On 4/3/13 1:52 AM, "Jitendra Pandey" <[email protected]>
> >>wrote:
> >> >>
> >> >> >Hi Folks,
> >> >> >     I want to propose for creation of a separate branch for
> >>HIVE-4160
> >> >> >work. This is a significant amount of work, and support for very
> >>basic
> >> >> >functionality will need big chunks of code. It will also take some
> >> >>time to
> >> >> >stabilize and test. A separate dev branch will allow us to do this
> >>work
> >> >> >incrementally and collaboratively. We have already uploaded a design
> >> >> >document on the jira for comments/feedback.
> >> >> >
> >> >> >thanks
> >> >> >jitendra
> >> >> >
> >> >> >
> >> >> >--
> >> >> ><http://hortonworks.com/download/>
> >> >>
> >> >>
> >> >
> >> >
> >> >--
> >> ><http://hortonworks.com/download/>
> >>
> >>
> >
> >
> >--
> ><http://hortonworks.com/download/>
>
>

Re: Branch for HIVE-4160

Reply via email to