Sounds good. I will create a branch soon. Thanks, Ashutosh
On Mon, Apr 8, 2013 at 7:31 PM, Namit Jain <nj...@fb.com> wrote: > Sounds good to me > > > On 4/9/13 12:04 AM, "Jitendra Pandey" <jiten...@hortonworks.com> wrote: > > >I agree that we shouldn't wait too long before merging the branch. > >We are targeting to have basic queries working within a month from now and > >will definitely propose to merge the branch back into trunk at that point. > >We will limit the scope of the work on the branch to just a few operators > >and primitive datatypes. Does that sound reasonable? > > > >regards > >jitendra > > > >On Wed, Apr 3, 2013 at 9:03 PM, Namit Jain <nj...@fb.com> wrote: > > > >> There is no right answer, but I feel if you go this path a long way, it > >> will be very difficult > >> to merge back. Given that this is not a new functionality, and > >>improvement > >> to existing code > >> (which will also evolve), it will become difficult to maintain/review a > >> big diff in the future. > >> > >> I haven't thought much about it, but can start by creating the > >>high-level > >> interfaces first, and then > >> going from there. For e.g.: create interfaces for operators which take > >>in > >> an array of rows instead of > >> a single row - initially the array size can always be 1. Now, proceed > >>from > >> there. > >> > >> What makes you think, merging a branch 6 months/1 year from now will be > >> easier than working on the > >> current branch ? > >> > >> Having said that, both approaches can be made to work - but I think you > >> are just delaying the > >> merging work instead of taking the hit upfront. > >> > >> Thanks, > >> -namit > >> > >> > >> > >> On 4/4/13 2:40 AM, "Jitendra Pandey" <jiten...@hortonworks.com> wrote: > >> > >> > We did consider implementing these changes on the trunk. But, it > >>would > >> >take several patches in various parts of the code before a simple end > >>to > >> >end query can be executed on vectorized path. For example a patch for > >> >vectorized expressions will be a significant amount of code, but will > >>not > >> >be used in a query until a vectorized operator is implemented and the > >> >query > >> >plan is modified to use the vectorized path. Vectorization of even > >>basic > >> >expressions becomes non trivial because we need to optimize for various > >> >cases like chain of expressions, for non-null columns or repeating > >>values > >> >and also handle case for nullable columns, or short circuit > >>optimization > >> >etc. Careful handling of these is important for performance gains. > >> > > >> > Committing those intermediate patches in trunk without stabilizing > >>them > >> >in a branch first might be a cause of concern. > >> > > >> > A separate branch will let us make incremental changes to the system > >>so > >> >that each patch addresses a single feature or functionality and is > >>small > >> >enough to review. > >> > We will make sure that the branch is frequently updated with the > >> >changes > >> >in the trunk to avoid conflicts at the time of the merge. > >> > Also, we plan to propose merger of the branch as soon as a basic end > >>to > >> >end query begins to work and is sufficiently tested, instead of waiting > >> >for > >> >all operators to get vectorized. Initially our target is to make select > >> >and > >> >filter operators work with vectorized expressions for primitive types. > >> > > >> > We will have a single global configuration flag that can be used to > >> >turn > >> >off the entire vectorization code path and we will specifically test to > >> >make sure that when this flag is off there is no regression on the > >>current > >> >system. When vectorization is turned on, we will have a validation > >>step to > >> >make sure the given query is supported on the vectorization path > >>otherwise > >> >it will fall back to current code path. > >> > > >> > Although, we intend to follow commit then review policy on the branch > >> >for > >> >speed of development, each patch will have an associated jira and will > >>be > >> >available for review and feedback. > >> > > >> >thanks > >> >jitendra > >> > > >> >On Tue, Apr 2, 2013 at 8:37 PM, Namit Jain <nj...@fb.com> wrote: > >> > > >> >> It will be difficult to merge back the branch. > >> >> Can you stage your changes incrementally ? > >> >> > >> >> I mean, start with the making the operators vectorized - it can be a > >>for > >> >> loop to > >> >> start with ? I think it will be very difficult to merge it back if we > >> >> diverge on this. > >> >> I would recommend starting with simple interfaces for operators and > >>then > >> >> plugging them > >> >> in slowly instead of a new branch, unless this approach is extremely > >> >> difficult. > >> >> > >> >> > >> >> Thanks, > >> >> -namit > >> >> > >> >> On 4/3/13 1:52 AM, "Jitendra Pandey" <jiten...@hortonworks.com> > >>wrote: > >> >> > >> >> >Hi Folks, > >> >> > I want to propose for creation of a separate branch for > >>HIVE-4160 > >> >> >work. This is a significant amount of work, and support for very > >>basic > >> >> >functionality will need big chunks of code. It will also take some > >> >>time to > >> >> >stabilize and test. A separate dev branch will allow us to do this > >>work > >> >> >incrementally and collaboratively. We have already uploaded a design > >> >> >document on the jira for comments/feedback. > >> >> > > >> >> >thanks > >> >> >jitendra > >> >> > > >> >> > > >> >> >-- > >> >> ><http://hortonworks.com/download/> > >> >> > >> >> > >> > > >> > > >> >-- > >> ><http://hortonworks.com/download/> > >> > >> > > > > > >-- > ><http://hortonworks.com/download/> > >