Re: Update on ARROW-1463, related subtasks and plan for testing and merging

2017-10-13 Thread Siddharth Teotia
Okay, sounds good. On Fri, Oct 13, 2017 at 2:50 PM, Wes McKinney wrote: > It is fine to have not-completely-working states in the refactor > branch. I recommend do whatever is the most expedient thing to help > with making progress. > > - Wes > > On Fri, Oct 13, 2017 at

Re: Update on ARROW-1463, related subtasks and plan for testing and merging

2017-10-13 Thread Wes McKinney
It is fine to have not-completely-working states in the refactor branch. I recommend do whatever is the most expedient thing to help with making progress. - Wes On Fri, Oct 13, 2017 at 5:42 PM, Siddharth Teotia wrote: > Li, > > I think there is some confusion. Are you

Re: Update on ARROW-1463, related subtasks and plan for testing and merging

2017-10-13 Thread Siddharth Teotia
Li, I think there is some confusion. Are you suggesting merging into "java vector refactor" branch or the master? Is it fine to merge stuff on the former branch even though few things are broken (around 10 tests) ? If this is allowed, I can do some cleanup (some documentation, some TODOs

Re: Update on ARROW-1463, related subtasks and plan for testing and merging

2017-10-13 Thread Li Jin
Siddharth, Regarding rename: Yes this can be done later. Tests: I agree having code like https://github.com/apache/ arrow/pull/1164/files#diff-0876c9a0005d1dbaea321ea8d39d79ae is hard to maintain even temporarily. I am not sure what's the best way to resolve test failure wrt removing of the

Re: Update on ARROW-1463, related subtasks and plan for testing and merging

2017-10-13 Thread Siddharth Teotia
I am not quite sure of the need to rename the vectors. Why do we need to rename? This would first require us to remove all the vectors generated by FixedValueVectors.java as they are non-nullable scalar vectors. Removing non-nullable vectors is one of the goals, but it can be done once the new

Re: Update on ARROW-1463, related subtasks and plan for testing and merging

2017-10-13 Thread Li Jin
Siddharth, Thanks for the update. I think it's fine to move forward with more vectors, but in the mean time, I think we should also prioritize to merge https://github.com/apache/arrow/pull/1164, here are a few comments needs to be addressed. (1) Backward-compatibility: I think there is no way to

Re: Update on ARROW-1463, related subtasks and plan for testing and merging

2017-10-13 Thread Siddharth Teotia
The patch that I have put up https://github.com/apache/arrow/pull/1198 seems to be in a reasonable state. We are now working off a different branch "java vector refactor". Now that we have the basic structure, in order to make quick forward progress, I would like to go ahead and do for other

Re: Update on ARROW-1463, related subtasks and plan for testing and merging

2017-10-12 Thread Siddharth Teotia
Yes, that is the intention. Good that we all are on the same page. I will move the PR https://github.com/apache/arrow/pull/1164 to new branch. On Thu, Oct 12, 2017 at 11:20 AM, Li Jin wrote: > To make clear, I think it's fine to have Legacy Vectors in 0.8 as a >

Re: Update on ARROW-1463, related subtasks and plan for testing and merging

2017-10-12 Thread Li Jin
To make clear, I think it's fine to have Legacy Vectors in 0.8 as a deprecated API. On Thu, Oct 12, 2017 at 2:19 PM, Li Jin wrote: > Siddharth, > > For working off a branch, Wes has created https://github.com/apache/ > arrow/tree/java-vector-refactor that we can submit PR

Re: Update on ARROW-1463, related subtasks and plan for testing and merging

2017-10-12 Thread Li Jin
Siddharth, For working off a branch, Wes has created https://github.com/apache/arrow/tree/java-vector-refactor that we can submit PR to. For Legacy vectors, I think it's fine because it's really just a migration path to help Dremio to migrate to the new vectors. I don't think other users, i.e.,

Re: Update on ARROW-1463, related subtasks and plan for testing and merging

2017-10-12 Thread Siddharth Teotia
Thanks Bryan and Li. Yes, the goal is to get this (and the subsequent patches) merged to the new branch. Once it is stabilized from different aspects, we can move to master. I am not sure of the exact mechanics when we work off a different project branch and not master. Does that sound good?

Re: Update on ARROW-1463, related subtasks and plan for testing and merging

2017-10-12 Thread Bryan Cutler
Thanks for the update Siddharth. From the Spark side of this, I definitely want to try to upgrade to the latest Arrow before the Spark 2.3 release but if it the refactor is too disruptive then others might get squeamish about upgrading. On the other hand, I don't think we should hold back on

Re: Update on ARROW-1463, related subtasks and plan for testing and merging

2017-10-11 Thread Li Jin
Hi Siddharth, Thanks for the update. This looks good. A few thoughts: *Compatibility:* It sounds like we will introduce some back-compatibility with the new Vector class. At this point I think our main Java users should be Spark and Dremio, is this right? - For Spark: It seems fine since

Update on ARROW-1463, related subtasks and plan for testing and merging

2017-10-10 Thread Siddharth Teotia
Hi All, I wanted to update everyone on state of this mini-project: - Requirements document and initial design proposal were sent out to the community for review and we have received some good feedback. All required docs are attached with corresponding JIRAs. - The initial