Re: Bulk JIRA email, or how to find good newbie tasks

2016-10-17 Thread Marcel Kornacker
Please don't add that comment. :)

What's currently labelled ramp-up is often not a good newbie task (and
maybe not even a good ramp-up task). The best way to identify newbie
tasks is for a few senior engineers to sift through the ramp-up tasks
and pick out maybe a few dozen that truly qualify as newbie tasks.

I'm happy to help out with that when I get back.

On Mon, Oct 17, 2016 at 6:50 PM, Jim Apple  wrote:
> The Impala JIRA has 129 tasks that have no assignee, are still open,
> and are labelled ramp* (i.e. ramp-up, ramp-up-introductory, etc.).
>
> I'd like to find which of those tasks are good tasks for someone who
> is making their first Impala patch. I intend to promote those on one
> or more of : the blog, the twitter account, this list, the user list,
> helpwanted.apache.org, and so on.
>
> The tasks should be the kind of thing that someone won't need too much
> hand-holding on, once their have their dev environment up and working.
>
> To do this, I was thinking of adding a comment to all 129 tasks to ask
> the watchers of each issue if it should be labelled "newbie". This
> will send hundreds of emails, which is a bummer, but it seems to me
> like the best way to track the discussions and decisions.
>
> What does everyone think?


Re: Bulk JIRA email, or how to find good newbie tasks

2016-10-17 Thread Lars Volker
Cool, it seems very helpful to me to have such a list of easy tasks.

I think that watchers of these issues could often be interested in a fix,
but might not necessarily know how much effort these issues are. Would it
be an alternative to ask people familiar with various parts of the codebase
for their feedback? You could make a list of Jira-searches (or a dashboard
with a pie chart) for each "component" (be, tests, fe, infra), and send
those around, so people just need to click and can go through the list,
identifying issues they think are newbie-friendly.

We could also tag those with the language involved, so people who want to
work in a particular programming language (Python, Java, C++) can find them
easily.

On Mon, Oct 17, 2016 at 9:50 AM, Jim Apple  wrote:

> The Impala JIRA has 129 tasks that have no assignee, are still open,
> and are labelled ramp* (i.e. ramp-up, ramp-up-introductory, etc.).
>
> I'd like to find which of those tasks are good tasks for someone who
> is making their first Impala patch. I intend to promote those on one
> or more of : the blog, the twitter account, this list, the user list,
> helpwanted.apache.org, and so on.
>
> The tasks should be the kind of thing that someone won't need too much
> hand-holding on, once their have their dev environment up and working.
>
> To do this, I was thinking of adding a comment to all 129 tasks to ask
> the watchers of each issue if it should be labelled "newbie". This
> will send hundreds of emails, which is a bummer, but it seems to me
> like the best way to track the discussions and decisions.
>
> What does everyone think?
>


[Toolchain-CR] Versioning of build artifacts

2016-10-17 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: Versioning of build artifacts
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4742/1/functions.sh
File functions.sh:

PS1, Line 425: git rev-parse
how about getting a short hash (I think --short) to make this a bit more 
human-friendly.

We could even start using tags in the future if we want.


-- 
To view, visit http://gerrit.cloudera.org:8080/4742
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb8774a7bd4bcae0c135684078f5ce89a28f6bc2
Gerrit-PatchSet: 1
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Toolchain-CR] Versioning of build artifacts

2016-10-17 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: Versioning of build artifacts
..


Patch Set 1:

Oh, nm the jenkins build # does give it the right ordering.

-- 
To view, visit http://gerrit.cloudera.org:8080/4742
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb8774a7bd4bcae0c135684078f5ce89a28f6bc2
Gerrit-PatchSet: 1
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Toolchain-CR] Versioning of build artifacts

2016-10-17 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: Versioning of build artifacts
..


Patch Set 1:

> My feeling was that we generally be referencing a fixed version of
 > the toolchain artifacts in most cases (rather than using "latest").
 > I think it could be useful to have that pointer, but I'm not sure
 > that it's the right thing to use for many use cases (beyond people
 > experimenting with a new toolchain build).
 > 
 > E.g. I was thinking we would probably bump toolchain versions along
 > with component versions in most cases.

Hm OK, that seems fine, just kind of a hassle if you want to look at the 
published toolchain since the build id doesn't indicate any ordering. I've done 
that a few times but I get that's probably not very common (e.g. to download 
and inspect particular bits manually).
Either way, this isn't a big deal -- this looks much better.

-- 
To view, visit http://gerrit.cloudera.org:8080/4742
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb8774a7bd4bcae0c135684078f5ce89a28f6bc2
Gerrit-PatchSet: 1
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Toolchain-CR] Versioning of build artifacts

2016-10-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Versioning of build artifacts
..


Patch Set 1:

My feeling was that we generally be referencing a fixed version of the 
toolchain artifacts in most cases (rather than using "latest"). I think it 
could be useful to have that pointer, but I'm not sure that it's the right 
thing to use for many use cases (beyond people experimenting with a new 
toolchain build).

E.g. I was thinking we would probably bump toolchain versions along with 
component versions in most cases.

-- 
To view, visit http://gerrit.cloudera.org:8080/4742
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb8774a7bd4bcae0c135684078f5ce89a28f6bc2
Gerrit-PatchSet: 1
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Toolchain-CR] Versioning of build artifacts

2016-10-17 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: Versioning of build artifacts
..


Patch Set 1:

Nice!

I think we'll need a good way to find/download the latest builds as well, since 
it'll be hard or impossible to tell from build-names generated via 
rand+githash. What if, for now, we duplicate the build dir this run is 
uploading and stores it as 'latest/' after all the projects build successfully. 
Obviously concurrent jobs could clobber each other, but that doesn't seem too 
concerning yet and it seems valuable to have an easy path to get the latest 
toolchain bits.

We also might want to create a small text file with the build number in that 
directory and have it pulled down with the bits so it can be stored in the 
Impala build and it'd be easy to tie bits back to a toolchain build, even if 
it's coming from "latest/".

-- 
To view, visit http://gerrit.cloudera.org:8080/4742
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb8774a7bd4bcae0c135684078f5ce89a28f6bc2
Gerrit-PatchSet: 1
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Toolchain-CR] Versioning of build artifacts

2016-10-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Versioning of build artifacts
..


Patch Set 1:

This solves a couple of pressing issues around build times and reproducibility 
of builds. Hoping this can start a discussion even if it turns out to not be 
exactly the solution we want (although it's been working well so far in 
testing).

-- 
To view, visit http://gerrit.cloudera.org:8080/4742
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb8774a7bd4bcae0c135684078f5ce89a28f6bc2
Gerrit-PatchSet: 1
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Toolchain-CR] Versioning of build artifacts

2016-10-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4742

Change subject: Versioning of build artifacts
..

Versioning of build artifacts

Previously publishing a new version of toolchain artifacts clobbered the
previous version. This is problematic when changing build options since
the new artifacts may not work in all circumstances. E.g. if an
older version of Impala depends on some particular detail of how the
older artifacts were built, we have no way of avoiding breakage.

Now a unique build ID is generated (including the git hash and the
jenkins job ID) and all build artifacts are uploaded into a directory
based on this unique id.

Also switches to building only the current artifacts by default (all
historical artifacts can be built with BUILD_HISTORICAL) to speed up
builds and reduce resource requirements.

Change-Id: Ifb8774a7bd4bcae0c135684078f5ce89a28f6bc2
---
M buildall.sh
M functions.sh
M init.sh
3 files changed, 65 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Toolchain refs/changes/42/4742/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4742
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ifb8774a7bd4bcae0c135684078f5ce89a28f6bc2
Gerrit-PatchSet: 1
Gerrit-Project: Toolchain
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


Re: Impala build error with libgflags-dev installed

2016-10-17 Thread Jim Apple
Thanks! Can you file a bug for this?

On Mon, Oct 17, 2016 at 12:01 PM, Yonghyun Hwang  wrote:
> Impala build sees an error if "libgflags-dev" package  is already installed
> in a machine. Here is the error.
>
> $ source bin/impala-config.sh && buildall.sh -skiptests -notests
> ...
> ...
> *Linking CXX shared library* ../../build/debug/gutil/libgutil.so
> /home/impala/Impala/toolchain/binutils-2.26-p1/bin/ld:
> /usr/lib/x86_64-linux-gnu/libgflags.a(libgflags_la-gflags.o): relocation
> R_X86_64_32 against `.rodata' can not be used when making a shared object;
> recompile with -fPIC
> */usr/lib/x86_64-linux-gnu/libgflags.a*: *error adding symbols: Bad value*
> collect2: error: ld returned 1 exit status
> make[3]: *** [be/build/debug/gutil/libgutil.so] Error 1
>
> $ apt search libgflags-dev | grep installed
>
> libgflags-dev/trusty,now 2.0-1.1ubuntu1 amd64 [installed]
>
> $ apt-file list libgflags-dev | grep libgflags.a
> libgflags-dev:* /usr/lib/x86_64-linux-gnu/libgflags.a*
>
>
> I think cmake picks up libgflags.a from "libgflags-dev" instead of that
> from ${IMPALA_HOME}/toolchain/gflags-2.0, which causes the build failure.
>
>
> -Yonghyun


Impala build error with libgflags-dev installed

2016-10-17 Thread Yonghyun Hwang
Impala build sees an error if "libgflags-dev" package  is already installed
in a machine. Here is the error.

$ source bin/impala-config.sh && buildall.sh -skiptests -notests
...
...
*Linking CXX shared library* ../../build/debug/gutil/libgutil.so
/home/impala/Impala/toolchain/binutils-2.26-p1/bin/ld:
/usr/lib/x86_64-linux-gnu/libgflags.a(libgflags_la-gflags.o): relocation
R_X86_64_32 against `.rodata' can not be used when making a shared object;
recompile with -fPIC
*/usr/lib/x86_64-linux-gnu/libgflags.a*: *error adding symbols: Bad value*
collect2: error: ld returned 1 exit status
make[3]: *** [be/build/debug/gutil/libgutil.so] Error 1

$ apt search libgflags-dev | grep installed

libgflags-dev/trusty,now 2.0-1.1ubuntu1 amd64 [installed]

$ apt-file list libgflags-dev | grep libgflags.a
libgflags-dev:* /usr/lib/x86_64-linux-gnu/libgflags.a*


I think cmake picks up libgflags.a from "libgflags-dev" instead of that
from ${IMPALA_HOME}/toolchain/gflags-2.0, which causes the build failure.


-Yonghyun


Re: Bootstrapping an Impala Development Environment From Scratch

2016-10-17 Thread Jim Apple
I think this takes at least 80GB of disk space and 16GB of RAM.

On Mon, Oct 17, 2016 at 11:50 AM, Laszlo Gaal  wrote:
> Running in a VM is good idea. Do  you have a recommendation on how much
> memory the VM should be configured with?
>
> On Mon, Oct 17, 2016 at 7:58 PM, Jim Apple  wrote:
>
>> If you are running Ubuntu 14.04, you can bootstrap a development
>> environment using the script bin/bootstrap_development.sh[0]. It will
>> alter your environment, including ~/.ssh/config and /etc/hosts, so
>> consider running it in a VM or container.
>>
>> It takes 6-7 hours in total to load all of the testdata and run all of
>> the tests. See the comments in the file for more information.
>>
>> [0]: https://git-wip-us.apache.org/repos/asf?p=incubator-impala.
>> git;a=blob;f=bin/bootstrap_development.sh;h=8c4f742ae058f8017858d2a749e882
>> 4be58bd410;hb=HEAD
>>


Re: Bootstrapping an Impala Development Environment From Scratch

2016-10-17 Thread Laszlo Gaal
Running in a VM is good idea. Do  you have a recommendation on how much
memory the VM should be configured with?

On Mon, Oct 17, 2016 at 7:58 PM, Jim Apple  wrote:

> If you are running Ubuntu 14.04, you can bootstrap a development
> environment using the script bin/bootstrap_development.sh[0]. It will
> alter your environment, including ~/.ssh/config and /etc/hosts, so
> consider running it in a VM or container.
>
> It takes 6-7 hours in total to load all of the testdata and run all of
> the tests. See the comments in the file for more information.
>
> [0]: https://git-wip-us.apache.org/repos/asf?p=incubator-impala.
> git;a=blob;f=bin/bootstrap_development.sh;h=8c4f742ae058f8017858d2a749e882
> 4be58bd410;hb=HEAD
>


Re: problems for the debuging the frontend in Eclipse IDE

2016-10-17 Thread Alex Behm
1. You need to generate en Eclipse project via "mvn eclipse:eclipse" from
the "fe" directory,
2. Before starting Eclipse, you need to "source bin/impala-config.sh" and
"source bin/set-classpath.sh".

If that still does not work, do you have a stack?

On Mon, Oct 17, 2016 at 10:41 AM, Jim Apple  wrote:

> Do the planner tests pass when you run them outside of the Eclipse IDE?
>
> On Sun, Oct 16, 2016 at 6:47 PM, Zhangjun (Jerry)
>  wrote:
> > Hi,
> > I download the source code from master branch. And build it and run all
> service with testdata successfully. But when I debug the planner test
> case(PlannerTest.java) for frontend in Eclipse IDE, the application
> shutdown and no any exception print( I have modified the log level to “all”
> level in Log4j.properties file.
> > I trace the case and found it occur after the loaded the libfessuport.so
> in the course of initialization of Frontend object.
> >
> > Thanks!
>


Bootstrapping an Impala Development Environment From Scratch

2016-10-17 Thread Jim Apple
If you are running Ubuntu 14.04, you can bootstrap a development
environment using the script bin/bootstrap_development.sh[0]. It will
alter your environment, including ~/.ssh/config and /etc/hosts, so
consider running it in a VM or container.

It takes 6-7 hours in total to load all of the testdata and run all of
the tests. See the comments in the file for more information.

[0]: 
https://git-wip-us.apache.org/repos/asf?p=incubator-impala.git;a=blob;f=bin/bootstrap_development.sh;h=8c4f742ae058f8017858d2a749e8824be58bd410;hb=HEAD


Re: problems for the debuging the frontend in Eclipse IDE

2016-10-17 Thread Jim Apple
Do the planner tests pass when you run them outside of the Eclipse IDE?

On Sun, Oct 16, 2016 at 6:47 PM, Zhangjun (Jerry)
 wrote:
> Hi,
> I download the source code from master branch. And build it and run all 
> service with testdata successfully. But when I debug the planner test 
> case(PlannerTest.java) for frontend in Eclipse IDE, the application shutdown 
> and no any exception print( I have modified the log level to “all” level in 
> Log4j.properties file.
> I trace the case and found it occur after the loaded the libfessuport.so in 
> the course of initialization of Frontend object.
>
> Thanks!


problems for the debuging the frontend in Eclipse IDE

2016-10-17 Thread Zhangjun (Jerry)
Hi,
I download the source code from master branch. And build it and run all service 
with testdata successfully. But when I debug the planner test 
case(PlannerTest.java) for frontend in Eclipse IDE, the application shutdown 
and no any exception print( I have modified the log level to “all” level in 
Log4j.properties file.
I trace the case and found it occur after the loaded the libfessuport.so in the 
course of initialization of Frontend object.

Thanks!


Bulk JIRA email, or how to find good newbie tasks

2016-10-17 Thread Jim Apple
The Impala JIRA has 129 tasks that have no assignee, are still open,
and are labelled ramp* (i.e. ramp-up, ramp-up-introductory, etc.).

I'd like to find which of those tasks are good tasks for someone who
is making their first Impala patch. I intend to promote those on one
or more of : the blog, the twitter account, this list, the user list,
helpwanted.apache.org, and so on.

The tasks should be the kind of thing that someone won't need too much
hand-holding on, once their have their dev environment up and working.

To do this, I was thinking of adding a comment to all 129 tasks to ask
the watchers of each issue if it should be labelled "newbie". This
will send hundreds of emails, which is a bummer, but it seems to me
like the best way to track the discussions and decisions.

What does everyone think?