Notes from the Pig Contributor Meeting held at Yahoo!, 4 November 2010.

We discussed the Turing Complete proposal posted at 
http://wiki.apache.org/pig/TuringCompletePig
  Dmitriy Ryaboy noted that he found the syntax for defining macros
confusing.  In particular, he did not like the way in and out
relations were defined.  He proposed instead that the syntax look
something like:

define macro_name(A:relation, B:relation, user:charrarray) returns C,D
{ ... }

and the invocation would look something like:

(X, Y) = macro_name(U, V, 'name');

There were concerns about assigning types to other fields, as we may
want to allow things like:

B = filter A by $predicate

Dmitriy pointed out that if we did assign types we could do more
semantic checks up front.

Also concern was expressed about adding 'relation' as a new type in
Pig in this way, since it could not be used anywhere else.

No solid conclusion was reached on what is the right syntax, though no
one voted for the currently proposed syntax.

Dmitriy also noted that if we picked the right scripting language
(such as JRuby) we could do the functions in that language without a
need for adding macros to Pig Latin.  Alan Gates responded that we did
not want to bind tightly to one language because different Pig users
preferred different scripting languages, and because we had found many
Pig users who wanted functions but not other elements of control flow.

Dmitriy gave an update on moving Piggybank to github.  A github
repository has been secured, committers assigned, a layout for the
code developed, and one function committed.  He agreed to write a wiki
page on it so others could start to use it.  Yahoo still needs to
propose a committer for this.

Jeff Zhang discussed work he would like to do to make PigServer thread
safe.  Everyone agreed this was a good idea, and no one else was
planning on working in that area in the near future.  Gerrit Jansen
van Vuuren brought up an issue he had encountered with re-using
instances of PigServer.  Every MapReduceLauncher instance brings up 2
monitor threads which it does not kill.  Thus eventually a re-used
version of PigServer eventually runs out of memory.

Alan.

Reply via email to