[jira] Updated: (PIG-1843) NPE in schema generation

2011-02-04 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1843:


Attachment: PIG-1843-1.patch

> NPE in schema generation
> 
>
> Key: PIG-1843
> URL: https://issues.apache.org/jira/browse/PIG-1843
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.8.0
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.8.0, 0.9.0
>
> Attachments: PIG-1843-1.patch
>
>
> Hit NPE in following script:
> {code}
> a = load 'table_testBagDereferenceInMiddle2' as (a0:chararray);
> b = foreach a generate MapGenerate(STRSPLIT(a0).$0));
> {code}
> {code}
> public class MapGenerate extends EvalFunc {
> @Override
> public Map exec(Tuple input) throws IOException {
> Map m = new HashMap();
> m.put("key", new Integer(input.size()));
> return m;
> }
> 
> @Override
> public Schema outputSchema(Schema input) {
> return new Schema(new Schema.FieldSchema(getSchemaName("parselong", 
> input), DataType.MAP));
> }
> }
> {code}
> Error message:
> Caused by: java.lang.NullPointerException
> at org.apache.pig.EvalFunc.getSchemaName(EvalFunc.java:76)
> at string.PARSELONG.outputSchema(PARSELONG.java:63)
> at 
> org.apache.pig.newplan.logical.expression.UserFuncExpression.getFieldSchema(UserFuncExpression.java:154)
> at 
> org.apache.pig.newplan.logical.optimizer.FieldSchemaResetter.execute(SchemaResetter.java:192)
> at 
> org.apache.pig.newplan.logical.expression.AllSameExpressionVisitor.visit(AllSameExpressionVisitor.java:143)
> at 
> org.apache.pig.newplan.logical.expression.UserFuncExpression.accept(UserFuncExpression.java:71)
> at 
> org.apache.pig.newplan.ReverseDependencyOrderWalker.walk(ReverseDependencyOrderWalker.java:70)
> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
> at 
> org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:104)
> at 
> org.apache.pig.newplan.logical.relational.LOGenerate.accept(LOGenerate.java:240)
> at 
> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
> at 
> org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:93)
> at 
> org.apache.pig.newplan.logical.relational.LOForEach.accept(LOForEach.java:73)
> at 
> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
> at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:279)
> at org.apache.pig.PigServer.compilePp(PigServer.java:1480)
> at org.apache.pig.PigServer.explain(PigServer.java:1042)

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (PIG-1843) NPE in schema generation

2011-02-04 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12990886#comment-12990886
 ] 

Daniel Dai commented on PIG-1843:
-

The problem happens when we have nested UDF:
1. Inner UDF does not define complete outputSchema
2. Outer UDF does not define getArgToFuncMapping
3. outputSchema in outer UDF uses inner schema to infer alias

> NPE in schema generation
> 
>
> Key: PIG-1843
> URL: https://issues.apache.org/jira/browse/PIG-1843
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.8.0
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.8.0, 0.9.0
>
>
> Hit NPE in following script:
> {code}
> a = load 'table_testBagDereferenceInMiddle2' as (a0:chararray);
> b = foreach a generate MapGenerate(STRSPLIT(a0).$0));
> {code}
> {code}
> public class MapGenerate extends EvalFunc {
> @Override
> public Map exec(Tuple input) throws IOException {
> Map m = new HashMap();
> m.put("key", new Integer(input.size()));
> return m;
> }
> 
> @Override
> public Schema outputSchema(Schema input) {
> return new Schema(new Schema.FieldSchema(getSchemaName("parselong", 
> input), DataType.MAP));
> }
> }
> {code}
> Error message:
> Caused by: java.lang.NullPointerException
> at org.apache.pig.EvalFunc.getSchemaName(EvalFunc.java:76)
> at string.PARSELONG.outputSchema(PARSELONG.java:63)
> at 
> org.apache.pig.newplan.logical.expression.UserFuncExpression.getFieldSchema(UserFuncExpression.java:154)
> at 
> org.apache.pig.newplan.logical.optimizer.FieldSchemaResetter.execute(SchemaResetter.java:192)
> at 
> org.apache.pig.newplan.logical.expression.AllSameExpressionVisitor.visit(AllSameExpressionVisitor.java:143)
> at 
> org.apache.pig.newplan.logical.expression.UserFuncExpression.accept(UserFuncExpression.java:71)
> at 
> org.apache.pig.newplan.ReverseDependencyOrderWalker.walk(ReverseDependencyOrderWalker.java:70)
> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
> at 
> org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:104)
> at 
> org.apache.pig.newplan.logical.relational.LOGenerate.accept(LOGenerate.java:240)
> at 
> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
> at 
> org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:93)
> at 
> org.apache.pig.newplan.logical.relational.LOForEach.accept(LOForEach.java:73)
> at 
> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
> at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:279)
> at org.apache.pig.PigServer.compilePp(PigServer.java:1480)
> at org.apache.pig.PigServer.explain(PigServer.java:1042)

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (PIG-1843) NPE in schema generation

2011-02-04 Thread Daniel Dai (JIRA)
NPE in schema generation


 Key: PIG-1843
 URL: https://issues.apache.org/jira/browse/PIG-1843
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.9.0, 0.8.0


Hit NPE in following script:
{code}
a = load 'table_testBagDereferenceInMiddle2' as (a0:chararray);
b = foreach a generate MapGenerate(STRSPLIT(a0).$0));
{code}
{code}
public class MapGenerate extends EvalFunc {
@Override
public Map exec(Tuple input) throws IOException {
Map m = new HashMap();
m.put("key", new Integer(input.size()));
return m;
}

@Override
public Schema outputSchema(Schema input) {
return new Schema(new Schema.FieldSchema(getSchemaName("parselong", 
input), DataType.MAP));
}
}
{code}

Error message:
Caused by: java.lang.NullPointerException
at org.apache.pig.EvalFunc.getSchemaName(EvalFunc.java:76)
at string.PARSELONG.outputSchema(PARSELONG.java:63)
at 
org.apache.pig.newplan.logical.expression.UserFuncExpression.getFieldSchema(UserFuncExpression.java:154)
at 
org.apache.pig.newplan.logical.optimizer.FieldSchemaResetter.execute(SchemaResetter.java:192)
at 
org.apache.pig.newplan.logical.expression.AllSameExpressionVisitor.visit(AllSameExpressionVisitor.java:143)
at 
org.apache.pig.newplan.logical.expression.UserFuncExpression.accept(UserFuncExpression.java:71)
at 
org.apache.pig.newplan.ReverseDependencyOrderWalker.walk(ReverseDependencyOrderWalker.java:70)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
at 
org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:104)
at 
org.apache.pig.newplan.logical.relational.LOGenerate.accept(LOGenerate.java:240)
at 
org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
at 
org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:93)
at 
org.apache.pig.newplan.logical.relational.LOForEach.accept(LOForEach.java:73)
at 
org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
at 
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:279)
at org.apache.pig.PigServer.compilePp(PigServer.java:1480)
at org.apache.pig.PigServer.explain(PigServer.java:1042)


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (PIG-1793) Add macro expansion to Pig Latin

2011-02-04 Thread Richard Ding (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12990842#comment-12990842
 ] 

Richard Ding commented on PIG-1793:
---

Unit tests pass. The output of test-patch:

{code}
 [exec] -1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] -1 javac.  The applied patch generated 973 javac compiler 
warnings (more than the trunk's current 962 warnings).
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] -1 release audit.  The applied patch generated 514 release 
audit warnings (more than the trunk's current 513 warnings).
{code}

The release audit warning is html related. 

> Add macro expansion to Pig Latin
> 
>
> Key: PIG-1793
> URL: https://issues.apache.org/jira/browse/PIG-1793
> Project: Pig
>  Issue Type: New Feature
>Reporter: Richard Ding
>Assignee: Richard Ding
> Fix For: 0.9.0
>
> Attachments: PIG-1793.patch
>
>
> As production Pig scripts grow longer and longer, Pig Latin has a need to 
> integrate standard programming techniques of separation and code sharing 
> offered by functions and modules.  A proposal of adding macro expansion to 
> Pig Latin is posted here: http://wiki.apache.org/pig/TuringCompletePig
> Below is a brief summary of the proposed syntax (and examples):
>* Macro Definition 
> The existing DEFINE keyword will be expanded to allow definitions of Pig 
> macros. 
> *Syntax*
> {code}
> define  () returns  {
> 
> };
> {code}
> *Example*
> {code}
> define my_macro(A, sortkey) returns C {
> B = filter $A by my_filter(*);
> $C = order B by $sortkey;
> }
> {code}
>* Macro Expansion 
> *Syntax*
> {code}
>  =  ();
> {code}
> *Example:* Use above macro in a Pig script:
> {code}
> X = load 'foo' as (user, address, phone);
> Y = my_macro(X, user);
> store Y into 'bar';
> {code}
> This script is expanded into the following Pig Latin statements: 
> {code}
> X = load 'foo' as (user, address, phone);
> macro_my_macro_B_1 = filter X by my_filter(*);
> Y = order macro_my_macro_B_1 by user;
> store Y into 'bar';
> {code}
> *Notes*
> 1. Any alias in the macro which isn't visible from outside will be prefixed 
> with macro name and suffixed with instance id to avoid namespace collision. 
> 2. Macro expansion is not a complete replacement for function calls. 
> Recursive expansions are not supported.  
>* Macro Import 
> The new IMPORT keyword can be used to add macros defined in another Pig Latin 
> file.
> *Syntax*
> {code}
> import ;
> {code}
> *Example*
> {code}
> import my_macro.pig;
> {code}
> *Note:* All macro names are in the global namespace. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (PIG-1793) Add macro expansion to Pig Latin

2011-02-04 Thread Richard Ding (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12990818#comment-12990818
 ] 

Richard Ding commented on PIG-1793:
---

Attaching the patch that implements the above proposed Pig syntax. The only 
change is the IMPORT statement which now requires the file name be a quoted 
string:

{code}
import 'my_macro.pig';
{code}

> Add macro expansion to Pig Latin
> 
>
> Key: PIG-1793
> URL: https://issues.apache.org/jira/browse/PIG-1793
> Project: Pig
>  Issue Type: New Feature
>Reporter: Richard Ding
>Assignee: Richard Ding
> Fix For: 0.9.0
>
> Attachments: PIG-1793.patch
>
>
> As production Pig scripts grow longer and longer, Pig Latin has a need to 
> integrate standard programming techniques of separation and code sharing 
> offered by functions and modules.  A proposal of adding macro expansion to 
> Pig Latin is posted here: http://wiki.apache.org/pig/TuringCompletePig
> Below is a brief summary of the proposed syntax (and examples):
>* Macro Definition 
> The existing DEFINE keyword will be expanded to allow definitions of Pig 
> macros. 
> *Syntax*
> {code}
> define  () returns  {
> 
> };
> {code}
> *Example*
> {code}
> define my_macro(A, sortkey) returns C {
> B = filter $A by my_filter(*);
> $C = order B by $sortkey;
> }
> {code}
>* Macro Expansion 
> *Syntax*
> {code}
>  =  ();
> {code}
> *Example:* Use above macro in a Pig script:
> {code}
> X = load 'foo' as (user, address, phone);
> Y = my_macro(X, user);
> store Y into 'bar';
> {code}
> This script is expanded into the following Pig Latin statements: 
> {code}
> X = load 'foo' as (user, address, phone);
> macro_my_macro_B_1 = filter X by my_filter(*);
> Y = order macro_my_macro_B_1 by user;
> store Y into 'bar';
> {code}
> *Notes*
> 1. Any alias in the macro which isn't visible from outside will be prefixed 
> with macro name and suffixed with instance id to avoid namespace collision. 
> 2. Macro expansion is not a complete replacement for function calls. 
> Recursive expansions are not supported.  
>* Macro Import 
> The new IMPORT keyword can be used to add macros defined in another Pig Latin 
> file.
> *Syntax*
> {code}
> import ;
> {code}
> *Example*
> {code}
> import my_macro.pig;
> {code}
> *Note:* All macro names are in the global namespace. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (PIG-1793) Add macro expansion to Pig Latin

2011-02-04 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1793:
--

Attachment: PIG-1793.patch

> Add macro expansion to Pig Latin
> 
>
> Key: PIG-1793
> URL: https://issues.apache.org/jira/browse/PIG-1793
> Project: Pig
>  Issue Type: New Feature
>Reporter: Richard Ding
>Assignee: Richard Ding
> Fix For: 0.9.0
>
> Attachments: PIG-1793.patch
>
>
> As production Pig scripts grow longer and longer, Pig Latin has a need to 
> integrate standard programming techniques of separation and code sharing 
> offered by functions and modules.  A proposal of adding macro expansion to 
> Pig Latin is posted here: http://wiki.apache.org/pig/TuringCompletePig
> Below is a brief summary of the proposed syntax (and examples):
>* Macro Definition 
> The existing DEFINE keyword will be expanded to allow definitions of Pig 
> macros. 
> *Syntax*
> {code}
> define  () returns  {
> 
> };
> {code}
> *Example*
> {code}
> define my_macro(A, sortkey) returns C {
> B = filter $A by my_filter(*);
> $C = order B by $sortkey;
> }
> {code}
>* Macro Expansion 
> *Syntax*
> {code}
>  =  ();
> {code}
> *Example:* Use above macro in a Pig script:
> {code}
> X = load 'foo' as (user, address, phone);
> Y = my_macro(X, user);
> store Y into 'bar';
> {code}
> This script is expanded into the following Pig Latin statements: 
> {code}
> X = load 'foo' as (user, address, phone);
> macro_my_macro_B_1 = filter X by my_filter(*);
> Y = order macro_my_macro_B_1 by user;
> store Y into 'bar';
> {code}
> *Notes*
> 1. Any alias in the macro which isn't visible from outside will be prefixed 
> with macro name and suffixed with instance id to avoid namespace collision. 
> 2. Macro expansion is not a complete replacement for function calls. 
> Recursive expansions are not supported.  
>* Macro Import 
> The new IMPORT keyword can be used to add macros defined in another Pig Latin 
> file.
> *Syntax*
> {code}
> import ;
> {code}
> *Example*
> {code}
> import my_macro.pig;
> {code}
> *Note:* All macro names are in the global namespace. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (PIG-1825) ability to turn off the write ahead log for pig's HBaseStorage

2011-02-04 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12990759#comment-12990759
 ] 

Alan Gates commented on PIG-1825:
-

Unit tests pass.  The output of test-patch:

[exec] -1 overall.
 [exec]
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec]
 [exec] -1 tests included.  The patch doesn't appear to include any new 
or modified tests.
 [exec] Please justify why no tests are needed for 
this patch.
 [exec]
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec]
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec]
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec]
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec]
 [exec]
 [exec]

As this points out, the functionality isn't tested.  Before we can check it in 
we'll need a test added to the hbase unit tests that shows that you can write 
to hbase with this option set.

> ability to turn off the write ahead log for pig's HBaseStorage
> --
>
> Key: PIG-1825
> URL: https://issues.apache.org/jira/browse/PIG-1825
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0
>Reporter: Corbin Hoenes
>Priority: Minor
> Fix For: 0.8.0
>
> Attachments: HBaseStorage_noWAL.patch
>
>
> Added an option to allow a caller of HBaseStorage to turn off the 
> WriteAheadLog feature while doing bulk loads into hbase.
> From the performance tuning wikipage: 
> http://wiki.apache.org/hadoop/PerformanceTuning
> "To speed up the inserts in a non critical job (like an import job), you can 
> use Put.writeToWAL(false) to bypass writing to the write ahead log."
> We've tested this on HBase 0.20.6 and it helps dramatically.  
> The -noWAL options is passed in just like other options for hbase storage:
> STORE myalias INTO 'MyTable' USING 
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('mycolumnfamily:field1 
> mycolumnfamily:field2','-noWAL');
> This would be my first patch so please educate me with any steps I need to 
> do.  

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Pig developer meeting in February

2011-02-04 Thread Romain Rigaux
Me too, I am interested in coming,

Romain

On Fri, Jan 28, 2011 at 3:35 PM, Santhosh Srinivasan wrote:

> I am planning to attend.
>
> -Original Message-
> From: Olga Natkovich [mailto:ol...@yahoo-inc.com]
> Sent: Friday, January 28, 2011 12:58 PM
> To: dev@pig.apache.org
> Subject: RE: Pig developer meeting in February
>
> I believe we have critical mass so the meeting is on!
>
> If you have not responded yet but planning to attend, please, let me know.
>
> Thanks,
>
> Olga
>
> -Original Message-
> From: Julien Le Dem [mailto:led...@yahoo-inc.com]
> Sent: Thursday, January 27, 2011 5:21 PM
> To: dev@pig.apache.org
> Subject: Re: Pig developer meeting in February
>
> Me too.
> Julien
>
>
> On 1/27/11 4:09 PM, "Dmitriy Ryaboy"  wrote:
>
> Ok yeah I'll come :).
>
>
>
> On Thu, Jan 27, 2011 at 3:17 PM, Olga Natkovich 
> wrote:
>
> > While there is a lively discussion on this thread, I have not actually
> > gotten any responses to having the meeting with exception of 1 person :).
> >
> > Please, let me know by the end of the week if you are planning to attend.
> > If we don't get at least a few more responses I suggest we postpone
> > the meeting.
> >
> > Thanks,
> >
> > Olga
> >
> > -Original Message-
> > From: Dmitriy Ryaboy [mailto:dvrya...@gmail.com]
> > Sent: Wednesday, January 26, 2011 6:04 PM
> > To: dev@pig.apache.org
> > Subject: Re: Pig developer meeting in February
> >
> > Right, we do partition filtering, but not true predicate pushdown.
> >
> > On Wed, Jan 26, 2011 at 5:59 PM, Daniel Dai 
> > wrote:
> >
> > > Are you talking about LoadMetadata.setPartitionFilter?
> > > PartitionFilterOptimizer will do that.
> > >
> > > Daniel
> > >
> > >
> > > Dmitriy Ryaboy wrote:
> > >
> > >> I may be wrong but I think predicate pushdown is designed for, but
> > >> not actually implemented in the current LoadPushdown interface (you
> > >> can only push projections). If I am wrong, that's great.. but if
> > >> not, that would
> > be
> > >> an important feature to add, as people are trying to connect Pig to
> > >> "smart"
> > >> storage systems like rdbmses, HBase, and Cassandra more and more.
> > >> I
> > think
> > >> we only kind of simulate this with partition keys info, which is
> > >> not always sufficient
> > >>
> > >> D
> > >>
> > >> On Wed, Jan 26, 2011 at 2:41 PM, Julien Le Dem
> > >> 
> > >> wrote:
> > >>
> > >>
> > >>
> > >>> If making Pig Thread safe (i.e.: two threads running a different
> > >>> pig
> > >>> script) is important then we need to change some of the APIs from
> > static
> > >>> singleton access to a dependency injection pattern.
> > >>> In that case, this should probably be done before 1.0 For example:
> > >>> UDFContext should be passed to the UDF after construction (similar
> > >>> to the SevrletContext in Servlet or the way Hadoop passes the
> > >>> context to tasks) Also a clearly separated API that does not
> > >>> depend on the Pig implementation would help.
> > >>> For example UDFContext is in org.apache.pig.impl.util when it
> > >>> would be better in org.apache.pig.api (Or at least an interface
> > >>> defining it)
> > >>>
> > >>> Julien
> > >>>
> > >>> On 1/24/11 10:14 AM, "Olga Natkovich"  wrote:
> > >>>
> > >>> Hi Guys,
> > >>>
> > >>> I think it is time for us to have another meeting. Yahoo would be
> > >>> happy to host if this works for everybody. How about Wednesday,
> > >>> 2/9 4-6 pm.
> > >>> Please,
> > >>> let us know if you are planning to attend and if the date/time
> > >>> works
> > for
> > >>> you.
> > >>>
> > >>> Things that come to mind to discuss and as always feel free to
> > >>> suggest
> > >>> others:
> > >>>
> > >>> -  Error handling proposal - this might be easier to finalize
> > >>> face-to-face
> > >>> -  Pig 0.9 plan
> > >>> -  Pig Roadmap beyond 0.9
> > >>> oWhat do we want to do in Pig.next?
> > >>> oAre we ready for Pig 1.0
> > >>>
> > >>> Olga
> > >>>
> > >>>
> > >>>
> > >>>
> > >>
> > >
> >
>
>


[jira] Commented: (PIG-1794) Javascript support for Pig embedding and UDFs in scripting languages

2011-02-04 Thread Richard Ding (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12990661#comment-12990661
 ] 

Richard Ding commented on PIG-1794:
---

Review comments are here https://reviews.apache.org/r/321/.

Since the review board hasn't been linked with Jira, please upload the new 
patch to the jira.

As for the 'include' statement, a related jira is PIG-1824 where the idea is to 
add SHIP clause so Pig would ship 'import/include' scripts to the backend.

> Javascript support for Pig embedding and UDFs in scripting languages
> 
>
> Key: PIG-1794
> URL: https://issues.apache.org/jira/browse/PIG-1794
> Project: Pig
>  Issue Type: New Feature
>  Components: impl
>Affects Versions: 0.9.0
>Reporter: Julien Le Dem
>Assignee: Julien Le Dem
> Fix For: 0.9.0
>
> Attachments: jsScripting.patch
>
>
> The attached patch proposes a javascript implementation for Pig embedding and 
> UDFs in scripting languages.
> It is similar to the Jython implementation and uses Rhino provided in the JDK.
> some differences:
>  - output schema is provided by: .outSchema="" as 
> javascript does not have annotations or decorators but functions are first 
> class objects
>  - tuples are converted to objects using the input schema (the other way 
> around using the output schema)
> The attached patch is not final yet. In particular it lacks unit tests.
> See test/org/apache/pig/test/data/tc.js for the "transitive closure" example
> See the following JIRAs for more context:
> https://issues.apache.org/jira/browse/PIG-928
> https://issues.apache.org/jira/browse/PIG-1479

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (PIG-1782) Add ability to load data by column family in HBaseStorage

2011-02-04 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12990653#comment-12990653
 ] 

Dmitriy V. Ryaboy commented on PIG-1782:


That seems reasonable to me.

The only reason I suggest deprecating the current HBaseStorage is that it's 
awkwardly placed in backend.hadoop.hbase which is not where anyone really 
expects to find it. But I guess we can do that in a different ticket.

> Add ability to load data by column family in HBaseStorage
> -
>
> Key: PIG-1782
> URL: https://issues.apache.org/jira/browse/PIG-1782
> Project: Pig
>  Issue Type: New Feature
> Environment: Java 6, Mac OS X 10.6
>Reporter: Eric Yang
>Assignee: Bill Graham
>
> It would be nice to load all columns in the column family by using short hand 
> syntax like:
> {noformat}
> CpuMetrics = load 'hbase://SystemMetrics' USING 
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cpu:','-loadKey');
> {noformat}
> Assuming there are columns cpu: sys.0, cpu:sys.1, cpu:user.0, cpu:user.1,  in 
> cpu column family.
> CpuMetrics would contain something like:
> {noformat}
> (rowKey, cpu:sys.0, cpu:sys.1, cpu:user.0, cpu:user.1)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (PIG-1717) pig needs to call setPartitionFilter if schema is null but getPartitionKeys is not

2011-02-04 Thread Gerrit Jansen van Vuuren (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12990496#comment-12990496
 ] 

Gerrit Jansen van Vuuren commented on PIG-1717:
---

Thanks :)

> pig needs to call setPartitionFilter if schema is null but getPartitionKeys 
> is not
> --
>
> Key: PIG-1717
> URL: https://issues.apache.org/jira/browse/PIG-1717
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Affects Versions: 0.9.0
>Reporter: Gerrit Jansen van Vuuren
>Assignee: Gerrit Jansen van Vuuren
>Priority: Minor
> Fix For: 0.9.0
>
> Attachments: PIG-1717.patch, PIG-1717.v1.patch, PIG-1717.v2.patch, 
> patchReleaseAuditWarnings.txt.gz, testlog.tgz, 
> trunkReleaseAuditWarnings.txt.gz
>
>
> I'm writing a loader that works with hive style partitioning e.g. 
> /logs/type1/daydate=2010-11-01
> The loader does not know the schema upfront and this is something that the 
> user adds in the script using the AS clause.
> The problem is that this user defined schema is not available to the loader, 
> so the loader cannot return any schema, the Loader does know what the 
> partition keys are and pig needs in some way to know about these partition 
> keys. 
> Currently if the schema is null pig never calls the 
> LoadMetaData:getPartitionKeys method or the setPartitionFilter method.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira