[jira] Commented: (PIG-1285) Allow SingleTupleBag to be serialized

2010-03-20 Thread Pradeep Kamath (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12847721#action_12847721
 ] 

Pradeep Kamath commented on PIG-1285:
-

yes

 Allow SingleTupleBag to be serialized
 -

 Key: PIG-1285
 URL: https://issues.apache.org/jira/browse/PIG-1285
 Project: Pig
  Issue Type: Improvement
Reporter: Dmitriy V. Ryaboy
Assignee: Dmitriy V. Ryaboy
 Fix For: 0.7.0

 Attachments: PIG-1285.patch


 Currently, Pig uses a SingleTupleBag for efficiency when a full-blown 
 spillable bag implementation is not needed in the Combiner optimization.
 Unfortunately this can create problems. The below Initial.exec() code fails 
 at run-time with the message that a SingleTupleBag cannot be serialized:
 {code}
 @Override
 public Tuple exec(Tuple in) throws IOException {
   // single record. just copy.
   if (in == null) return null;   
   try {
  Tuple resTuple = tupleFactory_.newTuple(in.size());
  for (int i=0; i in.size(); i++) {
resTuple.set(i, in.get(i));
 }
 return resTuple;
} catch (IOException e) {
  log.warn(e);
  return null;
   }
 }
 {code}
 The code below can fix the problem in the UDF, but it seems like something 
 that should be handled transparently, not requiring UDF authors to know about 
 SingleTupleBags.
 {code}
 @Override
 public Tuple exec(Tuple in) throws IOException {
   // single record. just copy.
   if (in == null) return null;   
   
   /*
* Unfortunately SingleTupleBags are not serializable. We cache whether 
 a given index contains a bag
* in the map below, and copy all bags into DefaultBags before 
 returning to avoid serialization exceptions.
*/
   MapInteger, Boolean isBagAtIndex = Maps.newHashMap();
   
   try {
 Tuple resTuple = tupleFactory_.newTuple(in.size());
 for (int i=0; i in.size(); i++) {
   Object obj = in.get(i);
   if (!isBagAtIndex.containsKey(i)) {
 isBagAtIndex.put(i, obj instanceof SingleTupleBag);
   }
   if (isBagAtIndex.get(i)) {
 DataBag newBag = bagFactory_.newDefaultBag();
 newBag.addAll((DataBag)obj);
 obj = newBag;
   }
   resTuple.set(i, obj);
 }
 return resTuple;
   } catch (IOException e) {
 log.warn(e);
 return null;
   }
 }
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1308) Inifinite loop in JobClient when reading from BinStorage Message: [org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 2]

2010-03-20 Thread Pradeep Kamath (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Kamath updated PIG-1308:


Attachment: PIG-1308.patch

The root cause of the issue is that the OpLimitOptimizer has a relaxed check() 
implementation which only checks if the node matched by RuleMatcher is a 
LOLimit which would be true any time there is a LOLimit in the plan. This 
results in the optimizer running 500 (the current max) iterations of all rules 
since the OpLimitOptimizer always matches.

The attached patch fixes the issue by tightening the implementation of 
OpLimitOptimizer.check() to return false in cases where LOLimit cannot be 
pushed up.

 Inifinite loop in JobClient when reading from BinStorage Message: 
 [org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2]
 

 Key: PIG-1308
 URL: https://issues.apache.org/jira/browse/PIG-1308
 Project: Pig
  Issue Type: Bug
Reporter: Viraj Bhat
Assignee: Pradeep Kamath
 Fix For: 0.7.0

 Attachments: PIG-1308.patch


 Simple script fails to read files from BinStorage() and fails to submit jobs 
 to JobTracker. This occurs with trunk and not with Pig 0.6 branch.
 {code}
 data = load 'binstoragesample' using BinStorage() as (s, m, l);
 A = foreach ULT generate   s#'key' as value;
 X = limit A 20;
 dump X;
 {code}
 When this script is submitted to the Jobtracker, we found the following error:
 2010-03-18 22:31:22,296 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 2010-03-18 22:32:01,574 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 2010-03-18 22:32:43,276 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 2010-03-18 22:33:21,743 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 2010-03-18 22:34:02,004 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 2010-03-18 22:34:43,442 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 2010-03-18 22:35:25,907 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 2010-03-18 22:36:07,402 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 2010-03-18 22:36:48,596 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 2010-03-18 22:37:28,014 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 2010-03-18 22:38:04,823 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 2010-03-18 22:38:38,981 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 2010-03-18 22:39:12,220 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 Stack Trace revelead 
 at org.apache.pig.impl.io.ReadToEndLoader.init(ReadToEndLoader.java:144)
 at 
 org.apache.pig.impl.io.ReadToEndLoader.init(ReadToEndLoader.java:115)
 at org.apache.pig.builtin.BinStorage.getSchema(BinStorage.java:404)
 at 
 org.apache.pig.impl.logicalLayer.LOLoad.determineSchema(LOLoad.java:167)
 at 
 org.apache.pig.impl.logicalLayer.LOLoad.getProjectionMap(LOLoad.java:263)
 at 
 org.apache.pig.impl.logicalLayer.ProjectionMapCalculator.visit(ProjectionMapCalculator.java:112)
 at org.apache.pig.impl.logicalLayer.LOLoad.visit(LOLoad.java:210)
 at org.apache.pig.impl.logicalLayer.LOLoad.visit(LOLoad.java:52)
 at 
 org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:69)
 at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
 at 
 org.apache.pig.impl.logicalLayer.optimizer.LogicalTransformer.rebuildProjectionMaps(LogicalTransformer.java:76)
 at 
 org.apache.pig.impl.logicalLayer.optimizer.LogicalOptimizer.optimize(LogicalOptimizer.java:216)
 at org.apache.pig.PigServer.compileLp(PigServer.java:883)
 at org.apache.pig.PigServer.store(PigServer.java:564)
 The binstorage data was generated from 2 datasets using limit and union:
 {code}
 Large1 = load 'input1'  using PigStorage();
 Large2 = load 'input2' using PigStorage();
 V = limit Large1 1;
 C = limit Large2 1;
 U = union V, C;
 store U into 'binstoragesample' using BinStorage();
 {code}

-- 
This message is 

[jira] Updated: (PIG-1308) Inifinite loop in JobClient when reading from BinStorage Message: [org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 2]

2010-03-20 Thread Pradeep Kamath (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Kamath updated PIG-1308:


Status: Patch Available  (was: Open)

 Inifinite loop in JobClient when reading from BinStorage Message: 
 [org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2]
 

 Key: PIG-1308
 URL: https://issues.apache.org/jira/browse/PIG-1308
 Project: Pig
  Issue Type: Bug
Reporter: Viraj Bhat
Assignee: Pradeep Kamath
 Fix For: 0.7.0

 Attachments: PIG-1308.patch


 Simple script fails to read files from BinStorage() and fails to submit jobs 
 to JobTracker. This occurs with trunk and not with Pig 0.6 branch.
 {code}
 data = load 'binstoragesample' using BinStorage() as (s, m, l);
 A = foreach ULT generate   s#'key' as value;
 X = limit A 20;
 dump X;
 {code}
 When this script is submitted to the Jobtracker, we found the following error:
 2010-03-18 22:31:22,296 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 2010-03-18 22:32:01,574 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 2010-03-18 22:32:43,276 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 2010-03-18 22:33:21,743 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 2010-03-18 22:34:02,004 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 2010-03-18 22:34:43,442 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 2010-03-18 22:35:25,907 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 2010-03-18 22:36:07,402 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 2010-03-18 22:36:48,596 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 2010-03-18 22:37:28,014 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 2010-03-18 22:38:04,823 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 2010-03-18 22:38:38,981 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 2010-03-18 22:39:12,220 [main] INFO  
 org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
 process : 2
 Stack Trace revelead 
 at org.apache.pig.impl.io.ReadToEndLoader.init(ReadToEndLoader.java:144)
 at 
 org.apache.pig.impl.io.ReadToEndLoader.init(ReadToEndLoader.java:115)
 at org.apache.pig.builtin.BinStorage.getSchema(BinStorage.java:404)
 at 
 org.apache.pig.impl.logicalLayer.LOLoad.determineSchema(LOLoad.java:167)
 at 
 org.apache.pig.impl.logicalLayer.LOLoad.getProjectionMap(LOLoad.java:263)
 at 
 org.apache.pig.impl.logicalLayer.ProjectionMapCalculator.visit(ProjectionMapCalculator.java:112)
 at org.apache.pig.impl.logicalLayer.LOLoad.visit(LOLoad.java:210)
 at org.apache.pig.impl.logicalLayer.LOLoad.visit(LOLoad.java:52)
 at 
 org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:69)
 at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
 at 
 org.apache.pig.impl.logicalLayer.optimizer.LogicalTransformer.rebuildProjectionMaps(LogicalTransformer.java:76)
 at 
 org.apache.pig.impl.logicalLayer.optimizer.LogicalOptimizer.optimize(LogicalOptimizer.java:216)
 at org.apache.pig.PigServer.compileLp(PigServer.java:883)
 at org.apache.pig.PigServer.store(PigServer.java:564)
 The binstorage data was generated from 2 datasets using limit and union:
 {code}
 Large1 = load 'input1'  using PigStorage();
 Large2 = load 'input2' using PigStorage();
 V = limit Large1 1;
 C = limit Large2 1;
 U = union V, C;
 store U into 'binstoragesample' using BinStorage();
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Pig-trunk build 713

2010-03-20 Thread Giridharan Kesavan
Pig-trunk build stuck on h7 machine for more than a day, I ve killed the stuck 
job and restarted the build on hudson.

http://hudson.zones.apache.org/hudson/view/Pig/job/Pig-trunk/713/


-Giri


[jira] Commented: (PIG-1282) [zebra] make Zebra's pig test cases run on real cluster

2010-03-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12847859#action_12847859
 ] 

Hadoop QA commented on PIG-1282:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12439224/PIG-1282.patch
  against trunk revision 925513.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 254 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

-1 release audit.  The applied patch generated 523 release audit warnings 
(more than the trunk's current 522 warnings).

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/257/testReport/
Release audit warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/257/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/257/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/257/console

This message is automatically generated.

 [zebra] make Zebra's pig test cases run on real cluster
 ---

 Key: PIG-1282
 URL: https://issues.apache.org/jira/browse/PIG-1282
 Project: Pig
  Issue Type: Task
Affects Versions: 0.6.0
Reporter: Chao Wang
Assignee: Chao Wang
 Fix For: 0.7.0

 Attachments: PIG-1282.patch, PIG-1282.patch, PIG-1282.patch


 The goal of this task is to make Zebra's pig test cases run on real cluster.
 Currently Zebra's pig test cases are mostly tested using MiniCluster. We want 
 to use a real hadoop cluster to test them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1285) Allow SingleTupleBag to be serialized

2010-03-20 Thread Dmitriy V. Ryaboy (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy V. Ryaboy updated PIG-1285:
---

Status: Open  (was: Patch Available)

 Allow SingleTupleBag to be serialized
 -

 Key: PIG-1285
 URL: https://issues.apache.org/jira/browse/PIG-1285
 Project: Pig
  Issue Type: Improvement
Reporter: Dmitriy V. Ryaboy
Assignee: Dmitriy V. Ryaboy
 Fix For: 0.7.0

 Attachments: PIG-1285.2.patch, PIG-1285.patch


 Currently, Pig uses a SingleTupleBag for efficiency when a full-blown 
 spillable bag implementation is not needed in the Combiner optimization.
 Unfortunately this can create problems. The below Initial.exec() code fails 
 at run-time with the message that a SingleTupleBag cannot be serialized:
 {code}
 @Override
 public Tuple exec(Tuple in) throws IOException {
   // single record. just copy.
   if (in == null) return null;   
   try {
  Tuple resTuple = tupleFactory_.newTuple(in.size());
  for (int i=0; i in.size(); i++) {
resTuple.set(i, in.get(i));
 }
 return resTuple;
} catch (IOException e) {
  log.warn(e);
  return null;
   }
 }
 {code}
 The code below can fix the problem in the UDF, but it seems like something 
 that should be handled transparently, not requiring UDF authors to know about 
 SingleTupleBags.
 {code}
 @Override
 public Tuple exec(Tuple in) throws IOException {
   // single record. just copy.
   if (in == null) return null;   
   
   /*
* Unfortunately SingleTupleBags are not serializable. We cache whether 
 a given index contains a bag
* in the map below, and copy all bags into DefaultBags before 
 returning to avoid serialization exceptions.
*/
   MapInteger, Boolean isBagAtIndex = Maps.newHashMap();
   
   try {
 Tuple resTuple = tupleFactory_.newTuple(in.size());
 for (int i=0; i in.size(); i++) {
   Object obj = in.get(i);
   if (!isBagAtIndex.containsKey(i)) {
 isBagAtIndex.put(i, obj instanceof SingleTupleBag);
   }
   if (isBagAtIndex.get(i)) {
 DataBag newBag = bagFactory_.newDefaultBag();
 newBag.addAll((DataBag)obj);
 obj = newBag;
   }
   resTuple.set(i, obj);
 }
 return resTuple;
   } catch (IOException e) {
 log.warn(e);
 return null;
   }
 }
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1285) Allow SingleTupleBag to be serialized

2010-03-20 Thread Dmitriy V. Ryaboy (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy V. Ryaboy updated PIG-1285:
---

Attachment: PIG-1285.2.patch

 Allow SingleTupleBag to be serialized
 -

 Key: PIG-1285
 URL: https://issues.apache.org/jira/browse/PIG-1285
 Project: Pig
  Issue Type: Improvement
Reporter: Dmitriy V. Ryaboy
Assignee: Dmitriy V. Ryaboy
 Fix For: 0.7.0

 Attachments: PIG-1285.2.patch, PIG-1285.patch


 Currently, Pig uses a SingleTupleBag for efficiency when a full-blown 
 spillable bag implementation is not needed in the Combiner optimization.
 Unfortunately this can create problems. The below Initial.exec() code fails 
 at run-time with the message that a SingleTupleBag cannot be serialized:
 {code}
 @Override
 public Tuple exec(Tuple in) throws IOException {
   // single record. just copy.
   if (in == null) return null;   
   try {
  Tuple resTuple = tupleFactory_.newTuple(in.size());
  for (int i=0; i in.size(); i++) {
resTuple.set(i, in.get(i));
 }
 return resTuple;
} catch (IOException e) {
  log.warn(e);
  return null;
   }
 }
 {code}
 The code below can fix the problem in the UDF, but it seems like something 
 that should be handled transparently, not requiring UDF authors to know about 
 SingleTupleBags.
 {code}
 @Override
 public Tuple exec(Tuple in) throws IOException {
   // single record. just copy.
   if (in == null) return null;   
   
   /*
* Unfortunately SingleTupleBags are not serializable. We cache whether 
 a given index contains a bag
* in the map below, and copy all bags into DefaultBags before 
 returning to avoid serialization exceptions.
*/
   MapInteger, Boolean isBagAtIndex = Maps.newHashMap();
   
   try {
 Tuple resTuple = tupleFactory_.newTuple(in.size());
 for (int i=0; i in.size(); i++) {
   Object obj = in.get(i);
   if (!isBagAtIndex.containsKey(i)) {
 isBagAtIndex.put(i, obj instanceof SingleTupleBag);
   }
   if (isBagAtIndex.get(i)) {
 DataBag newBag = bagFactory_.newDefaultBag();
 newBag.addAll((DataBag)obj);
 obj = newBag;
   }
   resTuple.set(i, obj);
 }
 return resTuple;
   } catch (IOException e) {
 log.warn(e);
 return null;
   }
 }
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1285) Allow SingleTupleBag to be serialized

2010-03-20 Thread Dmitriy V. Ryaboy (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy V. Ryaboy updated PIG-1285:
---

Status: Patch Available  (was: Open)

 Allow SingleTupleBag to be serialized
 -

 Key: PIG-1285
 URL: https://issues.apache.org/jira/browse/PIG-1285
 Project: Pig
  Issue Type: Improvement
Reporter: Dmitriy V. Ryaboy
Assignee: Dmitriy V. Ryaboy
 Fix For: 0.7.0

 Attachments: PIG-1285.2.patch, PIG-1285.patch


 Currently, Pig uses a SingleTupleBag for efficiency when a full-blown 
 spillable bag implementation is not needed in the Combiner optimization.
 Unfortunately this can create problems. The below Initial.exec() code fails 
 at run-time with the message that a SingleTupleBag cannot be serialized:
 {code}
 @Override
 public Tuple exec(Tuple in) throws IOException {
   // single record. just copy.
   if (in == null) return null;   
   try {
  Tuple resTuple = tupleFactory_.newTuple(in.size());
  for (int i=0; i in.size(); i++) {
resTuple.set(i, in.get(i));
 }
 return resTuple;
} catch (IOException e) {
  log.warn(e);
  return null;
   }
 }
 {code}
 The code below can fix the problem in the UDF, but it seems like something 
 that should be handled transparently, not requiring UDF authors to know about 
 SingleTupleBags.
 {code}
 @Override
 public Tuple exec(Tuple in) throws IOException {
   // single record. just copy.
   if (in == null) return null;   
   
   /*
* Unfortunately SingleTupleBags are not serializable. We cache whether 
 a given index contains a bag
* in the map below, and copy all bags into DefaultBags before 
 returning to avoid serialization exceptions.
*/
   MapInteger, Boolean isBagAtIndex = Maps.newHashMap();
   
   try {
 Tuple resTuple = tupleFactory_.newTuple(in.size());
 for (int i=0; i in.size(); i++) {
   Object obj = in.get(i);
   if (!isBagAtIndex.containsKey(i)) {
 isBagAtIndex.put(i, obj instanceof SingleTupleBag);
   }
   if (isBagAtIndex.get(i)) {
 DataBag newBag = bagFactory_.newDefaultBag();
 newBag.addAll((DataBag)obj);
 obj = newBag;
   }
   resTuple.set(i, obj);
 }
 return resTuple;
   } catch (IOException e) {
 log.warn(e);
 return null;
   }
 }
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1310) ISO Date UDFs: Conversion, Rounding and Date Math

2010-03-20 Thread Russell Jurney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12847874#action_12847874
 ] 

Russell Jurney commented on PIG-1310:
-

I'm thinking it would be good if DateTime was a Pig primitive.  Can someone 
give me an idea how much work it is to add a Pig primitive, and if this patch 
would be accepted for 0.8?

 ISO Date UDFs: Conversion, Rounding and Date Math
 -

 Key: PIG-1310
 URL: https://issues.apache.org/jira/browse/PIG-1310
 Project: Pig
  Issue Type: New Feature
  Components: impl
Reporter: Russell Jurney
 Fix For: 0.8.0

   Original Estimate: 168h
  Remaining Estimate: 168h

 I've written UDFs to handle loading unix times, datemonth values and ISO 8601 
 formatted date strings, and working with them as ISO datetimes using jodatime.
 The working code is here: 
 http://github.com/rjurney/oink/tree/master/src/java/oink/udf/isodate/
 It needs to be documented and tests added, and a couple UDFs are missing, but 
 these work if you REGISTER the jodatime jar in your script.  Hopefully I can 
 get this stuff in piggybank before someone else writes it this time :)  The 
 rounding also may not be performant, but the code works.
 Ultimately I'd also like to enable support for ISO 8601 durations.  Someone 
 slap me if this isn't done soon, it is not much work and this should help 
 everyone working with time series.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1258) [zebra] Number of sorted input splits is unusually high

2010-03-20 Thread Yan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12847875#action_12847875
 ] 

Yan Zhou commented on PIG-1258:
---

Hudson's rerun appears to be hanging. Here is the result from my private run:

 [exec] +1 overall.
 [exec]
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec]
 [exec] +1 tests included.  The patch appears to include 9 new or 
modified tests.
 [exec]
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec]
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec]
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec]
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.

 [zebra] Number of sorted input splits is unusually high
 ---

 Key: PIG-1258
 URL: https://issues.apache.org/jira/browse/PIG-1258
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.6.0
Reporter: Yan Zhou
 Attachments: PIG-1258.patch


 Number of sorted input splits is unusually high if the projections are on 
 multiple column groups, or a union of tables, or column group(s) that hold 
 many small tfiles. In one test, the number is about 100 times bigger that 
 from unsorted input splits on the same input tables.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.