[jira] [Updated] (PIG-3623) HBaseStorage: setting loadKey and noWAL to false doesn't have any affect
[ https://issues.apache.org/jira/browse/PIG-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-3623: Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) +1. Committed to trunk. Thanks Nezih The TestHBaseStorage still fails in the trunk. Passes fine when reverting to the old revision with this patch. Will address that in a separate jira. > HBaseStorage: setting loadKey and noWAL to false doesn't have any affect > > > Key: PIG-3623 > URL: https://issues.apache.org/jira/browse/PIG-3623 > Project: Pig > Issue Type: Bug >Affects Versions: 0.12.0 >Reporter: Michael Stefaniak >Assignee: Nezih Yigitbasi > Fix For: 0.13.0 > > Attachments: PIG-3623.1.patch, PIG-3623.2.patch, PIG-3623.3.patch, > PIG-3623.patch > > > The documentation for HBaseStorage > (http://pig.apache.org/docs/r0.12.0/func.html#HBaseStorage) > says -loadKey=(true|false) Load the row key as the first value in every tuple > returned from HBase (default=false) > However, looking at the source > (http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/hbase/HBaseStorage.java) > it is just doing a check for the existence of this option > loadRowKey_ = configuredOptions_.hasOption("loadKey"); > So setting -loadKey=false in the options string, still results in a true value -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] Subscription: PIG patch available
Issue Subscription Filter: PIG patch available (17 issues) Subscriber: pigdaily Key Summary PIG-3741Utils.setTmpFileCompressionOnConf can cause side effect for SequenceFileInterStorage https://issues.apache.org/jira/browse/PIG-3741 PIG-3737Bundle dependent jars in distribution in %PIG_HOME%/lib folder https://issues.apache.org/jira/browse/PIG-3737 PIG-3735UDF to data cleanse the dirty data with expected pattern https://issues.apache.org/jira/browse/PIG-3735 PIG-3724pig e2e tests dont have hadoop libs on classpath https://issues.apache.org/jira/browse/PIG-3724 PIG-3679e2e StreamingPythonUDFs_10 fails in trunk https://issues.apache.org/jira/browse/PIG-3679 PIG-3670Fix assert in Pig script https://issues.apache.org/jira/browse/PIG-3670 PIG-3668COR built-in function when atleast one of the coefficient values is NaN https://issues.apache.org/jira/browse/PIG-3668 PIG-3635Fix e2e tests for Hadoop 2.X on Windows https://issues.apache.org/jira/browse/PIG-3635 PIG-3623HBaseStorage: setting loadKey and noWAL to false doesn't have any affect https://issues.apache.org/jira/browse/PIG-3623 PIG-3615Update the way that JsonLoader/JsonStorage deal with BigDecimal https://issues.apache.org/jira/browse/PIG-3615 PIG-3613UDF for SimilarityMatching between strings with matching scores https://issues.apache.org/jira/browse/PIG-3613 PIG-3587add functionality for rolling over dates https://issues.apache.org/jira/browse/PIG-3587 PIG-3456Reduce threadlocal conf access in backend for each record https://issues.apache.org/jira/browse/PIG-3456 PIG-3447Compiler warning message dropped for CastLineageSetter and others with no enum kind https://issues.apache.org/jira/browse/PIG-3447 PIG-3441Allow Pig to use default resources from Configuration objects https://issues.apache.org/jira/browse/PIG-3441 PIG-3373XMLLoader returns non-matching nodes when a tag name spans through the block boundary https://issues.apache.org/jira/browse/PIG-3373 PIG-3347Store invocation brings side effect https://issues.apache.org/jira/browse/PIG-3347 You may edit this subscription at: https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=13225&filterId=12322384
[jira] [Commented] (PIG-259) allow store to overwrite existing directroy
[ https://issues.apache.org/jira/browse/PIG-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13891440#comment-13891440 ] Daniel Dai commented on PIG-259: PIG-259.9.patch committed. Thanks Nezih! > allow store to overwrite existing directroy > --- > > Key: PIG-259 > URL: https://issues.apache.org/jira/browse/PIG-259 > Project: Pig > Issue Type: Sub-task >Reporter: Olga Natkovich >Assignee: Nezih Yigitbasi > Fix For: 0.13.0 > > Attachments: PIG-259.5.patch, PIG-259.6.patch, PIG-259.7.patch, > PIG-259.8.patch, PIG-259.9.patch, Pig_259.patch, Pig_259_2.patch, > Pig_259_3.patch, Pig_259_4.patch > > > we have users who are asking for a flag to overwrite existing directory -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PIG-259) allow store to overwrite existing directroy
[ https://issues.apache.org/jira/browse/PIG-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nezih Yigitbasi updated PIG-259: Attachment: PIG-259.9.patch Daniel, delta patch added. Thanks for the review. > allow store to overwrite existing directroy > --- > > Key: PIG-259 > URL: https://issues.apache.org/jira/browse/PIG-259 > Project: Pig > Issue Type: Sub-task >Reporter: Olga Natkovich >Assignee: Nezih Yigitbasi > Fix For: 0.13.0 > > Attachments: PIG-259.5.patch, PIG-259.6.patch, PIG-259.7.patch, > PIG-259.8.patch, PIG-259.9.patch, Pig_259.patch, Pig_259_2.patch, > Pig_259_3.patch, Pig_259_4.patch > > > we have users who are asking for a flag to overwrite existing directory -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PIG-259) allow store to overwrite existing directroy
[ https://issues.apache.org/jira/browse/PIG-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13891303#comment-13891303 ] Daniel Dai commented on PIG-259: Sounds good. Since the patch is committed, can you upload the delta patch? > allow store to overwrite existing directroy > --- > > Key: PIG-259 > URL: https://issues.apache.org/jira/browse/PIG-259 > Project: Pig > Issue Type: Sub-task >Reporter: Olga Natkovich >Assignee: Nezih Yigitbasi > Fix For: 0.13.0 > > Attachments: PIG-259.5.patch, PIG-259.6.patch, PIG-259.7.patch, > PIG-259.8.patch, Pig_259.patch, Pig_259_2.patch, Pig_259_3.patch, > Pig_259_4.patch > > > we have users who are asking for a flag to overwrite existing directory -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 17681: [PIG-3742] Set MR runtime settings on tez runtime
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17681/ --- (Updated Feb. 4, 2014, 10:08 p.m.) Review request for pig, Cheolsoo Park and Daniel Dai. Changes --- Patch that deletes and newly adds util classes in util package instead of svn mv. Bugs: PIG-3742 https://issues.apache.org/jira/browse/PIG-3742 Repository: pig Description --- Changes made: 1) Converted the relevant MR settings to equivalent Tez settings and set them on AM, Vertex and Edge. 2) Moved the util and helper classes (SecurityHelper and TezCompilerUtil) to a util package. Does not show up cleanly in review board. Will be doing a svn mv while committing. 3) Fixed a issue with 1-1 edge in orderby while running pigmix where parallelism was not reflected in the second edge when the parallelism of first vertex changed after input split calculation. Also made POIdentityOutTez work with shuffle input as well when trying to test performance with 1-1 ege or shuffle edge with round robin partitioner. Shuffle edge with round robin partitioner or hash partitioner was very bad compared to MR. Even with 1-1 edge, performance is bad for L10.pig which orders by multiple columns. Still need to work on order by performance. Hoping unsorted shuffle with TEZ-661 might make it better. Diffs (updated) - http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POIdentityInOutTez.java 1563492 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/SecurityHelper.java 1563492 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java 1563492 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompilerUtil.java 1563492 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java 1563492 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java 1563492 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezSessionManager.java 1563492 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/util/MRToTezHelper.java PRE-CREATION http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/util/SecurityHelper.java PRE-CREATION http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/util/TezCompilerUtil.java PRE-CREATION Diff: https://reviews.apache.org/r/17681/diff/ Testing --- Unit and tez.conf e2e tests pass. Thanks, Rohini Palaniswamy
Re: Review Request 17681: [PIG-3742] Set MR runtime settings on tez runtime
> On Feb. 4, 2014, 8:05 p.m., Daniel Dai wrote: > > Seems the patch cannot apply cleanly. Can you rebase? This is because of the svn mv of the util classes. Uploaded a patch after deleting the moved files and creating them newly. Will do svn mv during the commit. - Rohini --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17681/#review33632 --- On Feb. 4, 2014, 5:40 p.m., Rohini Palaniswamy wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/17681/ > --- > > (Updated Feb. 4, 2014, 5:40 p.m.) > > > Review request for pig, Cheolsoo Park and Daniel Dai. > > > Bugs: PIG-3742 > https://issues.apache.org/jira/browse/PIG-3742 > > > Repository: pig > > > Description > --- > > Changes made: > 1) Converted the relevant MR settings to equivalent Tez settings and set them > on AM, Vertex and Edge. > 2) Moved the util and helper classes (SecurityHelper and TezCompilerUtil) to > a util package. Does not show up cleanly in review board. Will be doing a svn > mv while committing. > 3) Fixed a issue with 1-1 edge in orderby while running pigmix where > parallelism was not reflected in the second edge when the parallelism of > first vertex changed after input split calculation. Also made > POIdentityOutTez work with shuffle input as well when trying to test > performance with 1-1 ege or shuffle edge with round robin partitioner. > Shuffle edge with round robin partitioner or hash partitioner was very bad > compared to MR. Even with 1-1 edge, performance is bad for L10.pig which > orders by multiple columns. Still need to work on order by performance. > Hoping unsorted shuffle with TEZ-661 might make it better. > > > Diffs > - > > > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POIdentityInOutTez.java > 1563492 > > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/SecurityHelper.java > 1563492 > > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java > 1563492 > > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompilerUtil.java > 1563492 > > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java > 1563492 > > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java > 1563492 > > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezSessionManager.java > 1563492 > > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/util/MRToTezHelper.java > PRE-CREATION > > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/util/SecurityHelper.java > PRE-CREATION > > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/util/TezCompilerUtil.java > PRE-CREATION > > Diff: https://reviews.apache.org/r/17681/diff/ > > > Testing > --- > > Unit and tez.conf e2e tests pass. > > > Thanks, > > Rohini Palaniswamy > >
[jira] [Commented] (PIG-3441) Allow Pig to use default resources from Configuration objects
[ https://issues.apache.org/jira/browse/PIG-3441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13891280#comment-13891280 ] Bhooshan Mogal commented on PIG-3441: - [~daijy], yes, I agree. There are a lot of places where Configuration objects are re-created in Pig. I tried a bunch of them, but this particular instance - {{ConfigurationUtil.toConfiguration()}} seemed to fix the problem for me. This method is also called at multiple places. > Allow Pig to use default resources from Configuration objects > - > > Key: PIG-3441 > URL: https://issues.apache.org/jira/browse/PIG-3441 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.11.1 >Reporter: Bhooshan Mogal > Attachments: PIG-3441.patch, PIG-3441_1.patch > > > Pig currently ignores parameters from configuration files added statically to > Configuration objects as Configuration.addDefaultResource(filename.xml). > Consider the following scenario - > In a hadoop FileSystem driver for a non-HDFS filesystem you load properties > specific to that FileSystem in a static initializer block in the class that > extends org.apache.hadoop.fs.Filesystem for your FileSystem like below - > {code} > class MyFileSystem extends FileSystem { > static { > Configuration.addDefaultResource("myfs-default.xml"); > Configuration.addDefaultResource("myfs-site.xml"); > } > } > {code} > Interfaces like the Hadoop CLI, Hive, Hadoop M/R can find configuration > parameters defined in these configuration files as long as they are on the > classpath. > However, Pig cannot find parameters from these files, because it ignores > configuration files added statically. > Pig should allow users to specify if they would like pig to read parameters > from resources loaded statically. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PIG-259) allow store to overwrite existing directroy
[ https://issues.apache.org/jira/browse/PIG-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nezih Yigitbasi updated PIG-259: Attachment: PIG-259.8.patch When I removed the isOverwritable method tests failed. Because, the user tells us whether to overwrite or not, and using that flag we determine to catch file not found problems during validation. That is, implementing the interface is not enough to catch file not found problems during validation (user says "-overwrite false" but we only check whether PigStorage implements the OverwritingStoreFunc and ignore his input), so we need a flag that tells us user's input. To make the intent clearer I changed the name of OverwritingStoreFunc to OverwritableStoreFunc and changed the name of the method from "isOverwrite" to "shouldOverwrite", also added some javadoc. > allow store to overwrite existing directroy > --- > > Key: PIG-259 > URL: https://issues.apache.org/jira/browse/PIG-259 > Project: Pig > Issue Type: Sub-task >Reporter: Olga Natkovich >Assignee: Nezih Yigitbasi > Fix For: 0.13.0 > > Attachments: PIG-259.5.patch, PIG-259.6.patch, PIG-259.7.patch, > PIG-259.8.patch, Pig_259.patch, Pig_259_2.patch, Pig_259_3.patch, > Pig_259_4.patch > > > we have users who are asking for a flag to overwrite existing directory -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PIG-3441) Allow Pig to use default resources from Configuration objects
[ https://issues.apache.org/jira/browse/PIG-3441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13891228#comment-13891228 ] Daniel Dai commented on PIG-3441: - [~bdmogal] I see several more places we instantiate Configuration without default config files (eg, HExecutionEngine:111), not sure if we need to change those as well. Need to dig into the configuration propagation process more, it is quite complicated right now. > Allow Pig to use default resources from Configuration objects > - > > Key: PIG-3441 > URL: https://issues.apache.org/jira/browse/PIG-3441 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.11.1 >Reporter: Bhooshan Mogal > Attachments: PIG-3441.patch, PIG-3441_1.patch > > > Pig currently ignores parameters from configuration files added statically to > Configuration objects as Configuration.addDefaultResource(filename.xml). > Consider the following scenario - > In a hadoop FileSystem driver for a non-HDFS filesystem you load properties > specific to that FileSystem in a static initializer block in the class that > extends org.apache.hadoop.fs.Filesystem for your FileSystem like below - > {code} > class MyFileSystem extends FileSystem { > static { > Configuration.addDefaultResource("myfs-default.xml"); > Configuration.addDefaultResource("myfs-site.xml"); > } > } > {code} > Interfaces like the Hadoop CLI, Hive, Hadoop M/R can find configuration > parameters defined in these configuration files as long as they are on the > classpath. > However, Pig cannot find parameters from these files, because it ignores > configuration files added statically. > Pig should allow users to specify if they would like pig to read parameters > from resources loaded statically. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PIG-3347) Store invocation brings side effect
[ https://issues.apache.org/jira/browse/PIG-3347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13891178#comment-13891178 ] Daniel Dai commented on PIG-3347: - It could but introduce a lot of complications. Currently only LOForEach/LOSplitOutput is dealing with dup-uid, otherwise it will sprawl to all operators and all optimizer rules. > Store invocation brings side effect > --- > > Key: PIG-3347 > URL: https://issues.apache.org/jira/browse/PIG-3347 > Project: Pig > Issue Type: Bug > Components: grunt >Affects Versions: 0.11 > Environment: local mode >Reporter: Sergey >Assignee: Daniel Dai >Priority: Critical > Fix For: 0.12.1 > > Attachments: PIG-3347-1.patch, PIG-3347-2-testonly.patch, > PIG-3347-3.patch, PIG-3347-4-testonly.patch, PIG-3347-5.patch > > > The problem is that intermediate 'store' invocation "changes" the final store > output. Looks like it brings some kind of side effect. We did use 'local' > mode to run script > here is the input data: > 1 > 1 > Here is the script: > {code} > a = load 'test'; > a_group = group a by $0; > b = foreach a_group { > a_distinct = distinct a.$0; > generate group, a_distinct; > } > --store b into 'b'; > c = filter b by SIZE(a_distinct) == 1; > store c into 'out'; > {code} > We expect output to be: > 1 1 > The output is empty file. > Uncomment {code}--store b into 'b';{code} line and see the diffrence. > Yuo would get expected output. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PIG-3744) SequenceFileLoader does not support BytesWritable
[ https://issues.apache.org/jira/browse/PIG-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-3744: Attachment: PIG-3744-2.patch Test failed while running before commit as a single quote in the LOAD statement got accidentally deleted before generating the patch. Fixed that in PIG-3744-2.patch > SequenceFileLoader does not support BytesWritable > - > > Key: PIG-3744 > URL: https://issues.apache.org/jira/browse/PIG-3744 > Project: Pig > Issue Type: Bug >Affects Versions: 0.11.1 >Reporter: Rohini Palaniswamy >Assignee: Rohini Palaniswamy > Fix For: 0.13.0 > > Attachments: PIG-3744-1.patch, PIG-3744-2.patch > > > SequenceFileLoader should be referring to BytesWritable for bytearray type, > but it refers to pig's DataByteArray which does not even implement Writable. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PIG-3744) SequenceFileLoader does not support BytesWritable
[ https://issues.apache.org/jira/browse/PIG-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-3744: Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanks Daniel for the review. > SequenceFileLoader does not support BytesWritable > - > > Key: PIG-3744 > URL: https://issues.apache.org/jira/browse/PIG-3744 > Project: Pig > Issue Type: Bug >Affects Versions: 0.11.1 >Reporter: Rohini Palaniswamy >Assignee: Rohini Palaniswamy > Fix For: 0.13.0 > > Attachments: PIG-3744-1.patch, PIG-3744-2.patch > > > SequenceFileLoader should be referring to BytesWritable for bytearray type, > but it refers to pig's DataByteArray which does not even implement Writable. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PIG-3744) SequenceFileLoader does not support BytesWritable
[ https://issues.apache.org/jira/browse/PIG-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13891160#comment-13891160 ] Daniel Dai commented on PIG-3744: - +1 > SequenceFileLoader does not support BytesWritable > - > > Key: PIG-3744 > URL: https://issues.apache.org/jira/browse/PIG-3744 > Project: Pig > Issue Type: Bug >Affects Versions: 0.11.1 >Reporter: Rohini Palaniswamy >Assignee: Rohini Palaniswamy > Fix For: 0.13.0 > > Attachments: PIG-3744-1.patch > > > SequenceFileLoader should be referring to BytesWritable for bytearray type, > but it refers to pig's DataByteArray which does not even implement Writable. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PIG-259) allow store to overwrite existing directroy
[ https://issues.apache.org/jira/browse/PIG-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13891158#comment-13891158 ] Daniel Dai commented on PIG-259: Yes, good point. Let's remove the method to keep interface simpler. > allow store to overwrite existing directroy > --- > > Key: PIG-259 > URL: https://issues.apache.org/jira/browse/PIG-259 > Project: Pig > Issue Type: Sub-task >Reporter: Olga Natkovich >Assignee: Nezih Yigitbasi > Fix For: 0.13.0 > > Attachments: PIG-259.5.patch, PIG-259.6.patch, PIG-259.7.patch, > Pig_259.patch, Pig_259_2.patch, Pig_259_3.patch, Pig_259_4.patch > > > we have users who are asking for a flag to overwrite existing directory -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PIG-3744) SequenceFileLoader does not support BytesWritable
[ https://issues.apache.org/jira/browse/PIG-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-3744: Status: Patch Available (was: Open) > SequenceFileLoader does not support BytesWritable > - > > Key: PIG-3744 > URL: https://issues.apache.org/jira/browse/PIG-3744 > Project: Pig > Issue Type: Bug >Affects Versions: 0.11.1 >Reporter: Rohini Palaniswamy >Assignee: Rohini Palaniswamy > Fix For: 0.13.0 > > Attachments: PIG-3744-1.patch > > > SequenceFileLoader should be referring to BytesWritable for bytearray type, > but it refers to pig's DataByteArray which does not even implement Writable. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PIG-3744) SequenceFileLoader does not support BytesWritable
[ https://issues.apache.org/jira/browse/PIG-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-3744: Attachment: PIG-3744-1.patch > SequenceFileLoader does not support BytesWritable > - > > Key: PIG-3744 > URL: https://issues.apache.org/jira/browse/PIG-3744 > Project: Pig > Issue Type: Bug >Affects Versions: 0.11.1 >Reporter: Rohini Palaniswamy >Assignee: Rohini Palaniswamy > Fix For: 0.13.0 > > Attachments: PIG-3744-1.patch > > > SequenceFileLoader should be referring to BytesWritable for bytearray type, > but it refers to pig's DataByteArray which does not even implement Writable. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PIG-259) allow store to overwrite existing directroy
[ https://issues.apache.org/jira/browse/PIG-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13891145#comment-13891145 ] Nezih Yigitbasi commented on PIG-259: - Daniel, one question. Do you think the isOverwrite() method in the OverwritingStoreFunc interface necessary? If a store func. implements this interface it is very likely that it will return true in isOverwrite(). Maybe we should remove that method, what do you think? > allow store to overwrite existing directroy > --- > > Key: PIG-259 > URL: https://issues.apache.org/jira/browse/PIG-259 > Project: Pig > Issue Type: Sub-task >Reporter: Olga Natkovich >Assignee: Nezih Yigitbasi > Fix For: 0.13.0 > > Attachments: PIG-259.5.patch, PIG-259.6.patch, PIG-259.7.patch, > Pig_259.patch, Pig_259_2.patch, Pig_259_3.patch, Pig_259_4.patch > > > we have users who are asking for a flag to overwrite existing directory -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PIG-3347) Store invocation brings side effect
[ https://issues.apache.org/jira/browse/PIG-3347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13891127#comment-13891127 ] Koji Noguchi commented on PIG-3347: --- bq. we will need to generate a new uid for col2 to avoid uid conflict (using a UDF IdentityColumn) Daniel, I think I understand how it is being used, but my confusion is: for the pure purpose of tracking column lineage, shouldn't the redundant uid inside the relation be allowed? Isn't the requirement of no-conflict-uid coming from using the same uid for ProjectionPatcher which serves a different purpose than the lineage tracking? > Store invocation brings side effect > --- > > Key: PIG-3347 > URL: https://issues.apache.org/jira/browse/PIG-3347 > Project: Pig > Issue Type: Bug > Components: grunt >Affects Versions: 0.11 > Environment: local mode >Reporter: Sergey >Assignee: Daniel Dai >Priority: Critical > Fix For: 0.12.1 > > Attachments: PIG-3347-1.patch, PIG-3347-2-testonly.patch, > PIG-3347-3.patch, PIG-3347-4-testonly.patch, PIG-3347-5.patch > > > The problem is that intermediate 'store' invocation "changes" the final store > output. Looks like it brings some kind of side effect. We did use 'local' > mode to run script > here is the input data: > 1 > 1 > Here is the script: > {code} > a = load 'test'; > a_group = group a by $0; > b = foreach a_group { > a_distinct = distinct a.$0; > generate group, a_distinct; > } > --store b into 'b'; > c = filter b by SIZE(a_distinct) == 1; > store c into 'out'; > {code} > We expect output to be: > 1 1 > The output is empty file. > Uncomment {code}--store b into 'b';{code} line and see the diffrence. > Yuo would get expected output. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (PIG-3744) SequenceFileLoader does not support BytesWritable
Rohini Palaniswamy created PIG-3744: --- Summary: SequenceFileLoader does not support BytesWritable Key: PIG-3744 URL: https://issues.apache.org/jira/browse/PIG-3744 Project: Pig Issue Type: Bug Affects Versions: 0.11.1 Reporter: Rohini Palaniswamy Assignee: Rohini Palaniswamy Fix For: 0.13.0 SequenceFileLoader should be referring to BytesWritable for bytearray type, but it refers to pig's DataByteArray which does not even implement Writable. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 17681: [PIG-3742] Set MR runtime settings on tez runtime
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17681/#review33632 --- Seems the patch cannot apply cleanly. Can you rebase? - Daniel Dai On Feb. 4, 2014, 5:40 p.m., Rohini Palaniswamy wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/17681/ > --- > > (Updated Feb. 4, 2014, 5:40 p.m.) > > > Review request for pig, Cheolsoo Park and Daniel Dai. > > > Bugs: PIG-3742 > https://issues.apache.org/jira/browse/PIG-3742 > > > Repository: pig > > > Description > --- > > Changes made: > 1) Converted the relevant MR settings to equivalent Tez settings and set them > on AM, Vertex and Edge. > 2) Moved the util and helper classes (SecurityHelper and TezCompilerUtil) to > a util package. Does not show up cleanly in review board. Will be doing a svn > mv while committing. > 3) Fixed a issue with 1-1 edge in orderby while running pigmix where > parallelism was not reflected in the second edge when the parallelism of > first vertex changed after input split calculation. Also made > POIdentityOutTez work with shuffle input as well when trying to test > performance with 1-1 ege or shuffle edge with round robin partitioner. > Shuffle edge with round robin partitioner or hash partitioner was very bad > compared to MR. Even with 1-1 edge, performance is bad for L10.pig which > orders by multiple columns. Still need to work on order by performance. > Hoping unsorted shuffle with TEZ-661 might make it better. > > > Diffs > - > > > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POIdentityInOutTez.java > 1563492 > > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/SecurityHelper.java > 1563492 > > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java > 1563492 > > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompilerUtil.java > 1563492 > > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java > 1563492 > > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java > 1563492 > > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezSessionManager.java > 1563492 > > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/util/MRToTezHelper.java > PRE-CREATION > > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/util/SecurityHelper.java > PRE-CREATION > > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/util/TezCompilerUtil.java > PRE-CREATION > > Diff: https://reviews.apache.org/r/17681/diff/ > > > Testing > --- > > Unit and tez.conf e2e tests pass. > > > Thanks, > > Rohini Palaniswamy > >
[jira] [Resolved] (PIG-259) allow store to overwrite existing directroy
[ https://issues.apache.org/jira/browse/PIG-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai resolved PIG-259. Resolution: Fixed Fix Version/s: 0.13.0 Hadoop Flags: Reviewed Also add some comment to OverwritingStoreFunc. +1. Patch committed to trunk. Thanks Nezih! > allow store to overwrite existing directroy > --- > > Key: PIG-259 > URL: https://issues.apache.org/jira/browse/PIG-259 > Project: Pig > Issue Type: Sub-task >Reporter: Olga Natkovich >Assignee: Nezih Yigitbasi > Fix For: 0.13.0 > > Attachments: PIG-259.5.patch, PIG-259.6.patch, PIG-259.7.patch, > Pig_259.patch, Pig_259_2.patch, Pig_259_3.patch, Pig_259_4.patch > > > we have users who are asking for a flag to overwrite existing directory -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PIG-3567) LogicalPlanPrinter throws OOM for large scripts
[ https://issues.apache.org/jira/browse/PIG-3567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aniket Mokashi updated PIG-3567: Fix Version/s: 0.13.0 0.12.1 > LogicalPlanPrinter throws OOM for large scripts > --- > > Key: PIG-3567 > URL: https://issues.apache.org/jira/browse/PIG-3567 > Project: Pig > Issue Type: Bug >Reporter: Aniket Mokashi >Assignee: Aniket Mokashi > Fix For: 0.12.1, 0.13.0 > > Attachments: PIG-3567.patch > > > As mentioned in PIG-3455, LogicalPlanPrinter throws OOM for large scripts. > Problem is LogicalPlanPrinter's visit method generates a large string before > its written to the PrintStream. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PIG-259) allow store to overwrite existing directroy
[ https://issues.apache.org/jira/browse/PIG-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nezih Yigitbasi updated PIG-259: Attachment: PIG-259.7.patch Daniel, Thanks for the comments. 1. Good catch. 2. Updated PigOutputFormat to be consistent with InputOutputFileValidator. Both do checks now. 3. Fixed. > allow store to overwrite existing directroy > --- > > Key: PIG-259 > URL: https://issues.apache.org/jira/browse/PIG-259 > Project: Pig > Issue Type: Sub-task >Reporter: Olga Natkovich >Assignee: Nezih Yigitbasi > Attachments: PIG-259.5.patch, PIG-259.6.patch, PIG-259.7.patch, > Pig_259.patch, Pig_259_2.patch, Pig_259_3.patch, Pig_259_4.patch > > > we have users who are asking for a flag to overwrite existing directory -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PIG-3347) Store invocation brings side effect
[ https://issues.apache.org/jira/browse/PIG-3347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13891032#comment-13891032 ] Daniel Dai commented on PIG-3347: - [~knoguchi], in the "B = foreach A generate a as col1, a as col2; ", we will need to generate a new uid for col2 to avoid uid conflict (using a UDF IdentityColumn). The downside is this will break the lineage chain. The uid is mostly used in optimizer, there several holes when we use it for pure lineage. Optimizer rules is expected to live with these holes by skip optimize (eg, PushUpFilter is skip the foreach with UDF, which include IdentityColumn aiming to fix the uid conflict) > Store invocation brings side effect > --- > > Key: PIG-3347 > URL: https://issues.apache.org/jira/browse/PIG-3347 > Project: Pig > Issue Type: Bug > Components: grunt >Affects Versions: 0.11 > Environment: local mode >Reporter: Sergey >Assignee: Daniel Dai >Priority: Critical > Fix For: 0.12.1 > > Attachments: PIG-3347-1.patch, PIG-3347-2-testonly.patch, > PIG-3347-3.patch, PIG-3347-4-testonly.patch, PIG-3347-5.patch > > > The problem is that intermediate 'store' invocation "changes" the final store > output. Looks like it brings some kind of side effect. We did use 'local' > mode to run script > here is the input data: > 1 > 1 > Here is the script: > {code} > a = load 'test'; > a_group = group a by $0; > b = foreach a_group { > a_distinct = distinct a.$0; > generate group, a_distinct; > } > --store b into 'b'; > c = filter b by SIZE(a_distinct) == 1; > store c into 'out'; > {code} > We expect output to be: > 1 1 > The output is empty file. > Uncomment {code}--store b into 'b';{code} line and see the diffrence. > Yuo would get expected output. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PIG-3741) Utils.setTmpFileCompressionOnConf can cause side effect for SequenceFileInterStorage
[ https://issues.apache.org/jira/browse/PIG-3741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13891030#comment-13891030 ] Julien Le Dem commented on PIG-3741: Ideally each store would get its own config object, but that would be a major refactoring. In the meantime, this looks like a good improvement to me. +1 > Utils.setTmpFileCompressionOnConf can cause side effect for > SequenceFileInterStorage > > > Key: PIG-3741 > URL: https://issues.apache.org/jira/browse/PIG-3741 > Project: Pig > Issue Type: Bug >Reporter: Aniket Mokashi >Assignee: Aniket Mokashi > Fix For: 0.12.1 > > Attachments: PIG-3741.patch > > > Currently, Utils.setTmpFileCompressionOnConf(pigContext, conf); is invoked > for every job. In case of Seqfile, this api sets mapreduce params on conf to > assist SequenceFileInterStorage. However, as a side effect, this might change > the behavior of other storers due to these mapred properties. This api should > only be called for jobs with intermediate storage. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PIG-3347) Store invocation brings side effect
[ https://issues.apache.org/jira/browse/PIG-3347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-3347: Attachment: PIG-3347-5.patch Attach another patch which also address Koji's new case. > Store invocation brings side effect > --- > > Key: PIG-3347 > URL: https://issues.apache.org/jira/browse/PIG-3347 > Project: Pig > Issue Type: Bug > Components: grunt >Affects Versions: 0.11 > Environment: local mode >Reporter: Sergey >Assignee: Daniel Dai >Priority: Critical > Fix For: 0.12.1 > > Attachments: PIG-3347-1.patch, PIG-3347-2-testonly.patch, > PIG-3347-3.patch, PIG-3347-4-testonly.patch, PIG-3347-5.patch > > > The problem is that intermediate 'store' invocation "changes" the final store > output. Looks like it brings some kind of side effect. We did use 'local' > mode to run script > here is the input data: > 1 > 1 > Here is the script: > {code} > a = load 'test'; > a_group = group a by $0; > b = foreach a_group { > a_distinct = distinct a.$0; > generate group, a_distinct; > } > --store b into 'b'; > c = filter b by SIZE(a_distinct) == 1; > store c into 'out'; > {code} > We expect output to be: > 1 1 > The output is empty file. > Uncomment {code}--store b into 'b';{code} line and see the diffrence. > Yuo would get expected output. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PIG-259) allow store to overwrite existing directroy
[ https://issues.apache.org/jira/browse/PIG-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890971#comment-13890971 ] Daniel Dai commented on PIG-259: Thanks for the update. Another several comments: 1. PigOutputFormat.java: Remove "PigStorage ps = (PigStorage) sFunc;", we cannot assume sFunc is PigStorage 2. InputOutputFileValidator.java: Shall we skip checkOutputSpecs when overwrite happens? There is nothing wrong to capture FileAlreadyExistsException exception, but since you skip checkOutputSpecs in PigOutputFormat, it seems better to do it consistently 3. Another tab in PigStorage.java: "protected ResourceSchema schema" > allow store to overwrite existing directroy > --- > > Key: PIG-259 > URL: https://issues.apache.org/jira/browse/PIG-259 > Project: Pig > Issue Type: Sub-task >Reporter: Olga Natkovich >Assignee: Nezih Yigitbasi > Attachments: PIG-259.5.patch, PIG-259.6.patch, Pig_259.patch, > Pig_259_2.patch, Pig_259_3.patch, Pig_259_4.patch > > > we have users who are asking for a flag to overwrite existing directory -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PIG-3347) Store invocation brings side effect
[ https://issues.apache.org/jira/browse/PIG-3347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890942#comment-13890942 ] Koji Noguchi commented on PIG-3347: --- bq. UID is to track column lineage so in logical optimizer, so that we can freely move operate up and down, ProjectionPatcher will reposition the column according to uid I think part of my confusion comes from these two. UID is used for (1) tracking column lineage. (2) UID is also used for ProjectionPatcher to reposition therefore requiring UID to be unique within each relation. Because of (2), we're seeing new uid being created whenever column is referenced multiple times. Like A = load 'a.txt' as (a:int); B = foreach A generate a as col1, a as col2; This would create a schema like {noformat} 1-2: (Name: LOStore Schema: col1#1:int,col2#2:int) ... |---A: (Name: LOLoad Schema: a#1:int)RequiredFields:null {noformat} So without traversing the lineage, I cannot connect 'col2' to original 'a'. However, optimizer like PushUpFilter&FilterAboveForeach seems to be using just UID to determine the field usages... But this is outside of this jira. I need to spend more time learning how the pig compiler works. > Store invocation brings side effect > --- > > Key: PIG-3347 > URL: https://issues.apache.org/jira/browse/PIG-3347 > Project: Pig > Issue Type: Bug > Components: grunt >Affects Versions: 0.11 > Environment: local mode >Reporter: Sergey >Assignee: Daniel Dai >Priority: Critical > Fix For: 0.12.1 > > Attachments: PIG-3347-1.patch, PIG-3347-2-testonly.patch, > PIG-3347-3.patch, PIG-3347-4-testonly.patch > > > The problem is that intermediate 'store' invocation "changes" the final store > output. Looks like it brings some kind of side effect. We did use 'local' > mode to run script > here is the input data: > 1 > 1 > Here is the script: > {code} > a = load 'test'; > a_group = group a by $0; > b = foreach a_group { > a_distinct = distinct a.$0; > generate group, a_distinct; > } > --store b into 'b'; > c = filter b by SIZE(a_distinct) == 1; > store c into 'out'; > {code} > We expect output to be: > 1 1 > The output is empty file. > Uncomment {code}--store b into 'b';{code} line and see the diffrence. > Yuo would get expected output. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PIG-3347) Store invocation brings side effect
[ https://issues.apache.org/jira/browse/PIG-3347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Noguchi updated PIG-3347: -- Attachment: PIG-3347-4-testonly.patch Thanks [~daijy]. Adding one more testcase that I believe should push the filter before foreach. This one succeeds without the patch but fails with the patch. > Store invocation brings side effect > --- > > Key: PIG-3347 > URL: https://issues.apache.org/jira/browse/PIG-3347 > Project: Pig > Issue Type: Bug > Components: grunt >Affects Versions: 0.11 > Environment: local mode >Reporter: Sergey >Assignee: Daniel Dai >Priority: Critical > Fix For: 0.12.1 > > Attachments: PIG-3347-1.patch, PIG-3347-2-testonly.patch, > PIG-3347-3.patch, PIG-3347-4-testonly.patch > > > The problem is that intermediate 'store' invocation "changes" the final store > output. Looks like it brings some kind of side effect. We did use 'local' > mode to run script > here is the input data: > 1 > 1 > Here is the script: > {code} > a = load 'test'; > a_group = group a by $0; > b = foreach a_group { > a_distinct = distinct a.$0; > generate group, a_distinct; > } > --store b into 'b'; > c = filter b by SIZE(a_distinct) == 1; > store c into 'out'; > {code} > We expect output to be: > 1 1 > The output is empty file. > Uncomment {code}--store b into 'b';{code} line and see the diffrence. > Yuo would get expected output. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PIG-3347) Store invocation brings side effect
[ https://issues.apache.org/jira/browse/PIG-3347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890903#comment-13890903 ] Daniel Dai commented on PIG-3347: - All unit tests pass with the patch. > Store invocation brings side effect > --- > > Key: PIG-3347 > URL: https://issues.apache.org/jira/browse/PIG-3347 > Project: Pig > Issue Type: Bug > Components: grunt >Affects Versions: 0.11 > Environment: local mode >Reporter: Sergey >Assignee: Daniel Dai >Priority: Critical > Fix For: 0.12.1 > > Attachments: PIG-3347-1.patch, PIG-3347-2-testonly.patch, > PIG-3347-3.patch > > > The problem is that intermediate 'store' invocation "changes" the final store > output. Looks like it brings some kind of side effect. We did use 'local' > mode to run script > here is the input data: > 1 > 1 > Here is the script: > {code} > a = load 'test'; > a_group = group a by $0; > b = foreach a_group { > a_distinct = distinct a.$0; > generate group, a_distinct; > } > --store b into 'b'; > c = filter b by SIZE(a_distinct) == 1; > store c into 'out'; > {code} > We expect output to be: > 1 1 > The output is empty file. > Uncomment {code}--store b into 'b';{code} line and see the diffrence. > Yuo would get expected output. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Review Request 17681: [PIG-3742] Set MR runtime settings on tez runtime
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17681/ --- Review request for pig, Cheolsoo Park and Daniel Dai. Bugs: PIG-3742 https://issues.apache.org/jira/browse/PIG-3742 Repository: pig Description --- Changes made: 1) Converted the relevant MR settings to equivalent Tez settings and set them on AM, Vertex and Edge. 2) Moved the util and helper classes (SecurityHelper and TezCompilerUtil) to a util package. Does not show up cleanly in review board. Will be doing a svn mv while committing. 3) Fixed a issue with 1-1 edge in orderby while running pigmix where parallelism was not reflected in the second edge when the parallelism of first vertex changed after input split calculation. Also made POIdentityOutTez work with shuffle input as well when trying to test performance with 1-1 ege or shuffle edge with round robin partitioner. Shuffle edge with round robin partitioner or hash partitioner was very bad compared to MR. Even with 1-1 edge, performance is bad for L10.pig which orders by multiple columns. Still need to work on order by performance. Hoping unsorted shuffle with TEZ-661 might make it better. Diffs - http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POIdentityInOutTez.java 1563492 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/SecurityHelper.java 1563492 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java 1563492 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompilerUtil.java 1563492 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java 1563492 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java 1563492 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezSessionManager.java 1563492 http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/util/MRToTezHelper.java PRE-CREATION http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/util/SecurityHelper.java PRE-CREATION http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/util/TezCompilerUtil.java PRE-CREATION Diff: https://reviews.apache.org/r/17681/diff/ Testing --- Unit and tez.conf e2e tests pass. Thanks, Rohini Palaniswamy
Re: pig load function
Hi Krishna, By default it uses the PigStorage LoadFunc. You can change that behavior though by setting "pig.default.load.func" to your LoadFunc. -Mark On Mon, Feb 3, 2014 at 8:37 PM, Krishna Prasad Ambaripeta wrote: > Hi.I am new to pig. Have a basic doubt. when we write " a = load 'a/y.txt' > as(a,b)" , which pig function will it call. is it LoadFunc? > Thanks for the support. > Thanks,Krishna Prasad
pig load function
Hi.I am new to pig. Have a basic doubt. when we write " a = load 'a/y.txt' as(a,b)" , which pig function will it call. is it LoadFunc? Thanks for the support. Thanks,Krishna Prasad