[jira] [Assigned] (PIG-3154) TestPackage.testOperator fails in trunk
[ https://issues.apache.org/jira/browse/PIG-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Johnny Zhang reassigned PIG-3154: - Assignee: Johnny Zhang > TestPackage.testOperator fails in trunk > --- > > Key: PIG-3154 > URL: https://issues.apache.org/jira/browse/PIG-3154 > Project: Pig > Issue Type: Bug >Affects Versions: 0.12 >Reporter: Cheolsoo Park >Assignee: Johnny Zhang > Labels: newbie > Fix For: 0.12 > > > To reproduce the issue, do: > {code} > ant clean test -Dtestcase=TestPackage > {code} > The test fails with the following error: > {code} > No test case for type biginteger > junit.framework.AssertionFailedError: No test case for type biginteger > at org.apache.pig.test.TestPackage.pickTest(TestPackage.java:153) > at org.apache.pig.test.TestPackage.testOperator(TestPackage.java:171) > {code} > Apparently, this is broken by PIG-2764. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3138) Decouple PigServer.executeBatch() from compilation of batch
[ https://issues.apache.org/jira/browse/PIG-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Kommireddi updated PIG-3138: - Attachment: PIG-3138_1.patch [~cheolsoo] corrected the comment in this new patch. > Decouple PigServer.executeBatch() from compilation of batch > --- > > Key: PIG-3138 > URL: https://issues.apache.org/jira/browse/PIG-3138 > Project: Pig > Issue Type: Improvement >Reporter: Prashant Kommireddi >Assignee: Prashant Kommireddi > Fix For: 0.12 > > Attachments: PIG-3138_1.patch, PIG-3138.patch > > > executeBatch() currently does parsing and building of LogicalPlan in addition > to the actual execution. It will be beneficial to separate out > parsing/building from execution - that will allow us to get a handle on > load/store and other operators before execution of batch. Useful for folks > using PigServer API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3138) Decouple PigServer.executeBatch() from compilation of batch
[ https://issues.apache.org/jira/browse/PIG-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13570871#comment-13570871 ] Prashant Kommireddi commented on PIG-3138: -- Nice catch! I will update it. Thanks Cheolsoo, again. > Decouple PigServer.executeBatch() from compilation of batch > --- > > Key: PIG-3138 > URL: https://issues.apache.org/jira/browse/PIG-3138 > Project: Pig > Issue Type: Improvement >Reporter: Prashant Kommireddi >Assignee: Prashant Kommireddi > Fix For: 0.12 > > Attachments: PIG-3138.patch > > > executeBatch() currently does parsing and building of LogicalPlan in addition > to the actual execution. It will be beneficial to separate out > parsing/building from execution - that will allow us to get a handle on > load/store and other operators before execution of batch. Useful for folks > using PigServer API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3138) Decouple PigServer.executeBatch() from compilation of batch
[ https://issues.apache.org/jira/browse/PIG-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheolsoo Park updated PIG-3138: --- Status: Patch Available (was: Open) Hi Prashant, LGTM. I have one question: {quote} This method should be followed by \{@link PigServer#executeBatch(boolean)\} with argument as true. {quote} In the comment of {{parseAndBuild()}}, shouldn't you say "with argument as *false*" instead of "with argument as *true*"? > Decouple PigServer.executeBatch() from compilation of batch > --- > > Key: PIG-3138 > URL: https://issues.apache.org/jira/browse/PIG-3138 > Project: Pig > Issue Type: Improvement >Reporter: Prashant Kommireddi >Assignee: Prashant Kommireddi > Fix For: 0.12 > > Attachments: PIG-3138.patch > > > executeBatch() currently does parsing and building of LogicalPlan in addition > to the actual execution. It will be beneficial to separate out > parsing/building from execution - that will allow us to get a handle on > load/store and other operators before execution of batch. Useful for folks > using PigServer API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3137) fix Piggybank test to not using /tmp dir
[ https://issues.apache.org/jira/browse/PIG-3137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheolsoo Park updated PIG-3137: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanks Johnny! > fix Piggybank test to not using /tmp dir > > > Key: PIG-3137 > URL: https://issues.apache.org/jira/browse/PIG-3137 > Project: Pig > Issue Type: Bug > Components: piggybank >Affects Versions: 0.11 >Reporter: Johnny Zhang >Assignee: Johnny Zhang > Fix For: 0.12 > > Attachments: PIG-3137.nows.patch.txt, PIG-3137.patch.txt, > PIG-3137.patch.txt > > > right now several Piggybank tests create directory under /tmp to store test > data, the test could fail because user doesn't have permission to create > directory under /tmp. It is better to move test data dir under build dir to > avoid this problem. > I will submit a patch soon. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3137) fix Piggybank test to not using /tmp dir
[ https://issues.apache.org/jira/browse/PIG-3137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13570827#comment-13570827 ] Cheolsoo Park commented on PIG-3137: +1. Looks good. I will commit it after running test. > fix Piggybank test to not using /tmp dir > > > Key: PIG-3137 > URL: https://issues.apache.org/jira/browse/PIG-3137 > Project: Pig > Issue Type: Bug > Components: piggybank >Affects Versions: 0.11 >Reporter: Johnny Zhang >Assignee: Johnny Zhang > Fix For: 0.12 > > Attachments: PIG-3137.nows.patch.txt, PIG-3137.patch.txt, > PIG-3137.patch.txt > > > right now several Piggybank tests create directory under /tmp to store test > data, the test could fail because user doesn't have permission to create > directory under /tmp. It is better to move test data dir under build dir to > avoid this problem. > I will submit a patch soon. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Subscription: PIG patch available
Issue Subscription Filter: PIG patch available (22 issues) Subscriber: pigdaily Key Summary PIG-3158Errors in the document "Control Structures" https://issues.apache.org/jira/browse/PIG-3158 PIG-3142Fixed-width load and store functions for the Piggybank https://issues.apache.org/jira/browse/PIG-3142 PIG-3137fix Piggybank test to not using /tmp dir https://issues.apache.org/jira/browse/PIG-3137 PIG-3136Introduce a syntax making declared aliases optional https://issues.apache.org/jira/browse/PIG-3136 PIG-3123Simplify Logical Plans By Removing Unneccessary Identity Projections https://issues.apache.org/jira/browse/PIG-3123 PIG-3114Duplicated macro name error when using pigunit https://issues.apache.org/jira/browse/PIG-3114 PIG-3108HBaseStorage returns empty maps when mixing wildcard- with other columns https://issues.apache.org/jira/browse/PIG-3108 PIG-3105Fix TestJobSubmission unit test failure. https://issues.apache.org/jira/browse/PIG-3105 PIG-3098Add another test for the self join case https://issues.apache.org/jira/browse/PIG-3098 PIG-3088Add a builtin udf which removes prefixes https://issues.apache.org/jira/browse/PIG-3088 PIG-3069Native Windows Compatibility for Pig E2E Tests and Harness https://issues.apache.org/jira/browse/PIG-3069 PIG-3028testGrunt dev test needs some command filters to run correctly without cygwin https://issues.apache.org/jira/browse/PIG-3028 PIG-3027pigTest unit test needs a newline filter for comparisons of golden multi-line https://issues.apache.org/jira/browse/PIG-3027 PIG-3026Pig checked-in baseline comparisons need a pre-filter to address OS-specific newline differences https://issues.apache.org/jira/browse/PIG-3026 PIG-3025TestPruneColumn unit test - SimpleEchoStreamingCommand perl inline script needs simplification https://issues.apache.org/jira/browse/PIG-3025 PIG-3024TestEmptyInputDir unit test - hadoop version detection logic is brittle https://issues.apache.org/jira/browse/PIG-3024 PIG-3015Rewrite of AvroStorage https://issues.apache.org/jira/browse/PIG-3015 PIG-3010Allow UDF's to flatten themselves https://issues.apache.org/jira/browse/PIG-3010 PIG-2959Add a pig.cmd for Pig to run under Windows https://issues.apache.org/jira/browse/PIG-2959 PIG-2955 Fix bunch of Pig e2e tests on Windows https://issues.apache.org/jira/browse/PIG-2955 PIG-2880Pig current releases lack a UDF charAt.This UDF returns the char value at the specified index. https://issues.apache.org/jira/browse/PIG-2880 PIG-1914Support load/store JSON data in Pig https://issues.apache.org/jira/browse/PIG-1914 You may edit this subscription at: https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=13225&filterId=12322384
[jira] [Commented] (PIG-3135) HExecutionEngine should look for resources in user passed Properties
[ https://issues.apache.org/jira/browse/PIG-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13570739#comment-13570739 ] Prashant Kommireddi commented on PIG-3135: -- Thanks for the review and commit, Cheolsoo. > HExecutionEngine should look for resources in user passed Properties > > > Key: PIG-3135 > URL: https://issues.apache.org/jira/browse/PIG-3135 > Project: Pig > Issue Type: Bug >Affects Versions: 0.10.0 >Reporter: Prashant Kommireddi >Assignee: Prashant Kommireddi > Fix For: 0.12 > > Attachments: PIG-3135_1.patch, PIG-3135.patch > > > Looking at this snippet: > {code} > private void init(Properties properties) throws ExecException { > . > . > . > // Check existence of hadoop-site.xml or core-site.xml > Configuration testConf = new Configuration(); > ClassLoader cl = testConf.getClassLoader(); > URL hadoop_site = cl.getResource( HADOOP_SITE ); > URL core_site = cl.getResource( CORE_SITE ); > > if( hadoop_site == null && core_site == null ) { > throw new ExecException("Cannot find hadoop configurations in > classpath (neither hadoop-site.xml nor core-site.xml was found in the > classpath)." + > " If you plan to use local mode, please put -x local > option in command line", > 4010); > } > {code} > This assumes the resources (*-site.xml) are set on the classpath, but this > will not always be the case when run with Pig's Java APIs. One could want to > programatically set the resources and the code here should additionally check > if they are available in there. > Example: When a Configuration object is created and resources are added > before passing it on to Pig. > {code} > Configuration conf = new Configuration(false); > conf.addResource("foo/core-site.xml"); > conf.addResource("bar/hadoop-site.xml"); > PigServer pServer = new PigServer(ExecType.MAPREDUCE, conf); > {code} > The above conf is not used right now to obtain resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (PIG-3135) HExecutionEngine should look for resources in user passed Properties
[ https://issues.apache.org/jira/browse/PIG-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheolsoo Park resolved PIG-3135. Resolution: Fixed Fix Version/s: 0.12 Committed to trunk. Thanks Prashant! > HExecutionEngine should look for resources in user passed Properties > > > Key: PIG-3135 > URL: https://issues.apache.org/jira/browse/PIG-3135 > Project: Pig > Issue Type: Bug >Affects Versions: 0.10.0 >Reporter: Prashant Kommireddi >Assignee: Prashant Kommireddi > Fix For: 0.12 > > Attachments: PIG-3135_1.patch, PIG-3135.patch > > > Looking at this snippet: > {code} > private void init(Properties properties) throws ExecException { > . > . > . > // Check existence of hadoop-site.xml or core-site.xml > Configuration testConf = new Configuration(); > ClassLoader cl = testConf.getClassLoader(); > URL hadoop_site = cl.getResource( HADOOP_SITE ); > URL core_site = cl.getResource( CORE_SITE ); > > if( hadoop_site == null && core_site == null ) { > throw new ExecException("Cannot find hadoop configurations in > classpath (neither hadoop-site.xml nor core-site.xml was found in the > classpath)." + > " If you plan to use local mode, please put -x local > option in command line", > 4010); > } > {code} > This assumes the resources (*-site.xml) are set on the classpath, but this > will not always be the case when run with Pig's Java APIs. One could want to > programatically set the resources and the code here should additionally check > if they are available in there. > Example: When a Configuration object is created and resources are added > before passing it on to Pig. > {code} > Configuration conf = new Configuration(false); > conf.addResource("foo/core-site.xml"); > conf.addResource("bar/hadoop-site.xml"); > PigServer pServer = new PigServer(ExecType.MAPREDUCE, conf); > {code} > The above conf is not used right now to obtain resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Move pig project from svn to git repository
Sure, I'm more interested in how our patch-file-based contribution process would be improved by moving to git. I'm not against it but I'd like we think this through. So my question is: How would someone contribute once we move to git? How do they attach their work to a jira? (patch file, pointer to a branch?) Only committers would be able to create branches in the apache git repo, would we save the contribution in a branch for that jira or do we still ask for file patches ? On Mon, Feb 4, 2013 at 11:13 AM, Russell Jurney wrote: > Others (Cassandra?) have trail blazed already. > > > On Mon, Feb 4, 2013 at 10:44 AM, Julien Le Dem wrote: > > > There was a thread about this last year: > > http://mail-archives.apache.org/mod_mbox/pig-dev/201203.mbox/thread?5 > > As I recall there was no opposition but also no strong motivation to be > > trailblazers in this area. > > I do see lag on the git mirror sometimes (hours). > > I'd like to see a better process than the file based patch contribution > but > > it is unclear using git primarily will allow that. If you have > suggestions > > I'd be happy to hear about it. > > Julien > > > > > > On Sun, Feb 3, 2013 at 1:54 PM, Jarek Jarcec Cecho > >wrote: > > > > > Hi Bill, > > > thank you very much for your feedback. Mostly in the past the > > > synchronization did not seemed reliable and sometimes it took couple of > > > hours or days to get latest updates from from svn to the git mirror. > Also > > > even recently, sometimes the mirrors seems unresponsive. At least for > me. > > > I'm not experiencing such issues when I'm working with the projects > that > > > are primarily running on git. Thus I was wondering whether pig > community > > > would be open to such migration. > > > > > > Jarcec > > > > > > On Fri, Feb 01, 2013 at 10:38:00PM -0800, Bill Graham wrote: > > > > I'm a huge fan of git and use it exclusively for Pig with the > exception > > > of > > > > committing patches. I haven't personally experienced any reliability > > > issues > > > > with the git mirror. What are the reliability issues you've seen? > > > > > > > > > > > > On Fri, Feb 1, 2013 at 6:46 PM, Jarek Jarcec Cecho < > jar...@apache.org > > > >wrote: > > > > > > > > > Hi pig developers, > > > > > I personally prefer git over svn, so I'm using the git mirrors that > > > Apache > > > > > provides. As those mirrors do not seem entirely reliable I was > > > wondering > > > > > whether there are other pig developers that also prefer git over > svn > > as > > > > > myself. Apache Infrastructure Team is supporting projects that are > > > > > primarily working with git, so my question is - would pig developer > > > > > community be interested in migrating the repository from svn to > git? > > > > > > > > > > I've recently participated in three projects that done this change, > > > namely > > > > > Sqoop, Flume and MRunit, and it's not a big deal. The process is > > rather > > > > > simple, just it take some time as most of the job is done by > > > Infrastructure > > > > > team. I would be more than happy to help or even drive the process > in > > > case > > > > > that this change would be desirable by community. > > > > > > > > > > Jarcec > > > > > > > > > > > > > > > > > > > > > -- > > > > *Note that I'm no longer using my Yahoo! email address. Please email > me > > > at > > > > billgra...@gmail.com going forward.* > > > > > > > > > -- > Russell Jurney twitter.com/rjurney russell.jur...@gmail.com > datasyndrome.com >
[jira] [Commented] (PIG-3138) Decouple PigServer.executeBatch() from compilation of batch
[ https://issues.apache.org/jira/browse/PIG-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13570559#comment-13570559 ] Prashant Kommireddi commented on PIG-3138: -- Ping! Can a committer please take a look? > Decouple PigServer.executeBatch() from compilation of batch > --- > > Key: PIG-3138 > URL: https://issues.apache.org/jira/browse/PIG-3138 > Project: Pig > Issue Type: Improvement >Reporter: Prashant Kommireddi >Assignee: Prashant Kommireddi > Fix For: 0.12 > > Attachments: PIG-3138.patch > > > executeBatch() currently does parsing and building of LogicalPlan in addition > to the actual execution. It will be beneficial to separate out > parsing/building from execution - that will allow us to get a handle on > load/store and other operators before execution of batch. Useful for folks > using PigServer API. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3157) Move LENGTH from Piggybank to builtin, make LENGTH work for multiple types similar to SIZE
[ https://issues.apache.org/jira/browse/PIG-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13570557#comment-13570557 ] Russell Jurney commented on PIG-3157: - It could be an alias for SIZE. Two differences: 1) Occasionally SIZE() will complain of not being able to work on some types. Not sure whats up, but LENGTH(chararray) works in these situations. 2) atm LENGTH only works for chararrays > Move LENGTH from Piggybank to builtin, make LENGTH work for multiple types > similar to SIZE > -- > > Key: PIG-3157 > URL: https://issues.apache.org/jira/browse/PIG-3157 > Project: Pig > Issue Type: Improvement > Components: internal-udfs, piggybank >Affects Versions: 0.11 >Reporter: Russell Jurney >Assignee: Russell Jurney > Fix For: 0.12 > > > LENGTH needs to be a builtin. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3157) Move LENGTH from Piggybank to builtin, make LENGTH work for multiple types similar to SIZE
[ https://issues.apache.org/jira/browse/PIG-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13570549#comment-13570549 ] Alan Gates commented on PIG-3157: - How does LENGTH differ from SIZE? > Move LENGTH from Piggybank to builtin, make LENGTH work for multiple types > similar to SIZE > -- > > Key: PIG-3157 > URL: https://issues.apache.org/jira/browse/PIG-3157 > Project: Pig > Issue Type: Improvement > Components: internal-udfs, piggybank >Affects Versions: 0.11 >Reporter: Russell Jurney >Assignee: Russell Jurney > Fix For: 0.12 > > > LENGTH needs to be a builtin. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Move pig project from svn to git repository
Others (Cassandra?) have trail blazed already. On Mon, Feb 4, 2013 at 10:44 AM, Julien Le Dem wrote: > There was a thread about this last year: > http://mail-archives.apache.org/mod_mbox/pig-dev/201203.mbox/thread?5 > As I recall there was no opposition but also no strong motivation to be > trailblazers in this area. > I do see lag on the git mirror sometimes (hours). > I'd like to see a better process than the file based patch contribution but > it is unclear using git primarily will allow that. If you have suggestions > I'd be happy to hear about it. > Julien > > > On Sun, Feb 3, 2013 at 1:54 PM, Jarek Jarcec Cecho >wrote: > > > Hi Bill, > > thank you very much for your feedback. Mostly in the past the > > synchronization did not seemed reliable and sometimes it took couple of > > hours or days to get latest updates from from svn to the git mirror. Also > > even recently, sometimes the mirrors seems unresponsive. At least for me. > > I'm not experiencing such issues when I'm working with the projects that > > are primarily running on git. Thus I was wondering whether pig community > > would be open to such migration. > > > > Jarcec > > > > On Fri, Feb 01, 2013 at 10:38:00PM -0800, Bill Graham wrote: > > > I'm a huge fan of git and use it exclusively for Pig with the exception > > of > > > committing patches. I haven't personally experienced any reliability > > issues > > > with the git mirror. What are the reliability issues you've seen? > > > > > > > > > On Fri, Feb 1, 2013 at 6:46 PM, Jarek Jarcec Cecho > >wrote: > > > > > > > Hi pig developers, > > > > I personally prefer git over svn, so I'm using the git mirrors that > > Apache > > > > provides. As those mirrors do not seem entirely reliable I was > > wondering > > > > whether there are other pig developers that also prefer git over svn > as > > > > myself. Apache Infrastructure Team is supporting projects that are > > > > primarily working with git, so my question is - would pig developer > > > > community be interested in migrating the repository from svn to git? > > > > > > > > I've recently participated in three projects that done this change, > > namely > > > > Sqoop, Flume and MRunit, and it's not a big deal. The process is > rather > > > > simple, just it take some time as most of the job is done by > > Infrastructure > > > > team. I would be more than happy to help or even drive the process in > > case > > > > that this change would be desirable by community. > > > > > > > > Jarcec > > > > > > > > > > > > > > > > -- > > > *Note that I'm no longer using my Yahoo! email address. Please email me > > at > > > billgra...@gmail.com going forward.* > > > -- Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome.com
[jira] [Commented] (PIG-3098) Add another test for the self join case
[ https://issues.apache.org/jira/browse/PIG-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13570502#comment-13570502 ] Alan Gates commented on PIG-3098: - +1, patch looks good, new test passes. > Add another test for the self join case > --- > > Key: PIG-3098 > URL: https://issues.apache.org/jira/browse/PIG-3098 > Project: Pig > Issue Type: Bug >Reporter: Jonathan Coveney >Assignee: Jonathan Coveney > Fix For: 0.12 > > Attachments: PIG-3098-0.patch, PIG-3098-1.patch > > > This adds a test to TestJoin that doesn't just make sure that self joins work > semantically in the parser, but also that it pulls the right data through. > Thought it'd be easier to just make a new JIRA than to reopen PIG-3020. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Move pig project from svn to git repository
There was a thread about this last year: http://mail-archives.apache.org/mod_mbox/pig-dev/201203.mbox/thread?5 As I recall there was no opposition but also no strong motivation to be trailblazers in this area. I do see lag on the git mirror sometimes (hours). I'd like to see a better process than the file based patch contribution but it is unclear using git primarily will allow that. If you have suggestions I'd be happy to hear about it. Julien On Sun, Feb 3, 2013 at 1:54 PM, Jarek Jarcec Cecho wrote: > Hi Bill, > thank you very much for your feedback. Mostly in the past the > synchronization did not seemed reliable and sometimes it took couple of > hours or days to get latest updates from from svn to the git mirror. Also > even recently, sometimes the mirrors seems unresponsive. At least for me. > I'm not experiencing such issues when I'm working with the projects that > are primarily running on git. Thus I was wondering whether pig community > would be open to such migration. > > Jarcec > > On Fri, Feb 01, 2013 at 10:38:00PM -0800, Bill Graham wrote: > > I'm a huge fan of git and use it exclusively for Pig with the exception > of > > committing patches. I haven't personally experienced any reliability > issues > > with the git mirror. What are the reliability issues you've seen? > > > > > > On Fri, Feb 1, 2013 at 6:46 PM, Jarek Jarcec Cecho >wrote: > > > > > Hi pig developers, > > > I personally prefer git over svn, so I'm using the git mirrors that > Apache > > > provides. As those mirrors do not seem entirely reliable I was > wondering > > > whether there are other pig developers that also prefer git over svn as > > > myself. Apache Infrastructure Team is supporting projects that are > > > primarily working with git, so my question is - would pig developer > > > community be interested in migrating the repository from svn to git? > > > > > > I've recently participated in three projects that done this change, > namely > > > Sqoop, Flume and MRunit, and it's not a big deal. The process is rather > > > simple, just it take some time as most of the job is done by > Infrastructure > > > team. I would be more than happy to help or even drive the process in > case > > > that this change would be desirable by community. > > > > > > Jarcec > > > > > > > > > > > -- > > *Note that I'm no longer using my Yahoo! email address. Please email me > at > > billgra...@gmail.com going forward.* >
[jira] [Updated] (PIG-3122) Operators should not implicitly become reserved keywords
[ https://issues.apache.org/jira/browse/PIG-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-3122: Status: Open (was: Patch Available) Sorry Jonathan, but I think the checkin of the big decimal stuff totally broke this patch. It fails all over the place in QueryParser.g and I'm not sure I'm putting it back together correctly. Marking this as open pending a new patch being uploaded. > Operators should not implicitly become reserved keywords > > > Key: PIG-3122 > URL: https://issues.apache.org/jira/browse/PIG-3122 > Project: Pig > Issue Type: Bug >Reporter: Jonathan Coveney >Assignee: Jonathan Coveney > Fix For: 0.12 > > Attachments: PIG-3122-0.patch > > > As a byproduct of how ANTLR lexes things, whenever we introduce a new > operator (RANK, CUBE, and any special keyword really) we are implicitly > introducing a reserved word that can't be used for relations, columns, etc > (unless give to us by the framework, as in the case of group). > The following, for example, fails: > {code} > a = load 'foo' as (x:int); > a = foreach a generate x as rank; > {code} > I'll include a patch to fix this essentially by whitelisting tokens. I > currently just whitelist cube, rank, and group. We can add more as people > want them? Can anyone think of reasonable ones they'd like to add? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3122) Operators should not implicitly become reserved keywords
[ https://issues.apache.org/jira/browse/PIG-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13570440#comment-13570440 ] Alan Gates commented on PIG-3122: - Reviewing this. > Operators should not implicitly become reserved keywords > > > Key: PIG-3122 > URL: https://issues.apache.org/jira/browse/PIG-3122 > Project: Pig > Issue Type: Bug >Reporter: Jonathan Coveney >Assignee: Jonathan Coveney > Fix For: 0.12 > > Attachments: PIG-3122-0.patch > > > As a byproduct of how ANTLR lexes things, whenever we introduce a new > operator (RANK, CUBE, and any special keyword really) we are implicitly > introducing a reserved word that can't be used for relations, columns, etc > (unless give to us by the framework, as in the case of group). > The following, for example, fails: > {code} > a = load 'foo' as (x:int); > a = foreach a generate x as rank; > {code} > I'll include a patch to fix this essentially by whitelisting tokens. I > currently just whitelist cube, rank, and group. We can add more as people > want them? Can anyone think of reasonable ones they'd like to add? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2661) Pig uses an extra job for loading data in Pigmix L9
[ https://issues.apache.org/jira/browse/PIG-2661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-2661: Status: Open (was: Patch Available) Canceling patch as we still seem to be debating the best route forward for this. > Pig uses an extra job for loading data in Pigmix L9 > --- > > Key: PIG-2661 > URL: https://issues.apache.org/jira/browse/PIG-2661 > Project: Pig > Issue Type: Improvement >Affects Versions: 0.9.0 >Reporter: Jie Li >Assignee: Jie Li > Attachments: PIG-2661.0.patch, PIG-2661.1.patch, PIG-2661.2.patch, > PIG-2661.3.patch, PIG-2661.4.patch, PIG-2661.5.patch, PIG-2661.6.patch, > PIG-2661.7.patch, PIG-2661.8.patch, PIG-2661.plan.txt > > > See > https://issues.apache.org/jira/browse/PIG-200?focusedCommentId=13260155&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13260155 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2834) MultiStorage requires unused constructor argument
[ https://issues.apache.org/jira/browse/PIG-2834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-2834: Status: Open (was: Patch Available) These changes break backward compatibility for users of MultiStorage. I agree the parentPathStr is unused and not required, but you need to deprecate the existing contructors without removing them and add new ones that don't take parentPathStr. This allows current users a path forward without breaking their code. > MultiStorage requires unused constructor argument > - > > Key: PIG-2834 > URL: https://issues.apache.org/jira/browse/PIG-2834 > Project: Pig > Issue Type: Improvement > Components: data >Affects Versions: 0.10.0, 0.11 > Environment: Linux >Reporter: Danny Antonetti >Priority: Trivial > Labels: newbie > Fix For: 0.12 > > Attachments: MultiStorage.patch > > > each constructor in > org.apache.pig.piggybank.storage.MultiStorage > requires a constructor argument 'parentPathStr", that has no meaningful usage. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-1942) script UDF (jython) should utilize the intended output schema to more directly convert Py objects to Pig objects
[ https://issues.apache.org/jira/browse/PIG-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-1942: Status: Open (was: Patch Available) Marking open pending response to Thejas' comments. > script UDF (jython) should utilize the intended output schema to more > directly convert Py objects to Pig objects > > > Key: PIG-1942 > URL: https://issues.apache.org/jira/browse/PIG-1942 > Project: Pig > Issue Type: Improvement > Components: impl >Affects Versions: 0.9.0, 0.8.0 >Reporter: Woody Anderson >Assignee: Woody Anderson >Priority: Minor > Labels: python, schema, udf > Attachments: 1942.patch, 1942_with_junit.patch > > > from https://issues.apache.org/jira/browse/PIG-1824 > {code} > import re > @outputSchema("y:bag{t:tuple(word:chararray)}") > def strsplittobag(content,regex): > return re.compile(regex).split(content) > {code} > does not work because split returns a list of strings. However, the output > schema is known, and it would be quite simple to implicitly promote the > string element to a tupled element. > also, a list/array/tuple/set etc. are all equally convertable to bag, and > list/array/tuple are equally convertable to Tuple, this conversion can be > done in a much less rigid way with the use of the schema. > this allows much more facile re-use of existing python code and less memory > overhead to create intermediate re-converting of object types. > I have written the code to do this a while back as part of my version of the > jython script framework, i'll isolate that and attach. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2873) Converting bin/pig shell script to python
[ https://issues.apache.org/jira/browse/PIG-2873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-2873: Status: Open (was: Patch Available) Vikram, Patch looks reasonable. But we need tests to assure that pig.py responds in the same way as the current pig bash shell. These could easily be written as a new driver in the e2e framework. > Converting bin/pig shell script to python > - > > Key: PIG-2873 > URL: https://issues.apache.org/jira/browse/PIG-2873 > Project: Pig > Issue Type: Bug > Components: tools >Affects Versions: 0.10.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K >Priority: Minor > Attachments: PIG-2873_2.patch, PIG-2873_3.patch, PIG-2873.patch > > > Converted the shell script in a platform independent way in python. Should > work with version 2.7.x -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-1237) Piggybank MutliStorage - specify field to write in output
[ https://issues.apache.org/jira/browse/PIG-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-1237: Status: Open (was: Patch Available) Returning patch to open pending response to Dmitriy's comments. > Piggybank MutliStorage - specify field to write in output > - > > Key: PIG-1237 > URL: https://issues.apache.org/jira/browse/PIG-1237 > Project: Pig > Issue Type: Improvement >Reporter: Gerrit Jansen van Vuuren >Assignee: Gerrit Jansen van Vuuren >Priority: Minor > Attachments: PIG-1237.patch > > > I've made a modification to the piggy bank MutliStorage class that allows to > optionally specify the index of the field in each tuple to write to output. > This feature allows to have records with metadata like seqno, time of upload > etc, and then to combine files from these records into one but without the > metadata. > e.g. > 1: date type seq1 data > 2: date type seq2 data > then write output grouped by type and ordered by sequence: > data > data -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira