Re: Our release process
HI Julien, You are making most of the points that I did on this thread (CI for e2e, not burdening clean e2e prior to every commit for a release branch). The only point on which there is no clear agreement is the definition of a bug that can be included in a previously released branch. I am fine with a case by case inclusion. Hi Olga, Are you fine with Julien's proposal as it stands - bugs that are included will be determined at the time of inclusion instead of doing it now. Santhosh From: Julien Le Dem To: dev@pig.apache.org; Santhosh M S Cc: "billgra...@gmail.com" Sent: Friday, November 30, 2012 5:37 PM Subject: Re: Our release process Proposed criteria: - it makes the tests fail. targets test-commit + test + e2e tests - a critical bug is reported in a short time frame (definition of critical not needed as it is rare and can be decided on a case by case basis) That raises another question: what are the existing CI servers running the tests? - the Apache CI runs test-commit and test (is it more stable now?) and not e2e. It would be great if it did. - we have a Jenkins build at Twitter where we run test-commit and test, we could not run e2e easily in our environment. - I understand there's a Yahoo/Hortonworks build (test-commit + test + e2e ???) Whenever those builds fail we should open or reopen JIRAS and fix it. The time it takes to run the full test suite makes it impractical to run on a desktop/laptop. For the release Pig-0.11.0 we need to get this list of JIRAs down to 0 and publish the jar. https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=true&jqlQuery=project+%3D+PIG+AND+fixVersion+%3D+%220.11%22+AND+resolution+%3D+Unresolved+ORDER+BY+updated+DESC%2C+due+ASC%2C+priority+DESC Julien On Thu, Nov 29, 2012 at 11:16 PM, Santhosh M S wrote: > Looks like everyone is interested in having frequent releases - I don't see > anyone disagreeing with that. > > Regarding "If a patch makes the release branch unstable, we revert it" - what are the criteria? If we can't decide on the criteria on this thread (already pretty long) then lets get the release trains going. We can revisit the criteria for inclusion of bug fixes when that happens. > > Santhosh > > > > From: Julien Le Dem > To: dev@pig.apache.org; Santhosh M S > Cc: "billgra...@gmail.com" > Sent: Thursday, November 29, 2012 9:45 AM > Subject: Re: Our release process > > The release branch receives only bug fixes. Patch level releases (3rd > version number) are issued out of the release branch and introduce > only bug fixes and no new features. > Deciding whether a patch is applied to the release branch is based on > preserving stability (as Bill said). If a patch makes the release > branch unstable, we revert it. > New features are added to trunk where new major and minor releases will > happen. > If we need a new feature out then we make a new minor release. > Doing frequent releases is the industry standard and will resolve > conflicts around what should go in a release branch. > > Making a new release is currently painful *because* we wait so long in > between two releases. Let's fix that. > > Julien > > On Wed, Nov 28, 2012 at 10:09 PM, Santhosh M S > wrote: >> Since releasing a major version once a month is agressive and we have not >> released on a quarterly basis, we should allow commits to a released branch >> to facilitate dot releases. >> >> If we are allowing commits to a released branch, the criteria for inclusion >> can be created anew or we use the industry standards for severity (or >> priority). It could be painful for a few folks but I don't see better >> alternatives. >> >> Regarding reverting commits based on e2e tests breaking: >> 1. Who is running the tests? >> 2. How often are they run? >> If we have nightly e2e runs then its easier to catch these errors early. If >> not the barrier for inclusion is pretty high and time consuming making it harder to develop. >> >> Santhosh >> >> >> >> From: Bill Graham >> To: dev@pig.apache.org >> Sent: Wednesday, November 28, 2012 11:39 AM >> Subject: Re: Our release process >> >> I agree releasing often is ideal, but releasing major versions once a month >> would be a bit agressive. >> >> +1 to Olga's initial definition of how Yahoo! determines what goes into a >> released branch. Basically is something broken without a workaround or is >> there potential silent data loss. Trying to get a more granular definition >> than that (i.e. P1, P2, severity, etc) will be painful. The reality in that >> case is that for whomever is blocked by the bug will consider it a P1. >> >> Fixes need to be relatively low-risk though to keep stability, but this is >> also subjective. For this I'm in favor of relying on developer and reviewer >> judgement to make that call and I'm +1 to Alan's proposal of rolling back >> patches that break the e2e
[jira] [Updated] (PIG-2907) Publish pig 0.23 jars to maven
[ https://issues.apache.org/jira/browse/PIG-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-2907: Resolution: Fixed Fix Version/s: 0.12 Status: Resolved (was: Patch Available) Thanks Julien. Committed to 0.11 and trunk. > Publish pig 0.23 jars to maven > -- > > Key: PIG-2907 > URL: https://issues.apache.org/jira/browse/PIG-2907 > Project: Pig > Issue Type: New Feature >Reporter: Francis Liu >Assignee: Rohini Palaniswamy > Fix For: 0.11, 0.12 > > Attachments: PIG-2907-1.patch, PIG-2907-2.patch, PIG-2907.patch > > > HCatalog would like to get our unit tests be able to run against 0.23 part of > it would require pulling the pig 0.23 dependency from maven. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3014) CurrentTime() UDF has undesirable characteristics
[ https://issues.apache.org/jira/browse/PIG-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507825#comment-13507825 ] Rohini Palaniswamy commented on PIG-3014: - Ah. I had forgotten about that question. Agree with Julien. > CurrentTime() UDF has undesirable characteristics > - > > Key: PIG-3014 > URL: https://issues.apache.org/jira/browse/PIG-3014 > Project: Pig > Issue Type: Bug >Reporter: Jonathan Coveney >Assignee: Jonathan Coveney > Fix For: 0.12 > > Attachments: PIG-3014-0.patch, PIG-3014-1.patch, PIG-3014-2.patch, > PIG-3014-3.patch > > > As part of the explanation of the new DateTime datatype I noticed that we had > added a CurrentTime() UDF. The issue with this UDF is that it returns the > current time _of every exec invocation_, which can lead to confusing results. > In PIG-1431 I proposed a way such that every instance of the same NOW() will > return the same time, which I think is better. Would enjoy thoughts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Our release process
Proposed criteria: - it makes the tests fail. targets test-commit + test + e2e tests - a critical bug is reported in a short time frame (definition of critical not needed as it is rare and can be decided on a case by case basis) That raises another question: what are the existing CI servers running the tests? - the Apache CI runs test-commit and test (is it more stable now?) and not e2e. It would be great if it did. - we have a Jenkins build at Twitter where we run test-commit and test, we could not run e2e easily in our environment. - I understand there's a Yahoo/Hortonworks build (test-commit + test + e2e ???) Whenever those builds fail we should open or reopen JIRAS and fix it. The time it takes to run the full test suite makes it impractical to run on a desktop/laptop. For the release Pig-0.11.0 we need to get this list of JIRAs down to 0 and publish the jar. https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=true&jqlQuery=project+%3D+PIG+AND+fixVersion+%3D+%220.11%22+AND+resolution+%3D+Unresolved+ORDER+BY+updated+DESC%2C+due+ASC%2C+priority+DESC Julien On Thu, Nov 29, 2012 at 11:16 PM, Santhosh M S wrote: > Looks like everyone is interested in having frequent releases - I don't see > anyone disagreeing with that. > > Regarding "If a patch makes the release branch unstable, we revert it" - what > are the criteria? If we can't decide on the criteria on this thread (already > pretty long) then lets get the release trains going. We can revisit the > criteria for inclusion of bug fixes when that happens. > > Santhosh > > > > From: Julien Le Dem > To: dev@pig.apache.org; Santhosh M S > Cc: "billgra...@gmail.com" > Sent: Thursday, November 29, 2012 9:45 AM > Subject: Re: Our release process > > The release branch receives only bug fixes. Patch level releases (3rd > version number) are issued out of the release branch and introduce > only bug fixes and no new features. > Deciding whether a patch is applied to the release branch is based on > preserving stability (as Bill said). If a patch makes the release > branch unstable, we revert it. > New features are added to trunk where new major and minor releases will > happen. > If we need a new feature out then we make a new minor release. > Doing frequent releases is the industry standard and will resolve > conflicts around what should go in a release branch. > > Making a new release is currently painful *because* we wait so long in > between two releases. Let's fix that. > > Julien > > On Wed, Nov 28, 2012 at 10:09 PM, Santhosh M S > wrote: >> Since releasing a major version once a month is agressive and we have not >> released on a quarterly basis, we should allow commits to a released branch >> to facilitate dot releases. >> >> If we are allowing commits to a released branch, the criteria for inclusion >> can be created anew or we use the industry standards for severity (or >> priority). It could be painful for a few folks but I don't see better >> alternatives. >> >> Regarding reverting commits based on e2e tests breaking: >> 1. Who is running the tests? >> 2. How often are they run? >> If we have nightly e2e runs then its easier to catch these errors early. If >> not the barrier for inclusion is pretty high and time consuming making it >> harder to develop. >> >> Santhosh >> >> >> >> From: Bill Graham >> To: dev@pig.apache.org >> Sent: Wednesday, November 28, 2012 11:39 AM >> Subject: Re: Our release process >> >> I agree releasing often is ideal, but releasing major versions once a month >> would be a bit agressive. >> >> +1 to Olga's initial definition of how Yahoo! determines what goes into a >> released branch. Basically is something broken without a workaround or is >> there potential silent data loss. Trying to get a more granular definition >> than that (i.e. P1, P2, severity, etc) will be painful. The reality in that >> case is that for whomever is blocked by the bug will consider it a P1. >> >> Fixes need to be relatively low-risk though to keep stability, but this is >> also subjective. For this I'm in favor of relying on developer and reviewer >> judgement to make that call and I'm +1 to Alan's proposal of rolling back >> patches that break the e2e tests or anything else. >> >> I think our policy should avoid time-based consideration on how many >> quarters away are we from the next major release since that's also >> impossible to quantify. Plus, if the answer to the question is that we're >> more than 1-2 quarters from the next release is "yes" then we should be >> fixing that release problem. >> >> >> On Wed, Nov 28, 2012 at 10:22 AM, Julien Le Dem wrote: >> >>> I would really like to see us doing frequent releases (at least once >>> per quarter if not once a month). >>> I think the whole notion of priority or being a "blocker" is subjective. >>> Releasing infrequently pressures us to push more changes than we would >>>
[jira] Subscription: PIG patch available
Issue Subscription Filter: PIG patch available (32 issues) Subscriber: pigdaily Key Summary PIG-3069Native Windows Compatibility for Pig E2E Tests and Harness https://issues.apache.org/jira/browse/PIG-3069 PIG-3067HBaseStorage should be split up to become more managable https://issues.apache.org/jira/browse/PIG-3067 PIG-3066Fix TestPigRunner in trunk https://issues.apache.org/jira/browse/PIG-3066 PIG-3058Upgrade junit to at least 4.8 https://issues.apache.org/jira/browse/PIG-3058 PIG-3057make readField protected to be able to override it if we extend PigStorage https://issues.apache.org/jira/browse/PIG-3057 PIG-3051java.lang.IndexOutOfBoundsException failure with LimitOptimizer + ColumnPruning https://issues.apache.org/jira/browse/PIG-3051 PIG-3033test-patch failed with javadoc warnings https://issues.apache.org/jira/browse/PIG-3033 PIG-3029TestTypeCheckingValidatorNewLP has some path reference issues for cross-platform execution https://issues.apache.org/jira/browse/PIG-3029 PIG-3028testGrunt dev test needs some command filters to run correctly without cygwin https://issues.apache.org/jira/browse/PIG-3028 PIG-3027pigTest unit test needs a newline filter for comparisons of golden multi-line https://issues.apache.org/jira/browse/PIG-3027 PIG-3026Pig checked-in baseline comparisons need a pre-filter to address OS-specific newline differences https://issues.apache.org/jira/browse/PIG-3026 PIG-3025TestPruneColumn unit test - SimpleEchoStreamingCommand perl inline script needs simplification https://issues.apache.org/jira/browse/PIG-3025 PIG-3024TestEmptyInputDir unit test - hadoop version detection logic is brittle https://issues.apache.org/jira/browse/PIG-3024 PIG-3015Rewrite of AvroStorage https://issues.apache.org/jira/browse/PIG-3015 PIG-3010Allow UDF's to flatten themselves https://issues.apache.org/jira/browse/PIG-3010 PIG-2959Add a pig.cmd for Pig to run under Windows https://issues.apache.org/jira/browse/PIG-2959 PIG-2957TetsScriptUDF fail due to volume prefix in jar https://issues.apache.org/jira/browse/PIG-2957 PIG-2956Invalid cache specification for some streaming statement https://issues.apache.org/jira/browse/PIG-2956 PIG-2955 Fix bunch of Pig e2e tests on Windows https://issues.apache.org/jira/browse/PIG-2955 PIG-2907Publish pig 0.23 jars to maven https://issues.apache.org/jira/browse/PIG-2907 PIG-2873Converting bin/pig shell script to python https://issues.apache.org/jira/browse/PIG-2873 PIG-2834MultiStorage requires unused constructor argument https://issues.apache.org/jira/browse/PIG-2834 PIG-2824Pushing checking number of fields into LoadFunc https://issues.apache.org/jira/browse/PIG-2824 PIG-2661Pig uses an extra job for loading data in Pigmix L9 https://issues.apache.org/jira/browse/PIG-2661 PIG-2614AvroStorage crashes on LOADING a single bad error https://issues.apache.org/jira/browse/PIG-2614 PIG-2507Semicolon in paramenters for UDF results in parsing error https://issues.apache.org/jira/browse/PIG-2507 PIG-2433Jython import module not working if module path is in classpath https://issues.apache.org/jira/browse/PIG-2433 PIG-2417Streaming UDFs - allow users to easily write UDFs in scripting languages with no JVM implementation. https://issues.apache.org/jira/browse/PIG-2417 PIG-2362Rework Ant build.xml to use macrodef instead of antcall https://issues.apache.org/jira/browse/PIG-2362 PIG-2312NPE when relation and column share the same name and used in Nested Foreach https://issues.apache.org/jira/browse/PIG-2312 PIG-1942script UDF (jython) should utilize the intended output schema to more directly convert Py objects to Pig objects https://issues.apache.org/jira/browse/PIG-1942 PIG-1237Piggybank MutliStorage - specify field to write in output https://issues.apache.org/jira/browse/PIG-1237 You may edit this subscription at: https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=13225&filterId=12322384
[jira] [Updated] (PIG-3014) CurrentTime() UDF has undesirable characteristics
[ https://issues.apache.org/jira/browse/PIG-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheolsoo Park updated PIG-3014: --- Resolution: Fixed Status: Resolved (was: Patch Available) Patch committed to trunk. > CurrentTime() UDF has undesirable characteristics > - > > Key: PIG-3014 > URL: https://issues.apache.org/jira/browse/PIG-3014 > Project: Pig > Issue Type: Bug >Reporter: Jonathan Coveney >Assignee: Jonathan Coveney > Fix For: 0.12 > > Attachments: PIG-3014-0.patch, PIG-3014-1.patch, PIG-3014-2.patch, > PIG-3014-3.patch > > > As part of the explanation of the new DateTime datatype I noticed that we had > added a CurrentTime() UDF. The issue with this UDF is that it returns the > current time _of every exec invocation_, which can lead to confusing results. > In PIG-1431 I proposed a way such that every instance of the same NOW() will > return the same time, which I think is better. Would enjoy thoughts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2907) Publish pig 0.23 jars to maven
[ https://issues.apache.org/jira/browse/PIG-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507752#comment-13507752 ] Julien Le Dem commented on PIG-2907: +1 > Publish pig 0.23 jars to maven > -- > > Key: PIG-2907 > URL: https://issues.apache.org/jira/browse/PIG-2907 > Project: Pig > Issue Type: New Feature >Reporter: Francis Liu >Assignee: Rohini Palaniswamy > Fix For: 0.11 > > Attachments: PIG-2907-1.patch, PIG-2907-2.patch, PIG-2907.patch > > > HCatalog would like to get our unit tests be able to run against 0.23 part of > it would require pulling the pig 0.23 dependency from maven. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3014) CurrentTime() UDF has undesirable characteristics
[ https://issues.apache.org/jira/browse/PIG-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507750#comment-13507750 ] Julien Le Dem commented on PIG-3014: I think it's better to have one test class per UDF. Usually tests are grouped per class or functional group of classes. All builtin UDFs do not make a functional group as they have various different purposes. It just makes a huge Test class which is undesirable. > CurrentTime() UDF has undesirable characteristics > - > > Key: PIG-3014 > URL: https://issues.apache.org/jira/browse/PIG-3014 > Project: Pig > Issue Type: Bug >Reporter: Jonathan Coveney >Assignee: Jonathan Coveney > Fix For: 0.12 > > Attachments: PIG-3014-0.patch, PIG-3014-1.patch, PIG-3014-2.patch, > PIG-3014-3.patch > > > As part of the explanation of the new DateTime datatype I noticed that we had > added a CurrentTime() UDF. The issue with this UDF is that it returns the > current time _of every exec invocation_, which can lead to confusing results. > In PIG-1431 I proposed a way such that every instance of the same NOW() will > return the same time, which I think is better. Would enjoy thoughts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3014) CurrentTime() UDF has undesirable characteristics
[ https://issues.apache.org/jira/browse/PIG-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507731#comment-13507731 ] Cheolsoo Park commented on PIG-3014: Thanks Rohini. In fact, I asked that question on the dev mailing list a while ago: http://search-hadoop.com/m/OVyoR1Ktpcy/Adding+new+test+cases+to+TestBuiltin.java&subj=Adding+new+test+cases+to+TestBuiltin+java Julien said that each built-in UDF should have its own test suite, so I followed it in PIG-2881. I guess that the same applies to CurrentTime(). Please anyone correct me if I am wrong. > CurrentTime() UDF has undesirable characteristics > - > > Key: PIG-3014 > URL: https://issues.apache.org/jira/browse/PIG-3014 > Project: Pig > Issue Type: Bug >Reporter: Jonathan Coveney >Assignee: Jonathan Coveney > Fix For: 0.12 > > Attachments: PIG-3014-0.patch, PIG-3014-1.patch, PIG-3014-2.patch, > PIG-3014-3.patch > > > As part of the explanation of the new DateTime datatype I noticed that we had > added a CurrentTime() UDF. The issue with this UDF is that it returns the > current time _of every exec invocation_, which can lead to confusing results. > In PIG-1431 I proposed a way such that every instance of the same NOW() will > return the same time, which I think is better. Would enjoy thoughts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3014) CurrentTime() UDF has undesirable characteristics
[ https://issues.apache.org/jira/browse/PIG-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507725#comment-13507725 ] Rohini Palaniswamy commented on PIG-3014: - bq. Since the test case is not valid, I simply removed it. +1. TestCurrentTime covers CurrentTime udf adequately. Just an observation though. All builtin udf tests are in TestBuiltin, but CurrentTime alone has a separate test class with just one test. Should we move that to TestBuiltin? > CurrentTime() UDF has undesirable characteristics > - > > Key: PIG-3014 > URL: https://issues.apache.org/jira/browse/PIG-3014 > Project: Pig > Issue Type: Bug >Reporter: Jonathan Coveney >Assignee: Jonathan Coveney > Fix For: 0.12 > > Attachments: PIG-3014-0.patch, PIG-3014-1.patch, PIG-3014-2.patch, > PIG-3014-3.patch > > > As part of the explanation of the new DateTime datatype I noticed that we had > added a CurrentTime() UDF. The issue with this UDF is that it returns the > current time _of every exec invocation_, which can lead to confusing results. > In PIG-1431 I proposed a way such that every instance of the same NOW() will > return the same time, which I think is better. Would enjoy thoughts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3014) CurrentTime() UDF has undesirable characteristics
[ https://issues.apache.org/jira/browse/PIG-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheolsoo Park updated PIG-3014: --- Attachment: PIG-3014-3.patch Attached a patch that fixes {{TestBuiltin}}. The CurrentTime() must get called only in the back-end because it reads the value of "pig.job.submitted.timestamp" out of JobConf. But the unit test case was calling it in the front-end, resulting in a NullPointerException. Since the test case is not valid, I simply removed it. Thanks! > CurrentTime() UDF has undesirable characteristics > - > > Key: PIG-3014 > URL: https://issues.apache.org/jira/browse/PIG-3014 > Project: Pig > Issue Type: Bug >Reporter: Jonathan Coveney >Assignee: Jonathan Coveney > Fix For: 0.12 > > Attachments: PIG-3014-0.patch, PIG-3014-1.patch, PIG-3014-2.patch, > PIG-3014-3.patch > > > As part of the explanation of the new DateTime datatype I noticed that we had > added a CurrentTime() UDF. The issue with this UDF is that it returns the > current time _of every exec invocation_, which can lead to confusing results. > In PIG-1431 I proposed a way such that every instance of the same NOW() will > return the same time, which I think is better. Would enjoy thoughts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3014) CurrentTime() UDF has undesirable characteristics
[ https://issues.apache.org/jira/browse/PIG-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheolsoo Park updated PIG-3014: --- Status: Patch Available (was: Reopened) > CurrentTime() UDF has undesirable characteristics > - > > Key: PIG-3014 > URL: https://issues.apache.org/jira/browse/PIG-3014 > Project: Pig > Issue Type: Bug >Reporter: Jonathan Coveney >Assignee: Jonathan Coveney > Fix For: 0.12 > > Attachments: PIG-3014-0.patch, PIG-3014-1.patch, PIG-3014-2.patch, > PIG-3014-3.patch > > > As part of the explanation of the new DateTime datatype I noticed that we had > added a CurrentTime() UDF. The issue with this UDF is that it returns the > current time _of every exec invocation_, which can lead to confusing results. > In PIG-1431 I proposed a way such that every instance of the same NOW() will > return the same time, which I think is better. Would enjoy thoughts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3014) CurrentTime() UDF has undesirable characteristics
[ https://issues.apache.org/jira/browse/PIG-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507654#comment-13507654 ] Cheolsoo Park commented on PIG-3014: Hi Julien, Sorry for that. It is failing because {{TestBuiltin}} is not set {{pig.job.submitted.timestamp}}. I will get it fixed now. > CurrentTime() UDF has undesirable characteristics > - > > Key: PIG-3014 > URL: https://issues.apache.org/jira/browse/PIG-3014 > Project: Pig > Issue Type: Bug >Reporter: Jonathan Coveney >Assignee: Jonathan Coveney > Fix For: 0.12 > > Attachments: PIG-3014-0.patch, PIG-3014-1.patch, PIG-3014-2.patch > > > As part of the explanation of the new DateTime datatype I noticed that we had > added a CurrentTime() UDF. The issue with this UDF is that it returns the > current time _of every exec invocation_, which can lead to confusing results. > In PIG-1431 I proposed a way such that every instance of the same NOW() will > return the same time, which I think is better. Would enjoy thoughts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (PIG-3014) CurrentTime() UDF has undesirable characteristics
[ https://issues.apache.org/jira/browse/PIG-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Le Dem reopened PIG-3014: I see a failing test: org.apache.pig.test.TestBuiltin.testConversionBetweenDateTimeAndString java.lang.NullPointerException at org.apache.pig.builtin.CurrentTime.exec(CurrentTime.java:41) at org.apache.pig.test.TestBuiltin.testConversionBetweenDateTimeAndString(TestBuiltin.java:450) > CurrentTime() UDF has undesirable characteristics > - > > Key: PIG-3014 > URL: https://issues.apache.org/jira/browse/PIG-3014 > Project: Pig > Issue Type: Bug >Reporter: Jonathan Coveney >Assignee: Jonathan Coveney > Fix For: 0.12 > > Attachments: PIG-3014-0.patch, PIG-3014-1.patch, PIG-3014-2.patch > > > As part of the explanation of the new DateTime datatype I noticed that we had > added a CurrentTime() UDF. The issue with this UDF is that it returns the > current time _of every exec invocation_, which can lead to confusing results. > In PIG-1431 I proposed a way such that every instance of the same NOW() will > return the same time, which I think is better. Would enjoy thoughts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2614) AvroStorage crashes on LOADING a single bad error
[ https://issues.apache.org/jira/browse/PIG-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheolsoo Park updated PIG-2614: --- Status: Patch Available (was: Open) > AvroStorage crashes on LOADING a single bad error > - > > Key: PIG-2614 > URL: https://issues.apache.org/jira/browse/PIG-2614 > Project: Pig > Issue Type: Bug > Components: piggybank >Affects Versions: 0.10.0, 0.11 >Reporter: Russell Jurney >Assignee: Jonathan Coveney > Labels: avro, avrostorage, bad, book, cutting, doug, for, my, > pig, sadism > Fix For: 0.11, 0.10.1 > > Attachments: PIG-2614_0.patch, PIG-2614_1.patch, PIG-2614_2.patch, > test_avro_files.tar.gz > > > AvroStorage dies when a single bad record exists, such as one with missing > fields. This is very bad on 'big data,' where bad records are inevitable. > See discussion at > http://www.quora.com/Big-Data/In-Big-Data-ETL-how-many-records-are-an-acceptable-loss > for more theory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2614) AvroStorage crashes on LOADING a single bad error
[ https://issues.apache.org/jira/browse/PIG-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507564#comment-13507564 ] Cheolsoo Park commented on PIG-2614: In addition to applying the patch, the following commands should be also executed to run the unit test cases: {code} wget https://issues.apache.org/jira/secure/attachment/1246/test_avro_files.tar.gz tar -xf test_avro_files.tar.gz svn rm contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/avro_test_files/test_corrupted_file.avro svn add contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/avro_test_files/test_corrupted_file svn rm contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/avro_test_files/expected_testCorruptedFile.avro svn add contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/avro_test_files/expected_testCorruptedFile2.avro svn add contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/avro_test_files/expected_testCorruptedFile3.avro {code} Thanks! > AvroStorage crashes on LOADING a single bad error > - > > Key: PIG-2614 > URL: https://issues.apache.org/jira/browse/PIG-2614 > Project: Pig > Issue Type: Bug > Components: piggybank >Affects Versions: 0.10.0, 0.11 >Reporter: Russell Jurney >Assignee: Jonathan Coveney > Labels: avro, avrostorage, bad, book, cutting, doug, for, my, > pig, sadism > Fix For: 0.11, 0.10.1 > > Attachments: PIG-2614_0.patch, PIG-2614_1.patch, PIG-2614_2.patch, > test_avro_files.tar.gz > > > AvroStorage dies when a single bad record exists, such as one with missing > fields. This is very bad on 'big data,' where bad records are inevitable. > See discussion at > http://www.quora.com/Big-Data/In-Big-Data-ETL-how-many-records-are-an-acceptable-loss > for more theory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2614) AvroStorage crashes on LOADING a single bad error
[ https://issues.apache.org/jira/browse/PIG-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheolsoo Park updated PIG-2614: --- Attachment: test_avro_files.tar.gz PIG-2614_2.patch Hi all, I rebased the patch to trunk. Hopefully, this will make things more clear: - Removed PIG-2551 code since it's already committed to trunk. - Replaced the {{ignore_bad_file}} option that was committed in PIG-2909 with the {{bad.record.threshold}} and {{bad.record.min}} properties. - Added unit test cases {{testCorruptedFile1,2,3}}. @Joe, I am not sure if I fully understand your question. Please correct me if I am wrong. You're right that {{InputErrorTracker}} can be used by any LoadFunc. What storages need to do is to create a {{InputErrorTracker}} and increase counters. Do you have a better suggestion? Thanks! > AvroStorage crashes on LOADING a single bad error > - > > Key: PIG-2614 > URL: https://issues.apache.org/jira/browse/PIG-2614 > Project: Pig > Issue Type: Bug > Components: piggybank >Affects Versions: 0.10.0, 0.11 >Reporter: Russell Jurney >Assignee: Jonathan Coveney > Labels: avro, avrostorage, bad, book, cutting, doug, for, my, > pig, sadism > Fix For: 0.11, 0.10.1 > > Attachments: PIG-2614_0.patch, PIG-2614_1.patch, PIG-2614_2.patch, > test_avro_files.tar.gz > > > AvroStorage dies when a single bad record exists, such as one with missing > fields. This is very bad on 'big data,' where bad records are inevitable. > See discussion at > http://www.quora.com/Big-Data/In-Big-Data-ETL-how-many-records-are-an-acceptable-loss > for more theory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2907) Publish pig 0.23 jars to maven
[ https://issues.apache.org/jira/browse/PIG-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-2907: Attachment: PIG-2907-2.patch Updated patch with comment explaining move based on Julien's comments in reviewboard. Julien, Need a +1 from you for the h23 to h2 change. > Publish pig 0.23 jars to maven > -- > > Key: PIG-2907 > URL: https://issues.apache.org/jira/browse/PIG-2907 > Project: Pig > Issue Type: New Feature >Reporter: Francis Liu >Assignee: Rohini Palaniswamy > Fix For: 0.11 > > Attachments: PIG-2907-1.patch, PIG-2907-2.patch, PIG-2907.patch > > > HCatalog would like to get our unit tests be able to run against 0.23 part of > it would require pulling the pig 0.23 dependency from maven. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2907) Publish pig 0.23 jars to maven
[ https://issues.apache.org/jira/browse/PIG-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-2907: Attachment: PIG-2907-1.patch Updated patch changing classifier from h23 to h2 based on Alejandro's review comment > Publish pig 0.23 jars to maven > -- > > Key: PIG-2907 > URL: https://issues.apache.org/jira/browse/PIG-2907 > Project: Pig > Issue Type: New Feature >Reporter: Francis Liu >Assignee: Rohini Palaniswamy > Fix For: 0.11 > > Attachments: PIG-2907-1.patch, PIG-2907.patch > > > HCatalog would like to get our unit tests be able to run against 0.23 part of > it would require pulling the pig 0.23 dependency from maven. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: [PIG-2907] Publish pig 0.23 jars to maven
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/8157/ --- (Updated Nov. 30, 2012, 3:49 p.m.) Review request for pig. Changes --- Changed classifier from h23 to h2 based on Alejandro's review comment Description --- Publishing h23 compiled pig jar with classifier h23. This addresses bug PIG-2907. https://issues.apache.org/jira/browse/PIG-2907 Diffs (updated) - http://svn.apache.org/repos/asf/pig/trunk/build.xml 1415689 Diff: https://reviews.apache.org/r/8157/diff/ Testing --- Tested with a local nexus repository using the command ant clean mvn-deploy -Dasfrepo=http://localhost:8089/nexus Thanks, Rohini Palaniswamy
[jira] [Commented] (PIG-3058) Upgrade junit to at least 4.8
[ https://issues.apache.org/jira/browse/PIG-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507365#comment-13507365 ] Cheolsoo Park commented on PIG-3058: +1. I ran the full unit test suite with hadoop 20/23 and don't see any additional test failures. > Upgrade junit to at least 4.8 > - > > Key: PIG-3058 > URL: https://issues.apache.org/jira/browse/PIG-3058 > Project: Pig > Issue Type: Bug > Components: build >Affects Versions: 0.11 >Reporter: fang fang chen >Assignee: fang fang chen > Fix For: 0.11, 0.12 > > Attachments: PIG-3058.patch > > > Pig needs to upgrade junit version to at least 4.8. Otherwise, one gets > following warnings. > [javadoc] > org/apache/hadoop/hbase/mapreduce/TestWALPlayer.class(org/apache/hadoop/hbase/mapreduce:TestWALPlayer.class): > warning: Cannot find annotation method 'value()' in type > 'org.junit.experimental.categories.Category': class file for > org.junit.experimental.categories.Category not found -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira