[jira] Updated: (HIVE-1600) need to sort hook input/output lists for test result determinism
[ https://issues.apache.org/jira/browse/HIVE-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-1600: - Status: Resolved (was: Patch Available) Hadoop Flags: [Reviewed] Resolution: Fixed Committed. Thanks John > need to sort hook input/output lists for test result determinism > > > Key: HIVE-1600 > URL: https://issues.apache.org/jira/browse/HIVE-1600 > Project: Hadoop Hive > Issue Type: Bug > Components: Testing Infrastructure >Affects Versions: 0.6.0 >Reporter: John Sichi >Assignee: John Sichi > Fix For: 0.7.0 > > Attachments: HIVE-1600.1.patch, HIVE-1600.2.patch > > > Begin forwarded message: > From: Ning Zhang > Date: August 26, 2010 2:47:26 PM PDT > To: John Sichi > Cc: "hive-dev@hadoop.apache.org" > Subject: Re: failure in load_dyn_part1.q > Yes I saw this error before but if it does not repro. So it's probably an > ordering issue in POSTHOOK. > On Aug 26, 2010, at 2:39 PM, John Sichi wrote: > I'm seeing this failure due to a result diff when running tests on latest > trunk: > POSTHOOK: Input: defa...@srcpart@ds=2008-04-08/hr=12 > POSTHOOK: Input: defa...@srcpart@ds=2008-04-09/hr=11 > POSTHOOK: Input: defa...@srcpart@ds=2008-04-09/hr=12 > -POSTHOOK: Output: defa...@nzhang_part2@ds=2008-12-31/hr=11 > -POSTHOOK: Output: defa...@nzhang_part2@ds=2008-12-31/hr=12 > POSTHOOK: Output: defa...@nzhang_part1@ds=2008-04-08/hr=11 > POSTHOOK: Output: defa...@nzhang_part1@ds=2008-04-08/hr=12 > +POSTHOOK: Output: defa...@nzhang_part2@ds=2008-12-31/hr=11 > +POSTHOOK: Output: defa...@nzhang_part2@ds=2008-12-31/hr=12 > Did something change recently? Or are we missing a Java-level sort on the > input/output list for determinism? > JVS -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-471) A UDF for simple reflection
[ https://issues.apache.org/jira/browse/HIVE-471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo updated HIVE-471: - Status: Patch Available (was: Open) > A UDF for simple reflection > --- > > Key: HIVE-471 > URL: https://issues.apache.org/jira/browse/HIVE-471 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Affects Versions: 0.6.0 >Reporter: Edward Capriolo >Assignee: Edward Capriolo >Priority: Minor > Fix For: 0.7.0 > > Attachments: hive-471-gen.diff, HIVE-471.1.patch, HIVE-471.2.patch, > HIVE-471.3.patch, HIVE-471.4.patch, HIVE-471.5.patch, HIVE-471.6.patch.txt, > hive-471.diff > > > There are many methods in java that are static and have no arguments or can > be invoked with one simple parameter. More complicated functions will require > a UDF but one generic one can work as a poor-mans UDF. > {noformat} > SELECT reflect("java.lang.String", "valueOf", 1), reflect("java.lang.String", > "isEmpty") > FROM src LIMIT 1; > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-471) A UDF for simple reflection
[ https://issues.apache.org/jira/browse/HIVE-471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo updated HIVE-471: - Attachment: HIVE-471.6.patch.txt > A UDF for simple reflection > --- > > Key: HIVE-471 > URL: https://issues.apache.org/jira/browse/HIVE-471 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Affects Versions: 0.6.0 >Reporter: Edward Capriolo >Assignee: Edward Capriolo >Priority: Minor > Fix For: 0.7.0 > > Attachments: hive-471-gen.diff, HIVE-471.1.patch, HIVE-471.2.patch, > HIVE-471.3.patch, HIVE-471.4.patch, HIVE-471.5.patch, HIVE-471.6.patch.txt, > hive-471.diff > > > There are many methods in java that are static and have no arguments or can > be invoked with one simple parameter. More complicated functions will require > a UDF but one generic one can work as a poor-mans UDF. > {noformat} > SELECT reflect("java.lang.String", "valueOf", 1), reflect("java.lang.String", > "isEmpty") > FROM src LIMIT 1; > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1130) Create argmin and argmax
[ https://issues.apache.org/jira/browse/HIVE-1130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pierre Huyn updated HIVE-1130: -- Status: Patch Available (was: Open) Affects Version/s: 0.7.0 Fix Version/s: 0.7.0 Please review code. > Create argmin and argmax > > > Key: HIVE-1130 > URL: https://issues.apache.org/jira/browse/HIVE-1130 > Project: Hadoop Hive > Issue Type: Improvement >Affects Versions: 0.7.0 >Reporter: Zheng Shao >Assignee: Pierre Huyn > Fix For: 0.7.0 > > Attachments: HIVE-1130.1.patch > > > With HIVE-1128, users can already do what argmax and argmin does. > But it will be helpful if we provide these functions explicitly so people > from maths/stats background can use it more easily. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1130) Create argmin and argmax
[ https://issues.apache.org/jira/browse/HIVE-1130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pierre Huyn updated HIVE-1130: -- Attachment: HIVE-1130.1.patch Initial release of generic user-defined aggregate function: ARGMAX (X, Y) which takes a set of pairs (X,Y) and returns a Y that maximizes X. X must be of a comparable type. Any pair (NULL,Y) is ignored. If the function is applied to an empty set, NULL will be returned. If more than one Y value maximizes X, one of them will be returned arbitrarily. The current implementation tested out fine with the rebuilt Hive in my working copy of the SVN tree. However, it fails with ANT TEST and I could not figure out why. The code is ready for review. Also, any help to figure out why ANT TEST fails is appreciated. > Create argmin and argmax > > > Key: HIVE-1130 > URL: https://issues.apache.org/jira/browse/HIVE-1130 > Project: Hadoop Hive > Issue Type: Improvement >Reporter: Zheng Shao >Assignee: Pierre Huyn > Attachments: HIVE-1130.1.patch > > > With HIVE-1128, users can already do what argmax and argmin does. > But it will be helpful if we provide these functions explicitly so people > from maths/stats background can use it more easily. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1600) need to sort hook input/output lists for test result determinism
[ https://issues.apache.org/jira/browse/HIVE-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903841#action_12903841 ] Namit Jain commented on HIVE-1600: -- +1 > need to sort hook input/output lists for test result determinism > > > Key: HIVE-1600 > URL: https://issues.apache.org/jira/browse/HIVE-1600 > Project: Hadoop Hive > Issue Type: Bug > Components: Testing Infrastructure >Affects Versions: 0.6.0 >Reporter: John Sichi >Assignee: John Sichi > Fix For: 0.7.0 > > Attachments: HIVE-1600.1.patch, HIVE-1600.2.patch > > > Begin forwarded message: > From: Ning Zhang > Date: August 26, 2010 2:47:26 PM PDT > To: John Sichi > Cc: "hive-dev@hadoop.apache.org" > Subject: Re: failure in load_dyn_part1.q > Yes I saw this error before but if it does not repro. So it's probably an > ordering issue in POSTHOOK. > On Aug 26, 2010, at 2:39 PM, John Sichi wrote: > I'm seeing this failure due to a result diff when running tests on latest > trunk: > POSTHOOK: Input: defa...@srcpart@ds=2008-04-08/hr=12 > POSTHOOK: Input: defa...@srcpart@ds=2008-04-09/hr=11 > POSTHOOK: Input: defa...@srcpart@ds=2008-04-09/hr=12 > -POSTHOOK: Output: defa...@nzhang_part2@ds=2008-12-31/hr=11 > -POSTHOOK: Output: defa...@nzhang_part2@ds=2008-12-31/hr=12 > POSTHOOK: Output: defa...@nzhang_part1@ds=2008-04-08/hr=11 > POSTHOOK: Output: defa...@nzhang_part1@ds=2008-04-08/hr=12 > +POSTHOOK: Output: defa...@nzhang_part2@ds=2008-12-31/hr=11 > +POSTHOOK: Output: defa...@nzhang_part2@ds=2008-12-31/hr=12 > Did something change recently? Or are we missing a Java-level sort on the > input/output list for determinism? > JVS -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Meeting notes from the August Hive Contributors Meeting
August 8th, 2010 - Yongqiang He gave a presentation about his work on index support in Hive. - Slides are available here: http://files.meetup.com/1658206/Hive%20Index.pptx - John Sichi talked about his work on filter-pushdown optimizations. This is applicable to the HBase storage handler and the new index infrastructure. - Pradeep Kamath gave an update on progress with Howl. - The Howl source code is available on GitHub here: http://github.com/yahoo/howl - Starting to work on security for Howl. For the first iteration the plan is to base it on DFS permissions. - General agreement that we should aim to desupport pre-0.20.0 versions of Hadoop in Hive 0.7.0. This will allow us to remove the shim layer and will make it easier to transition to the new mapreduce APIs. But we also want to get a better idea of how many users are stuck on pre-0.20 versions of Hadoop. - Remove Thrift generated code from repository. - Pro: reduce noise in diffs during reviews. - Con: requires developers to install Thrift compiler. - Discussed moving the documentation from the wiki to version control. - Probably not practical to maintain the trunk version of the docs on the wiki and roll over to version control at release time, so trunk version of docs will be maintained in vcs. - It was agreed that feature patches should include updates to the docs, but it is also acceptable to file a doc ticket if there is time pressure to commit.j - Will maintain an errata page on the wiki for collecting updates/corrections from users. These notes will be rolled into the documentation in vcs on a monthly basis. - The next meeting will be held in September at Cloudera's office in Palo Alto.