[jira] Commented: (PIG-1229) allow pig to write output into a JDBC db

2010-02-08 Thread Ankur (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831337#action_12831337 ] Ankur commented on PIG-1229: Aaron, Thanks for the suggestions. I'll have an updated patch coming

[jira] Assigned: (PIG-1229) allow pig to write output into a JDBC db

2010-02-08 Thread Ankur (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur reassigned PIG-1229: -- Assignee: Ankur > allow pig to write output into a JDBC db > > >

[jira] Updated: (PIG-834) incorrect plan when algebraic functions are nested

2010-02-08 Thread Ashutosh Chauhan (JIRA)
[ https://issues.apache.org/jira/browse/PIG-834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-834: - Status: Open (was: Patch Available) > incorrect plan when algebraic functions are nested >

[jira] Updated: (PIG-834) incorrect plan when algebraic functions are nested

2010-02-08 Thread Ashutosh Chauhan (JIRA)
[ https://issues.apache.org/jira/browse/PIG-834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-834: - Status: Patch Available (was: Open) Trying to get hudson going on this. > incorrect plan when alge

[jira] Commented: (PIG-1215) Make Hadoop jobId more prominent in the client log

2010-02-08 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831316#action_12831316 ] Hadoop QA commented on PIG-1215: -1 overall. Here are the results of testing the latest atta

[jira] Commented: (PIG-1215) Make Hadoop jobId more prominent in the client log

2010-02-08 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831309#action_12831309 ] Hadoop QA commented on PIG-1215: -1 overall. Here are the results of testing the latest atta

[jira] Updated: (PIG-1231) Default DataBagIterator.hasNext() should be idempotent in all cases

2010-02-08 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1231: Description: DefaultDataBagIterator.hasNext() is not repeatable when the below conditions met: 1. There is n

[jira] Commented: (PIG-1131) Pig simple join does not work when it contains empty lines

2010-02-08 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831251#action_12831251 ] Viraj Bhat commented on PIG-1131: - Ashutosh I was able to recreate a similar problem using th

[jira] Commented: (PIG-1131) Pig simple join does not work when it contains empty lines

2010-02-08 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831248#action_12831248 ] Viraj Bhat commented on PIG-1131: - Olga I marked it as critical since we mention that Pig can

[jira] Updated: (PIG-1230) Streaming input in POJoinPackage should use nonspillable bag to collect tuples

2010-02-08 Thread Ashutosh Chauhan (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-1230: -- Attachment: pig-1230_1.patch Fixed findbugs warnings. Result of test-patch: {code} [exec]

[jira] Commented: (PIG-259) allow store to overwrite existing directroy

2010-02-08 Thread Dmitriy V. Ryaboy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831237#action_12831237 ] Dmitriy V. Ryaboy commented on PIG-259: --- Yeah I think it makes more sense on that level.

[jira] Updated: (PIG-1231) DataBagIterator.hasNext() should be idempotent

2010-02-08 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1231: Attachment: PIG-1231-1.patch DefaultDataBagIterator is the only DataBag has this problem. Other databag hand

[jira] Updated: (PIG-1231) DataBagIterator.hasNext() should be idempotent

2010-02-08 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1231: Description: DataBagIterator.hasNext() is not repeatable in some situations. This is not acceptable cuz the

[jira] Updated: (PIG-1231) DataBagIterator.hasNext() should be idempotent

2010-02-08 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1231: Status: Patch Available (was: Open) > DataBagIterator.hasNext() should be idempotent > -

[jira] Commented: (PIG-1230) Streaming input in POJoinPackage should use nonspillable bag to collect tuples

2010-02-08 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831224#action_12831224 ] Hadoop QA commented on PIG-1230: -1 overall. Here are the results of testing the latest atta

[jira] Commented: (PIG-259) allow store to overwrite existing directroy

2010-02-08 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831222#action_12831222 ] Alan Gates commented on PIG-259: If we make overwrite part of the language (as the JIRA propos

[jira] Commented: (PIG-259) allow store to overwrite existing directroy

2010-02-08 Thread Dmitriy V. Ryaboy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831212#action_12831212 ] Dmitriy V. Ryaboy commented on PIG-259: --- Doesn't the StoreFunc take care of resource cre

[jira] Commented: (PIG-259) allow store to overwrite existing directroy

2010-02-08 Thread Alan Gates (JIRA)
[ https://issues.apache.org/jira/browse/PIG-259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831205#action_12831205 ] Alan Gates commented on PIG-259: Sorry, I missed that it was already for load-store redesign.

[jira] Updated: (PIG-1215) Make Hadoop jobId more prominent in the client log

2010-02-08 Thread Ashutosh Chauhan (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-1215: -- Status: Patch Available (was: Open) > Make Hadoop jobId more prominent in the client log > -

[jira] Updated: (PIG-1215) Make Hadoop jobId more prominent in the client log

2010-02-08 Thread Ashutosh Chauhan (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-1215: -- Attachment: pig-1215.patch With this patch, Job ids will now be printed as: 2010-02-08 13:54:26,

[jira] Updated: (PIG-1227) [zebra] Missing column group meta file should not be allowed at query time

2010-02-08 Thread Yan Zhou (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-1227: -- Resolution: Fixed Status: Resolved (was: Patch Available) > [zebra] Missing column group meta file shoul

[jira] Commented: (PIG-1227) [zebra] Missing column group meta file should not be allowed at query time

2010-02-08 Thread Yan Zhou (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831154#action_12831154 ] Yan Zhou commented on PIG-1227: --- Patch committed to the 0.6 branch. > [zebra] Missing column g

[jira] Updated: (PIG-1227) [zebra] Missing column group meta file should not be allowed at query time

2010-02-08 Thread Chao Wang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Wang updated PIG-1227: --- Patch looks good +1. > [zebra] Missing column group meta file should not be allowed at query time > --

[jira] Updated: (PIG-1230) Streaming input in POJoinPackage should use nonspillable bag to collect tuples

2010-02-08 Thread Ashutosh Chauhan (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-1230: -- Attachment: pig-1230.patch Patch as per description. > Streaming input in POJoinPackage should u

[jira] Updated: (PIG-1230) Streaming input in POJoinPackage should use nonspillable bag to collect tuples

2010-02-08 Thread Ashutosh Chauhan (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-1230: -- Status: Patch Available (was: Open) > Streaming input in POJoinPackage should use nonspillable b

[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

2010-02-08 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831090#action_12831090 ] Xuefu Zhang commented on PIG-1140: -- New submission. It includes changes required for PIG LOA

[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

2010-02-08 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated PIG-1140: - Attachment: (was: zebra.0112) > [zebra] Use of Hadoop 2.0 APIs > > >

[jira] Commented: (PIG-1227) [zebra] Missing column group meta file should not be allowed at query time

2010-02-08 Thread Yan Zhou (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831086#action_12831086 ] Yan Zhou commented on PIG-1227: --- The patch is only applicable to the Apache 0.6 branch only, no

[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

2010-02-08 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated PIG-1140: - Attachment: zebra.0209 > [zebra] Use of Hadoop 2.0 APIs > > >

[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

2010-02-08 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated PIG-1140: - Status: Open (was: Patch Available) > [zebra] Use of Hadoop 2.0 APIs >

[jira] Created: (PIG-1231) DataBagIterator.hasNext() should be idempotent

2010-02-08 Thread Daniel Dai (JIRA)
DataBagIterator.hasNext() should be idempotent -- Key: PIG-1231 URL: https://issues.apache.org/jira/browse/PIG-1231 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.6

[jira] Created: (PIG-1230) Streaming input in POJoinPackage should use nonspillable bag to collect tuples

2010-02-08 Thread Ashutosh Chauhan (JIRA)
Streaming input in POJoinPackage should use nonspillable bag to collect tuples -- Key: PIG-1230 URL: https://issues.apache.org/jira/browse/PIG-1230 Project: Pig Issu

[jira] Updated: (PIG-1224) Collected group should change to use new (internal) bag

2010-02-08 Thread Ashutosh Chauhan (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-1224: -- Resolution: Fixed Status: Resolved (was: Patch Available) Patch checked-in. > Collected

[jira] Commented: (PIG-1229) allow pig to write output into a JDBC db

2010-02-08 Thread Aaron Kimball (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831052#action_12831052 ] Aaron Kimball commented on PIG-1229: Ian, This class looks reasonable to me. You'll pro

[jira] Commented: (PIG-1224) Collected group should change to use new (internal) bag

2010-02-08 Thread Olga Natkovich (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831048#action_12831048 ] Olga Natkovich commented on PIG-1224: - +1; please. commit > Collected group should chang

[jira] Resolved: (PIG-1228) \

2010-02-08 Thread Pradeep Kamath (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pradeep Kamath resolved PIG-1228. - Resolution: Invalid Seems like a jira created by accident > \ > - > > Key: PIG-122

[jira] Commented: (PIG-1227) [zebra] Missing column group meta file should not be allowed at query time

2010-02-08 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831008#action_12831008 ] Hadoop QA commented on PIG-1227: -1 overall. Here are the results of testing the latest atta

[jira] Commented: (PIG-1227) [zebra] Missing column group meta file should not be allowed at query time

2010-02-08 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831006#action_12831006 ] Hadoop QA commented on PIG-1227: -1 overall. Here are the results of testing the latest atta

Re: Will Pig support SQL?

2010-02-08 Thread Dmitriy Ryaboy
Also, I looked at the idea you posted and it seems to me that your balance step is in effect the sample step Pig's skewed data solution implements. Except your balance step needs 100% of the data. Consider how your balancing works when there's 1000 map tasks, each of which produces outputs that wi

Re: Will Pig support SQL?

2010-02-08 Thread Dmitriy Ryaboy
Jian, If what you are looking for is something that will let you deal with skewed data and forget about how the underlying distributed system works, both Pig and Hive will help you do that to some extent. If you are looking for something that will let you exercise fine-grained control over individu

Re: Will Pig support SQL?

2010-02-08 Thread jian yi
We can regards a task as a sleep call, the parameter of sleep is the time long. sleep(N) - For hive ,the N is not certain sleep(M) - For MBR, the M is certain 2010/2/8 jian yi > Hi Jeff, > > Thank you Jeff. > I known Hive has handling skewed join, but I think it is not enough: > 1.Need cost samp

Re: Will Pig support SQL?

2010-02-08 Thread jian yi
Hi Jeff, Thank you Jeff. I known Hive has handling skewed join, but I think it is not enough: 1.Need cost sample 2.Can't control the size of a task 3.Not exact 4.Must use Hive or Pig I think this is a fundamental solution for skew problem by adding balacne between map and reduce. Maybe I need exp

Re: Will Pig support SQL?

2010-02-08 Thread Jeff Hammerbacher
Hey Jian, Hive supports arbitrary procedural languages through Hadoop Streaming; see http://wiki.apache.org/hadoop/Hive/LanguageManual/Transform for more. Also, both Hive and Pig have support for handling skewed joins if you use their higher-level interface. See https://issues.apache.org/jira/bro

[jira] Updated: (PIG-1229) allow pig to write output into a JDBC db

2010-02-08 Thread Ian Holsman (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ian Holsman updated PIG-1229: - Attachment: DbStorage.java > allow pig to write output into a JDBC db > ---

[jira] Created: (PIG-1229) allow pig to write output into a JDBC db

2010-02-08 Thread Ian Holsman (JIRA)
allow pig to write output into a JDBC db Key: PIG-1229 URL: https://issues.apache.org/jira/browse/PIG-1229 Project: Pig Issue Type: New Feature Components: impl Reporter: Ian Hol