[jira] [Commented] (PIG-2553) Pig shouldn't allow attempts to write multiple relations into same directory
[ https://issues.apache.org/jira/browse/PIG-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13496013#comment-13496013 ] Prashant Kommireddi commented on PIG-2553: -- Also, we could add kfs and maprfs to the list of file-based schemes. I agree the list could keep growing, but I suspect most users facing this issue to be hdfs:// and file:// users. What do you guys think of this for a start? > Pig shouldn't allow attempts to write multiple relations into same directory > > > Key: PIG-2553 > URL: https://issues.apache.org/jira/browse/PIG-2553 > Project: Pig > Issue Type: Improvement >Reporter: Dmitriy V. Ryaboy >Assignee: Prashant Kommireddi > Attachments: PIG-2553.patch > > > We've seen multiple occasions where users accidentally try to store 2 or more > different relations to the same destination directory. Currently, this passes > the Pig planner and fails on MR side due to concurrent attempts to create the > same part file on the reducer. This is extremely confusing to the user, and > hard to debug. > We should instead fail their scripts before they are even submitted, since we > can identify the erroneous condition from the beginning. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2553) Pig shouldn't allow attempts to write multiple relations into same directory
[ https://issues.apache.org/jira/browse/PIG-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13496011#comment-13496011 ] Prashant Kommireddi commented on PIG-2553: -- I feel like we could treat this as file-based vs non-file based storage locations, similar to PIG-2924. The patch there uses a default FileBasedOutputSizeReader to determine output size and "pig.stats.output.size.reader" to compute size based on a different implementation. For this JIRA, can we also use a similar idea and handle file-based schemes with UriUtil.isHDFSFileOrLocalOrS3N(String uri)? For all other schemes (hbase, hcat, ...) we can allow multiple relations writing to same location. 1. Check if pig.location.check.strict is set 2. If not set, just log a warning if scheme is file-based 3. If set, check if scheme is file-based and report an error 4. If set but not a file-based scheme, continue without any warning/error message Thoughts? > Pig shouldn't allow attempts to write multiple relations into same directory > > > Key: PIG-2553 > URL: https://issues.apache.org/jira/browse/PIG-2553 > Project: Pig > Issue Type: Improvement >Reporter: Dmitriy V. Ryaboy >Assignee: Prashant Kommireddi > Attachments: PIG-2553.patch > > > We've seen multiple occasions where users accidentally try to store 2 or more > different relations to the same destination directory. Currently, this passes > the Pig planner and fails on MR side due to concurrent attempts to create the > same part file on the reducer. This is extremely confusing to the user, and > hard to debug. > We should instead fail their scripts before they are even submitted, since we > can identify the erroneous condition from the beginning. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2553) Pig shouldn't allow attempts to write multiple relations into same directory
[ https://issues.apache.org/jira/browse/PIG-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495887#comment-13495887 ] Dmitriy V. Ryaboy commented on PIG-2553: I think it should be set to false by default so that current scripts that use fancy storers, etc, can keep running without change (the scripts that have bugs, which we are trying to address with this, don't run correctly at all, so we don't have to worry about being backwards compatible with them). Individual pig admins / script authors can decide to turn it on by default if they notice this happening a lot. We are piling up quite the list of exceptions, though. Between hcat, hbase, unknown other schemas, and hdfs/s3/kfs/mapr cases, I'm getting concerned that maybe this wasn't such a well thought out feature wish on my part! What do you guys think? > Pig shouldn't allow attempts to write multiple relations into same directory > > > Key: PIG-2553 > URL: https://issues.apache.org/jira/browse/PIG-2553 > Project: Pig > Issue Type: Improvement >Reporter: Dmitriy V. Ryaboy >Assignee: Prashant Kommireddi > Attachments: PIG-2553.patch > > > We've seen multiple occasions where users accidentally try to store 2 or more > different relations to the same destination directory. Currently, this passes > the Pig planner and fails on MR side due to concurrent attempts to create the > same part file on the reducer. This is extremely confusing to the user, and > hard to debug. > We should instead fail their scripts before they are even submitted, since we > can identify the erroneous condition from the beginning. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2553) Pig shouldn't allow attempts to write multiple relations into same directory
[ https://issues.apache.org/jira/browse/PIG-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495811#comment-13495811 ] Prashant Kommireddi commented on PIG-2553: -- That's a good point Dmitriy. The patch does not handle multiple relations being written to hbase. Is it sufficient to check for the schema (hdfs://, hbase://, file://,...) ? Rohini, you are right. Any implementation of StoreFunc similar to Hadoop MultipleOutputFormat would break this. As Dmitriy suggested, I think it makes sense to provide an option to users, in addition to logging a warning message? > Pig shouldn't allow attempts to write multiple relations into same directory > > > Key: PIG-2553 > URL: https://issues.apache.org/jira/browse/PIG-2553 > Project: Pig > Issue Type: Improvement >Reporter: Dmitriy V. Ryaboy >Assignee: Prashant Kommireddi > Attachments: PIG-2553.patch > > > We've seen multiple occasions where users accidentally try to store 2 or more > different relations to the same destination directory. Currently, this passes > the Pig planner and fails on MR side due to concurrent attempts to create the > same part file on the reducer. This is extremely confusing to the user, and > hard to debug. > We should instead fail their scripts before they are even submitted, since we > can identify the erroneous condition from the beginning. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2553) Pig shouldn't allow attempts to write multiple relations into same directory
[ https://issues.apache.org/jira/browse/PIG-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495808#comment-13495808 ] Rohini Palaniswamy commented on PIG-2553: - Thanks for bringing up hbase Dmitriy. HCat table is another case. Would that property be true by default? I am fine making the property true by default as long as it checks only for filesystem locations. HCat and HBase tables are going to be more common going forward, and it would not be nice to ask the users to launch pig every time with that property set to false. > Pig shouldn't allow attempts to write multiple relations into same directory > > > Key: PIG-2553 > URL: https://issues.apache.org/jira/browse/PIG-2553 > Project: Pig > Issue Type: Improvement >Reporter: Dmitriy V. Ryaboy >Assignee: Prashant Kommireddi > Attachments: PIG-2553.patch > > > We've seen multiple occasions where users accidentally try to store 2 or more > different relations to the same destination directory. Currently, this passes > the Pig planner and fails on MR side due to concurrent attempts to create the > same part file on the reducer. This is extremely confusing to the user, and > hard to debug. > We should instead fail their scripts before they are even submitted, since we > can identify the erroneous condition from the beginning. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Subscription: PIG patch available
Issue Subscription Filter: PIG patch available (29 issues) Subscriber: pigdaily Key Summary PIG-3039Not possible to use custom version of jackson jars https://issues.apache.org/jira/browse/PIG-3039 PIG-3029TestTypeCheckingValidatorNewLP has some path reference issues for cross-platform execution https://issues.apache.org/jira/browse/PIG-3029 PIG-3028testGrunt dev test needs some command filters to run correctly without cygwin https://issues.apache.org/jira/browse/PIG-3028 PIG-3027pigTest unit test needs a newline filter for comparisons of golden multi-line https://issues.apache.org/jira/browse/PIG-3027 PIG-3026Pig checked-in baseline comparisons need a pre-filter to address OS-specific newline differences https://issues.apache.org/jira/browse/PIG-3026 PIG-3025TestPruneColumn unit test - SimpleEchoStreamingCommand perl inline script needs simplification https://issues.apache.org/jira/browse/PIG-3025 PIG-3024TestEmptyInputDir unit test - hadoop version detection logic is brittle https://issues.apache.org/jira/browse/PIG-3024 PIG-3014CurrentTime() UDF has undesirable characteristics https://issues.apache.org/jira/browse/PIG-3014 PIG-3010Allow UDF's to flatten themselves https://issues.apache.org/jira/browse/PIG-3010 PIG-2978TestLoadStoreFuncLifeCycle fails with hadoop-2.0.x https://issues.apache.org/jira/browse/PIG-2978 PIG-2959Add a pig.cmd for Pig to run under Windows https://issues.apache.org/jira/browse/PIG-2959 PIG-2957TetsScriptUDF fail due to volume prefix in jar https://issues.apache.org/jira/browse/PIG-2957 PIG-2956Invalid cache specification for some streaming statement https://issues.apache.org/jira/browse/PIG-2956 PIG-2955 Fix bunch of Pig e2e tests on Windows https://issues.apache.org/jira/browse/PIG-2955 PIG-2937generated field in nested foreach does not inherit the variable name as the field name https://issues.apache.org/jira/browse/PIG-2937 PIG-2924PigStats should not be assuming all Storage classes to be file-based storage https://issues.apache.org/jira/browse/PIG-2924 PIG-2873Converting bin/pig shell script to python https://issues.apache.org/jira/browse/PIG-2873 PIG-2834MultiStorage requires unused constructor argument https://issues.apache.org/jira/browse/PIG-2834 PIG-2824Pushing checking number of fields into LoadFunc https://issues.apache.org/jira/browse/PIG-2824 PIG-2661Pig uses an extra job for loading data in Pigmix L9 https://issues.apache.org/jira/browse/PIG-2661 PIG-2657Print warning if using wrong jython version https://issues.apache.org/jira/browse/PIG-2657 PIG-2507Semicolon in paramenters for UDF results in parsing error https://issues.apache.org/jira/browse/PIG-2507 PIG-2433Jython import module not working if module path is in classpath https://issues.apache.org/jira/browse/PIG-2433 PIG-2417Streaming UDFs - allow users to easily write UDFs in scripting languages with no JVM implementation. https://issues.apache.org/jira/browse/PIG-2417 PIG-2362Rework Ant build.xml to use macrodef instead of antcall https://issues.apache.org/jira/browse/PIG-2362 PIG-2312NPE when relation and column share the same name and used in Nested Foreach https://issues.apache.org/jira/browse/PIG-2312 PIG-1942script UDF (jython) should utilize the intended output schema to more directly convert Py objects to Pig objects https://issues.apache.org/jira/browse/PIG-1942 PIG-1431Current DateTime UDFs: ISONOW(), UNIXNOW() https://issues.apache.org/jira/browse/PIG-1431 PIG-1237Piggybank MutliStorage - specify field to write in output https://issues.apache.org/jira/browse/PIG-1237 You may edit this subscription at: https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=13225&filterId=12322384
[jira] [Commented] (PIG-2782) Specifying sorting field(s) at nightly.conf
[ https://issues.apache.org/jira/browse/PIG-2782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495766#comment-13495766 ] Egil Sorensen commented on PIG-2782: There were still problems with the patch here. E.g. the pig script sorts on column one and two, but the verification only checks that output is sorted on column one. For details please see the cloned PIG-3045. > Specifying sorting field(s) at nightly.conf > --- > > Key: PIG-2782 > URL: https://issues.apache.org/jira/browse/PIG-2782 > Project: Pig > Issue Type: Bug > Environment: Mac OS X Lion 10.7.3 > Hadoop 1.0.1-SNAPSHOT > Apache Pig version 0.11.0-SNAPSHOT (r1355798) >Reporter: Allan AvendaƱo >Assignee: Cheolsoo Park > Fix For: 0.11 > > Attachments: PIG-2782.patch > > > After running the Checkin tests, it fails because one of the parameters > passed to the sort is incorrect (instead of +1 -2, on POSIX is -k2,2). > According to this http://ss64.com/bash/sort.html, it was on an old notation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2553) Pig shouldn't allow attempts to write multiple relations into same directory
[ https://issues.apache.org/jira/browse/PIG-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495765#comment-13495765 ] Dmitriy V. Ryaboy commented on PIG-2553: Rohini, what if we hide this behind a property, something like "pig.location.check.strict"? I see accidental writes to the same output path causing problems all the time.. would love to have this feature. Prashant -- I haven't looked at the patch yet, but just something to check: it does allow writes of multiple relations to, say, the same HBase table? > Pig shouldn't allow attempts to write multiple relations into same directory > > > Key: PIG-2553 > URL: https://issues.apache.org/jira/browse/PIG-2553 > Project: Pig > Issue Type: Improvement >Reporter: Dmitriy V. Ryaboy >Assignee: Prashant Kommireddi > Attachments: PIG-2553.patch > > > We've seen multiple occasions where users accidentally try to store 2 or more > different relations to the same destination directory. Currently, this passes > the Pig planner and fails on MR side due to concurrent attempts to create the > same part file on the reducer. This is extremely confusing to the user, and > hard to debug. > We should instead fail their scripts before they are even submitted, since we > can identify the erroneous condition from the beginning. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3045) Specifying sorting field(s) at nightly.conf - further changes
[ https://issues.apache.org/jira/browse/PIG-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Egil Sorensen updated PIG-3045: --- Description: PIG-2782 fixed a number of tests where the parameters passed to the verification sort was incorrect. However, there are still problems with the patch in PIG-2782. E.g. the pig script sorts on column one and two, but the verification only checks that output is sorted on column one. For file test/e2e/pig/tests/nightly.conf: === @@ -1728,7 +1728,7 @@ 'pig' =>q\a = load ':INPATH:/singlefile/studentnulltab10k' as (name:chararray, age:int, gpa:double); b = order a by name, age, gpa; store b into ':OUTPATH:';\, -'sortArgs' => ['-t', ' ', '+0', '-1', '+1n', '-2'], +'sortArgs' => ['-t', ' ', '-k', '1,1', '-k', '2n,2n'], }, Should have been: +'sortArgs' => ['-t', ' ', '-k', '1,1', '-k', '2n,3n'], === Similar @@ -1736,7 +1736,7 @@ 'pig' =>q\a = load ':INPATH:/singlefile/studentnulltab10k' as (name:chararray, age:int, gpa:double); b = order a by name desc, age desc, gpa desc; store b into ':OUTPATH:';\, -'sortArgs' => ['-t', ' ', '+0r', '-1', '+1nr', '-2'], +'sortArgs' => ['-t', ' ', '-k', '1r,1r', '-k', '2nr,2nr'], }, Should have been: +'sortArgs' => ['-t', ' ', '-k', '1r,1r', '-k', '2nr,3nr'], === and @@ -1752,7 +1752,7 @@ 'pig' =>q\a = load ':INPATH:/singlefile/studentnulltab10k' as (name, age:long, gpa:float); b = order a by name desc, age desc, gpa desc; store b into ':OUTPATH:';\, -'sortArgs' => ['-t', ' ', '+0r', '-1', '+1nr', '-2'], +'sortArgs' => ['-t', ' ', '-k', '1r,1r', '-k', '2nr,2nr'], }, Should have been: +'sortArgs' => ['-t', ' ', '-k', '1r,1r', '-k', '2nr,3nr'], === @@ -1847,7 +1847,7 @@ 'pig' => q\a = load ':INPATH:/singlefile/studentnulltab10k' as (name:chararray, age:int, gpa:double); b = order a by *; store b into ':OUTPATH:';\, -'sortArgs' => ['-t', ' ', '+0', '-1', '+1n', '-2'], +'sortArgs' => ['-t', ' ', '-k', '1,1', '-k', '2n,2n'], }, Should have been: +'sortArgs' => ['-t', ' ', '-k', '1,1', '-k', '2n,3n'], === @@ -1855,7 +1855,7 @@ 'pig' => q\a = load ':INPATH:/singlefile/studentnulltab10k' as (name:chararray, age:int, gpa:double); b = order a by * desc; store b into ':OUTPATH:';\, -'sortArgs' => ['-t', ' ', '+0r', '-1', '+1nr', '-2'], +'sortArgs' => ['-t', ' ', '-k', '1r,1r', '-k', '2nr,2nr'], }, Should have been: +'sortArgs' => ['-t', ' ', '-k', '1,1', '-k', '2nr,3nr'], === @@ -1943,7 +1943,7 @@ c = filter b by $0 > 'a'; -- break the sort/limit optimization d = limit c 100; store d into ':OUTPATH:';\, - 'sortArgs' => ['-t', ' ', '+0', '-1'], + 'sortArgs' => ['-t', ' ', '-k', '1,1'], Should have been: + 'sortArgs' => ['-t', ' ', '-k', '1,2'], === @@ -1952,7 +1952,7 @@ b = order a by $0, $1; c = limit b 100; store c into ':OUTPATH:';\, - 'sortArgs' => ['-t', ' ', '+0', '-1'], + 'sortArgs' => ['-t', ' ', '-k', '1,1'], Should have been: + 'sortArgs' => ['-t', ' ', '-k', '1,2'], === @@ -,7 +,7 @@ D = order B by age, extra; store D into ':OUTPATH:';\, - 'sortArgs' => ['-t', ' ', '+1n', '-2'], + 'sortArgs' => ['-t', ' ', '-k', '2n,2n'], }, Should have been: + 'sortArgs' => ['-t', ' ', '-k', '2n,2n', '-k', '4,4'], (This last is decidedly minor, as the 'extra' column is empty, but for sake of consistency...) was: After running the Checkin tests, it fails because one of the parameters passed to the sort is incorrect (instead of +1 -2, on POSIX is -k2,2). According to this http://ss64.com/bash/sort.html, it was on an old notation. > Specifying sorting field(s) at nightly.conf - further changes > - > > Key: PIG-3045 > URL: https://issues.apache.org/jira/browse/PIG-3045 > Project: Pig > Issue Type: Bug > Components: e2e harness >Affects Versions: 0.10.0, 0.10.1 > Environment: Mac OS X Lion 10.7.3 > Hadoop 1.0.1-SNAPSHOT > Apache Pig version 0.11.0-SNAPSHOT (r1355798) >Reporter: Egil Sorensen >Assignee: Cheolsoo Park > Labels: test >
[jira] [Updated] (PIG-3045) Specifying sorting field(s) at nightly.conf - further changes
[ https://issues.apache.org/jira/browse/PIG-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Egil Sorensen updated PIG-3045: --- Affects Version/s: 0.10.1 > Specifying sorting field(s) at nightly.conf - further changes > - > > Key: PIG-3045 > URL: https://issues.apache.org/jira/browse/PIG-3045 > Project: Pig > Issue Type: Bug > Components: e2e harness >Affects Versions: 0.10.0, 0.10.1 > Environment: Mac OS X Lion 10.7.3 > Hadoop 1.0.1-SNAPSHOT > Apache Pig version 0.11.0-SNAPSHOT (r1355798) >Reporter: Egil Sorensen >Assignee: Cheolsoo Park > Labels: test > Fix For: 0.11 > > > After running the Checkin tests, it fails because one of the parameters > passed to the sort is incorrect (instead of +1 -2, on POSIX is -k2,2). > According to this http://ss64.com/bash/sort.html, it was on an old notation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2553) Pig shouldn't allow attempts to write multiple relations into same directory
[ https://issues.apache.org/jira/browse/PIG-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495729#comment-13495729 ] Rohini Palaniswamy commented on PIG-2553: - PiggyBank also has a MultiStorage - http://svn.apache.org/viewvc/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/MultiStorage.java?revision=1145447&view=markup. One thing we could do is print a warning message instead of throwing a error. I don't see a way to correctly determine and throw an error. And in cases like MultiStorage the filename is not even static. The output dir name/file name is dynamic and depends on the value of a field in the record. > Pig shouldn't allow attempts to write multiple relations into same directory > > > Key: PIG-2553 > URL: https://issues.apache.org/jira/browse/PIG-2553 > Project: Pig > Issue Type: Improvement >Reporter: Dmitriy V. Ryaboy >Assignee: Prashant Kommireddi > Attachments: PIG-2553.patch > > > We've seen multiple occasions where users accidentally try to store 2 or more > different relations to the same destination directory. Currently, this passes > the Pig planner and fails on MR side due to concurrent attempts to create the > same part file on the reducer. This is extremely confusing to the user, and > hard to debug. > We should instead fail their scripts before they are even submitted, since we > can identify the erroneous condition from the beginning. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2857) Add a -tagPath option to PigStorage
[ https://issues.apache.org/jira/browse/PIG-2857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Kommireddi updated PIG-2857: - Patch Info: Patch Available > Add a -tagPath option to PigStorage > --- > > Key: PIG-2857 > URL: https://issues.apache.org/jira/browse/PIG-2857 > Project: Pig > Issue Type: New Feature >Reporter: Dmitriy V. Ryaboy >Assignee: Prashant Kommireddi > Attachments: PIG-2857_1.patch, PIG-2857.patch > > > We recently added a "-tagSource" option to PigStorage, which allows us to add > filenames from which records come to the returned tuples. > Often, users want the whole path, not just the source file. I propose we add > a "-tagPath" option to do this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2553) Pig shouldn't allow attempts to write multiple relations into same directory
[ https://issues.apache.org/jira/browse/PIG-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495705#comment-13495705 ] Prashant Kommireddi commented on PIG-2553: -- StoreFuncs using setStoreLocation/relToAbsPathForStoreLocation to append filenames make it difficult to handle this. Any other ideas since the earlier approach is not safe? > Pig shouldn't allow attempts to write multiple relations into same directory > > > Key: PIG-2553 > URL: https://issues.apache.org/jira/browse/PIG-2553 > Project: Pig > Issue Type: Improvement >Reporter: Dmitriy V. Ryaboy >Assignee: Prashant Kommireddi > Attachments: PIG-2553.patch > > > We've seen multiple occasions where users accidentally try to store 2 or more > different relations to the same destination directory. Currently, this passes > the Pig planner and fails on MR side due to concurrent attempts to create the > same part file on the reducer. This is extremely confusing to the user, and > hard to debug. > We should instead fail their scripts before they are even submitted, since we > can identify the erroneous condition from the beginning. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3045) Specifying sorting field(s) at nightly.conf - further changes
[ https://issues.apache.org/jira/browse/PIG-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Egil Sorensen updated PIG-3045: --- Labels: test (was: ) > Specifying sorting field(s) at nightly.conf - further changes > - > > Key: PIG-3045 > URL: https://issues.apache.org/jira/browse/PIG-3045 > Project: Pig > Issue Type: Bug > Components: e2e harness >Affects Versions: 0.10.0 > Environment: Mac OS X Lion 10.7.3 > Hadoop 1.0.1-SNAPSHOT > Apache Pig version 0.11.0-SNAPSHOT (r1355798) >Reporter: Egil Sorensen >Assignee: Cheolsoo Park > Labels: test > Fix For: 0.11 > > > After running the Checkin tests, it fails because one of the parameters > passed to the sort is incorrect (instead of +1 -2, on POSIX is -k2,2). > According to this http://ss64.com/bash/sort.html, it was on an old notation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3045) Specifying sorting field(s) at nightly.conf - further changes
[ https://issues.apache.org/jira/browse/PIG-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Egil Sorensen updated PIG-3045: --- Hadoop Flags: (was: Reviewed) > Specifying sorting field(s) at nightly.conf - further changes > - > > Key: PIG-3045 > URL: https://issues.apache.org/jira/browse/PIG-3045 > Project: Pig > Issue Type: Bug > Components: e2e harness >Affects Versions: 0.10.0 > Environment: Mac OS X Lion 10.7.3 > Hadoop 1.0.1-SNAPSHOT > Apache Pig version 0.11.0-SNAPSHOT (r1355798) >Reporter: Egil Sorensen >Assignee: Cheolsoo Park > Labels: test > Fix For: 0.11 > > > After running the Checkin tests, it fails because one of the parameters > passed to the sort is incorrect (instead of +1 -2, on POSIX is -k2,2). > According to this http://ss64.com/bash/sort.html, it was on an old notation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3045) Specifying sorting field(s) at nightly.conf - further changes
[ https://issues.apache.org/jira/browse/PIG-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Egil Sorensen updated PIG-3045: --- Component/s: e2e harness > Specifying sorting field(s) at nightly.conf - further changes > - > > Key: PIG-3045 > URL: https://issues.apache.org/jira/browse/PIG-3045 > Project: Pig > Issue Type: Bug > Components: e2e harness >Affects Versions: 0.10.0 > Environment: Mac OS X Lion 10.7.3 > Hadoop 1.0.1-SNAPSHOT > Apache Pig version 0.11.0-SNAPSHOT (r1355798) >Reporter: Egil Sorensen >Assignee: Cheolsoo Park > Labels: test > Fix For: 0.11 > > > After running the Checkin tests, it fails because one of the parameters > passed to the sort is incorrect (instead of +1 -2, on POSIX is -k2,2). > According to this http://ss64.com/bash/sort.html, it was on an old notation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3045) Specifying sorting field(s) at nightly.conf - further changes
[ https://issues.apache.org/jira/browse/PIG-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Egil Sorensen updated PIG-3045: --- Affects Version/s: 0.10.0 > Specifying sorting field(s) at nightly.conf - further changes > - > > Key: PIG-3045 > URL: https://issues.apache.org/jira/browse/PIG-3045 > Project: Pig > Issue Type: Bug > Components: e2e harness >Affects Versions: 0.10.0 > Environment: Mac OS X Lion 10.7.3 > Hadoop 1.0.1-SNAPSHOT > Apache Pig version 0.11.0-SNAPSHOT (r1355798) >Reporter: Egil Sorensen >Assignee: Cheolsoo Park > Labels: test > Fix For: 0.11 > > > After running the Checkin tests, it fails because one of the parameters > passed to the sort is incorrect (instead of +1 -2, on POSIX is -k2,2). > According to this http://ss64.com/bash/sort.html, it was on an old notation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (PIG-3045) CLONE - Specifying sorting field(s) at nightly.conf
Egil Sorensen created PIG-3045: -- Summary: CLONE - Specifying sorting field(s) at nightly.conf Key: PIG-3045 URL: https://issues.apache.org/jira/browse/PIG-3045 Project: Pig Issue Type: Bug Environment: Mac OS X Lion 10.7.3 Hadoop 1.0.1-SNAPSHOT Apache Pig version 0.11.0-SNAPSHOT (r1355798) Reporter: Egil Sorensen Assignee: Cheolsoo Park Fix For: 0.11 After running the Checkin tests, it fails because one of the parameters passed to the sort is incorrect (instead of +1 -2, on POSIX is -k2,2). According to this http://ss64.com/bash/sort.html, it was on an old notation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3045) Specifying sorting field(s) at nightly.conf - further changes
[ https://issues.apache.org/jira/browse/PIG-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Egil Sorensen updated PIG-3045: --- Summary: Specifying sorting field(s) at nightly.conf - further changes (was: CLONE - Specifying sorting field(s) at nightly.conf) > Specifying sorting field(s) at nightly.conf - further changes > - > > Key: PIG-3045 > URL: https://issues.apache.org/jira/browse/PIG-3045 > Project: Pig > Issue Type: Bug > Environment: Mac OS X Lion 10.7.3 > Hadoop 1.0.1-SNAPSHOT > Apache Pig version 0.11.0-SNAPSHOT (r1355798) >Reporter: Egil Sorensen >Assignee: Cheolsoo Park > Fix For: 0.11 > > > After running the Checkin tests, it fails because one of the parameters > passed to the sort is incorrect (instead of +1 -2, on POSIX is -k2,2). > According to this http://ss64.com/bash/sort.html, it was on an old notation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2176) add logical plan assumption checker
[ https://issues.apache.org/jira/browse/PIG-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated PIG-2176: --- Assignee: Thejas M Nair > add logical plan assumption checker > > > Key: PIG-2176 > URL: https://issues.apache.org/jira/browse/PIG-2176 > Project: Pig > Issue Type: Improvement >Affects Versions: 0.9.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 0.10.0 > > Attachments: PIG-2176.1.patch, PIG-2176.2.patch > > > Pig expects certain things about LogicalPlan, and optimizer logic depends on > those to be true. Could that verifies that these assumptions are true will > help in catching issues early on. > Some of the assumptions that should be checked - > 1. All schema have valid uid . (not -1). > 2. All fields in schema have distinct uid. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2657) Print warning if using wrong jython version
[ https://issues.apache.org/jira/browse/PIG-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495644#comment-13495644 ] Cheolsoo Park commented on PIG-2657: Hi Johnny, Thank you very much for the patch. I have a few more comments: - I think that you have to swap {{Version.PY_VERSION}} and {{jythonVersion}}. - Looking at {{JythonScriptEngine.java}} after apply your patch, I see no reason why we nest try-catch blocks here. Can you please move the inner try-catch block to outside the outer one? In addition, can you make it to catch an IOException instead of an Exception since that's specifically what is thrown by {{JarFile}}? Do you agree? - Please remove tabs in {{build.xml}}. > Print warning if using wrong jython version > --- > > Key: PIG-2657 > URL: https://issues.apache.org/jira/browse/PIG-2657 > Project: Pig > Issue Type: Bug >Reporter: Fabian Alenius > Labels: newbie > Fix For: 0.12 > > Attachments: PIG-2657.1.patch, PIG-2657.2.patch, PIG-2657.3.patch > > > Hi, > It would be good if Pig would print a warning (or refuse to run) if you are > using an unsupported version of jython. I spent a couple of hours before > figuring out that you had to use 2.5.0. I've seen posts indicating that > others have run into this problem as well. > Might write up a patch if others agree this is an issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2553) Pig shouldn't allow attempts to write multiple relations into same directory
[ https://issues.apache.org/jira/browse/PIG-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495629#comment-13495629 ] Rohini Palaniswamy commented on PIG-2553: - Yes. That would be a simple one. > Pig shouldn't allow attempts to write multiple relations into same directory > > > Key: PIG-2553 > URL: https://issues.apache.org/jira/browse/PIG-2553 > Project: Pig > Issue Type: Improvement >Reporter: Dmitriy V. Ryaboy >Assignee: Prashant Kommireddi > Attachments: PIG-2553.patch > > > We've seen multiple occasions where users accidentally try to store 2 or more > different relations to the same destination directory. Currently, this passes > the Pig planner and fails on MR side due to concurrent attempts to create the > same part file on the reducer. This is extremely confusing to the user, and > hard to debug. > We should instead fail their scripts before they are even submitted, since we > can identify the erroneous condition from the beginning. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2553) Pig shouldn't allow attempts to write multiple relations into same directory
[ https://issues.apache.org/jira/browse/PIG-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495586#comment-13495586 ] Prashant Kommireddi commented on PIG-2553: -- Thanks for the feedback Rohini and Cheolsoo. Rohini, what does such a StoreFunc look like? May be the following? {code} STORE alias1 INTO 'output' using MyStoreFunc('filename1'); STORE alias2 INTO 'output' using MyStoreFunc('filename2'); {code} > Pig shouldn't allow attempts to write multiple relations into same directory > > > Key: PIG-2553 > URL: https://issues.apache.org/jira/browse/PIG-2553 > Project: Pig > Issue Type: Improvement >Reporter: Dmitriy V. Ryaboy >Assignee: Prashant Kommireddi > Attachments: PIG-2553.patch > > > We've seen multiple occasions where users accidentally try to store 2 or more > different relations to the same destination directory. Currently, this passes > the Pig planner and fails on MR side due to concurrent attempts to create the > same part file on the reducer. This is extremely confusing to the user, and > hard to debug. > We should instead fail their scripts before they are even submitted, since we > can identify the erroneous condition from the beginning. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2657) Print warning if using wrong jython version
[ https://issues.apache.org/jira/browse/PIG-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Johnny Zhang updated PIG-2657: -- Attachment: PIG-2657.3.patch [~cheolsoo], here is the new patch based on your comments. > Print warning if using wrong jython version > --- > > Key: PIG-2657 > URL: https://issues.apache.org/jira/browse/PIG-2657 > Project: Pig > Issue Type: Bug >Reporter: Fabian Alenius > Labels: newbie > Fix For: 0.12 > > Attachments: PIG-2657.1.patch, PIG-2657.2.patch, PIG-2657.3.patch > > > Hi, > It would be good if Pig would print a warning (or refuse to run) if you are > using an unsupported version of jython. I spent a couple of hours before > figuring out that you had to use 2.5.0. I've seen posts indicating that > others have run into this problem as well. > Might write up a patch if others agree this is an issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2553) Pig shouldn't allow attempts to write multiple relations into same directory
[ https://issues.apache.org/jira/browse/PIG-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495554#comment-13495554 ] Rohini Palaniswamy commented on PIG-2553: - This might break some existing custom StoreFuncs. Currently you can write a custom storer which allows to write into same directory, but with different file names. We have users who have written that kind of Storers. > Pig shouldn't allow attempts to write multiple relations into same directory > > > Key: PIG-2553 > URL: https://issues.apache.org/jira/browse/PIG-2553 > Project: Pig > Issue Type: Improvement >Reporter: Dmitriy V. Ryaboy >Assignee: Prashant Kommireddi > Attachments: PIG-2553.patch > > > We've seen multiple occasions where users accidentally try to store 2 or more > different relations to the same destination directory. Currently, this passes > the Pig planner and fails on MR side due to concurrent attempts to create the > same part file on the reducer. This is extremely confusing to the user, and > hard to debug. > We should instead fail their scripts before they are even submitted, since we > can identify the erroneous condition from the beginning. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2857) Add a -tagPath option to PigStorage
[ https://issues.apache.org/jira/browse/PIG-2857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Kommireddi updated PIG-2857: - Attachment: PIG-2857_1.patch Had made changes to Utils.java that I did not add to the previous patch. Attaching an updated patch. > Add a -tagPath option to PigStorage > --- > > Key: PIG-2857 > URL: https://issues.apache.org/jira/browse/PIG-2857 > Project: Pig > Issue Type: New Feature >Reporter: Dmitriy V. Ryaboy >Assignee: Prashant Kommireddi > Attachments: PIG-2857_1.patch, PIG-2857.patch > > > We recently added a "-tagSource" option to PigStorage, which allows us to add > filenames from which records come to the returned tuples. > Often, users want the whole path, not just the source file. I propose we add > a "-tagPath" option to do this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2955) Fix bunch of Pig e2e tests on Windows
[ https://issues.apache.org/jira/browse/PIG-2955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495496#comment-13495496 ] Alan Gates commented on PIG-2955: - +1, changes look good. > Fix bunch of Pig e2e tests on Windows > --- > > Key: PIG-2955 > URL: https://issues.apache.org/jira/browse/PIG-2955 > Project: Pig > Issue Type: Sub-task >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.11, 0.10.1, 0.12 > > Attachments: PIG-2955-1.patch, PIG-2955-2_0.10.patch, PIG-2955-2.patch > > > Fix the following test aborts and failures: > ComputeSpec_1 > ComputeSpec_2 > Unicode_cmdline_1 > Warning_1 > Warning_4 > Checkin_2 > UdfDistributedCache_1 > Jython_Checkin_2 > Jython_Diagnostics_4 > Jython_Diagnostics_5 > Jython_Diagnostics_6 > Jython_Error_3 > Jython_Error_4 > Jython_Error_5 > Jython_Error_6 > Jython_Error_7 > Grunt_6 > Grunt_8 > Grunt_13 > Grunt_14 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Jenkins / Clover
Hi, Gianmarco I added you to hudson-jobadmin group. Thanks, Daniel On Thu, Jul 19, 2012 at 12:33 AM, Gianmarco De Francisci Morales wrote: > Fine, > Alan, could you add me to the hudson-jobadmin group? > > modify_appgroups.pl hudson-jobadmin --add=gdfm > > On people.apache.org, according to the page. > > I have subscribed to infrastructure and builds. > > Cheers, > -- > Gianmarco > > > > > On Thu, Jul 19, 2012 at 12:17 AM, Alan Gates wrote: > >> http://wiki.apache.org/general/Jenkins?action=show&redirect=Hudsondescribes >> how to get an account so you can administer the Jenkins builds. >> >> Alan. >> >> On Jul 18, 2012, at 12:27 PM, Gianmarco De Francisci Morales wrote: >> >> > What is the procedure to modify the nightly build? >> > If everyone agrees (and somebody explains me how) I volunteer to fix it. >> > >> > Cheers, >> > -- >> > Gianmarco >> > >> > >> > >> > >> > On Wed, Jul 18, 2012 at 8:25 AM, Jonathan Coveney > >wrote: >> > >> >> +1 >> >> >> >> A while ago I tried to get apache builds to deal with this, and nothing. >> >> Very annoying, but pending a fix, we should remove it from the nightly. >> >> >> >> 2012/7/17 Alan Gates >> >> >> >>> I'm fine with removing it from the nightly build. I don't see any >> reason >> >>> to run that every day, especially since it slows down the tests. Let's >> >> not >> >>> remove it from ant, as it's useful to run occasionally. >> >>> >> >>> Alan. >> >>> >> >>> On Jul 17, 2012, at 3:17 PM, Gianmarco De Francisci Morales wrote: >> >>> >> Hi, >> >> Clover constantly makes a number of our Jenkins builds fail (usually >> because of license issues, I think it is a misconfiguration). >> Do we actually use it? >> If we don't I would propose to remove it from our build. >> What do you think? >> >> Cheers, >> -- >> Gianmarco >> >>> >> >>> >> >> >> >>
[jira] [Updated] (PIG-2657) Print warning if using wrong jython version
[ https://issues.apache.org/jira/browse/PIG-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheolsoo Park updated PIG-2657: --- Labels: newbie (was: ) Additional comments: - Please remove tabs. Use 4 space instead. - The property {{jython.version}} shouldn't be hard-coded in {{build.xml}}. It is automatically loaded from {{ivy/libraries.properties}} at compile-time, so no reason to define it. - The {{jython.version}} attribute shouldn't be embedded in {{pigunit.jar}}. It's useful only in {{pig.jar}} and {{pig-withouthadoop.jar}}. - Regarding the log message, why don't we print a message like "Pig is tested with ${jython.version}, so it may not work with ${runtime.jython.version}"? I think that this is flexible and informative at the same time. Thanks! > Print warning if using wrong jython version > --- > > Key: PIG-2657 > URL: https://issues.apache.org/jira/browse/PIG-2657 > Project: Pig > Issue Type: Bug >Reporter: Fabian Alenius > Labels: newbie > Fix For: 0.12 > > Attachments: PIG-2657.1.patch, PIG-2657.2.patch > > > Hi, > It would be good if Pig would print a warning (or refuse to run) if you are > using an unsupported version of jython. I spent a couple of hours before > figuring out that you had to use 2.5.0. I've seen posts indicating that > others have run into this problem as well. > Might write up a patch if others agree this is an issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3039) Not possible to use custom version of jackson jars
[ https://issues.apache.org/jira/browse/PIG-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheolsoo Park updated PIG-3039: --- Attachment: PIG-3039-download-jackson.patch Hi Rohini, The patch looks good. My only concern is that we have to remember this test case when we bump jackson to 1.9.9 in future. I suggest that we should at least make a comment in ivy/libraries.properties regarding this test case so that we won't forget to update this test case when we update the version of jackson. In addition, wouldn't it better to download the jackson 1.9.9 binaries using ant instead of checking them in? I made a quick patch that does this, so please feel free to use it if you like to. This is just a suggestion, and I won't insist. Thanks! > Not possible to use custom version of jackson jars > -- > > Key: PIG-3039 > URL: https://issues.apache.org/jira/browse/PIG-3039 > Project: Pig > Issue Type: Bug >Affects Versions: 0.10.0 >Reporter: Rohini Palaniswamy >Assignee: Rohini Palaniswamy > Fix For: 0.12 > > Attachments: PIG-3039-download-jackson.patch, PIG-3039-trunk.patch > > > User is trying > register jackson_core_asl-1.9.4_1.jar; > register jackson_mapper_asl-1.9.4_1.jar; > register jackson_xc-1.9.4_1.jar; > But pig.jar/pig-withouthadoop.jar has jackson jars and JarManager packages > the jackson from pig.jar into job.jar(PIG-2457). We could not find any > possible workaround with mapreduce framework to put the user jar first in the > classpath as job.jar always takes precedence. > The pig script works fine with 0.9 and is a regression in 0.10. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2553) Pig shouldn't allow attempts to write multiple relations into same directory
[ https://issues.apache.org/jira/browse/PIG-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495260#comment-13495260 ] Cheolsoo Park commented on PIG-2553: Hi Prashant, Thank you very much for the patch! I tested it in local and mr mode, and it works fine. Can you please add a unit test probably in TestPigServer? > Pig shouldn't allow attempts to write multiple relations into same directory > > > Key: PIG-2553 > URL: https://issues.apache.org/jira/browse/PIG-2553 > Project: Pig > Issue Type: Improvement >Reporter: Dmitriy V. Ryaboy >Assignee: Prashant Kommireddi > Attachments: PIG-2553.patch > > > We've seen multiple occasions where users accidentally try to store 2 or more > different relations to the same destination directory. Currently, this passes > the Pig planner and fails on MR side due to concurrent attempts to create the > same part file on the reducer. This is extremely confusing to the user, and > hard to debug. > We should instead fail their scripts before they are even submitted, since we > can identify the erroneous condition from the beginning. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2857) Add a -tagPath option to PigStorage
[ https://issues.apache.org/jira/browse/PIG-2857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Kommireddi updated PIG-2857: - Attachment: PIG-2857.patch Coming back to this. I think we should keep 'tagsource' in the next release for backward compatibility and as you suggested, log the deprecation message. May be we can remove 'tagsource' for 0.13. Adding a patch that now uses '-tagFile' for source filename and '-tagPath' for source path. I have modified tests accordingly to handle the same. > Add a -tagPath option to PigStorage > --- > > Key: PIG-2857 > URL: https://issues.apache.org/jira/browse/PIG-2857 > Project: Pig > Issue Type: New Feature >Reporter: Dmitriy V. Ryaboy >Assignee: Prashant Kommireddi > Attachments: PIG-2857.patch > > > We recently added a "-tagSource" option to PigStorage, which allows us to add > filenames from which records come to the returned tuples. > Often, users want the whole path, not just the source file. I propose we add > a "-tagPath" option to do this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3041) Improve ResourceStatistics
[ https://issues.apache.org/jira/browse/PIG-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Kommireddi updated PIG-3041: - Issue Type: Improvement (was: Bug) > Improve ResourceStatistics > -- > > Key: PIG-3041 > URL: https://issues.apache.org/jira/browse/PIG-3041 > Project: Pig > Issue Type: Improvement >Affects Versions: 0.12 >Reporter: Prashant Kommireddi >Assignee: Prashant Kommireddi > > This is a follow-up JIRA to PIG-2582. ResourceStatistics should be improved > and a few things we should do for 0.13. > 1. Consider removing method setmBytes(Long mBytes). We deprecated this method > in 0.12, but the code does not seem intuitive as the setter is actually > working on the variable "bytes". > 2. All setter methods return ResourceStatistics object and this is > unnecessary. For eg: > {code} > public ResourceStatistics setNumRecords(Long numRecords) { > this.numRecords = numRecords; > return this; > } > {code} > Each one of these variables has an associated getter. > I will take this up once we are in the 0.13 cycle. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira