[jira] [Commented] (PIG-1832) Support timestamp in HBaseStorage when storing
[ https://issues.apache.org/jira/browse/PIG-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588059#comment-13588059 ] Eric Yang commented on PIG-1832: hi Guido, I think the -tmpestamp= make sense for high throughput system. We probably should revisit per cell level timestamp writing later. This is not a high priority item for me to work on. If anyone would like to tackle this issue, feel free to take this issue. > Support timestamp in HBaseStorage when storing > -- > > Key: PIG-1832 > URL: https://issues.apache.org/jira/browse/PIG-1832 > Project: Pig > Issue Type: Improvement >Reporter: Eric Yang > > When storing data into HBase using > org.apache.pig.backend.hadoop.hbase.HBaseStorage, HBase timestamp field is > stored with insertion time of the mapreduce job. It would be nice to have a > way to populate timestamp from user data. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3222) New UDFContextSignature assignments in Pig 0.11 breaks HCatalog.HCatStorer
[ https://issues.apache.org/jira/browse/PIG-3222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588037#comment-13588037 ] Bill Graham commented on PIG-3222: -- Feng could you attach a sample test script/storer that reproduces the Pig bug without HCatalog? > New UDFContextSignature assignments in Pig 0.11 breaks HCatalog.HCatStorer > --- > > Key: PIG-3222 > URL: https://issues.apache.org/jira/browse/PIG-3222 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.11 >Reporter: Feng Peng > Labels: hcatalog > > Pig 0.11 assigns different UDFContextSignature for different invocations of > the same load/store statement. This change breaks the HCatStorer which > assumes all front-end and back-end invocations of the same store statement > has the same UDFContextSignature so that it can read the previously stored > information correctly. > The related HCatalog code is in > https://svn.apache.org/repos/asf/incubator/hcatalog/branches/branch-0.5/hcatalog-pig-adapter/src/main/java/org/apache/hcatalog/pig/HCatStorer.java > (the setStoreLocation() function). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3183) rm or rmf commands should respect globbing/regex of path
[ https://issues.apache.org/jira/browse/PIG-3183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13587865#comment-13587865 ] Prashant Kommireddi commented on PIG-3183: -- [~jcoveney] or others have any comments? > rm or rmf commands should respect globbing/regex of path > > > Key: PIG-3183 > URL: https://issues.apache.org/jira/browse/PIG-3183 > Project: Pig > Issue Type: Improvement > Components: grunt >Affects Versions: 0.10.0 >Reporter: Prashant Kommireddi >Assignee: Prashant Kommireddi > Fix For: 0.12 > > Attachments: PIG-3183.patch > > > Hadoop fs commands support globbing during deleting files/dirs. Pig is not > consistent with this behavior and seems like we could change rm/rmf commands > to do the same. > For eg: > {code} > localhost:pig pkommireddi$ ls -ld out* > drwxr-xr-x 12 pkommireddi SF\domain users 408 Feb 13 01:09 out > drwxr-xr-x 2 pkommireddi SF\domain users 68 Feb 13 01:16 out1 > drwxr-xr-x 2 pkommireddi SF\domain users 68 Feb 13 01:16 out2 > localhost:pig pkommireddi$ bin/pig -x local > grunt> rmf out* > grunt> quit > localhost:pig pkommireddi$ ls -ld out* > drwxr-xr-x 12 pkommireddi SF\domain users 408 Feb 13 01:09 out > drwxr-xr-x 2 pkommireddi SF\domain users 68 Feb 13 01:16 out1 > drwxr-xr-x 2 pkommireddi SF\domain users 68 Feb 13 01:16 out2 > {code} > Ideally, the user would expect "rmf out*" to delete all of the above dirs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Subscription: PIG patch available
Issue Subscription Filter: PIG patch available (33 issues) Subscriber: pigdaily Key Summary PIG-3216Groovy UDFs documentation has minor typos https://issues.apache.org/jira/browse/PIG-3216 PIG-3215[piggybank] Add LTSVLoader to load LTSV (Labeled Tab-separated Values) files https://issues.apache.org/jira/browse/PIG-3215 PIG-3210Pig fails to start when it cannot write log to log files https://issues.apache.org/jira/browse/PIG-3210 PIG-3205Passing arguments to python script does not work with -f option https://issues.apache.org/jira/browse/PIG-3205 PIG-3198Let users use any function from PigType -> PigType as if it were builtlin https://issues.apache.org/jira/browse/PIG-3198 PIG-3185Pig release lacks UDF for Initcap function https://issues.apache.org/jira/browse/PIG-3185 PIG-3184Pig release lacks UDF for functions rtrim and repeat https://issues.apache.org/jira/browse/PIG-3184 PIG-3183rm or rmf commands should respect globbing/regex of path https://issues.apache.org/jira/browse/PIG-3183 PIG-3166Update eclipse .classpath according to ivy library.properties https://issues.apache.org/jira/browse/PIG-3166 PIG-3164Pig current releases lack a UDF endsWith.This UDF tests if a given string ends with the specified suffix. https://issues.apache.org/jira/browse/PIG-3164 PIG-3162PigTest.assertOutput doesn't allow non-default delimiter https://issues.apache.org/jira/browse/PIG-3162 PIG-3144Erroneous map entry alias resolution leading to "Duplicate schema alias" errors https://issues.apache.org/jira/browse/PIG-3144 PIG-3142Fixed-width load and store functions for the Piggybank https://issues.apache.org/jira/browse/PIG-3142 PIG-3136Introduce a syntax making declared aliases optional https://issues.apache.org/jira/browse/PIG-3136 PIG-3123Simplify Logical Plans By Removing Unneccessary Identity Projections https://issues.apache.org/jira/browse/PIG-3123 PIG-3122Operators should not implicitly become reserved keywords https://issues.apache.org/jira/browse/PIG-3122 PIG-3114Duplicated macro name error when using pigunit https://issues.apache.org/jira/browse/PIG-3114 PIG-3105Fix TestJobSubmission unit test failure. https://issues.apache.org/jira/browse/PIG-3105 PIG-3088Add a builtin udf which removes prefixes https://issues.apache.org/jira/browse/PIG-3088 PIG-3081Pig progress stays at 0% for the first job in hadoop 23 https://issues.apache.org/jira/browse/PIG-3081 PIG-3069Native Windows Compatibility for Pig E2E Tests and Harness https://issues.apache.org/jira/browse/PIG-3069 PIG-3028testGrunt dev test needs some command filters to run correctly without cygwin https://issues.apache.org/jira/browse/PIG-3028 PIG-3027pigTest unit test needs a newline filter for comparisons of golden multi-line https://issues.apache.org/jira/browse/PIG-3027 PIG-3026Pig checked-in baseline comparisons need a pre-filter to address OS-specific newline differences https://issues.apache.org/jira/browse/PIG-3026 PIG-3024TestEmptyInputDir unit test - hadoop version detection logic is brittle https://issues.apache.org/jira/browse/PIG-3024 PIG-3015Rewrite of AvroStorage https://issues.apache.org/jira/browse/PIG-3015 PIG-3010Allow UDF's to flatten themselves https://issues.apache.org/jira/browse/PIG-3010 PIG-2959Add a pig.cmd for Pig to run under Windows https://issues.apache.org/jira/browse/PIG-2959 PIG-2955 Fix bunch of Pig e2e tests on Windows https://issues.apache.org/jira/browse/PIG-2955 PIG-2643Use bytecode generation to make a performance replacement for InvokeForLong, InvokeForString, etc https://issues.apache.org/jira/browse/PIG-2643 PIG-2641Create toJSON function for all complex types: tuples, bags and maps https://issues.apache.org/jira/browse/PIG-2641 PIG-2591Unit tests should not write to /tmp but respect java.io.tmpdir https://issues.apache.org/jira/browse/PIG-2591 PIG-1914Support load/store JSON data in Pig https://issues.apache.org/jira/browse/PIG-1914 You may edit this subscription at: https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=13225&filterId=12322384
[jira] [Commented] (PIG-3199) Expose LogicalPlan via PigServer API
[ https://issues.apache.org/jira/browse/PIG-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13587510#comment-13587510 ] Prashant Kommireddi commented on PIG-3199: -- Mainly because we don't need the extra steps of running the optimizer, generating Physical plan, generating MR plan to get to this information. It just feels querying LP for source/sink or load/store funcs is more efficient. Would be happy to get your thoughts? > Expose LogicalPlan via PigServer API > > > Key: PIG-3199 > URL: https://issues.apache.org/jira/browse/PIG-3199 > Project: Pig > Issue Type: Improvement > Components: impl >Affects Versions: 0.10.0 >Reporter: Prashant Kommireddi >Assignee: Prashant Kommireddi > Fix For: 0.12 > > Attachments: PIG-3199.patch > > > LogicalPlan could be exposed to user in order for one to make validations > based on it. For eg, one could get Load/Store paths or other operators and be > able to perform checks such as whether I/O paths are valid etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3212) Race Conditions in POSort and (Internal)SortedBag during Proactive Spill.
[ https://issues.apache.org/jira/browse/PIG-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13587361#comment-13587361 ] Dmitriy V. Ryaboy commented on PIG-3212: Ok so I think the fix works, but it seems like this bug has been there for a long time -- the synchronization issue with SMM would've been there before 0.11, as well. Is it just more visible now because SMM is faster (fewer bags to go through)? It seems unlikely that we ever get to spill an internal sorted bag, given this patch.. seems like it almost always has an iterator open. If the concern is using the same comparator -- could we not solve this by initializing a new comparator for every bag? > Race Conditions in POSort and (Internal)SortedBag during Proactive Spill. > - > > Key: PIG-3212 > URL: https://issues.apache.org/jira/browse/PIG-3212 > Project: Pig > Issue Type: Bug >Affects Versions: 0.11 >Reporter: Kai Londenberg >Priority: Critical > Fix For: 0.12, 0.11.1 > > Attachments: PIG-3212-p1.patch > > > The following bug exists in the latest release of Pig 0.11.0 > While running some large jobs involving groups and sorts like these: > {code} > events_by_user = GROUP events BY user_id; > sorted_events_by_user = FOREACH events_by_user { > A = ORDER events BY ts, split_idx, line_num; > GENERATE group, A; > } > {code} > I got a pretty strange behaviour: While this worked on small datasets, if I > ran it on large datasets, the results were sometimes not sorted perfectly. > So after a long debugging session, I tracked it down to at least one race > condition: > The following partial stack trace shows how a proactive spill gets triggered > on an InternalSortedBag. A spill in turn triggers a sort of that > InternalSortedBag. > {code} > at > org.apache.pig.data.SortedSpillBag.proactive_spill(SortedSpillBag.java:83) > at > org.apache.pig.data.InternalSortedBag.spill(InternalSortedBag.java:455) > at > org.apache.pig.impl.util.SpillableMemoryManager.handleNotification(SpillableMemoryManager.java:243) > at > sun.management.NotificationEmitterSupport.sendNotification(NotificationEmitterSupport.java:138) > at sun.management.MemoryImpl.createNotification(MemoryImpl.java:171) > at > sun.management.MemoryPoolImpl$PoolSensor.triggerAction(MemoryPoolImpl.java:272) > at sun.management.Sensor.trigger(Sensor.java:120) > {code} > At the same time, the same InternalSortedBag might be sorted or accessed > within a POSort Operation. For example using the following Code path (line > numbers might be off, I had to add debug statements to diagnose this) > {code} > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSort.getNext(POSort.java:346) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:492) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:582) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PORelationToExprProject.getNext(PORelationToExprProject.java:107) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:394) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:372) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:297) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:368) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSplit.getNext(POSplit.java:214) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:465) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapReduce.java:433) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:413) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:257) > at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176) > at > org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) > at org.apache.hadoop.mapred.Child.main(Child.java:170) > {code} > The key here is: Bot
[jira] [Commented] (PIG-3067) HBaseStorage should be split up to become more manageable
[ https://issues.apache.org/jira/browse/PIG-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13587338#comment-13587338 ] Dmitriy V. Ryaboy commented on PIG-3067: Just chiming in to say thanks, I like where this is going as well. > HBaseStorage should be split up to become more manageable > - > > Key: PIG-3067 > URL: https://issues.apache.org/jira/browse/PIG-3067 > Project: Pig > Issue Type: Improvement >Reporter: Christoph Bauer >Assignee: Christoph Bauer > Attachments: hbasestorage-split.patch > > > HBaseStorage has become quite big (>1100 lines). > I propose to split it up into more managable parts. I believe it will become > a lot easier to maintain. > I split it up like this: > HBaseStorage > * settings:LoadStoreFuncSettings > ** options > ** caster > ** udfProperties > ** contextSignature > ** columns:ColumnInfo - moved to its own class-file > * loadFuncDelegate:HBaseLoadFunc - LoadFunc implementation > ** settings:LoadStoreFuncSettings (s.a.) > ** scanner:HBaseLoadFuncScanner - everything scan-specific > ** tupleIterator:HBaseTupleIterator - interface for _public Tuple getNext()_ > * storeFuncDelegate:HBaseStorFunc - StorFunc implementation > ** settings:LoadStoreFuncSettings (s.a.) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3067) HBaseStorage should be split up to become more manageable
[ https://issues.apache.org/jira/browse/PIG-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3067: - Summary: HBaseStorage should be split up to become more manageable (was: HBaseStorage should be split up to become more managable) > HBaseStorage should be split up to become more manageable > - > > Key: PIG-3067 > URL: https://issues.apache.org/jira/browse/PIG-3067 > Project: Pig > Issue Type: Improvement >Reporter: Christoph Bauer >Assignee: Christoph Bauer > Attachments: hbasestorage-split.patch > > > HBaseStorage has become quite big (>1100 lines). > I propose to split it up into more managable parts. I believe it will become > a lot easier to maintain. > I split it up like this: > HBaseStorage > * settings:LoadStoreFuncSettings > ** options > ** caster > ** udfProperties > ** contextSignature > ** columns:ColumnInfo - moved to its own class-file > * loadFuncDelegate:HBaseLoadFunc - LoadFunc implementation > ** settings:LoadStoreFuncSettings (s.a.) > ** scanner:HBaseLoadFuncScanner - everything scan-specific > ** tupleIterator:HBaseTupleIterator - interface for _public Tuple getNext()_ > * storeFuncDelegate:HBaseStorFunc - StorFunc implementation > ** settings:LoadStoreFuncSettings (s.a.) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-1832) Support timestamp in HBaseStorage when storing
[ https://issues.apache.org/jira/browse/PIG-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13587328#comment-13587328 ] Bill Graham commented on PIG-1832: -- I don't think there is a ticket to support returning multiple cell versions with timestamps, but we did discuss ideas for an approach here: https://issues.apache.org/jira/browse/PIG-1782?focusedCommentId=12988192&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12988192 Basically the idea is to create a new class to support this, since it would be fundamentally very different than what we currently support with {{HBaseStorage}}. That work might be better handled after we tackle PIG-3067 (HBaseStorage should be split up to become more manageable). > Support timestamp in HBaseStorage when storing > -- > > Key: PIG-1832 > URL: https://issues.apache.org/jira/browse/PIG-1832 > Project: Pig > Issue Type: Improvement >Reporter: Eric Yang > > When storing data into HBase using > org.apache.pig.backend.hadoop.hbase.HBaseStorage, HBase timestamp field is > stored with insertion time of the mapreduce job. It would be nice to have a > way to populate timestamp from user data. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-1832) Support timestamp in HBaseStorage when storing
[ https://issues.apache.org/jira/browse/PIG-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13587268#comment-13587268 ] Guido Serra aka Zeph commented on PIG-1832: --- s/imaging/imagine > Support timestamp in HBaseStorage when storing > -- > > Key: PIG-1832 > URL: https://issues.apache.org/jira/browse/PIG-1832 > Project: Pig > Issue Type: Improvement >Reporter: Eric Yang > > When storing data into HBase using > org.apache.pig.backend.hadoop.hbase.HBaseStorage, HBase timestamp field is > stored with insertion time of the mapreduce job. It would be nice to have a > way to populate timestamp from user data. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-1832) Support timestamp in HBaseStorage when storing
[ https://issues.apache.org/jira/browse/PIG-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13587267#comment-13587267 ] Guido Serra aka Zeph commented on PIG-1832: --- p.s. [~billgraham] I can't find a ticket addressing the outputting the timestamp... I mean, imaging I'd like to see multiple versions, given a time range... (k, I guess I need to create a feature ticket for that) > Support timestamp in HBaseStorage when storing > -- > > Key: PIG-1832 > URL: https://issues.apache.org/jira/browse/PIG-1832 > Project: Pig > Issue Type: Improvement >Reporter: Eric Yang > > When storing data into HBase using > org.apache.pig.backend.hadoop.hbase.HBaseStorage, HBase timestamp field is > stored with insertion time of the mapreduce job. It would be nice to have a > way to populate timestamp from user data. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3216) Groovy UDFs documentation has minor typos
[ https://issues.apache.org/jira/browse/PIG-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-3216: Status: Patch Available (was: Open) > Groovy UDFs documentation has minor typos > - > > Key: PIG-3216 > URL: https://issues.apache.org/jira/browse/PIG-3216 > Project: Pig > Issue Type: Improvement > Components: documentation >Affects Versions: 0.11 >Reporter: Mathias Herberts >Assignee: Mathias Herberts >Priority: Trivial > Attachments: PIG-3216.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3002) Pig client should handle CountersExceededException
[ https://issues.apache.org/jira/browse/PIG-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13587246#comment-13587246 ] Jarek Jarcec Cecho commented on PIG-3002: - Hi [~billgraham], thank you very much for taking a look on this Jira and my patch. I was considering similar solution as you proposed in my early work, but I've notice one side effect during experiments with my early patches. I've created quite pathological case when my cluster was using default configuration, but I've limit the number of allowed counters to 3 on machine where I've executed pig. I've noticed that with similar fix, pig will print out couple of counters and than bail out on exception on first non existing Counter. As a result not all the counters will be printed out even though they are available in the {{Couter}} object. My experiment is obviously not entirely real as it's unlikely that users will have different hadoop configuration. However I believe that it model the edge situation when mapreduce job will create almost all available counters, but because client is iterating over predefined set, not all of them will be printed out. I've also did one step further and put the {{try-catch}} block inside the {{for}} iteration. I've noticed that in this situation we might print out the error message several times, which is kind of distracting. This lead me to the idea of doing the changes on the shim layer that I've submitted. Jarcec > Pig client should handle CountersExceededException > -- > > Key: PIG-3002 > URL: https://issues.apache.org/jira/browse/PIG-3002 > Project: Pig > Issue Type: Bug >Reporter: Bill Graham >Assignee: Jarek Jarcec Cecho > Labels: newbie, simple > Attachments: PIG-3002.2.patch, PIG-3002.patch > > > Running a pig job that uses more than 120 counters will succeed, but a grunt > exception will occur when trying to output counter info to the console. This > exception should be caught and handled with friendly messaging: > {noformat} > org.apache.pig.backend.executionengine.ExecException: ERROR 2043: Unexpected > error during execution. > at org.apache.pig.PigServer.launchPlan(PigServer.java:1275) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1249) > at org.apache.pig.PigServer.execute(PigServer.java:1239) > at org.apache.pig.PigServer.executeBatch(PigServer.java:333) > at > org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:136) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:197) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:169) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) > at org.apache.pig.Main.run(Main.java:604) > at org.apache.pig.Main.main(Main.java:154) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:186) > Caused by: org.apache.hadoop.mapred.Counters$CountersExceededException: > Error: Exceeded limits on number of counters - Counters=120 Limit=120 > at > org.apache.hadoop.mapred.Counters$Group.getCounterForName(Counters.java:312) > at org.apache.hadoop.mapred.Counters.findCounter(Counters.java:431) > at org.apache.hadoop.mapred.Counters.getCounter(Counters.java:495) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.computeWarningAggregate(MapReduceLauncher.java:707) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:442) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1264) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-1832) Support timestamp in HBaseStorage when storing
[ https://issues.apache.org/jira/browse/PIG-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-1832: - Environment: (was: Java 6, Mac OS X 10.6) > Support timestamp in HBaseStorage when storing > -- > > Key: PIG-1832 > URL: https://issues.apache.org/jira/browse/PIG-1832 > Project: Pig > Issue Type: Improvement >Reporter: Eric Yang > > When storing data into HBase using > org.apache.pig.backend.hadoop.hbase.HBaseStorage, HBase timestamp field is > stored with insertion time of the mapreduce job. It would be nice to have a > way to populate timestamp from user data. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-1832) Support timestamp in HBaseStorage when storing
[ https://issues.apache.org/jira/browse/PIG-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-1832: - Summary: Support timestamp in HBaseStorage when storing (was: Support timestamp in HBaseStorage) > Support timestamp in HBaseStorage when storing > -- > > Key: PIG-1832 > URL: https://issues.apache.org/jira/browse/PIG-1832 > Project: Pig > Issue Type: Improvement > Environment: Java 6, Mac OS X 10.6 >Reporter: Eric Yang > > When storing data into HBase using > org.apache.pig.backend.hadoop.hbase.HBaseStorage, HBase timestamp field is > stored with insertion time of the mapreduce job. It would be nice to have a > way to populate timestamp from user data. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-1832) Support timestamp in HBaseStorage
[ https://issues.apache.org/jira/browse/PIG-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13587239#comment-13587239 ] Bill Graham commented on PIG-1832: -- Yes, read via time ranges is done. Work on PIG-2114 seems stalled though and there's a lot going on in that patch. I propose this JIRA just add write support for -timestamp= for consistency with the current read API. That's a quick change that would be useful and would give full read/write support for timestamps. That would also help reduce the somewhat broad scope of PIG-2114. > Support timestamp in HBaseStorage > - > > Key: PIG-1832 > URL: https://issues.apache.org/jira/browse/PIG-1832 > Project: Pig > Issue Type: Improvement > Environment: Java 6, Mac OS X 10.6 >Reporter: Eric Yang > > When storing data into HBase using > org.apache.pig.backend.hadoop.hbase.HBaseStorage, HBase timestamp field is > stored with insertion time of the mapreduce job. It would be nice to have a > way to populate timestamp from user data. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-1832) Support timestamp in HBaseStorage
[ https://issues.apache.org/jira/browse/PIG-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13587189#comment-13587189 ] Guido Serra aka Zeph commented on PIG-1832: --- even... they just updated ( PIG-2341 ) the documentation: - http://pig.apache.org/docs/r0.11.0/func.html#HBaseStorage I'd say, that just having the double usage of "-timestamp=", at LOAD and on STORE, is all we need right now (as of version 0.11), this option is being taken into consideration only at LOAD time p.s. there is a scenario though, which I'm covering with a python/jython custom script, that puzzles me... what if only a cell (row/column intersection) changes? HBase by design stores a new entry at a given timestamp for all the family:columns provided, even if they are identical ... shall we compute the difference within the HBaseStorage, or shall the user handle it? > Support timestamp in HBaseStorage > - > > Key: PIG-1832 > URL: https://issues.apache.org/jira/browse/PIG-1832 > Project: Pig > Issue Type: Improvement > Environment: Java 6, Mac OS X 10.6 >Reporter: Eric Yang > > When storing data into HBase using > org.apache.pig.backend.hadoop.hbase.HBaseStorage, HBase timestamp field is > stored with insertion time of the mapreduce job. It would be nice to have a > way to populate timestamp from user data. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-1832) Support timestamp in HBaseStorage
[ https://issues.apache.org/jira/browse/PIG-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13587180#comment-13587180 ] Guido Serra aka Zeph commented on PIG-1832: --- k, PIG-2886 is covering only the reading... this is actually attempting to cover the writing, let's keep it open seems to be partially addressed in PIG-2114 though... [~eyang] any progress from ur side? > Support timestamp in HBaseStorage > - > > Key: PIG-1832 > URL: https://issues.apache.org/jira/browse/PIG-1832 > Project: Pig > Issue Type: Improvement > Environment: Java 6, Mac OS X 10.6 >Reporter: Eric Yang > > When storing data into HBase using > org.apache.pig.backend.hadoop.hbase.HBaseStorage, HBase timestamp field is > stored with insertion time of the mapreduce job. It would be nice to have a > way to populate timestamp from user data. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-1832) Support timestamp in HBaseStorage
[ https://issues.apache.org/jira/browse/PIG-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13587174#comment-13587174 ] Guido Serra aka Zeph commented on PIG-1832: --- [~eyang] up to me it is covered by PIG-2886 , have a look at it > Support timestamp in HBaseStorage > - > > Key: PIG-1832 > URL: https://issues.apache.org/jira/browse/PIG-1832 > Project: Pig > Issue Type: Improvement > Environment: Java 6, Mac OS X 10.6 >Reporter: Eric Yang > > When storing data into HBase using > org.apache.pig.backend.hadoop.hbase.HBaseStorage, HBase timestamp field is > stored with insertion time of the mapreduce job. It would be nice to have a > way to populate timestamp from user data. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (PIG-3206) HBaseStorage does not work with Oozie pig action and secure HBase
[ https://issues.apache.org/jira/browse/PIG-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy resolved PIG-3206. - Resolution: Fixed Fix Version/s: 0.11.1 Thanks Dmitriy. Checked into 0.11.1 and trunk. Added a new section in CHANGES.txt for Release 0.11.1. > HBaseStorage does not work with Oozie pig action and secure HBase > - > > Key: PIG-3206 > URL: https://issues.apache.org/jira/browse/PIG-3206 > Project: Pig > Issue Type: Bug >Affects Versions: 0.10.1 >Reporter: Rohini Palaniswamy >Assignee: Rohini Palaniswamy > Fix For: 0.12, 0.11.1 > > Attachments: PIG-3206-1.patch > > > HBaseStorage always tries to fetch delegation token for a secure hbase > cluster. But when pig is launched through Oozie, it will fail as TGT is not > available in the map job. In that case, it should try and reuse the hbase > delegation token in JobConf passed to pig through > mapreduce.job.credentials.binary property. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3214) New/improved mascot
[ https://issues.apache.org/jira/browse/PIG-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13587161#comment-13587161 ] Jonathan Coveney commented on PIG-3214: --- Thanks for volunteering for this, Prashanth! > New/improved mascot > --- > > Key: PIG-3214 > URL: https://issues.apache.org/jira/browse/PIG-3214 > Project: Pig > Issue Type: Wish > Components: site >Affects Versions: 0.11 >Reporter: Andrew Musselman >Priority: Minor > Fix For: 0.12 > > > Request to change pig mascot to something more graphically appealing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira