Re: [VOTE] Release Pig 0.12.1 (Candidate 0)

2014-04-07 Thread Dmitriy Ryaboy
Release notes refer to CHANGES.txt which is inside a gzip. That seems less than idea. Can we make CHANGES.txt available in the release directory for future releases? On Mon, Apr 7, 2014 at 1:22 PM, Prashant Kommireddi prkommire...@apache.org wrote: I have created a candidate build for Pig

Re: [VOTE] Release Pig 0.12.1 (Candidate 0)

2014-04-07 Thread Dmitriy Ryaboy
.. ideal, I mean. Here's the CHANGES link for those who are interested: https://svn.apache.org/viewvc/pig/branches/branch-0.12/CHANGES.txt?revision=1585023view=markup Overall looks good to me. +1. On Mon, Apr 7, 2014 at 2:58 PM, Dmitriy Ryaboy dvrya...@gmail.com wrote: Release notes refer

Re: Pig 0.13.0 release

2014-02-12 Thread Dmitriy Ryaboy
to warrant a minor release. Julien On Feb 6, 2014, at 12:24 PM, Dmitriy Ryaboy wrote: Major updates since we release 12 that are currently in trunk: - lazy output (don't generate empty part files) - jar caching optimization - automatic local mode for small job (big wall-clock

Re: Review Request 17876: [PIG-3456] Reduce threadlocal conf access in backend for each record

2014-02-09 Thread Dmitriy Ryaboy
/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java https://reviews.apache.org/r/17876/#comment63997 let's move these string constants (batchsize, cachedbag type, etc) to PigConfiguration when we come across them - Dmitriy Ryaboy On Feb. 9

Re: Pig 0.13.0 release

2014-02-06 Thread Dmitriy Ryaboy
Major updates since we release 12 that are currently in trunk: - lazy output (don't generate empty part files) - jar caching optimization - automatic local mode for small job (big wall-clock wins for long-tail jobs) - improved support for BigInteger, BigDecimal - hbase loader improvements - debug

Re: Review Request 14552: PIG-3480 TFile-based tmpfile compression crashes in some cases

2013-10-11 Thread Dmitriy Ryaboy
the tmpfile compression storage config? trunk/src/org/apache/pig/impl/io/TFileStorage.java https://reviews.apache.org/r/14552/#comment52494 let's allow gzip and rewrite it to gz Looks mostly right! Just a few superficial comments. - Dmitriy Ryaboy On Oct. 11, 2013, 8:43 p.m., Aniket Mokashi

Re: Pig 0.10.1 to Pig 0.11.1 API compatibility break

2013-04-19 Thread Dmitriy Ryaboy
Hi Gerrit, we do try to keep backwards incompatible changes to a minimum, but sometimes they are needed to make progress. How about we make a practice of tagging notifications about new pig release candidates with [RC] so you can set up your filters and get a heads up to try your software with

Re: GSoC 2013

2013-04-08 Thread Dmitriy Ryaboy
; -- generate a score using the node list you can traverse the graph to the your finishing position store... Thanks Best Regards... On Mon, Apr 1, 2013 at 7:20 PM, Dmitriy Ryaboy dvrya...@gmail.com wrote: I'm somewhat familiar with WTF code (my day job

Re: HBase Types: Explicit Null Support

2013-04-03 Thread Dmitriy Ryaboy
Hiya Nick, Pig converts data for HBase storage using this class: https://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/hbase/HBaseBinaryConverter.java(which is mostly just calling into HBase's Bytes class). As long as Bytes handles the null stuff, we'll just inherit the

Re: Apache Pig 0.11.1 release candidate

2013-04-01 Thread Dmitriy Ryaboy
Roman and Mark, Joining Bill here in thanking you for BigTop and all the integration work you guys do. Since this issue came up so late (after the vote), and, while it does affect people trying to build an rpm, does not affect people just using Pig jars, etc, I'd move to release 0.11.1 and

Re: GSoC 2013

2013-04-01 Thread Dmitriy Ryaboy
... On Fri, Mar 29, 2013 at 6:10 PM, Dmitriy Ryaboy dvrya...@gmail.com wrote: Hi Burakk, The general idea of making graph processing easier is a good one. I'm not sure what exactly you are proposing to do, though. Could you be more detailed about what you are thinking? On Thu, Mar 28

Re: Pig 0.10 on YARN

2013-03-29 Thread Dmitriy Ryaboy
Can you run `pig -secretDebugCmd` and look at your paths? I suspect there's something going on there, like not having the right configs or hadoop jars. On Thu, Mar 28, 2013 at 11:32 PM, Konstantin Boudnik c...@apache.org wrote: Guys, I am trying to run TestPigTest against a fanctional Hadoop

Re: GSoC 2013

2013-03-29 Thread Dmitriy Ryaboy
Hi Burakk, The general idea of making graph processing easier is a good one. I'm not sure what exactly you are proposing to do, though. Could you be more detailed about what you are thinking? On Thu, Mar 28, 2013 at 1:28 PM, burakkk burak.isi...@gmail.com wrote: Hi, I might be a little bit

Re: Pig 0.10 on YARN

2013-03-29 Thread Dmitriy Ryaboy
the code but didn't spot anything immediately. I will give -secretDebugCmd a try anyway - it might give me some ideas... Appreciate the help! Cos On Fri, Mar 29, 2013 at 09:08AM, Dmitriy Ryaboy wrote: Can you run `pig -secretDebugCmd` and look at your paths? I suspect there's something going

Re: Review Request: [PIG-3173] - Partition filter pushdown does not happen if partition keys condition include a AND and OR construct

2013-03-20 Thread Dmitriy Ryaboy
On March 20, 2013, 12:42 a.m., Dmitriy Ryaboy wrote: http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/PColFilterExtractor.java, line 224 https://reviews.apache.org/r/10035/diff/1/?file=272254#file272254line224 (A and B) or (C and D) is impossible

Re: Review Request: [PIG-3173] - Partition filter pushdown does not happen if partition keys condition include a AND and OR construct

2013-03-19 Thread Dmitriy Ryaboy
/#comment38256 is this new code or is this RB being funny? either way, dead code can be deleted; this is preferable to commenting it out. - Dmitriy Ryaboy On March 20, 2013, 12:16 a.m., Rohini Palaniswamy wrote

Re: Are we ready for 0.11.1 release?

2013-03-18 Thread Dmitriy Ryaboy
, 2013, at 17:08, Dmitriy Ryaboy dvrya...@gmail.com wrote: I think all the critical patches we discussed as required for 0.11.1 have gone in -- is there anything else people want to finish up, or can we roll this? Current change log: Release 0.11.1 (unreleased) INCOMPATIBLE CHANGES

Re: Are we ready for 0.11.1 release?

2013-03-18 Thread Dmitriy Ryaboy
be nice to include this feature. On Mon, Mar 18, 2013 at 1:04 AM, Dmitriy Ryaboy dvrya...@gmail.com wrote: Just +1'd it. I think after this one we are good to go? On Sun, Mar 17, 2013 at 9:09 PM, Daniel Dai da...@hortonworks.com wrote: Can I include PIG-3132? Thanks

Are we ready for 0.11.1 release?

2013-03-15 Thread Dmitriy Ryaboy
I think all the critical patches we discussed as required for 0.11.1 have gone in -- is there anything else people want to finish up, or can we roll this? Current change log: Release 0.11.1 (unreleased) INCOMPATIBLE CHANGES IMPROVEMENTS PIG-2988: start deploying pigunit maven artifact part of

Introducing Parquet: efficient columnar storage for Hadoop.

2013-03-12 Thread Dmitriy Ryaboy
to contribute Parquet to the Apache Incubator when the development is farther along. Regards, Nong Li, Julien Le Dem, Marcel Kornacker, Todd Lipcon, Dmitriy Ryaboy, Jonathan Coveney, and friends.

Re: Contribute to PIG-3225

2013-03-11 Thread Dmitriy Ryaboy
+ Gianmarco On Mon, Mar 11, 2013 at 11:20 AM, Sadari Jayawardena sjayawardena...@gmail.com wrote: I am a final year undergraduate in Computer Science Engineering. I have a good experience in Java programming and interested in mathematics and statistics. I would like to contribute to this

Re: pig 0.11 candidate 2 feedback: Several problems

2013-03-08 Thread Dmitriy Ryaboy
-3194 http://goo.gl/UQ3zs. Please take a look when you folks have a chance. On Fri, Mar 1, 2013 at 7:00 PM, Dmitriy Ryaboy dvrya...@gmail.com wrote: I'd like to get the gc fix in as well, but looks like Rohini is about to commit it so we are good there. On Mar 1, 2013, at 11:33 AM

Re: pig 0.11 candidate 2 feedback: Several problems

2013-03-01 Thread Dmitriy Ryaboy
Jarcec Cecho jar...@apache.org wrote: Just a unrelated note: The CDH3 is more closer to Hadoop 1.x than to 0.20. Jarcec On Wed, Feb 20, 2013 at 12:04:51PM -0800, Dmitriy Ryaboy wrote: I agree -- this is a good release. The bugs Kai pointed out should be fixed, but as they are not critical

How do we post to the apache pig blog?

2013-02-22 Thread Dmitriy Ryaboy
I prepared a detailed post going over the pig 0.11 release, and realized I don't know how to post to the apache pig blog. Does anyone have a pointer?

Re: ON_ERROR command

2013-02-22 Thread Dmitriy Ryaboy
Adam it would be *great* if someone worked on this. I would love to hear your thoughts on the design, it was written a while ago and could probably be improved (though it's pretty viable still, I think). You guys are using Pig at trifacta? On Fri, Feb 22, 2013 at 12:10 PM, Adam Silberstein

Pig 0.11: new features and improvements

2013-02-22 Thread Dmitriy Ryaboy
I pulled together some of the highlights of the pig 0.11 release on the Apache Pig blog (which now officially exists!): https://blogs.apache.org/pig/ D

Re: pig 0.11 candidate 2 feedback: Several problems

2013-02-20 Thread Dmitriy Ryaboy
I agree -- this is a good release. The bugs Kai pointed out should be fixed, but as they are not critical regressions, we can fix them in 0.11.1 (if someone wants to roll 0.11.1 the minute these fixes are committed, I won't mind and will dutifully vote for the release). I think the Hadoop 20.2

Re: [VOTE] Release Pig 0.11.0 (candidate 2)

2013-02-20 Thread Dmitriy Ryaboy
+1 tests pass. ran sample jobs in local mode and on cluster. Verified HBaseStorage still works. On Tue, Feb 19, 2013 at 2:12 PM, Daniel Dai da...@hortonworks.com wrote: +1. Checked piggybank.jar in tarball/rpm/deb, checked release notes, all looks good this time. Daniel On Thu, Feb 14,

Re: Pig 11.0

2013-01-29 Thread Dmitriy Ryaboy
Sounds like we can roll a release. Who wants to do the honors? I think Olga and Daniel have done it in the past, not sure if they have the time? On Mon, Jan 28, 2013 at 11:37 PM, Bill Graham billgra...@gmail.com wrote: I've just committed the 2 documentation jiras and all Pig 0.11 issues are

Re: Reducer estimation

2012-12-06 Thread Dmitriy Ryaboy
How would flat files work? The data needs to be updated by every pig run. On Dec 3, 2012, at 11:10 PM, Prashant Kommireddi prash1...@gmail.com wrote: Awesome! It would be good to have a flat-file based impl as there will probably a lot of pig users not having an hbase instance setup for

Re: [DISCUSS] Remove Penny from contrib

2012-11-01 Thread Dmitriy Ryaboy
How would this work -- would we dump it on github somewhere in case someone wants to use it? On Thu, Nov 1, 2012 at 10:59 AM, Julien Le Dem jul...@twitter.com wrote: Sad but +1 Maybe we should review what target we run before checking in changes. Like compiling contrib projects ? Or possibly

Re: CHANGES.txt in branches

2012-10-16 Thread Dmitriy Ryaboy
Guilty.. I guess we should be putting them under 0.11 in trunk. On Tue, Oct 16, 2012 at 8:18 PM, Jonathan Coveney jcove...@gmail.com wrote: AFAIK (and I don't really know), I thought that if we put it in both, that it'd go in the pig 11 section in trunk, and if not, we don't. Is this correct?

Cutting Pig-11 branch at 1pm PST

2012-10-12 Thread Dmitriy Ryaboy
I will begin branching Pig 0.11 around 1pm PST, in about 2 hours. I will send another email when I start, and another when I finish; lease refrain from committing any patches between those two messages. -Dmitriy

Please hold all commits to Pig trunk.

2012-10-12 Thread Dmitriy Ryaboy
I am branching the 0.11 branch, will let you know when that's done. D

Re: Please hold all commits to Pig trunk.

2012-10-12 Thread Dmitriy Ryaboy
All clear. Please remember to commit bug fixes to both trunk and 0.11. Please keep new features and exploratory stuff out of 0.11. D On Fri, Oct 12, 2012 at 1:02 PM, Dmitriy Ryaboy dvrya...@gmail.com wrote: I am branching the 0.11 branch, will let you know when that's done. D

Re: Pig 0.11

2012-10-12 Thread Dmitriy Ryaboy
...@yahoo.com wrote: Dmitry, I would be happy to help with the release process. Want to get back into this now that I am back at work. Let me know what you would like me to do. Olga From: Dmitriy Ryaboy dvrya...@gmail.com To: dev@pig.apache.org Cc: billgra

Re: Pig 0.11

2012-10-11 Thread Dmitriy Ryaboy
I think we are good to branch. If the gem thing is not a big change, I'm ok with putting it into the branch post-factum. Daniel -- will you be managing this release? D On Thu, Oct 11, 2012 at 11:03 AM, Russell Jurney russell.jur...@gmail.com wrote: I'm gonna test JRuby gems tomorrow, in hopes

Re: PigServer API

2012-10-11 Thread Dmitriy Ryaboy
Doesn't executeBatch() return exactly what you want? On Thu, Oct 11, 2012 at 2:12 AM, Prashant Kommireddi prash1...@gmail.com wrote: I knew I had those negotiation skills :) Patch is available, please review. It's a minor one https://issues.apache.org/jira/browse/PIG-2964 -Prashant On

Re: PigServer API

2012-10-11 Thread Dmitriy Ryaboy
, Oct 11, 2012 at 12:27 PM, Dmitriy Ryaboy dvrya...@gmail.com wrote: Doesn't executeBatch() return exactly what you want? On Thu, Oct 11, 2012 at 2:12 AM, Prashant Kommireddi prash1...@gmail.com wrote: I knew I had those negotiation skills :) Patch is available, please review. It's

Re: PigServer API

2012-10-11 Thread Dmitriy Ryaboy
actually, no it's not, if all we are changing is return type from void to something better. carry on. On Thu, Oct 11, 2012 at 2:28 PM, Dmitriy Ryaboy dvrya...@gmail.com wrote: Backwards compatibility is an issue.. On Thu, Oct 11, 2012 at 12:54 PM, Prashant Kommireddi prash1...@gmail.com wrote

Re: Pig 0.11

2012-10-11 Thread Dmitriy Ryaboy
Ok I will branch 0.11 tomorrow morning unless someone objects. From then on, committers should be careful to commit bug fixes to both 0.11 branch and trunk; minor polish can go into the branch, but whole new features should not (we can discuss on the list if something is in the gray area). D On

Re: Unexpected data type

2012-10-07 Thread Dmitriy Ryaboy
with hadoop version 1.0.0, I checkout three times Pig from trunk and I got the same exception. What could I do to solve it? On Sun, Oct 7, 2012 at 2:17 AM, Dmitriy Ryaboy dvrya...@gmail.com wrote: This is a serialization error. What version of Pig are you running?

Re: Unexpected data type

2012-10-06 Thread Dmitriy Ryaboy
This is a serialization error. What version of Pig are you running? On Sat, Oct 6, 2012 at 10:38 AM, Allan aaven...@gmail.com wrote: Hi everybody, I was trying to run a simple script: A = LOAD 'test01' AS (f1:chararray,f2:int,f3:chararray); B = order A by f1,f2,f3 DESC; but, suddenly I

Re: Time to branch 0.11?

2012-09-23 Thread Dmitriy Ryaboy
to me Julien On Sat, Sep 22, 2012 at 4:59 PM, Dmitriy Ryaboy dvrya...@gmail.com wrote: Hi folks, Should we branch 0.11? I don't see anything major left outstanding other than Jon's SchemaTuple integration work (which is practically ready and can be pushed

Time to branch 0.11?

2012-09-22 Thread Dmitriy Ryaboy
Hi folks, Should we branch 0.11? I don't see anything major left outstanding other than Jon's SchemaTuple integration work (which is practically ready and can be pushed to both a branch and trunk), just a few bug fixes here and there. I'd like to branch before merging in Prasanth's CUBE operator

Re: CI server and /tmp

2012-09-19 Thread Dmitriy Ryaboy
Wrong pig-dev :) On Wed, Sep 19, 2012 at 10:35 AM, Julien Le Dem jul...@twitter.com wrote: Hello, Do the Jenkins slaves cleanup their /tmp regularly? If this is the case, would it be possible to delete only stuff that is older than a day ? Many things write to /tmp and it seems tests fail

Re: Modifying databag on the fly

2012-09-08 Thread Dmitriy Ryaboy
FYI -- we wound up going with a much cleaner and memory-friendly solution of returning a new databag implementation which simply proxied all the calls to the original bag, but returned a special Iterator which applied the necessary transformation to tuples on the fly. That way, we don't need to

Re: Current patch available' and open issues

2012-09-04 Thread Dmitriy Ryaboy
be a great way to be reminded of what's patches need review. I dug around jira a bit and couldn't figure out how to set this up. Alan, did you mention that you knew a way to do this? On Sun, Sep 2, 2012 at 5:01 PM, Dmitriy Ryaboy dvrya...@gmail.com wrote: Hi folks, Here's a link to a almost 40

Current patch available' and open issues

2012-09-02 Thread Dmitriy Ryaboy
Hi folks, Here's a link to a almost 40 JIRAs where patches are available. We should review them and either close as won't fix, drop patch available if the patch doesn't pass muster, or commit.

Re: Number of mappers in MRCompiler

2012-08-23 Thread Dmitriy Ryaboy
I think we decided to instead stub in a special loader that reads a few records from each underlying split, in a single mapper (by using a single wrapping split), right? On Thu, Aug 23, 2012 at 7:55 PM, Prasanth J buckeye.prasa...@gmail.com wrote: I see. Thanks Alan for your reply. Also one

Re: Review Request: PIG-2831: MR-Cube implementation (Distributed cubing for holistic measures)

2012-08-17 Thread Dmitriy Ryaboy
these be added to POCube and extracted from within the contained plans instead? - Dmitriy Ryaboy On Aug. 16, 2012, 10:19 p.m., Prasanth_J wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/6651

Re: Where can I download Pig OR Lets put a download link on the home page!

2012-08-10 Thread Dmitriy Ryaboy
No objection. And a giant link to current docs. Google keeps sending people to 0.7. On Aug 9, 2012, at 10:29 PM, Russell Jurney russell.jur...@gmail.com wrote: I just clicked 14 times to find the Pig dowload link. While I have a special skill of emulating the dumbest of users, I think we

Re: illustrate

2012-08-10 Thread Dmitriy Ryaboy
I must be missing some tricky detail... Which of these operations could not be done by clever udfs? On Aug 9, 2012, at 9:01 AM, Gianmarco De Francisci Morales g...@apache.org wrote: Hi Allan, I think I found an answer to your problem: 1) Modify PhysicalPlanResetter by adding:

Re: Documentation fix for SAMPLE operator

2012-07-15 Thread Dmitriy Ryaboy
Docs are managed just like code, via Jira. Please file a bug, I'll commit. Thanks for fixing this! On Jul 12, 2012, at 3:44 PM, Prasanth J buckeye.prasa...@gmail.com wrote: Just noticed that the description for SAMPLE operator has the description of SPLIT operator.

Re: Including wonderdog in Pig contrib

2012-07-10 Thread Dmitriy Ryaboy
I don't see the need for Pig to include much of anything in contrib (and that includes things that are currently in contrib). Wonderdog is a great library, and it's fantastic that Russel wants to make sure it works with latest Pig versions. I don't see how putting the apache process in Russel's

Re: Pig for MongoDB

2012-07-09 Thread Dmitriy Ryaboy
You'd have to do something about shipping the UDFs to Mongo. But try it -- the generalization code that was pulled was stuff like fs abstraction (want to work on something that's not HDFS? just implement the FileSystem interface from hadoop, like S3 and Cassandra did) and slices (just use an

Known Jira issues?

2012-07-05 Thread Dmitriy Ryaboy
I haven't been able to get into Apache Jira for days. Apparently several others have had more success. Is this a known thing? Does apache infra have an email? (they only thing I've found for them is the jira, which, of course, is unreachable from my machine...) D

Re: Review Request: PIG-2726: Handling legitimate NULL values in CUBE operator

2012-07-01 Thread Dmitriy Ryaboy
. http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/CubeDimensions.java https://reviews.apache.org/r/5470/#comment18553 I would rather we set the fields, rather than copy the tuple and then set them. Unnecessary copies = time + gc pressure. - Dmitriy Ryaboy On June 21

Re: Review Request: PIG-2765: Implementing RollupDimensions UDF and adding ROLLUP clause in CUBE operator

2012-07-01 Thread Dmitriy Ryaboy
/TestRollupDimensions.java https://reviews.apache.org/r/5521/#comment18590 this will need to change with the new null handling - Dmitriy Ryaboy On June 22, 2012, 7:35 a.m., Prasanth_J wrote: --- This is an automatically generated e-mail. To reply, visit

Re: Build failed in Jenkins: Pig-trunk #1268

2012-06-28 Thread Dmitriy Ryaboy
That's my fault, xml error in the new merge join docs. Will fix. On Thu, Jun 28, 2012 at 3:09 AM, Apache Jenkins Server jenk...@builds.apache.org wrote: See https://builds.apache.org/job/Pig-trunk/1268/changes Changes: [julien] PIG-2750: add artifacts to the ivy.xml for other jars Pig

Re: Can we set up a pig-git reviewboard, similar to hcatalog-git?

2012-06-25 Thread Dmitriy Ryaboy
+1. Hcat people, how did you guys do that? Infra ticket? D On Thu, Jun 21, 2012 at 11:37 AM, Jonathan Coveney jcove...@gmail.com wrote: Currently our pig reviewboard is configured for svn diffs... I believe it is possible to set up another reviewboard base that would accept diffs from git.

Re: CUBE/ROLLUP/GROUPING SETS syntax

2012-06-22 Thread Dmitriy Ryaboy
One happens on the mapper. On Thu, Jun 21, 2012 at 2:52 PM, Prasanth J buckeye.prasa...@gmail.com wrote: Thanks Alan. Your suggestion looks correct. I think with this I can achieve what I wanted in the same syntax out = CUBE rel BY CUBE(a,b,c), ROLLUP(c,d), CUBE(e,f); Just curious to know.

Re: [jira] [Commented] (PIG-2726) Handling legitimate NULL values

2012-06-16 Thread Dmitriy Ryaboy
Approach as described seems sound to me. Will review path early next week. On Jun 13, 2012, at 9:46 PM, Prasanth J (JIRA) j...@apache.org wrote: [

Re: [jira] [Commented] (PIG-2747) Support more predicate pushdown to a data source by pulling up multiple predicates from branches using the same data source

2012-06-16 Thread Dmitriy Ryaboy
I don't think a union is required for this to make sense. On Jun 11, 2012, at 11:58 AM, Daniel Dai (JIRA) j...@apache.org wrote: [

Re: Persisting Pig Scripts

2012-06-06 Thread Dmitriy Ryaboy
You can write a nightly cron that runs the JobHistoryLoader job and stores parsed scripts to hdfs... D On Wed, Jun 6, 2012 at 5:16 PM, Prashant Kommireddi prash1...@gmail.com wrote: I think that would be more of a post-process vs having Pig write the same to a HDFS location. That would avoid

Re: Handle NULL values in Cube dimensions

2012-06-06 Thread Dmitriy Ryaboy
Note that the current CubeDimensions UDF does a third thing -- instead of rebranding nulls as unknown and using null to mean * or all values, the UDF allows you to specify a custom value to stand for * or all values. That way null can be an individual valid cell value. This is (imho) much nicer

Re: Newbie / Simple issues

2012-04-30 Thread Dmitriy Ryaboy
' tag, with a link to a broken JIRA filter: https://cwiki.apache.org/confluence/display/PIG/HowToContribute We should swap in this link. Should we just change to use simple and get rid of the newbie tag? Or do we say to look for either? On Sun, Apr 29, 2012 at 3:44 PM, Dmitriy Ryaboy dvrya

Re: Review Request: PIG-2167 - Naive implementation of CUBE operator

2012-04-29 Thread Dmitriy Ryaboy
On 2012-04-07 23:50:33, Dmitriy Ryaboy wrote: http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/relational/LOCube.java, line 69 https://reviews.apache.org/r/4670/diff/1/?file=100633#file100633line69 That doesn't seem right. Prasanth_J wrote

Re: Review Request: PIG-2167 - Naive implementation of CUBE operator

2012-04-29 Thread Dmitriy Ryaboy
and Dmitriy Ryaboy. Summary --- This is a review board for https://issues.apache.org/jira/browse/PIG-2167 This addresses bug https://issues.apache.org/jira/browse/PIG-2167. https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/PIG-2167 Diffs

Re: Review Request: PIG-2167 - Naive implementation of CUBE operator

2012-04-29 Thread Dmitriy Ryaboy
/ --- (Updated 2012-04-12 07:12:48) Review request for pig and Dmitriy Ryaboy. Summary --- This is a review board for https://issues.apache.org/jira/browse/PIG-2167 This addresses bug https://issues.apache.org/jira

Re: [jira] [Commented] (PIG-2674) Parameter substitution with spaces does not work

2012-04-28 Thread Dmitriy Ryaboy
Maybe the appropriate patch here is a documentation one?

Re: [VOTE] Release Pig 0.10.0 (candidate 0)

2012-04-24 Thread Dmitriy Ryaboy
...@gmail.com wrote: Can someone from LinkedIn try this release candidate? It may break your AvroStorage, so that would be good to know. Russell Jurney http://datasyndrome.com On Apr 23, 2012, at 6:36 PM, Dmitriy Ryaboy dvrya...@gmail.com wrote: +1 Verified several jobs using

Re: [VOTE] Release Pig 0.10.0 (candidate 0)

2012-04-24 Thread Dmitriy Ryaboy
0.1, I don't think we should block 0.10 for it.  The change is to change the directions in HowToRelease. Alan. On Apr 24, 2012, at 3:16 PM, Dmitriy Ryaboy wrote: That's a good catch. We shouldn't officially publish a SNAPSHOT build... Or does that get fixed only when you officially

Re: [VOTE] Release Pig 0.10.0 (candidate 0)

2012-04-24 Thread Dmitriy Ryaboy
and get it in 0.10? https://issues.apache.org/jira/browse/PIG-2266 On Tue, Apr 24, 2012 at 4:01 PM, Dmitriy Ryaboy dvrya...@gmail.com wrote: Heh. Ok my +1 stands. On Tue, Apr 24, 2012 at 3:25 PM, Alan Gates ga...@hortonworks.com wrote: Oddly enough that's how we've always done the version

Re: [VOTE] Release Pig 0.10.0 (candidate 0)

2012-04-23 Thread Dmitriy Ryaboy
+1 Verified several jobs using Elephant-Bird loaders. Tested correctness with pig.exec.mapPartAgg both true and false. Verified license. Verified release notes. Ran test-commit D On Sat, Apr 21, 2012 at 12:27 PM, Daniel Dai da...@hortonworks.com wrote: We should do sanity check of the

Re: [VOTE] Release Pig 0.10.0 (candidate 0)

2012-04-23 Thread Dmitriy Ryaboy
) Something about my environment? D On Mon, Apr 23, 2012 at 6:36 PM, Dmitriy Ryaboy dvrya...@gmail.com wrote: +1 Verified several jobs using Elephant-Bird loaders. Tested correctness with pig.exec.mapPartAgg both true and false. Verified license. Verified release notes. Ran test-commit D

Re: [VOTE] Release Pig 0.10.0 (candidate 0)

2012-04-23 Thread Dmitriy Ryaboy
- Connecting to hadoop file system at: file:/// Is this something with my script, or may be the new version (0.10.0)? -Prashant On Mon, Apr 23, 2012 at 8:30 PM, Dmitriy Ryaboy dvrya...@gmail.com wrote: Uh, actually, one of the test-commit tests failed in my environment. In TestPigServer

Apache Pig hackday @ Twitter (SF)

2012-04-18 Thread Dmitriy Ryaboy
Hi folks, The Analytics Infra team at Twitter will be hosting a Pig hackday on May 11. On the agenda: - get newcomers set up with the apache ticket process - review and commit a bunch of stuff that's not been getting love - hack on exciting new features - fix boring old problems - the Dmitriy

Re: Java 7 and Pig, hijacked from PIG-2643

2012-04-16 Thread Dmitriy Ryaboy
Last time I tried to run a Java 7 jar on a java 6 jvm, things blew up. This suggests to me that before Pig can use Java 7, Hadoop has to be able to (since hadoop spins out the jvm process that runs the pig map/reduce jobs). Do you know if Hadoop can run on Java 7? D On Mon, Apr 16, 2012 at 10:18

Re: [jira] [Commented] (PIG-2065) IsEmpty should be Accumulative

2012-04-11 Thread Dmitriy Ryaboy
Cause if it's accumulative, the bag doesn't need to be loaded in ram to be checked for emptiness. On Apr 11, 2012, at 10:55 AM, Jonathan Coveney (Commented) (JIRA) j...@apache.org wrote: [

Re: toJSON function for tuples, bags and strings, PIG-2641

2012-04-10 Thread Dmitriy Ryaboy
, 2012 at 5:38 PM, Dmitriy Ryaboy dvrya...@gmail.comwrote: Jackson is your friend. On Mon, Apr 9, 2012 at 5:14 PM, Russell Jurney russell.jur...@gmail.com wrote: I need to be able to JSONize and return json:chararray's of any pig datatypes, to be able to index complex types in ElasticSearch

Re: toJSON function for tuples, bags and strings, PIG-2641

2012-04-10 Thread Dmitriy Ryaboy
to postpone acessibility. Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome.com On Apr 10, 2012, at 7:53 AM, Dmitriy Ryaboy dvrya...@gmail.com wrote: first question: you can do this when outputSchema() is called, as it's passed the input schema. IIRC, in trunk you have hooks

Re: Scalar problem

2012-04-09 Thread Dmitriy Ryaboy
Alan, which idea are you +1 on? I think (int) D is the current syntax. There are a couple problems that people hit in the current scalar implementation, both of which I think can be fixed without introducing new syntax: 1) Require the cast, don't do it implicitly. This was actually in the design

Re: toJSON function for tuples, bags and strings, PIG-2641

2012-04-09 Thread Dmitriy Ryaboy
Jackson is your friend. On Mon, Apr 9, 2012 at 5:14 PM, Russell Jurney russell.jur...@gmail.com wrote: I need to be able to JSONize and return json:chararray's of any pig datatypes, to be able to index complex types in ElasticSearch via Wonderdog.  See:

Re: Is there any reason why the private instance methods of BinInterSedes shouldn't be made protected static?

2012-04-06 Thread Dmitriy Ryaboy
I think you can safely pull out such functionality into a general helper class. On Fri, Apr 6, 2012 at 9:29 AM, Jonathan Coveney jcove...@gmail.com wrote: https://issues.apache.org/jira/browse/PIG-2632 I'm working on a way to use code generation to generate custom Tuples when the Schema is

Re: What outstanding patches are must haves for 0.10? (path to a RC theater)

2012-04-06 Thread Dmitriy Ryaboy
WOOT. On Fri, Apr 6, 2012 at 1:53 PM, Daniel Dai da...@hortonworks.com wrote: Ok, let's cut off the 0.10.0 release. I will start testing during the weekend. Daniel On Wed, Apr 4, 2012 at 11:17 PM, Russell Jurney russell.jur...@gmail.com wrote: JRuby and a working AvroStorage. Woohoo!

Welcome Pig's newest committer, Bill Graham!

2012-04-05 Thread Dmitriy Ryaboy
Hi all, On behalf of the Pig PMC, I'm very happy to announce that Bill Graham has been invited to become a Pig committer. Bill's been involved in the Pig project for a long time now, and has made a number of significant contributions -- big improvements to HBase and Avro support, memory leak

Re: [gsoc2012] plan/data flow visualizer web interface - PIG-2586

2012-04-05 Thread Dmitriy Ryaboy
Do we have a committer who's signed up to mentor this issue? Russ can't according to apache's guidelines for gsoc (though he can certainly help advise / review code / etc -- just not be an official apache foundation rep to google regarding progress of the project) D On Thu, Apr 5, 2012 at 5:01

Re: pig-dev and dev@pig

2012-03-27 Thread Dmitriy Ryaboy
Pig-dev is from when pig was in incubator. Just use dev. On Mar 27, 2012, at 2:00 AM, Russell Jurney russell.jur...@gmail.com wrote: I am on both of these. Its all dupes. Is there a reason both exist? -- Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome.com

Re: [GSoC 2012] Interested in PIG Cubing [PIG-2167]

2012-03-27 Thread Dmitriy Ryaboy
Yup, I'll be happy to work with students on this. Looking forward to your application. On Tue, Mar 27, 2012 at 1:08 PM, Daniel Dai da...@hortonworks.com wrote: I think it will be Dmitriy. Daniel On Tue, Mar 27, 2012 at 12:19 PM, ASF - Maillists buckeye.prasa...@gmail.com wrote: Thanks

Re: Making git the repo of choice for Pig?

2012-03-27 Thread Dmitriy Ryaboy
in my humble opinion. Cheers, -- Gianmarco On Sat, Mar 24, 2012 at 01:27, Dmitriy Ryaboy dvrya...@gmail.com wrote: There is a check box you check when you upload a patch. If we committed without verifying you checked it, thats an unfortunate oversight. No, you cant send a pull

Re: Making git the repo of choice for Pig?

2012-03-23 Thread Dmitriy Ryaboy
: What's the work? Russell Jurney http://datasyndrome.com On Mar 21, 2012, at 9:39 PM, Dmitriy Ryaboy dvrya...@gmail.com wrote: All that's left is for someone to volunteer to do the work. On Mar 21, 2012, at 9:17 PM, Russell Jurney russell.jur...@gmail.com wrote: What do we have

Re: Making git the repo of choice for Pig?

2012-03-22 Thread Dmitriy Ryaboy
. Russell Jurney http://datasyndrome.com On Mar 21, 2012, at 10:20 PM, Dmitriy Ryaboy dvrya...@gmail.com wrote: Answering that question is the first part of the answer to that question. #recursion D On Wed, Mar 21, 2012 at 9:57 PM, Russell Jurney russell.jur...@gmail.comwrote

Re: Making git the repo of choice for Pig?

2012-03-21 Thread Dmitriy Ryaboy
All that's left is for someone to volunteer to do the work. On Mar 21, 2012, at 9:17 PM, Russell Jurney russell.jur...@gmail.com wrote: What do we have to do to be the first real project that uses git? Let's do that. Or, let's just sink svn to github. It will propel the project forward.

Re: Making git the repo of choice for Pig?

2012-03-21 Thread Dmitriy Ryaboy
Answering that question is the first part of the answer to that question. #recursion D On Wed, Mar 21, 2012 at 9:57 PM, Russell Jurney russell.jur...@gmail.comwrote: What's the work? Russell Jurney http://datasyndrome.com On Mar 21, 2012, at 9:39 PM, Dmitriy Ryaboy dvrya...@gmail.com wrote

Re: Pig User Group

2012-03-19 Thread Dmitriy Ryaboy
Twitter can host :) D On Mon, Mar 19, 2012 at 3:44 PM, Jonathan Coveney jcove...@gmail.com wrote: I'd love for there to be a PUG, and equally would love to know if there is still a contributor meeting... 2012/3/19 Russell Jurney russell.jur...@gmail.com Are the contributor meetings still

Re: [ANNOUNCE] Welcome new Apache Pig Committers and PMC members

2012-03-19 Thread Dmitriy Ryaboy
Huge congrats to both, well-deserved! D On Mon, Mar 19, 2012 at 5:03 PM, Daniel Dai da...@hortonworks.com wrote: Pig users and developers, The Apache Pig PMCs is pleased to announce the new additions to Pig project: * Jonathan Coveney is now Apache Pig committer * Julien Le Dem is now

Re: How can I track the actual map and reduce tasks executed in pig?

2012-03-12 Thread Dmitriy Ryaboy
the execution MapReduce plan by manipulating the script. Thanks! On Fri, Mar 9, 2012 at 2:02 PM, Dmitriy Ryaboy dvrya...@gmail.com wrote: Pig always runs the same piece of code, generically speaking. There is no codegen. What actually happens is driven by serialized DAGs of the physical

yslow optimizations

2012-03-10 Thread Dmitriy Ryaboy
Yslow does some clever correlation-based optimizations to achieve significant speedups. They have a good paper about it: http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-11-7.pdf Note the Hive/Pig numbers.. we are generating unnecessary jobs, and too much intermediate data, it

Re: How can I track the actual map and reduce tasks executed in pig?

2012-03-09 Thread Dmitriy Ryaboy
to the worker? Or can you tell me which piece of source code in the pig project generate the map and reduce tasks parsed to the slave worker? Thanks! On Thu, Mar 8, 2012 at 8:23 PM, Dmitriy Ryaboy dvrya...@gmail.com wrote: That's what I get for reading explain plans on an iphone. Sorry. So

  1   2   >