[jira] [Updated] (PIG-3222) New UDFContextSignature assignments in Pig 0.11 breaks HCatalog.HCatStorer

2013-03-20 Thread Feng Peng (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feng Peng updated PIG-3222: --- Attachment: PigStorerDemo.java A test class that decorates a StoreFunc and shows the UDFContextSignature set

[jira] [Commented] (PIG-3222) New UDFContextSignature assignments in Pig 0.11 breaks HCatalog.HCatStorer

2013-03-20 Thread Feng Peng (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13607326#comment-13607326 ] Feng Peng commented on PIG-3222: Using the attached test class and the following script

[jira] [Updated] (PIG-3251) Bzip2TextInputFormat requires double the memory of maximum record size

2013-03-20 Thread Koji Noguchi (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Noguchi updated PIG-3251: -- Attachment: pig-3251-trunk-v02.patch (1) Current status (before any patch) ||hadoop version ||

Fwd: Call for papers: Management of Big Data track - ICAC'13 by USENIX/ACM-SIGARCH

2013-03-20 Thread Alan Gates
Begin forwarded message: From: Dani Abel Rayan ira...@gatech.edu Date: March 14, 2013 10:55:00 AM PDT To: user u...@hadoop.apache.org Subject: Call for papers: Management of Big Data track - ICAC'13 by USENIX/ACM-SIGARCH Reply-To: u...@hadoop.apache.org Hi, Join us for the 10th

[jira] [Commented] (PIG-3251) Bzip2TextInputFormat requires double the memory of maximum record size

2013-03-20 Thread Richard Ding (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13607820#comment-13607820 ] Richard Ding commented on PIG-3251: --- With HADOOP-7823, can we remove Bzip2TextInputFormat

[jira] [Commented] (PIG-3251) Bzip2TextInputFormat requires double the memory of maximum record size

2013-03-20 Thread Koji Noguchi (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13607860#comment-13607860 ] Koji Noguchi commented on PIG-3251: --- bq. With HADOOP-7823, can we remove

[jira] [Commented] (PIG-3251) Bzip2TextInputFormat requires double the memory of maximum record size

2013-03-20 Thread Koji Noguchi (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13607886#comment-13607886 ] Koji Noguchi commented on PIG-3251: --- bq. With HADOOP-7823, can we remove

[jira] [Commented] (PIG-3222) New UDFContextSignature assignments in Pig 0.11 breaks HCatalog.HCatStorer

2013-03-20 Thread Feng Peng (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13607904#comment-13607904 ] Feng Peng commented on PIG-3222: Modified the test class to add instruments for all

Anybody using custom Serializer/Deserializer in Pig Streaming?

2013-03-20 Thread Koji Noguchi
Hi. Do you know anyone using custom serializer/deserializer in pig streaming? I was looking at http://wiki.apache.org/pig/PigStreamingFunctionalSpec and was impressed on various features it supports. Then, looking at the code, I was sad to see many additional data copying done to support those

Re: Review Request: [PIG-3173] - Partition filter pushdown does not happen if partition keys condition include a AND and OR construct

2013-03-20 Thread Rohini Palaniswamy
On March 20, 2013, 12:42 a.m., Dmitriy Ryaboy wrote: http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/PColFilterExtractor.java, line 229 https://reviews.apache.org/r/10035/diff/1/?file=272254#file272254line229 looks like tabulation is off? The whole file uses

[jira] [Commented] (PIG-3223) AvroStorage does not handle comma separated input paths

2013-03-20 Thread Johnny Zhang (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13608025#comment-13608025 ] Johnny Zhang commented on PIG-3223: --- Thanks for clarification, I will post patch soon.

[jira] [Commented] (PIG-3141) Giving CSVExcelStorage an option to handle header rows

2013-03-20 Thread Cheolsoo Park (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13608029#comment-13608029 ] Cheolsoo Park commented on PIG-3141: Thank you Jonathan P. for the patch! I made some

[jira] [Commented] (PIG-3251) Bzip2TextInputFormat requires double the memory of maximum record size

2013-03-20 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13608030#comment-13608030 ] Daniel Dai commented on PIG-3251: - Makes sense, we shall move to the new approach for Hadoop

[jira] [Commented] (PIG-3254) Fail a failed Pig script quicker

2013-03-20 Thread Dmitriy V. Ryaboy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13608070#comment-13608070 ] Dmitriy V. Ryaboy commented on PIG-3254: Can I add a request for whoever will work

Re: Review Request: [PIG-3173] - Partition filter pushdown does not happen if partition keys condition include a AND and OR construct

2013-03-20 Thread Dmitriy Ryaboy
On March 20, 2013, 12:42 a.m., Dmitriy Ryaboy wrote: http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/PColFilterExtractor.java, line 224 https://reviews.apache.org/r/10035/diff/1/?file=272254#file272254line224 (A and B) or (C and D) is impossible

Re: Anybody using custom Serializer/Deserializer in Pig Streaming?

2013-03-20 Thread Rohini Palaniswamy
Nice summarization Koji. Wish we had some object that has byte[] and length instead of byte[] as the return type of serialize() and method param of deserialize(). That would enable reuse and cut down on some of the copy. At least there is one copy we can cut down without any API changes by having

[jira] [Created] (PIG-3255) Avoid extra byte array copy in streaming deserialize

2013-03-20 Thread Rohini Palaniswamy (JIRA)
Rohini Palaniswamy created PIG-3255: --- Summary: Avoid extra byte array copy in streaming deserialize Key: PIG-3255 URL: https://issues.apache.org/jira/browse/PIG-3255 Project: Pig Issue

[jira] [Updated] (PIG-3255) Avoid extra byte array copy in streaming deserialize

2013-03-20 Thread Rohini Palaniswamy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-3255: Description: PigStreaming.java: public Tuple deserialize(byte[] bytes) throws IOException

[jira] [Assigned] (PIG-3255) Avoid extra byte array copy in streaming deserialize

2013-03-20 Thread Rohini Palaniswamy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy reassigned PIG-3255: --- Assignee: Rohini Palaniswamy Avoid extra byte array copy in streaming deserialize

[jira] [Commented] (PIG-3110) pig corrupts chararrays with trailing whitespace when converting them to long

2013-03-20 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13608125#comment-13608125 ] Daniel Dai commented on PIG-3110: - Looks good. Do we still need to check ret == null case if

[jira] [Updated] (PIG-3255) Avoid extra byte array copy in streaming deserialize

2013-03-20 Thread Rohini Palaniswamy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-3255: Attachment: PIG-3255-1.patch Avoid extra byte array copy in streaming deserialize

[jira] [Updated] (PIG-3255) Avoid extra byte array copy in streaming deserialize

2013-03-20 Thread Rohini Palaniswamy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-3255: Status: Patch Available (was: Open) Avoid extra byte array copy in streaming

[jira] [Updated] (PIG-3251) Bzip2TextInputFormat requires double the memory of maximum record size

2013-03-20 Thread Koji Noguchi (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Noguchi updated PIG-3251: -- Attachment: pig-3251-trunk-v03.patch bq. Makes sense, we shall move to the new approach for Hadoop

Re: Review Request: PIG-3141 [piggybank] Giving CSVExcelStorage an option to handle header rows

2013-03-20 Thread Cheolsoo Park
On March 20, 2013, 7:05 p.m., Cheolsoo Park wrote: contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/CSVExcelStorage.java, line 538 https://reviews.apache.org/r/9697/diff/2/?file=263987#file263987line538 Can you move this line to inside the if block? That's

[jira] [Updated] (PIG-3253) Misleading comment w.r.t getSplitIndex() method in PigSplit.java

2013-03-20 Thread Cheolsoo Park (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheolsoo Park updated PIG-3253: --- Resolution: Fixed Fix Version/s: 0.12 Status: Resolved (was: Patch Available)

[jira] [Commented] (PIG-3255) Avoid extra byte array copy in streaming deserialize

2013-03-20 Thread Koji Noguchi (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13608227#comment-13608227 ] Koji Noguchi commented on PIG-3255: --- +1 Looks good to me. Probably another Jira, but I

[jira] [Created] (PIG-3256) Upgrade jython to 2.5.3 (legal concern)

2013-03-20 Thread Daniel Dai (JIRA)
Daniel Dai created PIG-3256: --- Summary: Upgrade jython to 2.5.3 (legal concern) Key: PIG-3256 URL: https://issues.apache.org/jira/browse/PIG-3256 Project: Pig Issue Type: Bug Reporter:

[jira] [Updated] (PIG-3255) Avoid extra byte array copy in streaming deserialize

2013-03-20 Thread Rohini Palaniswamy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-3255: Status: Open (was: Patch Available) Had a chat with Koji. He pointed out HADOOP-6109 which

[jira] [Commented] (PIG-3110) pig corrupts chararrays with trailing whitespace when converting them to long

2013-03-20 Thread Prashant Kommireddi (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13608326#comment-13608326 ] Prashant Kommireddi commented on PIG-3110: -- Yes, we do need that in the case input

[jira] [Commented] (PIG-3256) Upgrade jython to 2.5.3 (legal concern)

2013-03-20 Thread Rohini Palaniswamy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13608356#comment-13608356 ] Rohini Palaniswamy commented on PIG-3256: - +1. Should we put it in 0.11 branch also?

[jira] [Updated] (PIG-3256) Upgrade jython to 2.5.3 (legal concern)

2013-03-20 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-3256: Attachment: PIG-3256-2.patch Thanks Rohini. Updated the patch to include your suggestion. I would like to

[jira] [Updated] (PIG-3256) Upgrade jython to 2.5.3 (legal concern)

2013-03-20 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-3256: Fix Version/s: 0.11.1 Upgrade jython to 2.5.3 (legal concern) ---

[jira] [Updated] (PIG-3256) Upgrade jython to 2.5.3 (legal concern)

2013-03-20 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-3256: Priority: Critical (was: Major) Upgrade jython to 2.5.3 (legal concern)

[jira] [Updated] (PIG-3110) pig corrupts chararrays with trailing whitespace when converting them to long

2013-03-20 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-3110: Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Sounds

[jira] [Commented] (PIG-3256) Upgrade jython to 2.5.3 (legal concern)

2013-03-20 Thread Rohini Palaniswamy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13608437#comment-13608437 ] Rohini Palaniswamy commented on PIG-3256: - loadproperties

[jira] [Updated] (PIG-3249) Pig startup script prints out a wrong version of hadoop when using fat jar

2013-03-20 Thread Prashant Kommireddi (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Kommireddi updated PIG-3249: - Patch Info: Patch Available Pig startup script prints out a wrong version of hadoop

[jira] [Updated] (PIG-3256) Upgrade jython to 2.5.3 (legal concern)

2013-03-20 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-3256: Attachment: PIG-3256-3.patch Yes, miss that. Upload again. Upgrade jython to 2.5.3 (legal

[jira] [Updated] (PIG-3249) Pig startup script prints out a wrong version of hadoop when using fat jar

2013-03-20 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-3249: Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) +1, patch

[jira] [Commented] (PIG-3256) Upgrade jython to 2.5.3 (legal concern)

2013-03-20 Thread Rohini Palaniswamy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13608455#comment-13608455 ] Rohini Palaniswamy commented on PIG-3256: - Sorry. One last comment. Should have

[jira] [Updated] (PIG-3193) Fix ant docs warnings

2013-03-20 Thread Cheolsoo Park (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheolsoo Park updated PIG-3193: --- Attachment: PIG-3193.patch I am attaching a patch that includes the following changes: * Fixed Javadoc

[jira] [Updated] (PIG-3256) Upgrade jython to 2.5.3 (legal concern)

2013-03-20 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-3256: Attachment: PIG-3256-4.patch Attached new patch. Thanks again! Upgrade jython to 2.5.3

[jira] [Commented] (PIG-3256) Upgrade jython to 2.5.3 (legal concern)

2013-03-20 Thread Rohini Palaniswamy (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13608469#comment-13608469 ] Rohini Palaniswamy commented on PIG-3256: - +1 Upgrade jython to

Bytes to Long/Interger conversions

2013-03-20 Thread Prashant Kommireddi
Daniel and myself were discussing the way Pig does these conversions currently and possibly simplify/optimize it further. Long ret = null; if (sanityCheckIntegerLong(s)) { try { ret = Long.valueOf(s); } catch (NumberFormatException nfe) {

[jira] Subscription: PIG patch available

2013-03-20 Thread jira
Issue Subscription Filter: PIG patch available (31 issues) Subscriber: pigdaily Key Summary PIG-3247Piggybank functions to mimic OVER clause in SQL https://issues.apache.org/jira/browse/PIG-3247 PIG-3238Pig current releases lack a UDF Stuff(). This UDF deletes

[jira] [Resolved] (PIG-3256) Upgrade jython to 2.5.3 (legal concern)

2013-03-20 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai resolved PIG-3256. - Resolution: Fixed Hadoop Flags: Reviewed Patch committed to 0.11 branch and trunk. Thanks Rohini!

[jira] [Updated] (PIG-1271) Provide a more flexible data format to load complex field (bag/tuple/map) in PigStorage

2013-03-20 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1271: Labels: gsoc2013 (was: gsoc2012) Provide a more flexible data format to load complex field

[jira] [Updated] (PIG-1271) Provide a more flexible data format to load complex field (bag/tuple/map) in PigStorage

2013-03-20 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1271: Description: With [PIG-613|https://issues.apache.org/jira/browse/PIG-613], we are able to load txt files

[jira] [Updated] (PIG-2586) A better plan/data flow visualizer

2013-03-20 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-2586: Labels: gsoc2013 (was: gsoc2012) A better plan/data flow visualizer

[jira] [Updated] (PIG-2586) A better plan/data flow visualizer

2013-03-20 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-2586: Description: Pig supports a dot graph style plan to visualize the logical/physical/mapreduce plan (explain

[jira] [Updated] (PIG-2599) Mavenize Pig

2013-03-20 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-2599: Labels: gsoc2013 (was: ) Mavenize Pig Key: PIG-2599

Put a Google summer of code 2013 cwiki page

2013-03-20 Thread Daniel Dai
https://cwiki.apache.org/confluence/display/PIG/GSoc2013 Feel free to add more project which could fit in the timeline of a student summer project. I remember there are several projects we discussed in our last meetup: * Allow Pig use Hive UDFs, Alan, do we have a ticket for that? * A general

[jira] [Updated] (PIG-2599) Mavenize Pig

2013-03-20 Thread Daniel Dai (JIRA)
[ https://issues.apache.org/jira/browse/PIG-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-2599: Description: Switch Pig build system from ant to maven. This is a candidate project for Google summer of