Re: Move pig project from svn to git repository

2013-02-01 Thread Bill Graham
I'm a huge fan of git and use it exclusively for Pig with the exception of
committing patches. I haven't personally experienced any reliability issues
with the git mirror. What are the reliability issues you've seen?


On Fri, Feb 1, 2013 at 6:46 PM, Jarek Jarcec Cecho wrote:

> Hi pig developers,
> I personally prefer git over svn, so I'm using the git mirrors that Apache
> provides. As those mirrors do not seem entirely reliable I was wondering
> whether there are other pig developers that also prefer git over svn as
> myself. Apache Infrastructure Team is supporting projects that are
> primarily working with git, so my question is - would pig developer
> community be interested in migrating the repository from svn to git?
>
> I've recently participated in three projects that done this change, namely
> Sqoop, Flume and MRunit, and it's not a big deal. The process is rather
> simple, just it take some time as most of the job is done by Infrastructure
> team. I would be more than happy to help or even drive the process in case
> that this change would be desirable by community.
>
> Jarcec
>



-- 
*Note that I'm no longer using my Yahoo! email address. Please email me at
billgra...@gmail.com going forward.*


Move pig project from svn to git repository

2013-02-01 Thread Jarek Jarcec Cecho
Hi pig developers,
I personally prefer git over svn, so I'm using the git mirrors that Apache 
provides. As those mirrors do not seem entirely reliable I was wondering 
whether there are other pig developers that also prefer git over svn as myself. 
Apache Infrastructure Team is supporting projects that are primarily working 
with git, so my question is - would pig developer community be interested in 
migrating the repository from svn to git?

I've recently participated in three projects that done this change, namely 
Sqoop, Flume and MRunit, and it's not a big deal. The process is rather simple, 
just it take some time as most of the job is done by Infrastructure team. I 
would be more than happy to help or even drive the process in case that this 
change would be desirable by community.

Jarcec


signature.asc
Description: Digital signature


[jira] [Commented] (PIG-3015) Rewrite of AvroStorage

2013-02-01 Thread Cheolsoo Park (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13569397#comment-13569397
 ] 

Cheolsoo Park commented on PIG-3015:


[~jadler], if you could add documentation, that would be awesome!

> Rewrite of AvroStorage
> --
>
> Key: PIG-3015
> URL: https://issues.apache.org/jira/browse/PIG-3015
> Project: Pig
>  Issue Type: Improvement
>  Components: piggybank
>Reporter: Joseph Adler
>Assignee: Joseph Adler
> Attachments: bad.avro, good.avro, PIG-3015-2.patch, PIG-3015-3.patch, 
> PIG-3015-4.patch, PIG-3015-5.patch, PIG-3015-6.patch, PIG-3015-7.patch, 
> TestInput.java, Test.java
>
>
> The current AvroStorage implementation has a lot of issues: it requires old 
> versions of Avro, it copies data much more than needed, and it's verbose and 
> complicated. (One pet peeve of mine is that old versions of Avro don't 
> support Snappy compression.)
> I rewrote AvroStorage from scratch to fix these issues. In early tests, the 
> new implementation is significantly faster, and the code is a lot simpler. 
> Rewriting AvroStorage also enabled me to implement support for Trevni (as 
> TrevniStorage).
> I'm opening this ticket to facilitate discussion while I figure out the best 
> way to contribute the changes back to Apache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Subscription: PIG patch available

2013-02-01 Thread jira
Issue Subscription
Filter: PIG patch available (26 issues)

Subscriber: pigdaily

Key Summary
PIG-3142Fixed-width load and store functions for the Piggybank
https://issues.apache.org/jira/browse/PIG-3142
PIG-3137fix Piggybank test to not using /tmp dir
https://issues.apache.org/jira/browse/PIG-3137
PIG-3136Introduce a syntax making declared aliases optional
https://issues.apache.org/jira/browse/PIG-3136
PIG-3123Simplify Logical Plans By Removing Unneccessary Identity Projections
https://issues.apache.org/jira/browse/PIG-3123
PIG-3122Operators should not implicitly become reserved keywords
https://issues.apache.org/jira/browse/PIG-3122
PIG-3114Duplicated macro name error when using pigunit
https://issues.apache.org/jira/browse/PIG-3114
PIG-3108HBaseStorage returns empty maps when mixing wildcard- with other 
columns
https://issues.apache.org/jira/browse/PIG-3108
PIG-3105Fix TestJobSubmission unit test failure.
https://issues.apache.org/jira/browse/PIG-3105
PIG-3098Add another test for the self join case
https://issues.apache.org/jira/browse/PIG-3098
PIG-3088Add a builtin udf which removes prefixes
https://issues.apache.org/jira/browse/PIG-3088
PIG-3069Native Windows Compatibility for Pig E2E Tests and Harness
https://issues.apache.org/jira/browse/PIG-3069
PIG-3028testGrunt dev test needs some command filters to run correctly 
without cygwin
https://issues.apache.org/jira/browse/PIG-3028
PIG-3027pigTest unit test needs a newline filter for comparisons of golden 
multi-line
https://issues.apache.org/jira/browse/PIG-3027
PIG-3026Pig checked-in baseline comparisons need a pre-filter to address 
OS-specific newline differences
https://issues.apache.org/jira/browse/PIG-3026
PIG-3025TestPruneColumn unit test - SimpleEchoStreamingCommand perl inline 
script needs simplification
https://issues.apache.org/jira/browse/PIG-3025
PIG-3024TestEmptyInputDir unit test - hadoop version detection logic is 
brittle
https://issues.apache.org/jira/browse/PIG-3024
PIG-3015Rewrite of AvroStorage
https://issues.apache.org/jira/browse/PIG-3015
PIG-3010Allow UDF's to flatten themselves
https://issues.apache.org/jira/browse/PIG-3010
PIG-2959Add a pig.cmd for Pig to run under Windows
https://issues.apache.org/jira/browse/PIG-2959
PIG-2955 Fix bunch of Pig e2e tests on Windows 
https://issues.apache.org/jira/browse/PIG-2955
PIG-2873Converting bin/pig shell script to python
https://issues.apache.org/jira/browse/PIG-2873
PIG-2834MultiStorage requires unused constructor argument
https://issues.apache.org/jira/browse/PIG-2834
PIG-2661Pig uses an extra job for loading data in Pigmix L9
https://issues.apache.org/jira/browse/PIG-2661
PIG-1942script UDF (jython) should utilize the intended output schema to 
more directly convert Py objects to Pig objects
https://issues.apache.org/jira/browse/PIG-1942
PIG-1914Support load/store JSON data in Pig
https://issues.apache.org/jira/browse/PIG-1914
PIG-1237Piggybank MutliStorage - specify field to write in output
https://issues.apache.org/jira/browse/PIG-1237

You may edit this subscription at:
https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=13225&filterId=12322384


[jira] [Updated] (PIG-2878) Pig current releases lack a UDF equalIgnoreCase.This function returns a Boolean value indicating whether string left is equal to string right. This check is case insensitiv

2013-02-01 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated PIG-2878:


   Resolution: Fixed
Fix Version/s: 0.12
   Status: Resolved  (was: Patch Available)

Patch 1 checked into trunk.  Thanks Shami for your work on this.

> Pig current releases lack a UDF equalIgnoreCase.This function returns a 
> Boolean value indicating whether string left is equal to string right. This 
> check is case insensitive.
> --
>
> Key: PIG-2878
> URL: https://issues.apache.org/jira/browse/PIG-2878
> Project: Pig
>  Issue Type: Bug
>  Components: internal-udfs
>Affects Versions: 0.10.0
>Reporter: Arjun K R
>Assignee: Shami B
>  Labels: features
> Fix For: 0.12
>
> Attachments: PIG-2878-1.patch, PIG-2878.patch, PIG-2878-UnitTest.patch
>
>
> Pig current releases lack a UDF equalIgnoreCase.This function returns a 
> Boolean value indicating whether string left is equal to string right. This 
> check is case insensitive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3015) Rewrite of AvroStorage

2013-02-01 Thread Joseph Adler (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13569190#comment-13569190
 ] 

Joseph Adler commented on PIG-3015:
---

Let me know what help you need. I can work on the documentation as well. Is 
early next week enough time? (Also, check out Avro-1241. I couldn't get 
adequate performance without it.)

> Rewrite of AvroStorage
> --
>
> Key: PIG-3015
> URL: https://issues.apache.org/jira/browse/PIG-3015
> Project: Pig
>  Issue Type: Improvement
>  Components: piggybank
>Reporter: Joseph Adler
>Assignee: Joseph Adler
> Attachments: bad.avro, good.avro, PIG-3015-2.patch, PIG-3015-3.patch, 
> PIG-3015-4.patch, PIG-3015-5.patch, PIG-3015-6.patch, PIG-3015-7.patch, 
> TestInput.java, Test.java
>
>
> The current AvroStorage implementation has a lot of issues: it requires old 
> versions of Avro, it copies data much more than needed, and it's verbose and 
> complicated. (One pet peeve of mine is that old versions of Avro don't 
> support Snappy compression.)
> I rewrote AvroStorage from scratch to fix these issues. In early tests, the 
> new implementation is significantly faster, and the code is a lot simpler. 
> Rewriting AvroStorage also enabled me to implement support for Trevni (as 
> TrevniStorage).
> I'm opening this ticket to facilitate discussion while I figure out the best 
> way to contribute the changes back to Apache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3145) Parameters in core-site.xml and mapred-site.xml are not correctly substituted

2013-02-01 Thread Cheolsoo Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheolsoo Park updated PIG-3145:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thank you Santhosh for the review. Committed to trunk.

> Parameters in core-site.xml and mapred-site.xml are not correctly substituted
> -
>
> Key: PIG-3145
> URL: https://issues.apache.org/jira/browse/PIG-3145
> Project: Pig
>  Issue Type: Bug
>Reporter: Cheolsoo Park
>Assignee: Cheolsoo Park
> Attachments: PIG-3145.patch
>
>
> To reproduce the issue, please do the following:
> # Parameterize the address of name node in core-site.xml.
> {code}
>   
> fs.default.name
> hdfs://${foo}:8020
>   
> {code}
> # Set the value of "foo" via -D option.
> {code}
> export PIG_OPTS="-Dfoo=mr1-0.cheolsoo.com"
> {code}
> # Pig fails with the following error.
> {code}
> 2013-01-28 18:54:02,786 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting 
> to hadoop file system at: hdfs://${foo}:8020
> 2013-01-28 18:54:02,805 [main] ERROR org.apache.pig.Main - ERROR 2999: 
> Unexpected internal error. null
> Details at logfile: /home/cheolsoo/pig-cdh/pig_1359428042522.log
> {code}
> Note that the parameter $\{foo\} in core-site.xml is not expanded. This is 
> because the addresses of name node and job tracker are read directly from 
> core-site.xml instead of reading via Configuration.get().
> {code:title=HExecutionEngine.java}
> // properties is Java Properties
> cluster = properties.getProperty(JOB_TRACKER_LOCATION);
> nameNode = properties.getProperty(FILE_SYSTEM_LOCATION);
> {code}
> Replacing these lines with Configuration.get() fixes the issue.
> {code:title=HExecutionEngine.java}
> // jc is Hadoop Configuration
> cluster = jc.get(JOB_TRACKER_LOCATION);
> nameNode = jc.get(FILE_SYSTEM_LOCATION);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3145) Parameters in core-site.xml and mapred-site.xml are not correctly substituted

2013-02-01 Thread Cheolsoo Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheolsoo Park updated PIG-3145:
---

Fix Version/s: 0.12

> Parameters in core-site.xml and mapred-site.xml are not correctly substituted
> -
>
> Key: PIG-3145
> URL: https://issues.apache.org/jira/browse/PIG-3145
> Project: Pig
>  Issue Type: Bug
>Reporter: Cheolsoo Park
>Assignee: Cheolsoo Park
> Fix For: 0.12
>
> Attachments: PIG-3145.patch
>
>
> To reproduce the issue, please do the following:
> # Parameterize the address of name node in core-site.xml.
> {code}
>   
> fs.default.name
> hdfs://${foo}:8020
>   
> {code}
> # Set the value of "foo" via -D option.
> {code}
> export PIG_OPTS="-Dfoo=mr1-0.cheolsoo.com"
> {code}
> # Pig fails with the following error.
> {code}
> 2013-01-28 18:54:02,786 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting 
> to hadoop file system at: hdfs://${foo}:8020
> 2013-01-28 18:54:02,805 [main] ERROR org.apache.pig.Main - ERROR 2999: 
> Unexpected internal error. null
> Details at logfile: /home/cheolsoo/pig-cdh/pig_1359428042522.log
> {code}
> Note that the parameter $\{foo\} in core-site.xml is not expanded. This is 
> because the addresses of name node and job tracker are read directly from 
> core-site.xml instead of reading via Configuration.get().
> {code:title=HExecutionEngine.java}
> // properties is Java Properties
> cluster = properties.getProperty(JOB_TRACKER_LOCATION);
> nameNode = properties.getProperty(FILE_SYSTEM_LOCATION);
> {code}
> Replacing these lines with Configuration.get() fixes the issue.
> {code:title=HExecutionEngine.java}
> // jc is Hadoop Configuration
> cluster = jc.get(JOB_TRACKER_LOCATION);
> nameNode = jc.get(FILE_SYSTEM_LOCATION);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3137) fix Piggybank test to not using /tmp dir

2013-02-01 Thread Johnny Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johnny Zhang updated PIG-3137:
--

Status: Patch Available  (was: Open)

[~cheolsoo], thanks for your comments, new patch just posted
1. change contrib/piggybank/java/build.xml to create log in user.dir instead of 
/tmp
2. not using FileLocalizer but using contrib/piggybank/java/build/test/ to 
store hsqldb in TestDBStorage
3. use pig.temp.dir to update PigContext's temp dir to 
contrib/piggybank/java/build/test/tmp/  before using FileLocalizer in 
TestAvroStorage

> fix Piggybank test to not using /tmp dir
> 
>
> Key: PIG-3137
> URL: https://issues.apache.org/jira/browse/PIG-3137
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11
>Reporter: Johnny Zhang
>Assignee: Johnny Zhang
> Fix For: 0.12
>
> Attachments: PIG-3137.nows.patch.txt, PIG-3137.patch.txt, 
> PIG-3137.patch.txt
>
>
> right now several Piggybank tests create directory under /tmp to store test 
> data, the test could fail because user doesn't have permission to create 
> directory under /tmp. It is better to move test data dir under build dir to 
> avoid this problem.
> I will submit a patch soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3145) Parameters in core-site.xml and mapred-site.xml are not correctly substituted

2013-02-01 Thread Santhosh Srinivasan (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13569146#comment-13569146
 ] 

Santhosh Srinivasan commented on PIG-3145:
--

+1 - the changes look good.

> Parameters in core-site.xml and mapred-site.xml are not correctly substituted
> -
>
> Key: PIG-3145
> URL: https://issues.apache.org/jira/browse/PIG-3145
> Project: Pig
>  Issue Type: Bug
>Reporter: Cheolsoo Park
>Assignee: Cheolsoo Park
> Attachments: PIG-3145.patch
>
>
> To reproduce the issue, please do the following:
> # Parameterize the address of name node in core-site.xml.
> {code}
>   
> fs.default.name
> hdfs://${foo}:8020
>   
> {code}
> # Set the value of "foo" via -D option.
> {code}
> export PIG_OPTS="-Dfoo=mr1-0.cheolsoo.com"
> {code}
> # Pig fails with the following error.
> {code}
> 2013-01-28 18:54:02,786 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting 
> to hadoop file system at: hdfs://${foo}:8020
> 2013-01-28 18:54:02,805 [main] ERROR org.apache.pig.Main - ERROR 2999: 
> Unexpected internal error. null
> Details at logfile: /home/cheolsoo/pig-cdh/pig_1359428042522.log
> {code}
> Note that the parameter $\{foo\} in core-site.xml is not expanded. This is 
> because the addresses of name node and job tracker are read directly from 
> core-site.xml instead of reading via Configuration.get().
> {code:title=HExecutionEngine.java}
> // properties is Java Properties
> cluster = properties.getProperty(JOB_TRACKER_LOCATION);
> nameNode = properties.getProperty(FILE_SYSTEM_LOCATION);
> {code}
> Replacing these lines with Configuration.get() fixes the issue.
> {code:title=HExecutionEngine.java}
> // jc is Hadoop Configuration
> cluster = jc.get(JOB_TRACKER_LOCATION);
> nameNode = jc.get(FILE_SYSTEM_LOCATION);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3137) fix Piggybank test to not using /tmp dir

2013-02-01 Thread Johnny Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johnny Zhang updated PIG-3137:
--

Attachment: PIG-3137.patch.txt

> fix Piggybank test to not using /tmp dir
> 
>
> Key: PIG-3137
> URL: https://issues.apache.org/jira/browse/PIG-3137
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11
>Reporter: Johnny Zhang
>Assignee: Johnny Zhang
> Fix For: 0.12
>
> Attachments: PIG-3137.nows.patch.txt, PIG-3137.patch.txt, 
> PIG-3137.patch.txt
>
>
> right now several Piggybank tests create directory under /tmp to store test 
> data, the test could fail because user doesn't have permission to create 
> directory under /tmp. It is better to move test data dir under build dir to 
> avoid this problem.
> I will submit a patch soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3137) fix Piggybank test to not using /tmp dir

2013-02-01 Thread Johnny Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johnny Zhang updated PIG-3137:
--

Attachment: PIG-3137.nows.patch.txt

> fix Piggybank test to not using /tmp dir
> 
>
> Key: PIG-3137
> URL: https://issues.apache.org/jira/browse/PIG-3137
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11
>Reporter: Johnny Zhang
>Assignee: Johnny Zhang
> Fix For: 0.12
>
> Attachments: PIG-3137.nows.patch.txt, PIG-3137.patch.txt, 
> PIG-3137.patch.txt
>
>
> right now several Piggybank tests create directory under /tmp to store test 
> data, the test could fail because user doesn't have permission to create 
> directory under /tmp. It is better to move test data dir under build dir to 
> avoid this problem.
> I will submit a patch soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3145) Parameters in core-site.xml and mapred-site.xml are not correctly substituted

2013-02-01 Thread Cheolsoo Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheolsoo Park updated PIG-3145:
---

Status: Patch Available  (was: Open)

I ran full unit test and e2e test and found no regression.

Note that the following test cases are failing in trunk:
* org.apache.pig.test.TestScriptUDF PIG-3153
* org.apache.pig.test.TestPackage PIG-3154
* org.apache.pig.test.TestTypeCheckingValidatorNewLP PIG-3155
* org.apache.pig.data.TestSchemaTuple PIG-3156

However, they are not relevant, and I filed jiras for them.

> Parameters in core-site.xml and mapred-site.xml are not correctly substituted
> -
>
> Key: PIG-3145
> URL: https://issues.apache.org/jira/browse/PIG-3145
> Project: Pig
>  Issue Type: Bug
>Reporter: Cheolsoo Park
>Assignee: Cheolsoo Park
> Attachments: PIG-3145.patch
>
>
> To reproduce the issue, please do the following:
> # Parameterize the address of name node in core-site.xml.
> {code}
>   
> fs.default.name
> hdfs://${foo}:8020
>   
> {code}
> # Set the value of "foo" via -D option.
> {code}
> export PIG_OPTS="-Dfoo=mr1-0.cheolsoo.com"
> {code}
> # Pig fails with the following error.
> {code}
> 2013-01-28 18:54:02,786 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting 
> to hadoop file system at: hdfs://${foo}:8020
> 2013-01-28 18:54:02,805 [main] ERROR org.apache.pig.Main - ERROR 2999: 
> Unexpected internal error. null
> Details at logfile: /home/cheolsoo/pig-cdh/pig_1359428042522.log
> {code}
> Note that the parameter $\{foo\} in core-site.xml is not expanded. This is 
> because the addresses of name node and job tracker are read directly from 
> core-site.xml instead of reading via Configuration.get().
> {code:title=HExecutionEngine.java}
> // properties is Java Properties
> cluster = properties.getProperty(JOB_TRACKER_LOCATION);
> nameNode = properties.getProperty(FILE_SYSTEM_LOCATION);
> {code}
> Replacing these lines with Configuration.get() fixes the issue.
> {code:title=HExecutionEngine.java}
> // jc is Hadoop Configuration
> cluster = jc.get(JOB_TRACKER_LOCATION);
> nameNode = jc.get(FILE_SYSTEM_LOCATION);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-3156) TestSchemaTuple fails in trunk

2013-02-01 Thread Cheolsoo Park (JIRA)
Cheolsoo Park created PIG-3156:
--

 Summary: TestSchemaTuple fails in trunk
 Key: PIG-3156
 URL: https://issues.apache.org/jira/browse/PIG-3156
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.12
Reporter: Cheolsoo Park
 Fix For: 0.12


To reproduce the issue, do:
{code}
ant clean test -Dtestcase=TestSchemaTuple
{code}
All 3 test cases fail with the following error:
{code}
Caused by: java.lang.RuntimeException: Unable to compile
at 
org.apache.pig.impl.util.JavaCompilerHelper.compile(JavaCompilerHelper.java:83)
at 
org.apache.pig.data.SchemaTupleClassGenerator.compileCodeString(SchemaTupleClassGenerator.java:233)
at 
org.apache.pig.data.SchemaTupleClassGenerator.generateSchemaTuple(SchemaTupleClassGenerator.java:186)
at 
org.apache.pig.data.SchemaTupleFrontend$SchemaTupleFrontendGenHelper.generateAll(SchemaTupleFrontend.java:203)
at 
org.apache.pig.data.SchemaTupleFrontend$SchemaTupleFrontendGenHelper.access$100(SchemaTupleFrontend.java:91)
at 
org.apache.pig.data.SchemaTupleFrontend.copyAllGeneratedToDistributedCache(SchemaTupleFrontend.java:278)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:656)
{code}
I found that this was introduced by PIG-2764.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3015) Rewrite of AvroStorage

2013-02-01 Thread Russell Jurney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13569048#comment-13569048
 ] 

Russell Jurney commented on PIG-3015:
-

I'll start testing this again.

> Rewrite of AvroStorage
> --
>
> Key: PIG-3015
> URL: https://issues.apache.org/jira/browse/PIG-3015
> Project: Pig
>  Issue Type: Improvement
>  Components: piggybank
>Reporter: Joseph Adler
>Assignee: Joseph Adler
> Attachments: bad.avro, good.avro, PIG-3015-2.patch, PIG-3015-3.patch, 
> PIG-3015-4.patch, PIG-3015-5.patch, PIG-3015-6.patch, PIG-3015-7.patch, 
> TestInput.java, Test.java
>
>
> The current AvroStorage implementation has a lot of issues: it requires old 
> versions of Avro, it copies data much more than needed, and it's verbose and 
> complicated. (One pet peeve of mine is that old versions of Avro don't 
> support Snappy compression.)
> I rewrote AvroStorage from scratch to fix these issues. In early tests, the 
> new implementation is significantly faster, and the code is a lot simpler. 
> Rewriting AvroStorage also enabled me to implement support for Trevni (as 
> TrevniStorage).
> I'm opening this ticket to facilitate discussion while I figure out the best 
> way to contribute the changes back to Apache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-3155) TestTypeCheckingValidatorNewLP.testSortWithInnerPlan3 fails in trunk

2013-02-01 Thread Cheolsoo Park (JIRA)
Cheolsoo Park created PIG-3155:
--

 Summary: TestTypeCheckingValidatorNewLP.testSortWithInnerPlan3 
fails in trunk
 Key: PIG-3155
 URL: https://issues.apache.org/jira/browse/PIG-3155
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.12
Reporter: Cheolsoo Park
 Fix For: 0.12


To reproduce the failure, do:
{code}
ant clean test -Dtestcase=TestTypeCheckingValidatorNewLP
{code}
The test fails with the following error:
{code}
Error expected
junit.framework.AssertionFailedError: Error expected
at 
org.apache.pig.test.TestTypeCheckingValidatorNewLP.testSortWithInnerPlan3(TestTypeCheckingValidatorNewLP.java:1570)
{code}
I found that this was introduced by PIG-2764.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (PIG-3153) TestScriptUDF.testJavascriptExampleScript fails in trunk

2013-02-01 Thread Johnny Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johnny Zhang reassigned PIG-3153:
-

Assignee: Johnny Zhang

> TestScriptUDF.testJavascriptExampleScript fails in trunk
> 
>
> Key: PIG-3153
> URL: https://issues.apache.org/jira/browse/PIG-3153
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.12
>Reporter: Cheolsoo Park
>Assignee: Johnny Zhang
> Fix For: 0.12
>
>
> To reproduce the failure, do:
> {code}
> ant clean test -Dtestcase=TestScriptUDF
> {code}
> The test fails with the following error:
> {code}
> Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Given 
> UDF returns an improper Schema. Schema should only contain one field of a 
> Tuple, Bag, or a single type. Returns: {word: chararray,num: long}
> at 
> org.apache.pig.newplan.logical.expression.UserFuncExpression.getFieldSchema(UserFuncExpression.java:206)
> at 
> org.apache.pig.newplan.logical.optimizer.FieldSchemaResetter.execute(SchemaResetter.java:264)
> at 
> org.apache.pig.newplan.logical.expression.AllSameExpressionVisitor.visit(AllSameExpressionVisitor.java:143)
> at 
> org.apache.pig.newplan.logical.expression.UserFuncExpression.accept(UserFuncExpression.java:88)
> at 
> org.apache.pig.newplan.ReverseDependencyOrderWalker.walk(ReverseDependencyOrderWalker.java:70)
> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
> at 
> org.apache.pig.newplan.logical.optimizer.SchemaResetter.visitAll(SchemaResetter.java:67)
> at 
> org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:122)
> at 
> org.apache.pig.newplan.logical.relational.LOGenerate.accept(LOGenerate.java:240)
> at 
> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
> at 
> org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:114)
> at 
> org.apache.pig.newplan.logical.relational.LOForEach.accept(LOForEach.java:76)
> at 
> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
> at 
> org.apache.pig.parser.LogicalPlanBuilder.expandAndResetVisitor(LogicalPlanBuilder.java:402)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-3154) TestPackage.testOperator fails in trunk

2013-02-01 Thread Cheolsoo Park (JIRA)
Cheolsoo Park created PIG-3154:
--

 Summary: TestPackage.testOperator fails in trunk
 Key: PIG-3154
 URL: https://issues.apache.org/jira/browse/PIG-3154
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.12
Reporter: Cheolsoo Park
 Fix For: 0.12


To reproduce the issue, do:
{code}
ant clean test -Dtestcase=TestPackage
{code}
The test fails with the following error:
{code}
No test case for type biginteger
junit.framework.AssertionFailedError: No test case for type biginteger
at org.apache.pig.test.TestPackage.pickTest(TestPackage.java:153)
at org.apache.pig.test.TestPackage.testOperator(TestPackage.java:171)
{code}
Apparently, this is broken by PIG-2764.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-3153) TestScriptUDF.testJavascriptExampleScript fails in trunk

2013-02-01 Thread Cheolsoo Park (JIRA)
Cheolsoo Park created PIG-3153:
--

 Summary: TestScriptUDF.testJavascriptExampleScript fails in trunk
 Key: PIG-3153
 URL: https://issues.apache.org/jira/browse/PIG-3153
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.12
Reporter: Cheolsoo Park
 Fix For: 0.12


To reproduce the failure, do:
{code}
ant clean test -Dtestcase=TestScriptUDF
{code}
The test fails with the following error:
{code}
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Given 
UDF returns an improper Schema. Schema should only contain one field of a 
Tuple, Bag, or a single type. Returns: {word: chararray,num: long}
at 
org.apache.pig.newplan.logical.expression.UserFuncExpression.getFieldSchema(UserFuncExpression.java:206)
at 
org.apache.pig.newplan.logical.optimizer.FieldSchemaResetter.execute(SchemaResetter.java:264)
at 
org.apache.pig.newplan.logical.expression.AllSameExpressionVisitor.visit(AllSameExpressionVisitor.java:143)
at 
org.apache.pig.newplan.logical.expression.UserFuncExpression.accept(UserFuncExpression.java:88)
at 
org.apache.pig.newplan.ReverseDependencyOrderWalker.walk(ReverseDependencyOrderWalker.java:70)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
at 
org.apache.pig.newplan.logical.optimizer.SchemaResetter.visitAll(SchemaResetter.java:67)
at 
org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:122)
at 
org.apache.pig.newplan.logical.relational.LOGenerate.accept(LOGenerate.java:240)
at 
org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
at 
org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:114)
at 
org.apache.pig.newplan.logical.relational.LOForEach.accept(LOForEach.java:76)
at 
org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
at 
org.apache.pig.parser.LogicalPlanBuilder.expandAndResetVisitor(LogicalPlanBuilder.java:402)
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2878) Pig current releases lack a UDF equalIgnoreCase.This function returns a Boolean value indicating whether string left is equal to string right. This check is case insensitiv

2013-02-01 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated PIG-2878:


Attachment: PIG-2878-1.patch

Attaching a single patch with the previous two combined.  I also took the 
liberty of expanding the unit test to have a negative case.  This patch 
represents what I will check in.

> Pig current releases lack a UDF equalIgnoreCase.This function returns a 
> Boolean value indicating whether string left is equal to string right. This 
> check is case insensitive.
> --
>
> Key: PIG-2878
> URL: https://issues.apache.org/jira/browse/PIG-2878
> Project: Pig
>  Issue Type: Bug
>  Components: internal-udfs
>Affects Versions: 0.10.0
>Reporter: Arjun K R
>Assignee: Arjun K R
>  Labels: features
> Attachments: PIG-2878-1.patch, PIG-2878.patch, PIG-2878-UnitTest.patch
>
>
> Pig current releases lack a UDF equalIgnoreCase.This function returns a 
> Boolean value indicating whether string left is equal to string right. This 
> check is case insensitive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2878) Pig current releases lack a UDF equalIgnoreCase.This function returns a Boolean value indicating whether string left is equal to string right. This check is case insensitiv

2013-02-01 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated PIG-2878:


Assignee: Shami B  (was: Arjun K R)

> Pig current releases lack a UDF equalIgnoreCase.This function returns a 
> Boolean value indicating whether string left is equal to string right. This 
> check is case insensitive.
> --
>
> Key: PIG-2878
> URL: https://issues.apache.org/jira/browse/PIG-2878
> Project: Pig
>  Issue Type: Bug
>  Components: internal-udfs
>Affects Versions: 0.10.0
>Reporter: Arjun K R
>Assignee: Shami B
>  Labels: features
> Attachments: PIG-2878-1.patch, PIG-2878.patch, PIG-2878-UnitTest.patch
>
>
> Pig current releases lack a UDF equalIgnoreCase.This function returns a 
> Boolean value indicating whether string left is equal to string right. This 
> check is case insensitive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-3152) HTable class in Pig 0.10.1

2013-02-01 Thread Ionut Ignatescu (JIRA)
Ionut Ignatescu created PIG-3152:


 Summary: HTable class in Pig 0.10.1
 Key: PIG-3152
 URL: https://issues.apache.org/jira/browse/PIG-3152
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.10.1
Reporter: Ionut Ignatescu
Priority: Blocker


In Pig 0.10.1 HTable class is defined under the same package as it is in HBase 
package. Much more, the version of this class seems to be very old: several 
methods do not exists or have a different signature.
Since in my use case  HBase is a transitive dependency, I cannot remove it and 
I need last version of it(same deployed on my cluster).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3137) fix Piggybank test to not using /tmp dir

2013-02-01 Thread Cheolsoo Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheolsoo Park updated PIG-3137:
---

Status: Open  (was: Patch Available)

Canceling patch until it gets updated.

> fix Piggybank test to not using /tmp dir
> 
>
> Key: PIG-3137
> URL: https://issues.apache.org/jira/browse/PIG-3137
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11
>Reporter: Johnny Zhang
>Assignee: Johnny Zhang
> Fix For: 0.12
>
> Attachments: PIG-3137.patch.txt
>
>
> right now several Piggybank tests create directory under /tmp to store test 
> data, the test could fail because user doesn't have permission to create 
> directory under /tmp. It is better to move test data dir under build dir to 
> avoid this problem.
> I will submit a patch soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3137) fix Piggybank test to not using /tmp dir

2013-02-01 Thread Cheolsoo Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheolsoo Park updated PIG-3137:
---

Assignee: Johnny Zhang

> fix Piggybank test to not using /tmp dir
> 
>
> Key: PIG-3137
> URL: https://issues.apache.org/jira/browse/PIG-3137
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11
>Reporter: Johnny Zhang
>Assignee: Johnny Zhang
> Fix For: 0.12
>
> Attachments: PIG-3137.patch.txt
>
>
> right now several Piggybank tests create directory under /tmp to store test 
> data, the test could fail because user doesn't have permission to create 
> directory under /tmp. It is better to move test data dir under build dir to 
> avoid this problem.
> I will submit a patch soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3137) fix Piggybank test to not using /tmp dir

2013-02-01 Thread Cheolsoo Park (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13568760#comment-13568760
 ] 

Cheolsoo Park commented on PIG-3137:


[~dreambird], thank you very much for the patch. I have two suggestions:
* FileLocalizer.getTemporaryPath() is for generating random paths in Hadoop 
cluster (either it's local, mini cluster, or real cluster). So it makes sense 
to use FileLocalizer in TestAvroStorage where we need temp paths for test 
outputs. But in TestDBStorage, we need a temp dir for Hsqldb, so I don't think 
we want to use FileLocalizer there. Using a temporary path under build (e.g. 
contrib/piggybank/java/build/blah) would be better.
* You can control the root dir of FileLocalizer.getTemporaryPath() using the 
pig.temp.dir property. It would be nice if it's set to somewhere under the 
build directory, so temporary dirs can be deleted by ant clean.

Let me know what you think. Thanks!



> fix Piggybank test to not using /tmp dir
> 
>
> Key: PIG-3137
> URL: https://issues.apache.org/jira/browse/PIG-3137
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11
>Reporter: Johnny Zhang
> Fix For: 0.12
>
> Attachments: PIG-3137.patch.txt
>
>
> right now several Piggybank tests create directory under /tmp to store test 
> data, the test could fail because user doesn't have permission to create 
> directory under /tmp. It is better to move test data dir under build dir to 
> avoid this problem.
> I will submit a patch soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3015) Rewrite of AvroStorage

2013-02-01 Thread Cheolsoo Park (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13568753#comment-13568753
 ] 

Cheolsoo Park commented on PIG-3015:


I think the patch is very close to being committed. Two main obstacles are:
# Tests do not pass with Hadoop-2.0.x (i.e. ant clean test 
-Dtestcase=TestAvroStorage -Dhadoopversion=23).
# Documentation is missing.

I will give another shot on debugging #1 when I get more time, but any help 
would be appreciated!

> Rewrite of AvroStorage
> --
>
> Key: PIG-3015
> URL: https://issues.apache.org/jira/browse/PIG-3015
> Project: Pig
>  Issue Type: Improvement
>  Components: piggybank
>Reporter: Joseph Adler
>Assignee: Joseph Adler
> Attachments: bad.avro, good.avro, PIG-3015-2.patch, PIG-3015-3.patch, 
> PIG-3015-4.patch, PIG-3015-5.patch, PIG-3015-6.patch, PIG-3015-7.patch, 
> TestInput.java, Test.java
>
>
> The current AvroStorage implementation has a lot of issues: it requires old 
> versions of Avro, it copies data much more than needed, and it's verbose and 
> complicated. (One pet peeve of mine is that old versions of Avro don't 
> support Snappy compression.)
> I rewrote AvroStorage from scratch to fix these issues. In early tests, the 
> new implementation is significantly faster, and the code is a lot simpler. 
> Rewriting AvroStorage also enabled me to implement support for Trevni (as 
> TrevniStorage).
> I'm opening this ticket to facilitate discussion while I figure out the best 
> way to contribute the changes back to Apache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira