[jira] [Commented] (PIG-2553) Pig shouldn't allow attempts to write multiple relations into same directory

2012-11-12 Thread Prashant Kommireddi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13496013#comment-13496013
 ] 

Prashant Kommireddi commented on PIG-2553:
--

Also, we could add kfs and maprfs to the list of file-based schemes. I agree 
the list could keep growing, but I suspect most users facing this issue to be 
hdfs:// and file:// users. What do you guys think of this for a start?

> Pig shouldn't allow attempts to write multiple relations into same directory
> 
>
> Key: PIG-2553
> URL: https://issues.apache.org/jira/browse/PIG-2553
> Project: Pig
>  Issue Type: Improvement
>Reporter: Dmitriy V. Ryaboy
>Assignee: Prashant Kommireddi
> Attachments: PIG-2553.patch
>
>
> We've seen multiple occasions where users accidentally try to store 2 or more 
> different relations to the same destination directory. Currently, this passes 
> the Pig planner and fails on MR side due to concurrent attempts to create the 
> same part file on the reducer. This is extremely confusing to the user, and 
> hard to debug.
> We should instead fail their scripts before they are even submitted, since we 
> can identify the erroneous condition from the beginning.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2553) Pig shouldn't allow attempts to write multiple relations into same directory

2012-11-12 Thread Prashant Kommireddi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13496011#comment-13496011
 ] 

Prashant Kommireddi commented on PIG-2553:
--

I feel like we could treat this as file-based vs non-file based storage 
locations, similar to PIG-2924. The patch there uses a default 
FileBasedOutputSizeReader to determine output size and 
"pig.stats.output.size.reader" to compute size based on a different 
implementation.

For this JIRA, can we also use a similar idea and handle file-based schemes 
with UriUtil.isHDFSFileOrLocalOrS3N(String uri)? For all other schemes (hbase, 
hcat, ...) we can allow multiple relations writing to same location.

1. Check if pig.location.check.strict is set
2. If not set, just log a warning if scheme is file-based
3. If set, check if scheme is file-based and report an error
4. If set but not a file-based scheme, continue without any warning/error 
message

Thoughts?

> Pig shouldn't allow attempts to write multiple relations into same directory
> 
>
> Key: PIG-2553
> URL: https://issues.apache.org/jira/browse/PIG-2553
> Project: Pig
>  Issue Type: Improvement
>Reporter: Dmitriy V. Ryaboy
>Assignee: Prashant Kommireddi
> Attachments: PIG-2553.patch
>
>
> We've seen multiple occasions where users accidentally try to store 2 or more 
> different relations to the same destination directory. Currently, this passes 
> the Pig planner and fails on MR side due to concurrent attempts to create the 
> same part file on the reducer. This is extremely confusing to the user, and 
> hard to debug.
> We should instead fail their scripts before they are even submitted, since we 
> can identify the erroneous condition from the beginning.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2553) Pig shouldn't allow attempts to write multiple relations into same directory

2012-11-12 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495887#comment-13495887
 ] 

Dmitriy V. Ryaboy commented on PIG-2553:


I think it should be set to false by default so that current scripts that use 
fancy storers, etc, can keep running without change (the scripts that have 
bugs, which we are trying to address with this, don't run correctly at all, so 
we don't have to worry about being backwards compatible with them). Individual 
pig admins / script authors can decide to turn it on by default if they notice 
this happening a lot.

We are piling up quite the list of exceptions, though. Between hcat, hbase, 
unknown other schemas, and hdfs/s3/kfs/mapr cases, I'm getting concerned that 
maybe this wasn't such a well thought out feature wish on my part! 

What do you guys think?

> Pig shouldn't allow attempts to write multiple relations into same directory
> 
>
> Key: PIG-2553
> URL: https://issues.apache.org/jira/browse/PIG-2553
> Project: Pig
>  Issue Type: Improvement
>Reporter: Dmitriy V. Ryaboy
>Assignee: Prashant Kommireddi
> Attachments: PIG-2553.patch
>
>
> We've seen multiple occasions where users accidentally try to store 2 or more 
> different relations to the same destination directory. Currently, this passes 
> the Pig planner and fails on MR side due to concurrent attempts to create the 
> same part file on the reducer. This is extremely confusing to the user, and 
> hard to debug.
> We should instead fail their scripts before they are even submitted, since we 
> can identify the erroneous condition from the beginning.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2553) Pig shouldn't allow attempts to write multiple relations into same directory

2012-11-12 Thread Prashant Kommireddi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495811#comment-13495811
 ] 

Prashant Kommireddi commented on PIG-2553:
--

That's a good point Dmitriy. The patch does not handle multiple relations being 
written to hbase. Is it sufficient to check for the schema (hdfs://, hbase://, 
file://,...) ?

Rohini, you are right. Any implementation of StoreFunc similar to Hadoop 
MultipleOutputFormat would break this. As Dmitriy suggested, I think it makes 
sense to provide an option to users, in addition to logging a warning message?

> Pig shouldn't allow attempts to write multiple relations into same directory
> 
>
> Key: PIG-2553
> URL: https://issues.apache.org/jira/browse/PIG-2553
> Project: Pig
>  Issue Type: Improvement
>Reporter: Dmitriy V. Ryaboy
>Assignee: Prashant Kommireddi
> Attachments: PIG-2553.patch
>
>
> We've seen multiple occasions where users accidentally try to store 2 or more 
> different relations to the same destination directory. Currently, this passes 
> the Pig planner and fails on MR side due to concurrent attempts to create the 
> same part file on the reducer. This is extremely confusing to the user, and 
> hard to debug.
> We should instead fail their scripts before they are even submitted, since we 
> can identify the erroneous condition from the beginning.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2553) Pig shouldn't allow attempts to write multiple relations into same directory

2012-11-12 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495808#comment-13495808
 ] 

Rohini Palaniswamy commented on PIG-2553:
-

Thanks for bringing up hbase Dmitriy. HCat table is another case. 

  Would that property be true by default? I am fine making the property true by 
default as long as it checks only for filesystem locations. HCat and HBase 
tables are going to be more common going forward, and it would not be nice to 
ask the users to launch pig every time with that property set to false.

> Pig shouldn't allow attempts to write multiple relations into same directory
> 
>
> Key: PIG-2553
> URL: https://issues.apache.org/jira/browse/PIG-2553
> Project: Pig
>  Issue Type: Improvement
>Reporter: Dmitriy V. Ryaboy
>Assignee: Prashant Kommireddi
> Attachments: PIG-2553.patch
>
>
> We've seen multiple occasions where users accidentally try to store 2 or more 
> different relations to the same destination directory. Currently, this passes 
> the Pig planner and fails on MR side due to concurrent attempts to create the 
> same part file on the reducer. This is extremely confusing to the user, and 
> hard to debug.
> We should instead fail their scripts before they are even submitted, since we 
> can identify the erroneous condition from the beginning.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Subscription: PIG patch available

2012-11-12 Thread jira
Issue Subscription
Filter: PIG patch available (29 issues)

Subscriber: pigdaily

Key Summary
PIG-3039Not possible to use custom version of jackson jars
https://issues.apache.org/jira/browse/PIG-3039
PIG-3029TestTypeCheckingValidatorNewLP has some path reference issues for 
cross-platform execution
https://issues.apache.org/jira/browse/PIG-3029
PIG-3028testGrunt dev test needs some command filters to run correctly 
without cygwin
https://issues.apache.org/jira/browse/PIG-3028
PIG-3027pigTest unit test needs a newline filter for comparisons of golden 
multi-line
https://issues.apache.org/jira/browse/PIG-3027
PIG-3026Pig checked-in baseline comparisons need a pre-filter to address 
OS-specific newline differences
https://issues.apache.org/jira/browse/PIG-3026
PIG-3025TestPruneColumn unit test - SimpleEchoStreamingCommand perl inline 
script needs simplification
https://issues.apache.org/jira/browse/PIG-3025
PIG-3024TestEmptyInputDir unit test - hadoop version detection logic is 
brittle
https://issues.apache.org/jira/browse/PIG-3024
PIG-3014CurrentTime() UDF has undesirable characteristics
https://issues.apache.org/jira/browse/PIG-3014
PIG-3010Allow UDF's to flatten themselves
https://issues.apache.org/jira/browse/PIG-3010
PIG-2978TestLoadStoreFuncLifeCycle fails with hadoop-2.0.x
https://issues.apache.org/jira/browse/PIG-2978
PIG-2959Add a pig.cmd for Pig to run under Windows
https://issues.apache.org/jira/browse/PIG-2959
PIG-2957TetsScriptUDF fail due to volume prefix in jar
https://issues.apache.org/jira/browse/PIG-2957
PIG-2956Invalid cache specification for some streaming statement
https://issues.apache.org/jira/browse/PIG-2956
PIG-2955 Fix bunch of Pig e2e tests on Windows 
https://issues.apache.org/jira/browse/PIG-2955
PIG-2937generated field in nested foreach does not inherit the variable 
name as the field name
https://issues.apache.org/jira/browse/PIG-2937
PIG-2924PigStats should not be assuming all Storage classes to be 
file-based storage
https://issues.apache.org/jira/browse/PIG-2924
PIG-2873Converting bin/pig shell script to python
https://issues.apache.org/jira/browse/PIG-2873
PIG-2834MultiStorage requires unused constructor argument
https://issues.apache.org/jira/browse/PIG-2834
PIG-2824Pushing checking number of fields into LoadFunc
https://issues.apache.org/jira/browse/PIG-2824
PIG-2661Pig uses an extra job for loading data in Pigmix L9
https://issues.apache.org/jira/browse/PIG-2661
PIG-2657Print warning if using wrong jython version
https://issues.apache.org/jira/browse/PIG-2657
PIG-2507Semicolon in paramenters for UDF results in parsing error
https://issues.apache.org/jira/browse/PIG-2507
PIG-2433Jython import module not working if module path is in classpath
https://issues.apache.org/jira/browse/PIG-2433
PIG-2417Streaming UDFs -  allow users to easily write UDFs in scripting 
languages with no JVM implementation.
https://issues.apache.org/jira/browse/PIG-2417
PIG-2362Rework Ant build.xml to use macrodef instead of antcall
https://issues.apache.org/jira/browse/PIG-2362
PIG-2312NPE when relation and column share the same name and used in Nested 
Foreach 
https://issues.apache.org/jira/browse/PIG-2312
PIG-1942script UDF (jython) should utilize the intended output schema to 
more directly convert Py objects to Pig objects
https://issues.apache.org/jira/browse/PIG-1942
PIG-1431Current DateTime UDFs: ISONOW(), UNIXNOW()
https://issues.apache.org/jira/browse/PIG-1431
PIG-1237Piggybank MutliStorage - specify field to write in output
https://issues.apache.org/jira/browse/PIG-1237

You may edit this subscription at:
https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=13225&filterId=12322384


[jira] [Commented] (PIG-2782) Specifying sorting field(s) at nightly.conf

2012-11-12 Thread Egil Sorensen (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495766#comment-13495766
 ] 

Egil Sorensen commented on PIG-2782:


There were still problems with the patch here. E.g. the pig script sorts on 
column one and two, but the verification only checks that output is sorted on 
column one.
For details please see the cloned PIG-3045.

> Specifying sorting field(s) at nightly.conf
> ---
>
> Key: PIG-2782
> URL: https://issues.apache.org/jira/browse/PIG-2782
> Project: Pig
>  Issue Type: Bug
> Environment: Mac OS X Lion 10.7.3
> Hadoop 1.0.1-SNAPSHOT
> Apache Pig version 0.11.0-SNAPSHOT (r1355798)
>Reporter: Allan AvendaƱo
>Assignee: Cheolsoo Park
> Fix For: 0.11
>
> Attachments: PIG-2782.patch
>
>
> After running the Checkin tests, it fails because one of the parameters 
> passed to the sort is incorrect (instead of +1 -2, on POSIX is -k2,2). 
> According to this http://ss64.com/bash/sort.html, it was on an old notation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2553) Pig shouldn't allow attempts to write multiple relations into same directory

2012-11-12 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495765#comment-13495765
 ] 

Dmitriy V. Ryaboy commented on PIG-2553:


Rohini, what if we hide this behind a property, something like 
"pig.location.check.strict"?
I see accidental writes to the same output path causing problems all the time.. 
would love to have this feature.

Prashant -- I haven't looked at the patch yet, but just something to check: it 
does allow writes of multiple relations to, say, the same HBase table?

> Pig shouldn't allow attempts to write multiple relations into same directory
> 
>
> Key: PIG-2553
> URL: https://issues.apache.org/jira/browse/PIG-2553
> Project: Pig
>  Issue Type: Improvement
>Reporter: Dmitriy V. Ryaboy
>Assignee: Prashant Kommireddi
> Attachments: PIG-2553.patch
>
>
> We've seen multiple occasions where users accidentally try to store 2 or more 
> different relations to the same destination directory. Currently, this passes 
> the Pig planner and fails on MR side due to concurrent attempts to create the 
> same part file on the reducer. This is extremely confusing to the user, and 
> hard to debug.
> We should instead fail their scripts before they are even submitted, since we 
> can identify the erroneous condition from the beginning.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3045) Specifying sorting field(s) at nightly.conf - further changes

2012-11-12 Thread Egil Sorensen (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Egil Sorensen updated PIG-3045:
---

Description: 
PIG-2782 fixed a number of tests where the parameters passed to the 
verification sort was incorrect.
However, there are still problems with the patch in PIG-2782. E.g. the pig 
script sorts on column one and two, but the verification only checks that 
output is sorted on column one.

For file test/e2e/pig/tests/nightly.conf:
===

@@ -1728,7 +1728,7 @@
'pig' =>q\a = load
':INPATH:/singlefile/studentnulltab10k' as (name:chararray, age:int,
gpa:double);
 b = order a by name, age, gpa;
 store b into ':OUTPATH:';\,
-'sortArgs' => ['-t', ' ', '+0', '-1', '+1n', '-2'],
+'sortArgs' => ['-t', ' ', '-k', '1,1', '-k', '2n,2n'],
},

Should have been: 
+'sortArgs' => ['-t', ' ', '-k', '1,1', '-k', '2n,3n'],

===

Similar

@@ -1736,7 +1736,7 @@
'pig' =>q\a = load
':INPATH:/singlefile/studentnulltab10k' as (name:chararray, age:int,
gpa:double);
 b = order a by name desc, age desc, gpa desc;
 store b into ':OUTPATH:';\,
-'sortArgs' => ['-t', ' ', '+0r', '-1', '+1nr', '-2'],
+'sortArgs' => ['-t', ' ', '-k', '1r,1r', '-k', '2nr,2nr'],
},


Should have been: 
+'sortArgs' => ['-t', ' ', '-k', '1r,1r', '-k', '2nr,3nr'],

===

and 

@@ -1752,7 +1752,7 @@
'pig' =>q\a = load
':INPATH:/singlefile/studentnulltab10k' as (name, age:long, gpa:float);
 b = order a by name desc, age desc, gpa desc;
 store b into ':OUTPATH:';\,
-'sortArgs' => ['-t', ' ', '+0r', '-1', '+1nr', '-2'],
+'sortArgs' => ['-t', ' ', '-k', '1r,1r', '-k', '2nr,2nr'],
},

Should have been: 
+'sortArgs' => ['-t', ' ', '-k', '1r,1r', '-k', '2nr,3nr'],

===

@@ -1847,7 +1847,7 @@
'pig' => q\a = load
':INPATH:/singlefile/studentnulltab10k' as (name:chararray, age:int,
gpa:double);
 b = order a by *;
 store b into ':OUTPATH:';\,
-'sortArgs' => ['-t', ' ', '+0', '-1', '+1n', '-2'],
+'sortArgs' => ['-t', ' ', '-k', '1,1', '-k', '2n,2n'],
},

Should have been: 
+'sortArgs' => ['-t', ' ', '-k', '1,1', '-k', '2n,3n'],

===

@@ -1855,7 +1855,7 @@
'pig' => q\a = load
':INPATH:/singlefile/studentnulltab10k' as (name:chararray, age:int,
gpa:double);
 b = order a by * desc;
 store b into ':OUTPATH:';\,
-'sortArgs' => ['-t', ' ', '+0r', '-1', '+1nr', '-2'],
+'sortArgs' => ['-t', ' ', '-k', '1r,1r', '-k', '2nr,2nr'],
},

Should have been: 
+'sortArgs' => ['-t', ' ', '-k', '1,1', '-k', '2nr,3nr'],

===

@@ -1943,7 +1943,7 @@
 c = filter b by $0 > 'a'; -- break the sort/limit optimization
 d = limit c 100;
 store d into ':OUTPATH:';\,
-   'sortArgs' => ['-t', '  ', '+0', '-1'],
+   'sortArgs' => ['-t', '  ', '-k', '1,1'],

Should have been: 
+   'sortArgs' => ['-t', '  ', '-k', '1,2'],

===

@@ -1952,7 +1952,7 @@
 b = order a by $0, $1;
 c = limit b 100;
 store c into ':OUTPATH:';\,
-   'sortArgs' => ['-t', '  ', '+0', '-1'],
+   'sortArgs' => ['-t', '  ', '-k', '1,1'],

Should have been: 
+   'sortArgs' => ['-t', '  ', '-k', '1,2'],

===

@@ -,7 +,7 @@
 D = order B by age, extra;
 store D into ':OUTPATH:';\,

-   'sortArgs' => ['-t', '  ', '+1n', '-2'],
+   'sortArgs' => ['-t', '  ', '-k', '2n,2n'],
},

Should have been: 
+   'sortArgs' => ['-t', '  ', '-k', '2n,2n', '-k', '4,4'],

(This last is decidedly minor, as the 'extra' column is empty, but for sake of 
consistency...) 

  

  was:
After running the Checkin tests, it fails because one of the parameters passed 
to the sort is incorrect (instead of +1 -2, on POSIX is -k2,2). 

According to this http://ss64.com/bash/sort.html, it was on an old notation.


> Specifying sorting field(s) at nightly.conf - further changes
> -
>
> Key: PIG-3045
> URL: https://issues.apache.org/jira/browse/PIG-3045
> Project: Pig
>  Issue Type: Bug
>  Components: e2e harness
>Affects Versions: 0.10.0, 0.10.1
> Environment: Mac OS X Lion 10.7.3
> Hadoop 1.0.1-SNAPSHOT
> Apache Pig version 0.11.0-SNAPSHOT (r1355798)
>Reporter: Egil Sorensen
>Assignee: Cheolsoo Park
>  Labels: test
>

[jira] [Updated] (PIG-3045) Specifying sorting field(s) at nightly.conf - further changes

2012-11-12 Thread Egil Sorensen (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Egil Sorensen updated PIG-3045:
---

Affects Version/s: 0.10.1

> Specifying sorting field(s) at nightly.conf - further changes
> -
>
> Key: PIG-3045
> URL: https://issues.apache.org/jira/browse/PIG-3045
> Project: Pig
>  Issue Type: Bug
>  Components: e2e harness
>Affects Versions: 0.10.0, 0.10.1
> Environment: Mac OS X Lion 10.7.3
> Hadoop 1.0.1-SNAPSHOT
> Apache Pig version 0.11.0-SNAPSHOT (r1355798)
>Reporter: Egil Sorensen
>Assignee: Cheolsoo Park
>  Labels: test
> Fix For: 0.11
>
>
> After running the Checkin tests, it fails because one of the parameters 
> passed to the sort is incorrect (instead of +1 -2, on POSIX is -k2,2). 
> According to this http://ss64.com/bash/sort.html, it was on an old notation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2553) Pig shouldn't allow attempts to write multiple relations into same directory

2012-11-12 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495729#comment-13495729
 ] 

Rohini Palaniswamy commented on PIG-2553:
-

PiggyBank also has a MultiStorage - 
http://svn.apache.org/viewvc/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/MultiStorage.java?revision=1145447&view=markup.
 

 One thing we could do is print a warning message instead of throwing a error. 
I don't see a way to correctly determine and throw an error. And in cases like 
MultiStorage the filename is not even static. The output dir name/file name is 
dynamic and depends on the value of a field in the record.  

> Pig shouldn't allow attempts to write multiple relations into same directory
> 
>
> Key: PIG-2553
> URL: https://issues.apache.org/jira/browse/PIG-2553
> Project: Pig
>  Issue Type: Improvement
>Reporter: Dmitriy V. Ryaboy
>Assignee: Prashant Kommireddi
> Attachments: PIG-2553.patch
>
>
> We've seen multiple occasions where users accidentally try to store 2 or more 
> different relations to the same destination directory. Currently, this passes 
> the Pig planner and fails on MR side due to concurrent attempts to create the 
> same part file on the reducer. This is extremely confusing to the user, and 
> hard to debug.
> We should instead fail their scripts before they are even submitted, since we 
> can identify the erroneous condition from the beginning.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2857) Add a -tagPath option to PigStorage

2012-11-12 Thread Prashant Kommireddi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prashant Kommireddi updated PIG-2857:
-

Patch Info: Patch Available

> Add a -tagPath option to PigStorage
> ---
>
> Key: PIG-2857
> URL: https://issues.apache.org/jira/browse/PIG-2857
> Project: Pig
>  Issue Type: New Feature
>Reporter: Dmitriy V. Ryaboy
>Assignee: Prashant Kommireddi
> Attachments: PIG-2857_1.patch, PIG-2857.patch
>
>
> We recently added a "-tagSource" option to PigStorage, which allows us to add 
> filenames from which records come to the returned tuples.
> Often, users want the whole path, not just the source file. I propose we add 
> a "-tagPath" option to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2553) Pig shouldn't allow attempts to write multiple relations into same directory

2012-11-12 Thread Prashant Kommireddi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495705#comment-13495705
 ] 

Prashant Kommireddi commented on PIG-2553:
--

StoreFuncs using setStoreLocation/relToAbsPathForStoreLocation to append 
filenames make it difficult to handle this. Any other ideas since the earlier 
approach is not safe?

> Pig shouldn't allow attempts to write multiple relations into same directory
> 
>
> Key: PIG-2553
> URL: https://issues.apache.org/jira/browse/PIG-2553
> Project: Pig
>  Issue Type: Improvement
>Reporter: Dmitriy V. Ryaboy
>Assignee: Prashant Kommireddi
> Attachments: PIG-2553.patch
>
>
> We've seen multiple occasions where users accidentally try to store 2 or more 
> different relations to the same destination directory. Currently, this passes 
> the Pig planner and fails on MR side due to concurrent attempts to create the 
> same part file on the reducer. This is extremely confusing to the user, and 
> hard to debug.
> We should instead fail their scripts before they are even submitted, since we 
> can identify the erroneous condition from the beginning.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3045) Specifying sorting field(s) at nightly.conf - further changes

2012-11-12 Thread Egil Sorensen (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Egil Sorensen updated PIG-3045:
---

Labels: test  (was: )

> Specifying sorting field(s) at nightly.conf - further changes
> -
>
> Key: PIG-3045
> URL: https://issues.apache.org/jira/browse/PIG-3045
> Project: Pig
>  Issue Type: Bug
>  Components: e2e harness
>Affects Versions: 0.10.0
> Environment: Mac OS X Lion 10.7.3
> Hadoop 1.0.1-SNAPSHOT
> Apache Pig version 0.11.0-SNAPSHOT (r1355798)
>Reporter: Egil Sorensen
>Assignee: Cheolsoo Park
>  Labels: test
> Fix For: 0.11
>
>
> After running the Checkin tests, it fails because one of the parameters 
> passed to the sort is incorrect (instead of +1 -2, on POSIX is -k2,2). 
> According to this http://ss64.com/bash/sort.html, it was on an old notation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3045) Specifying sorting field(s) at nightly.conf - further changes

2012-11-12 Thread Egil Sorensen (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Egil Sorensen updated PIG-3045:
---

Hadoop Flags:   (was: Reviewed)

> Specifying sorting field(s) at nightly.conf - further changes
> -
>
> Key: PIG-3045
> URL: https://issues.apache.org/jira/browse/PIG-3045
> Project: Pig
>  Issue Type: Bug
>  Components: e2e harness
>Affects Versions: 0.10.0
> Environment: Mac OS X Lion 10.7.3
> Hadoop 1.0.1-SNAPSHOT
> Apache Pig version 0.11.0-SNAPSHOT (r1355798)
>Reporter: Egil Sorensen
>Assignee: Cheolsoo Park
>  Labels: test
> Fix For: 0.11
>
>
> After running the Checkin tests, it fails because one of the parameters 
> passed to the sort is incorrect (instead of +1 -2, on POSIX is -k2,2). 
> According to this http://ss64.com/bash/sort.html, it was on an old notation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3045) Specifying sorting field(s) at nightly.conf - further changes

2012-11-12 Thread Egil Sorensen (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Egil Sorensen updated PIG-3045:
---

Component/s: e2e harness

> Specifying sorting field(s) at nightly.conf - further changes
> -
>
> Key: PIG-3045
> URL: https://issues.apache.org/jira/browse/PIG-3045
> Project: Pig
>  Issue Type: Bug
>  Components: e2e harness
>Affects Versions: 0.10.0
> Environment: Mac OS X Lion 10.7.3
> Hadoop 1.0.1-SNAPSHOT
> Apache Pig version 0.11.0-SNAPSHOT (r1355798)
>Reporter: Egil Sorensen
>Assignee: Cheolsoo Park
>  Labels: test
> Fix For: 0.11
>
>
> After running the Checkin tests, it fails because one of the parameters 
> passed to the sort is incorrect (instead of +1 -2, on POSIX is -k2,2). 
> According to this http://ss64.com/bash/sort.html, it was on an old notation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3045) Specifying sorting field(s) at nightly.conf - further changes

2012-11-12 Thread Egil Sorensen (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Egil Sorensen updated PIG-3045:
---

Affects Version/s: 0.10.0

> Specifying sorting field(s) at nightly.conf - further changes
> -
>
> Key: PIG-3045
> URL: https://issues.apache.org/jira/browse/PIG-3045
> Project: Pig
>  Issue Type: Bug
>  Components: e2e harness
>Affects Versions: 0.10.0
> Environment: Mac OS X Lion 10.7.3
> Hadoop 1.0.1-SNAPSHOT
> Apache Pig version 0.11.0-SNAPSHOT (r1355798)
>Reporter: Egil Sorensen
>Assignee: Cheolsoo Park
>  Labels: test
> Fix For: 0.11
>
>
> After running the Checkin tests, it fails because one of the parameters 
> passed to the sort is incorrect (instead of +1 -2, on POSIX is -k2,2). 
> According to this http://ss64.com/bash/sort.html, it was on an old notation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-3045) CLONE - Specifying sorting field(s) at nightly.conf

2012-11-12 Thread Egil Sorensen (JIRA)
Egil Sorensen created PIG-3045:
--

 Summary: CLONE - Specifying sorting field(s) at nightly.conf
 Key: PIG-3045
 URL: https://issues.apache.org/jira/browse/PIG-3045
 Project: Pig
  Issue Type: Bug
 Environment: Mac OS X Lion 10.7.3
Hadoop 1.0.1-SNAPSHOT
Apache Pig version 0.11.0-SNAPSHOT (r1355798)
Reporter: Egil Sorensen
Assignee: Cheolsoo Park
 Fix For: 0.11


After running the Checkin tests, it fails because one of the parameters passed 
to the sort is incorrect (instead of +1 -2, on POSIX is -k2,2). 

According to this http://ss64.com/bash/sort.html, it was on an old notation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3045) Specifying sorting field(s) at nightly.conf - further changes

2012-11-12 Thread Egil Sorensen (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Egil Sorensen updated PIG-3045:
---

Summary: Specifying sorting field(s) at nightly.conf - further changes  
(was: CLONE - Specifying sorting field(s) at nightly.conf)

> Specifying sorting field(s) at nightly.conf - further changes
> -
>
> Key: PIG-3045
> URL: https://issues.apache.org/jira/browse/PIG-3045
> Project: Pig
>  Issue Type: Bug
> Environment: Mac OS X Lion 10.7.3
> Hadoop 1.0.1-SNAPSHOT
> Apache Pig version 0.11.0-SNAPSHOT (r1355798)
>Reporter: Egil Sorensen
>Assignee: Cheolsoo Park
> Fix For: 0.11
>
>
> After running the Checkin tests, it fails because one of the parameters 
> passed to the sort is incorrect (instead of +1 -2, on POSIX is -k2,2). 
> According to this http://ss64.com/bash/sort.html, it was on an old notation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2176) add logical plan assumption checker

2012-11-12 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-2176:
---

Assignee: Thejas M Nair

> add logical plan assumption checker 
> 
>
> Key: PIG-2176
> URL: https://issues.apache.org/jira/browse/PIG-2176
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.9.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.10.0
>
> Attachments: PIG-2176.1.patch, PIG-2176.2.patch
>
>
> Pig expects certain things about LogicalPlan, and optimizer logic depends on 
> those to be true. Could that verifies that these assumptions are true will 
> help in catching issues early on. 
> Some of the assumptions that should be checked - 
> 1. All schema have valid uid . (not -1).
> 2. All fields in schema have distinct uid. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2657) Print warning if using wrong jython version

2012-11-12 Thread Cheolsoo Park (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495644#comment-13495644
 ] 

Cheolsoo Park commented on PIG-2657:


Hi Johnny,

Thank you very much for the patch. I have a few more comments:
- I think that you have to swap {{Version.PY_VERSION}} and {{jythonVersion}}.
- Looking at {{JythonScriptEngine.java}} after apply your patch, I see no 
reason why we nest try-catch blocks here. Can you please move the inner 
try-catch block to outside the outer one? In addition, can you make it to catch 
an IOException instead of an Exception since that's specifically what is thrown 
by {{JarFile}}? Do you agree?
- Please remove tabs in {{build.xml}}.

> Print warning if using wrong jython version
> ---
>
> Key: PIG-2657
> URL: https://issues.apache.org/jira/browse/PIG-2657
> Project: Pig
>  Issue Type: Bug
>Reporter: Fabian Alenius
>  Labels: newbie
> Fix For: 0.12
>
> Attachments: PIG-2657.1.patch, PIG-2657.2.patch, PIG-2657.3.patch
>
>
> Hi,
> It would be good if Pig would print a warning (or refuse to run) if you are 
> using an unsupported version of jython. I spent a couple of hours before 
> figuring out that you had to use 2.5.0. I've seen posts indicating that 
> others have run into this problem as well.
> Might write up a patch if others agree this is an issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2553) Pig shouldn't allow attempts to write multiple relations into same directory

2012-11-12 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495629#comment-13495629
 ] 

Rohini Palaniswamy commented on PIG-2553:
-

Yes. That would be a simple one. 



> Pig shouldn't allow attempts to write multiple relations into same directory
> 
>
> Key: PIG-2553
> URL: https://issues.apache.org/jira/browse/PIG-2553
> Project: Pig
>  Issue Type: Improvement
>Reporter: Dmitriy V. Ryaboy
>Assignee: Prashant Kommireddi
> Attachments: PIG-2553.patch
>
>
> We've seen multiple occasions where users accidentally try to store 2 or more 
> different relations to the same destination directory. Currently, this passes 
> the Pig planner and fails on MR side due to concurrent attempts to create the 
> same part file on the reducer. This is extremely confusing to the user, and 
> hard to debug.
> We should instead fail their scripts before they are even submitted, since we 
> can identify the erroneous condition from the beginning.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2553) Pig shouldn't allow attempts to write multiple relations into same directory

2012-11-12 Thread Prashant Kommireddi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495586#comment-13495586
 ] 

Prashant Kommireddi commented on PIG-2553:
--

Thanks for the feedback Rohini and Cheolsoo.

Rohini, what does such a StoreFunc look like? May be the following?
{code}
STORE alias1 INTO 'output' using MyStoreFunc('filename1');
STORE alias2 INTO 'output' using MyStoreFunc('filename2');
{code}



> Pig shouldn't allow attempts to write multiple relations into same directory
> 
>
> Key: PIG-2553
> URL: https://issues.apache.org/jira/browse/PIG-2553
> Project: Pig
>  Issue Type: Improvement
>Reporter: Dmitriy V. Ryaboy
>Assignee: Prashant Kommireddi
> Attachments: PIG-2553.patch
>
>
> We've seen multiple occasions where users accidentally try to store 2 or more 
> different relations to the same destination directory. Currently, this passes 
> the Pig planner and fails on MR side due to concurrent attempts to create the 
> same part file on the reducer. This is extremely confusing to the user, and 
> hard to debug.
> We should instead fail their scripts before they are even submitted, since we 
> can identify the erroneous condition from the beginning.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2657) Print warning if using wrong jython version

2012-11-12 Thread Johnny Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johnny Zhang updated PIG-2657:
--

Attachment: PIG-2657.3.patch

[~cheolsoo], here is the new patch based on your comments.

> Print warning if using wrong jython version
> ---
>
> Key: PIG-2657
> URL: https://issues.apache.org/jira/browse/PIG-2657
> Project: Pig
>  Issue Type: Bug
>Reporter: Fabian Alenius
>  Labels: newbie
> Fix For: 0.12
>
> Attachments: PIG-2657.1.patch, PIG-2657.2.patch, PIG-2657.3.patch
>
>
> Hi,
> It would be good if Pig would print a warning (or refuse to run) if you are 
> using an unsupported version of jython. I spent a couple of hours before 
> figuring out that you had to use 2.5.0. I've seen posts indicating that 
> others have run into this problem as well.
> Might write up a patch if others agree this is an issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2553) Pig shouldn't allow attempts to write multiple relations into same directory

2012-11-12 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495554#comment-13495554
 ] 

Rohini Palaniswamy commented on PIG-2553:
-

  This might break some existing custom StoreFuncs. Currently you can write a 
custom storer which allows to write into same directory, but with different 
file names. We have users who have written that kind of Storers. 

> Pig shouldn't allow attempts to write multiple relations into same directory
> 
>
> Key: PIG-2553
> URL: https://issues.apache.org/jira/browse/PIG-2553
> Project: Pig
>  Issue Type: Improvement
>Reporter: Dmitriy V. Ryaboy
>Assignee: Prashant Kommireddi
> Attachments: PIG-2553.patch
>
>
> We've seen multiple occasions where users accidentally try to store 2 or more 
> different relations to the same destination directory. Currently, this passes 
> the Pig planner and fails on MR side due to concurrent attempts to create the 
> same part file on the reducer. This is extremely confusing to the user, and 
> hard to debug.
> We should instead fail their scripts before they are even submitted, since we 
> can identify the erroneous condition from the beginning.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2857) Add a -tagPath option to PigStorage

2012-11-12 Thread Prashant Kommireddi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prashant Kommireddi updated PIG-2857:
-

Attachment: PIG-2857_1.patch

Had made changes to Utils.java that I did not add to the previous patch. 
Attaching an updated patch.

> Add a -tagPath option to PigStorage
> ---
>
> Key: PIG-2857
> URL: https://issues.apache.org/jira/browse/PIG-2857
> Project: Pig
>  Issue Type: New Feature
>Reporter: Dmitriy V. Ryaboy
>Assignee: Prashant Kommireddi
> Attachments: PIG-2857_1.patch, PIG-2857.patch
>
>
> We recently added a "-tagSource" option to PigStorage, which allows us to add 
> filenames from which records come to the returned tuples.
> Often, users want the whole path, not just the source file. I propose we add 
> a "-tagPath" option to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2955) Fix bunch of Pig e2e tests on Windows

2012-11-12 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495496#comment-13495496
 ] 

Alan Gates commented on PIG-2955:
-

+1, changes look good.

>  Fix bunch of Pig e2e tests on Windows 
> ---
>
> Key: PIG-2955
> URL: https://issues.apache.org/jira/browse/PIG-2955
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.11, 0.10.1, 0.12
>
> Attachments: PIG-2955-1.patch, PIG-2955-2_0.10.patch, PIG-2955-2.patch
>
>
> Fix the following test aborts and failures:
> ComputeSpec_1
> ComputeSpec_2
> Unicode_cmdline_1
> Warning_1
> Warning_4
> Checkin_2
> UdfDistributedCache_1
> Jython_Checkin_2
> Jython_Diagnostics_4
> Jython_Diagnostics_5
> Jython_Diagnostics_6
> Jython_Error_3
> Jython_Error_4
> Jython_Error_5
> Jython_Error_6
> Jython_Error_7
> Grunt_6
> Grunt_8
> Grunt_13
> Grunt_14

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Jenkins / Clover

2012-11-12 Thread Daniel Dai
Hi, Gianmarco
I added you to hudson-jobadmin group.

Thanks,
Daniel

On Thu, Jul 19, 2012 at 12:33 AM, Gianmarco De Francisci Morales
 wrote:
> Fine,
> Alan, could you add me to the hudson-jobadmin group?
>
> modify_appgroups.pl hudson-jobadmin --add=gdfm
>
> On people.apache.org, according to the page.
>
> I have subscribed to infrastructure and builds.
>
> Cheers,
> --
> Gianmarco
>
>
>
>
> On Thu, Jul 19, 2012 at 12:17 AM, Alan Gates  wrote:
>
>> http://wiki.apache.org/general/Jenkins?action=show&redirect=Hudsondescribes 
>> how to get an account so you can administer the Jenkins builds.
>>
>> Alan.
>>
>> On Jul 18, 2012, at 12:27 PM, Gianmarco De Francisci Morales wrote:
>>
>> > What is the procedure to modify the nightly build?
>> > If everyone agrees (and somebody explains me how) I volunteer to fix it.
>> >
>> > Cheers,
>> > --
>> > Gianmarco
>> >
>> >
>> >
>> >
>> > On Wed, Jul 18, 2012 at 8:25 AM, Jonathan Coveney > >wrote:
>> >
>> >> +1
>> >>
>> >> A while ago I tried to get apache builds to deal with this, and nothing.
>> >> Very annoying, but pending a fix, we should remove it from the nightly.
>> >>
>> >> 2012/7/17 Alan Gates 
>> >>
>> >>> I'm fine with removing it from the nightly build.  I don't see any
>> reason
>> >>> to run that every day, especially since it slows down the tests.  Let's
>> >> not
>> >>> remove it from ant, as it's useful to run occasionally.
>> >>>
>> >>> Alan.
>> >>>
>> >>> On Jul 17, 2012, at 3:17 PM, Gianmarco De Francisci Morales wrote:
>> >>>
>>  Hi,
>> 
>>  Clover constantly makes a number of our Jenkins builds fail (usually
>>  because of license issues, I think it is a misconfiguration).
>>  Do we actually use it?
>>  If we don't I would propose to remove it from our build.
>>  What do you think?
>> 
>>  Cheers,
>>  --
>>  Gianmarco
>> >>>
>> >>>
>> >>
>>
>>


[jira] [Updated] (PIG-2657) Print warning if using wrong jython version

2012-11-12 Thread Cheolsoo Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheolsoo Park updated PIG-2657:
---

Labels: newbie  (was: )

Additional comments:
- Please remove tabs. Use 4 space instead.
- The property {{jython.version}} shouldn't be hard-coded in {{build.xml}}. It 
is automatically loaded from {{ivy/libraries.properties}} at compile-time, so 
no reason to define it.
- The {{jython.version}} attribute shouldn't be embedded in {{pigunit.jar}}. 
It's useful only in {{pig.jar}} and {{pig-withouthadoop.jar}}.
- Regarding the log message, why don't we print a message like "Pig is tested 
with ${jython.version}, so it may not work with ${runtime.jython.version}"? I 
think that this is flexible and informative at the same time.

Thanks!

> Print warning if using wrong jython version
> ---
>
> Key: PIG-2657
> URL: https://issues.apache.org/jira/browse/PIG-2657
> Project: Pig
>  Issue Type: Bug
>Reporter: Fabian Alenius
>  Labels: newbie
> Fix For: 0.12
>
> Attachments: PIG-2657.1.patch, PIG-2657.2.patch
>
>
> Hi,
> It would be good if Pig would print a warning (or refuse to run) if you are 
> using an unsupported version of jython. I spent a couple of hours before 
> figuring out that you had to use 2.5.0. I've seen posts indicating that 
> others have run into this problem as well.
> Might write up a patch if others agree this is an issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3039) Not possible to use custom version of jackson jars

2012-11-12 Thread Cheolsoo Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheolsoo Park updated PIG-3039:
---

Attachment: PIG-3039-download-jackson.patch

Hi Rohini,

The patch looks good. My only concern is that we have to remember this test 
case when we bump jackson to 1.9.9 in future.

I suggest that we should at least make a comment in ivy/libraries.properties 
regarding this test case so that we won't forget to update this test case when 
we update the version of jackson.

In addition, wouldn't it better to download the jackson 1.9.9 binaries using 
ant instead of checking them in? I made a quick patch that does this, so please 
feel free to use it if you like to. This is just a suggestion, and I won't 
insist.

Thanks!

> Not possible to use custom version of jackson jars
> --
>
> Key: PIG-3039
> URL: https://issues.apache.org/jira/browse/PIG-3039
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: 0.12
>
> Attachments: PIG-3039-download-jackson.patch, PIG-3039-trunk.patch
>
>
> User is trying
> register jackson_core_asl-1.9.4_1.jar;
> register jackson_mapper_asl-1.9.4_1.jar;
> register jackson_xc-1.9.4_1.jar;
> But pig.jar/pig-withouthadoop.jar has jackson jars and JarManager packages 
> the jackson from pig.jar into job.jar(PIG-2457). We could not find any 
> possible workaround with mapreduce framework to put the user jar first in the 
> classpath as job.jar always takes precedence.
>  The pig script works fine with 0.9 and is a regression in 0.10.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2553) Pig shouldn't allow attempts to write multiple relations into same directory

2012-11-12 Thread Cheolsoo Park (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495260#comment-13495260
 ] 

Cheolsoo Park commented on PIG-2553:


Hi Prashant,

Thank you very much for the patch! I tested it in local and mr mode, and it 
works fine. Can you please add a unit test probably in TestPigServer?

> Pig shouldn't allow attempts to write multiple relations into same directory
> 
>
> Key: PIG-2553
> URL: https://issues.apache.org/jira/browse/PIG-2553
> Project: Pig
>  Issue Type: Improvement
>Reporter: Dmitriy V. Ryaboy
>Assignee: Prashant Kommireddi
> Attachments: PIG-2553.patch
>
>
> We've seen multiple occasions where users accidentally try to store 2 or more 
> different relations to the same destination directory. Currently, this passes 
> the Pig planner and fails on MR side due to concurrent attempts to create the 
> same part file on the reducer. This is extremely confusing to the user, and 
> hard to debug.
> We should instead fail their scripts before they are even submitted, since we 
> can identify the erroneous condition from the beginning.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2857) Add a -tagPath option to PigStorage

2012-11-12 Thread Prashant Kommireddi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prashant Kommireddi updated PIG-2857:
-

Attachment: PIG-2857.patch

Coming back to this. I think we should keep 'tagsource' in the next release for 
backward compatibility and as you suggested, log the deprecation message. May 
be we can remove 'tagsource' for 0.13. Adding a patch that now uses '-tagFile' 
for source filename and '-tagPath' for source path. I have modified tests 
accordingly to handle the same.

> Add a -tagPath option to PigStorage
> ---
>
> Key: PIG-2857
> URL: https://issues.apache.org/jira/browse/PIG-2857
> Project: Pig
>  Issue Type: New Feature
>Reporter: Dmitriy V. Ryaboy
>Assignee: Prashant Kommireddi
> Attachments: PIG-2857.patch
>
>
> We recently added a "-tagSource" option to PigStorage, which allows us to add 
> filenames from which records come to the returned tuples.
> Often, users want the whole path, not just the source file. I propose we add 
> a "-tagPath" option to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3041) Improve ResourceStatistics

2012-11-12 Thread Prashant Kommireddi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prashant Kommireddi updated PIG-3041:
-

Issue Type: Improvement  (was: Bug)

> Improve ResourceStatistics
> --
>
> Key: PIG-3041
> URL: https://issues.apache.org/jira/browse/PIG-3041
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.12
>Reporter: Prashant Kommireddi
>Assignee: Prashant Kommireddi
>
> This is a follow-up JIRA to PIG-2582. ResourceStatistics should be improved 
> and a few things we should do for 0.13. 
> 1. Consider removing method setmBytes(Long mBytes). We deprecated this method 
> in 0.12, but the code does not seem intuitive as the setter is actually 
> working on the variable "bytes".
> 2. All setter methods return ResourceStatistics object and this is 
> unnecessary. For eg:
> {code}
> public ResourceStatistics setNumRecords(Long numRecords) {
> this.numRecords = numRecords;
> return this;
> }
> {code}
> Each one of these variables has an associated getter.
> I will take this up once we are in the 0.13 cycle.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira