[jira] Created: (PIG-1779) Worng stats shown when there are multiple loads but same file names

2010-12-21 Thread Vivek Padmanabhan (JIRA)
Worng stats shown when there are multiple loads but same file names
---

 Key: PIG-1779
 URL: https://issues.apache.org/jira/browse/PIG-1779
 Project: Pig
  Issue Type: Bug
  Components: tools
Affects Versions: 0.8.0
Reporter: Vivek Padmanabhan


In Pig 0.8 , the stats is showing wrong information when ever I have multiple 
loads and the the file names are similar .

a) Problem 1
Sample Script : 
A = LOAD 'myfolder/tryme' AS (f1);
B = LOAD 'myfolder/anotherfolder/tryme' AS (f2);
C = JOIN A BY f1, B BY f2;
DUMP C;

Here I have 10 records for A and 3 records for B , but pig says 
Successfully read 6 records from: "/myfolder/anotherfolder/tryme"
Successfully read 6 records from: "myfolder/tryme"

b) Problem 2
A = LOAD 'myfolder/tryme' AS (f1);
B = LOAD 'myfolder/anotherfolder/tryme' AS (f2);
C = JOIN A BY f1, B BY f2;
DUMP C;

Here there is no folder named anotherfolder while "myfolder/tryme" exists . 
But pig says
Failed to read data from "/myfolder/anotherfolder/tryme"
Failed to read data from "/myfolder/tryme"


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1778) Some dependencies not packaged with Pig 0.8 release

2010-12-21 Thread Soren Macbeth (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12974027#action_12974027
 ] 

Soren Macbeth commented on PIG-1778:


guava's List object is used in HBaseStorage to handle the parameter inputs. 

> Some dependencies not packaged with Pig 0.8 release
> ---
>
> Key: PIG-1778
> URL: https://issues.apache.org/jira/browse/PIG-1778
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.8.0
>Reporter: Dmitriy V. Ryaboy
>
> Some of the libraries required for new Pig features are not included in the 
> built tarball of 0.8 release:
> guava, required for HBaseStorage
> jython, required for Jython UDFs
> We should discuss how to properly package these dependencies.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1778) Some dependencies not packaged with Pig 0.8 release

2010-12-21 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12973942#action_12973942
 ] 

Daniel Dai commented on PIG-1778:
-

Seems guava.jar already in pig-0.8.0-core.jar, we need to find out what is the 
root cause for HBaseStorage. 

Jython.jar should be in classpath, but not necessary in pig-0.8.0-core.jar. 
PythonScriptEngine will package jython.jar automatically. We need to modify 
pig.pl to include jython.jar.

> Some dependencies not packaged with Pig 0.8 release
> ---
>
> Key: PIG-1778
> URL: https://issues.apache.org/jira/browse/PIG-1778
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.8.0
>Reporter: Dmitriy V. Ryaboy
>
> Some of the libraries required for new Pig features are not included in the 
> built tarball of 0.8 release:
> guava, required for HBaseStorage
> jython, required for Jython UDFs
> We should discuss how to properly package these dependencies.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1777) LoadFunc in a scripting language

2010-12-21 Thread John Meagher (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Meagher updated PIG-1777:
--

Patch Info: [Patch Available]

> LoadFunc in a scripting language
> 
>
> Key: PIG-1777
> URL: https://issues.apache.org/jira/browse/PIG-1777
> Project: Pig
>  Issue Type: New Feature
>Reporter: John Meagher
> Fix For: 0.9.0
>
> Attachments: Initial-scripted-load.patch, Initial-scripted-load2.patch
>
>
> Provide a mechanism for loading custom objects from a Sequence file with the 
> conversion from the object to Pig objects happening in a scripting language.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1777) LoadFunc in a scripting language

2010-12-21 Thread John Meagher (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Meagher updated PIG-1777:
--

Fix Version/s: 0.9.0
   Status: Patch Available  (was: Open)

> LoadFunc in a scripting language
> 
>
> Key: PIG-1777
> URL: https://issues.apache.org/jira/browse/PIG-1777
> Project: Pig
>  Issue Type: New Feature
>Reporter: John Meagher
> Fix For: 0.9.0
>
> Attachments: Initial-scripted-load.patch, Initial-scripted-load2.patch
>
>
> Provide a mechanism for loading custom objects from a Sequence file with the 
> conversion from the object to Pig objects happening in a scripting language.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1778) Some dependencies not packaged with Pig 0.8 release

2010-12-21 Thread Soren Macbeth (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12973819#action_12973819
 ] 

Soren Macbeth commented on PIG-1778:


I ran into this trying to load data using the 0.8.0 build from hbase 0.20.6 and 
hadoop 0.20.2 in pseudo-distributed mode on my dev machine. pig complained 
about a missing class from guava, which wasn't included any where in the 
pig-0.8.0 tarball. Downloading the latest guava and putting it on my classpath 
solved the issue. 

In regards to Dmitriy's comment on Jython, I agree that you probably don't want 
to start bundling jar for every scripting language, but python based UDFs are 
specifically called out a new feature in the documentation and they don't work 
as advertised until you manually go out and put Jython on your classpath.

> Some dependencies not packaged with Pig 0.8 release
> ---
>
> Key: PIG-1778
> URL: https://issues.apache.org/jira/browse/PIG-1778
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.8.0
>Reporter: Dmitriy V. Ryaboy
>
> Some of the libraries required for new Pig features are not included in the 
> built tarball of 0.8 release:
> guava, required for HBaseStorage
> jython, required for Jython UDFs
> We should discuss how to properly package these dependencies.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1778) Some dependencies not packaged with Pig 0.8 release

2010-12-21 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12973804#action_12973804
 ] 

Dmitriy V. Ryaboy commented on PIG-1778:


Guava is a general-purpose library that can be used throughout the Pig code; I 
think we should make it available in the default build.

Jython is very specific to the jython feature, and I doubt we want to get into 
distributing jars for every supported scripting language; perhaps we can 
distribute some bootstrapping script that would fetch Jython from maven if 
people want to use it.

> Some dependencies not packaged with Pig 0.8 release
> ---
>
> Key: PIG-1778
> URL: https://issues.apache.org/jira/browse/PIG-1778
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.8.0
>Reporter: Dmitriy V. Ryaboy
>
> Some of the libraries required for new Pig features are not included in the 
> built tarball of 0.8 release:
> guava, required for HBaseStorage
> jython, required for Jython UDFs
> We should discuss how to properly package these dependencies.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (PIG-1778) Some dependencies not packaged with Pig 0.8 release

2010-12-21 Thread Dmitriy V. Ryaboy (JIRA)
Some dependencies not packaged with Pig 0.8 release
---

 Key: PIG-1778
 URL: https://issues.apache.org/jira/browse/PIG-1778
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Dmitriy V. Ryaboy


Some of the libraries required for new Pig features are not included in the 
built tarball of 0.8 release:

guava, required for HBaseStorage
jython, required for Jython UDFs

We should discuss how to properly package these dependencies.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1755) Clean up duplicated code in Physical Operators

2010-12-21 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12973797#action_12973797
 ] 

Dmitriy V. Ryaboy commented on PIG-1755:


bump :)

> Clean up duplicated code in Physical Operators
> --
>
> Key: PIG-1755
> URL: https://issues.apache.org/jira/browse/PIG-1755
> Project: Pig
>  Issue Type: Improvement
>Reporter: Dmitriy V. Ryaboy
>Assignee: Dmitriy V. Ryaboy
>Priority: Minor
> Fix For: 0.9.0
>
> Attachments: PIG-1755.2.patch, PIG-1755.3.patch, PIG-1755.patch
>
>
> A lot of the getNext() implementations in PhysicalOperators is copy-pasted, 
> with only the method signatures and casts changing. 
> Shorter code leads to less bugs and is easier to read.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.