[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig

2014-07-21 Thread Lorand Bendig (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068321#comment-14068321
 ] 

Lorand Bendig commented on PIG-3445:


{quote} Caused by: java.lang.NoClassDefFoundError: parquet/pig/ParquetStorer 
{quote}
ParquetLoader and Storer are just wrappers, you need to have 
{{parquet-pig-bundle-1.2.3.jar}} on your pig's classpath which contains the 
concrete implementations. 

> Make Parquet format available out of the box in Pig
> ---
>
> Key: PIG-3445
> URL: https://issues.apache.org/jira/browse/PIG-3445
> Project: Pig
>  Issue Type: Improvement
>Reporter: Julien Le Dem
>Assignee: Lorand Bendig
> Fix For: 0.12.0
>
> Attachments: PIG-3445-2.patch, PIG-3445-3.patch, PIG-3445-4.patch, 
> PIG-3445-5.patch, PIG-3445.patch
>
>
> We would add the Parquet jar in the Pig packages to make it available out of 
> the box to pig users.
> On top of that we could add the parquet.pig package to the list of packages 
> to search for UDFs. (alternatively, the parquet jar could contain classes 
> name or.apache.pig.builtin.ParquetLoader and ParquetStorer)
> This way users can use Parquet simply by typing:
> A = LOAD 'foo' USING ParquetLoader();
> STORE A INTO 'bar' USING ParquetStorer();



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig

2014-07-20 Thread Amit Mor (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067912#comment-14067912
 ] 

Amit Mor commented on PIG-3445:
---

odd, I'm getting this:

2014-07-20 15:28:03,900 [main] INFO  org.apache.pig.Main - Apache Pig version 
0.12.1 (r1585011) compiled Apr 05 2014, 01:41:34
2014-07-20 15:28:03,900 [main] INFO  org.apache.pig.Main - Logging error 
messages to: 
/home/amit/Projects/sql.hg/mapreduce/pigx/scripts/pig_1405859283897.log
2014-07-20 15:28:04,176 [main] INFO  org.apache.pig.impl.util.Utils - Default 
bootup file /home/amit/.pigbootup not found
2014-07-20 15:28:04,320 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to 
hadoop file system at: file:///
2014-07-20 15:28:06,175 [main] ERROR org.apache.pig.PigServer - exception 
during parsing: Error during parsing. Cannot instantiate class 
org.apache.pig.builtin.ParquetStorer (parquet.pig.ParquetStorer)
Failed to parse: Pig script failed to parse: 
 pig script failed to validate: 
java.lang.RuntimeException: could not instantiate 'ParquetStorer' with 
arguments 'null'
at 
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:196)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678)
at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411)
at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344)
at org.apache.pig.PigServer.executeBatch(PigServer.java:369)
at org.apache.pig.PigServer.executeBatch(PigServer.java:355)
at 
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:607)
at org.apache.pig.Main.main(Main.java:156)
Caused by: 
 pig script failed to validate: 
java.lang.RuntimeException: could not instantiate 'ParquetStorer' with 
arguments 'null'
at 
org.apache.pig.parser.LogicalPlanBuilder.buildStoreOp(LogicalPlanBuilder.java:969)
at 
org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:7780)
at 
org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1669)
at 
org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
at 
org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
at 
org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
at 
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188)
... 11 more
Caused by: java.lang.RuntimeException: could not instantiate 'ParquetStorer' 
with arguments 'null'
at 
org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:748)
at 
org.apache.pig.parser.LogicalPlanBuilder.buildStoreOp(LogicalPlanBuilder.java:948)
... 17 more
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2259: 
Cannot instantiate class org.apache.pig.builtin.ParquetStorer 
(parquet.pig.ParquetStorer)
at org.apache.pig.builtin.ParquetStorer.(ParquetStorer.java:38)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at java.lang.Class.newInstance(Class.java:374)
at 
org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:718)
... 18 more
Caused by: java.lang.NoClassDefFoundError: parquet/pig/ParquetStorer
at org.apache.pig.builtin.ParquetStorer.(ParquetStorer.java:34)
... 24 more
Caused by: java.lang.ClassNotFoundException: parquet.pig.ParquetStorer
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 25 more
2014-07-20 15:28:06,182 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
2259: Cannot instantiate class org.apache.pig.builtin.ParquetStorer 
(parquet.pig.ParquetStorer)
Details at logfile: 
/home/amit/Projects/sql.hg/mapreduce/pigx/scripts/pig_1405859283897.log

when calling:

STORE cgrp_rena

[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig

2013-10-04 Thread Aniket Mokashi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13786694#comment-13786694
 ] 

Aniket Mokashi commented on PIG-3445:
-

Committed to trunk and Pig-0.12. Thanks [~lbendig] and [~julienledem].

> Make Parquet format available out of the box in Pig
> ---
>
> Key: PIG-3445
> URL: https://issues.apache.org/jira/browse/PIG-3445
> Project: Pig
>  Issue Type: Improvement
>Reporter: Julien Le Dem
>Assignee: Lorand Bendig
> Fix For: 0.12.0
>
> Attachments: PIG-3445-2.patch, PIG-3445-3.patch, PIG-3445-4.patch, 
> PIG-3445-5.patch, PIG-3445.patch
>
>
> We would add the Parquet jar in the Pig packages to make it available out of 
> the box to pig users.
> On top of that we could add the parquet.pig package to the list of packages 
> to search for UDFs. (alternatively, the parquet jar could contain classes 
> name or.apache.pig.builtin.ParquetLoader and ParquetStorer)
> This way users can use Parquet simply by typing:
> A = LOAD 'foo' USING ParquetLoader();
> STORE A INTO 'bar' USING ParquetStorer();



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig

2013-10-04 Thread Lorand Bendig (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13786186#comment-13786186
 ] 

Lorand Bendig commented on PIG-3445:


I have the modified patch for parquet-pig-bundle, but I'd like to attach it 
when it becomes visible in maven central,
just to be sure.

> Make Parquet format available out of the box in Pig
> ---
>
> Key: PIG-3445
> URL: https://issues.apache.org/jira/browse/PIG-3445
> Project: Pig
>  Issue Type: Improvement
>Reporter: Julien Le Dem
> Fix For: 0.12.0
>
> Attachments: PIG-3445-2.patch, PIG-3445-3.patch, PIG-3445-4.patch, 
> PIG-3445.patch
>
>
> We would add the Parquet jar in the Pig packages to make it available out of 
> the box to pig users.
> On top of that we could add the parquet.pig package to the list of packages 
> to search for UDFs. (alternatively, the parquet jar could contain classes 
> name or.apache.pig.builtin.ParquetLoader and ParquetStorer)
> This way users can use Parquet simply by typing:
> A = LOAD 'foo' USING ParquetLoader();
> STORE A INTO 'bar' USING ParquetStorer();



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig

2013-10-03 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785785#comment-13785785
 ] 

Daniel Dai commented on PIG-3445:
-

Great, thanks!

> Make Parquet format available out of the box in Pig
> ---
>
> Key: PIG-3445
> URL: https://issues.apache.org/jira/browse/PIG-3445
> Project: Pig
>  Issue Type: Improvement
>Reporter: Julien Le Dem
> Fix For: 0.12.0
>
> Attachments: PIG-3445-2.patch, PIG-3445-3.patch, PIG-3445-4.patch, 
> PIG-3445.patch
>
>
> We would add the Parquet jar in the Pig packages to make it available out of 
> the box to pig users.
> On top of that we could add the parquet.pig package to the list of packages 
> to search for UDFs. (alternatively, the parquet jar could contain classes 
> name or.apache.pig.builtin.ParquetLoader and ParquetStorer)
> This way users can use Parquet simply by typing:
> A = LOAD 'foo' USING ParquetLoader();
> STORE A INTO 'bar' USING ParquetStorer();



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig

2013-10-03 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785742#comment-13785742
 ] 

Julien Le Dem commented on PIG-3445:


I just released parquet-pig-bundle-1.2.3
this should show up in maven central overnight

> Make Parquet format available out of the box in Pig
> ---
>
> Key: PIG-3445
> URL: https://issues.apache.org/jira/browse/PIG-3445
> Project: Pig
>  Issue Type: Improvement
>Reporter: Julien Le Dem
> Fix For: 0.12.0
>
> Attachments: PIG-3445-2.patch, PIG-3445-3.patch, PIG-3445-4.patch, 
> PIG-3445.patch
>
>
> We would add the Parquet jar in the Pig packages to make it available out of 
> the box to pig users.
> On top of that we could add the parquet.pig package to the list of packages 
> to search for UDFs. (alternatively, the parquet jar could contain classes 
> name or.apache.pig.builtin.ParquetLoader and ParquetStorer)
> This way users can use Parquet simply by typing:
> A = LOAD 'foo' USING ParquetLoader();
> STORE A INTO 'bar' USING ParquetStorer();



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig

2013-10-03 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785616#comment-13785616
 ] 

Daniel Dai commented on PIG-3445:
-

Hi, [~julienledem], I am trying to roll a Pig 0.12.0 RC tomorrow, can we get it 
done by then?

> Make Parquet format available out of the box in Pig
> ---
>
> Key: PIG-3445
> URL: https://issues.apache.org/jira/browse/PIG-3445
> Project: Pig
>  Issue Type: Improvement
>Reporter: Julien Le Dem
> Fix For: 0.12.0
>
> Attachments: PIG-3445-2.patch, PIG-3445-3.patch, PIG-3445-4.patch, 
> PIG-3445.patch
>
>
> We would add the Parquet jar in the Pig packages to make it available out of 
> the box to pig users.
> On top of that we could add the parquet.pig package to the list of packages 
> to search for UDFs. (alternatively, the parquet jar could contain classes 
> name or.apache.pig.builtin.ParquetLoader and ParquetStorer)
> This way users can use Parquet simply by typing:
> A = LOAD 'foo' USING ParquetLoader();
> STORE A INTO 'bar' USING ParquetStorer();



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig

2013-10-03 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785610#comment-13785610
 ] 

Julien Le Dem commented on PIG-3445:


We merged the PR for parquet-pig-bundle
I'm making a release so that this can be merge in pig 0.12


> Make Parquet format available out of the box in Pig
> ---
>
> Key: PIG-3445
> URL: https://issues.apache.org/jira/browse/PIG-3445
> Project: Pig
>  Issue Type: Improvement
>Reporter: Julien Le Dem
> Fix For: 0.12.0
>
> Attachments: PIG-3445-2.patch, PIG-3445-3.patch, PIG-3445-4.patch, 
> PIG-3445.patch
>
>
> We would add the Parquet jar in the Pig packages to make it available out of 
> the box to pig users.
> On top of that we could add the parquet.pig package to the list of packages 
> to search for UDFs. (alternatively, the parquet jar could contain classes 
> name or.apache.pig.builtin.ParquetLoader and ParquetStorer)
> This way users can use Parquet simply by typing:
> A = LOAD 'foo' USING ParquetLoader();
> STORE A INTO 'bar' USING ParquetStorer();



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig

2013-10-03 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785565#comment-13785565
 ] 

Julien Le Dem commented on PIG-3445:



parquet-format.version should be 1.0.0

> Make Parquet format available out of the box in Pig
> ---
>
> Key: PIG-3445
> URL: https://issues.apache.org/jira/browse/PIG-3445
> Project: Pig
>  Issue Type: Improvement
>Reporter: Julien Le Dem
> Fix For: 0.12.0
>
> Attachments: PIG-3445-2.patch, PIG-3445-3.patch, PIG-3445-4.patch, 
> PIG-3445.patch
>
>
> We would add the Parquet jar in the Pig packages to make it available out of 
> the box to pig users.
> On top of that we could add the parquet.pig package to the list of packages 
> to search for UDFs. (alternatively, the parquet jar could contain classes 
> name or.apache.pig.builtin.ParquetLoader and ParquetStorer)
> This way users can use Parquet simply by typing:
> A = LOAD 'foo' USING ParquetLoader();
> STORE A INTO 'bar' USING ParquetStorer();



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig

2013-10-03 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785564#comment-13785564
 ] 

Julien Le Dem commented on PIG-3445:


I add a parquet-pig-bundle and the shading of fastutil:
https://github.com/Parquet/parquet-mr/pull/186
We can make a new release to simplify

> Make Parquet format available out of the box in Pig
> ---
>
> Key: PIG-3445
> URL: https://issues.apache.org/jira/browse/PIG-3445
> Project: Pig
>  Issue Type: Improvement
>Reporter: Julien Le Dem
> Fix For: 0.12.0
>
> Attachments: PIG-3445-2.patch, PIG-3445-3.patch, PIG-3445-4.patch, 
> PIG-3445.patch
>
>
> We would add the Parquet jar in the Pig packages to make it available out of 
> the box to pig users.
> On top of that we could add the parquet.pig package to the list of packages 
> to search for UDFs. (alternatively, the parquet jar could contain classes 
> name or.apache.pig.builtin.ParquetLoader and ParquetStorer)
> This way users can use Parquet simply by typing:
> A = LOAD 'foo' USING ParquetLoader();
> STORE A INTO 'bar' USING ParquetStorer();



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig

2013-10-02 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13784627#comment-13784627
 ] 

Dmitriy V. Ryaboy commented on PIG-3445:


That's a great addition, thanks Lorand.

The code looks really tidy now.

Looks like ParquetUtil is actually general util? Maybe add that functionality 
to org.apache.pig.impl.util.JarManager or something along those lines?

[~julienledem] do we need to publish a new artifact version so fastutil isn't 
required for dictionary encoding?


> Make Parquet format available out of the box in Pig
> ---
>
> Key: PIG-3445
> URL: https://issues.apache.org/jira/browse/PIG-3445
> Project: Pig
>  Issue Type: Improvement
>Reporter: Julien Le Dem
> Fix For: 0.12.0
>
> Attachments: PIG-3445-2.patch, PIG-3445-3.patch, PIG-3445.patch
>
>
> We would add the Parquet jar in the Pig packages to make it available out of 
> the box to pig users.
> On top of that we could add the parquet.pig package to the list of packages 
> to search for UDFs. (alternatively, the parquet jar could contain classes 
> name or.apache.pig.builtin.ParquetLoader and ParquetStorer)
> This way users can use Parquet simply by typing:
> A = LOAD 'foo' USING ParquetLoader();
> STORE A INTO 'bar' USING ParquetStorer();



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig

2013-10-01 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13783614#comment-13783614
 ] 

Dmitriy V. Ryaboy commented on PIG-3445:


[~lbendig] might be more succinct to use StoreFuncWrapper ?

> Make Parquet format available out of the box in Pig
> ---
>
> Key: PIG-3445
> URL: https://issues.apache.org/jira/browse/PIG-3445
> Project: Pig
>  Issue Type: Improvement
>Reporter: Julien Le Dem
> Fix For: 0.12.0
>
> Attachments: PIG-3445-2.patch, PIG-3445.patch
>
>
> We would add the Parquet jar in the Pig packages to make it available out of 
> the box to pig users.
> On top of that we could add the parquet.pig package to the list of packages 
> to search for UDFs. (alternatively, the parquet jar could contain classes 
> name or.apache.pig.builtin.ParquetLoader and ParquetStorer)
> This way users can use Parquet simply by typing:
> A = LOAD 'foo' USING ParquetLoader();
> STORE A INTO 'bar' USING ParquetStorer();



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig

2013-09-24 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13776879#comment-13776879
 ] 

Dmitriy V. Ryaboy commented on PIG-3445:


Other loaders like csv, avro, json, xml, etc (even RC, though it's in piggybank 
due to heavy dependencies and lack of support) are all in already so I don't 
see this as unfair, but as consistent.
Not packaging the pq jars into pig monojar and instead adding them, the way we 
add guava et al for hbase, sounds like a good idea.
[~julienledem] should we do that by providing a simple wrapper in pig builtins, 
or by messing with the job conf in parquet's own loader/storer?

> Make Parquet format available out of the box in Pig
> ---
>
> Key: PIG-3445
> URL: https://issues.apache.org/jira/browse/PIG-3445
> Project: Pig
>  Issue Type: Improvement
>Reporter: Julien Le Dem
> Fix For: 0.12.0
>
> Attachments: PIG-3445.patch
>
>
> We would add the Parquet jar in the Pig packages to make it available out of 
> the box to pig users.
> On top of that we could add the parquet.pig package to the list of packages 
> to search for UDFs. (alternatively, the parquet jar could contain classes 
> name or.apache.pig.builtin.ParquetLoader and ParquetStorer)
> This way users can use Parquet simply by typing:
> A = LOAD 'foo' USING ParquetLoader();
> STORE A INTO 'bar' USING ParquetStorer();

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig

2013-09-24 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13776764#comment-13776764
 ] 

Daniel Dai commented on PIG-3445:
-

Size maybe one thing, but still, doing a favor for Parquet sounds unfair to 
other loaders. Is it possible to push the jar dependency logic into LoadFunc, 
only shipping jar to backend when use the LoadFunc.

> Make Parquet format available out of the box in Pig
> ---
>
> Key: PIG-3445
> URL: https://issues.apache.org/jira/browse/PIG-3445
> Project: Pig
>  Issue Type: Improvement
>Reporter: Julien Le Dem
> Fix For: 0.12
>
> Attachments: PIG-3445.patch
>
>
> We would add the Parquet jar in the Pig packages to make it available out of 
> the box to pig users.
> On top of that we could add the parquet.pig package to the list of packages 
> to search for UDFs. (alternatively, the parquet jar could contain classes 
> name or.apache.pig.builtin.ParquetLoader and ParquetStorer)
> This way users can use Parquet simply by typing:
> A = LOAD 'foo' USING ParquetLoader();
> STORE A INTO 'bar' USING ParquetStorer();

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig

2013-09-24 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13776172#comment-13776172
 ] 

Dmitriy V. Ryaboy commented on PIG-3445:


The size of the dependency introduced by this is orders of magnitude smaller 
than the HBase (or Avro) one, since everything comes from a single project 
(unlike HBase's liberal use of guava, metric, ZK, and everything else under the 
sun). The total size is less than 1 meg.

Can we add parquet.pig to udf import list in the same patch?

> Make Parquet format available out of the box in Pig
> ---
>
> Key: PIG-3445
> URL: https://issues.apache.org/jira/browse/PIG-3445
> Project: Pig
>  Issue Type: Improvement
>Reporter: Julien Le Dem
> Attachments: PIG-3445.patch
>
>
> We would add the Parquet jar in the Pig packages to make it available out of 
> the box to pig users.
> On top of that we could add the parquet.pig package to the list of packages 
> to search for UDFs. (alternatively, the parquet jar could contain classes 
> name or.apache.pig.builtin.ParquetLoader and ParquetStorer)
> This way users can use Parquet simply by typing:
> A = LOAD 'foo' USING ParquetLoader();
> STORE A INTO 'bar' USING ParquetStorer();

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig

2013-09-13 Thread Lorand Bendig (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766515#comment-13766515
 ] 

Lorand Bendig commented on PIG-3445:


Yes, that's definitely a drawback of this patch.
Is it an option here to utilize pig.additional.jars and udf.import.list?
If so, I can think of the following:
pig.properties:
{code}
pig.additional.jars.parquet.column=/path/to/parquet-column.jar
pig.additional.jars.parquet.common=
pig.additional.jars.parquet.encoding=
...

or: pig.additional.jars.parquet=parquet-column.jar:parquet-common.jar

udf.import.list.parquet=parquet.pig.
{code}
At the point where 3rd party jars and import packages are initialized an 
additional code could take care of these grouped properties. If some checks 
(can be defined per group) succeed, like valid paths..etc then these props 
would be merged to pig.additional.jars and udf.import.list. 
The rest is the same as before.
However, this might be silly solution which may not address all the issues that 
can arise, I'm curious if it can be an option.

> Make Parquet format available out of the box in Pig
> ---
>
> Key: PIG-3445
> URL: https://issues.apache.org/jira/browse/PIG-3445
> Project: Pig
>  Issue Type: Improvement
>Reporter: Julien Le Dem
> Attachments: PIG-3445.patch
>
>
> We would add the Parquet jar in the Pig packages to make it available out of 
> the box to pig users.
> On top of that we could add the parquet.pig package to the list of packages 
> to search for UDFs. (alternatively, the parquet jar could contain classes 
> name or.apache.pig.builtin.ParquetLoader and ParquetStorer)
> This way users can use Parquet simply by typing:
> A = LOAD 'foo' USING ParquetLoader();
> STORE A INTO 'bar' USING ParquetStorer();

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig

2013-09-09 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13762182#comment-13762182
 ] 

Daniel Dai commented on PIG-3445:
-

This reminds me a similar ticket for HBase PIG-3285. Not sure packing a bunch 
of jars for a new loader is a good idea. I am not objecting the patch, but 
seems we need a better solution for that in the future.

> Make Parquet format available out of the box in Pig
> ---
>
> Key: PIG-3445
> URL: https://issues.apache.org/jira/browse/PIG-3445
> Project: Pig
>  Issue Type: Improvement
>Reporter: Julien Le Dem
> Attachments: PIG-3445.patch
>
>
> We would add the Parquet jar in the Pig packages to make it available out of 
> the box to pig users.
> On top of that we could add the parquet.pig package to the list of packages 
> to search for UDFs. (alternatively, the parquet jar could contain classes 
> name or.apache.pig.builtin.ParquetLoader and ParquetStorer)
> This way users can use Parquet simply by typing:
> A = LOAD 'foo' USING ParquetLoader();
> STORE A INTO 'bar' USING ParquetStorer();

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira