[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig
[ https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068321#comment-14068321 ] Lorand Bendig commented on PIG-3445: {quote} Caused by: java.lang.NoClassDefFoundError: parquet/pig/ParquetStorer {quote} ParquetLoader and Storer are just wrappers, you need to have {{parquet-pig-bundle-1.2.3.jar}} on your pig's classpath which contains the concrete implementations. > Make Parquet format available out of the box in Pig > --- > > Key: PIG-3445 > URL: https://issues.apache.org/jira/browse/PIG-3445 > Project: Pig > Issue Type: Improvement >Reporter: Julien Le Dem >Assignee: Lorand Bendig > Fix For: 0.12.0 > > Attachments: PIG-3445-2.patch, PIG-3445-3.patch, PIG-3445-4.patch, > PIG-3445-5.patch, PIG-3445.patch > > > We would add the Parquet jar in the Pig packages to make it available out of > the box to pig users. > On top of that we could add the parquet.pig package to the list of packages > to search for UDFs. (alternatively, the parquet jar could contain classes > name or.apache.pig.builtin.ParquetLoader and ParquetStorer) > This way users can use Parquet simply by typing: > A = LOAD 'foo' USING ParquetLoader(); > STORE A INTO 'bar' USING ParquetStorer(); -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig
[ https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067912#comment-14067912 ] Amit Mor commented on PIG-3445: --- odd, I'm getting this: 2014-07-20 15:28:03,900 [main] INFO org.apache.pig.Main - Apache Pig version 0.12.1 (r1585011) compiled Apr 05 2014, 01:41:34 2014-07-20 15:28:03,900 [main] INFO org.apache.pig.Main - Logging error messages to: /home/amit/Projects/sql.hg/mapreduce/pigx/scripts/pig_1405859283897.log 2014-07-20 15:28:04,176 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /home/amit/.pigbootup not found 2014-07-20 15:28:04,320 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:/// 2014-07-20 15:28:06,175 [main] ERROR org.apache.pig.PigServer - exception during parsing: Error during parsing. Cannot instantiate class org.apache.pig.builtin.ParquetStorer (parquet.pig.ParquetStorer) Failed to parse: Pig script failed to parse: pig script failed to validate: java.lang.RuntimeException: could not instantiate 'ParquetStorer' with arguments 'null' at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:196) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678) at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411) at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344) at org.apache.pig.PigServer.executeBatch(PigServer.java:369) at org.apache.pig.PigServer.executeBatch(PigServer.java:355) at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) at org.apache.pig.Main.run(Main.java:607) at org.apache.pig.Main.main(Main.java:156) Caused by: pig script failed to validate: java.lang.RuntimeException: could not instantiate 'ParquetStorer' with arguments 'null' at org.apache.pig.parser.LogicalPlanBuilder.buildStoreOp(LogicalPlanBuilder.java:969) at org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:7780) at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1669) at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102) at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560) at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421) at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188) ... 11 more Caused by: java.lang.RuntimeException: could not instantiate 'ParquetStorer' with arguments 'null' at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:748) at org.apache.pig.parser.LogicalPlanBuilder.buildStoreOp(LogicalPlanBuilder.java:948) ... 17 more Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2259: Cannot instantiate class org.apache.pig.builtin.ParquetStorer (parquet.pig.ParquetStorer) at org.apache.pig.builtin.ParquetStorer.(ParquetStorer.java:38) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at java.lang.Class.newInstance(Class.java:374) at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:718) ... 18 more Caused by: java.lang.NoClassDefFoundError: parquet/pig/ParquetStorer at org.apache.pig.builtin.ParquetStorer.(ParquetStorer.java:34) ... 24 more Caused by: java.lang.ClassNotFoundException: parquet.pig.ParquetStorer at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 25 more 2014-07-20 15:28:06,182 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2259: Cannot instantiate class org.apache.pig.builtin.ParquetStorer (parquet.pig.ParquetStorer) Details at logfile: /home/amit/Projects/sql.hg/mapreduce/pigx/scripts/pig_1405859283897.log when calling: STORE cgrp_rena
[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig
[ https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13786694#comment-13786694 ] Aniket Mokashi commented on PIG-3445: - Committed to trunk and Pig-0.12. Thanks [~lbendig] and [~julienledem]. > Make Parquet format available out of the box in Pig > --- > > Key: PIG-3445 > URL: https://issues.apache.org/jira/browse/PIG-3445 > Project: Pig > Issue Type: Improvement >Reporter: Julien Le Dem >Assignee: Lorand Bendig > Fix For: 0.12.0 > > Attachments: PIG-3445-2.patch, PIG-3445-3.patch, PIG-3445-4.patch, > PIG-3445-5.patch, PIG-3445.patch > > > We would add the Parquet jar in the Pig packages to make it available out of > the box to pig users. > On top of that we could add the parquet.pig package to the list of packages > to search for UDFs. (alternatively, the parquet jar could contain classes > name or.apache.pig.builtin.ParquetLoader and ParquetStorer) > This way users can use Parquet simply by typing: > A = LOAD 'foo' USING ParquetLoader(); > STORE A INTO 'bar' USING ParquetStorer(); -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig
[ https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13786186#comment-13786186 ] Lorand Bendig commented on PIG-3445: I have the modified patch for parquet-pig-bundle, but I'd like to attach it when it becomes visible in maven central, just to be sure. > Make Parquet format available out of the box in Pig > --- > > Key: PIG-3445 > URL: https://issues.apache.org/jira/browse/PIG-3445 > Project: Pig > Issue Type: Improvement >Reporter: Julien Le Dem > Fix For: 0.12.0 > > Attachments: PIG-3445-2.patch, PIG-3445-3.patch, PIG-3445-4.patch, > PIG-3445.patch > > > We would add the Parquet jar in the Pig packages to make it available out of > the box to pig users. > On top of that we could add the parquet.pig package to the list of packages > to search for UDFs. (alternatively, the parquet jar could contain classes > name or.apache.pig.builtin.ParquetLoader and ParquetStorer) > This way users can use Parquet simply by typing: > A = LOAD 'foo' USING ParquetLoader(); > STORE A INTO 'bar' USING ParquetStorer(); -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig
[ https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785785#comment-13785785 ] Daniel Dai commented on PIG-3445: - Great, thanks! > Make Parquet format available out of the box in Pig > --- > > Key: PIG-3445 > URL: https://issues.apache.org/jira/browse/PIG-3445 > Project: Pig > Issue Type: Improvement >Reporter: Julien Le Dem > Fix For: 0.12.0 > > Attachments: PIG-3445-2.patch, PIG-3445-3.patch, PIG-3445-4.patch, > PIG-3445.patch > > > We would add the Parquet jar in the Pig packages to make it available out of > the box to pig users. > On top of that we could add the parquet.pig package to the list of packages > to search for UDFs. (alternatively, the parquet jar could contain classes > name or.apache.pig.builtin.ParquetLoader and ParquetStorer) > This way users can use Parquet simply by typing: > A = LOAD 'foo' USING ParquetLoader(); > STORE A INTO 'bar' USING ParquetStorer(); -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig
[ https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785742#comment-13785742 ] Julien Le Dem commented on PIG-3445: I just released parquet-pig-bundle-1.2.3 this should show up in maven central overnight > Make Parquet format available out of the box in Pig > --- > > Key: PIG-3445 > URL: https://issues.apache.org/jira/browse/PIG-3445 > Project: Pig > Issue Type: Improvement >Reporter: Julien Le Dem > Fix For: 0.12.0 > > Attachments: PIG-3445-2.patch, PIG-3445-3.patch, PIG-3445-4.patch, > PIG-3445.patch > > > We would add the Parquet jar in the Pig packages to make it available out of > the box to pig users. > On top of that we could add the parquet.pig package to the list of packages > to search for UDFs. (alternatively, the parquet jar could contain classes > name or.apache.pig.builtin.ParquetLoader and ParquetStorer) > This way users can use Parquet simply by typing: > A = LOAD 'foo' USING ParquetLoader(); > STORE A INTO 'bar' USING ParquetStorer(); -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig
[ https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785616#comment-13785616 ] Daniel Dai commented on PIG-3445: - Hi, [~julienledem], I am trying to roll a Pig 0.12.0 RC tomorrow, can we get it done by then? > Make Parquet format available out of the box in Pig > --- > > Key: PIG-3445 > URL: https://issues.apache.org/jira/browse/PIG-3445 > Project: Pig > Issue Type: Improvement >Reporter: Julien Le Dem > Fix For: 0.12.0 > > Attachments: PIG-3445-2.patch, PIG-3445-3.patch, PIG-3445-4.patch, > PIG-3445.patch > > > We would add the Parquet jar in the Pig packages to make it available out of > the box to pig users. > On top of that we could add the parquet.pig package to the list of packages > to search for UDFs. (alternatively, the parquet jar could contain classes > name or.apache.pig.builtin.ParquetLoader and ParquetStorer) > This way users can use Parquet simply by typing: > A = LOAD 'foo' USING ParquetLoader(); > STORE A INTO 'bar' USING ParquetStorer(); -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig
[ https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785610#comment-13785610 ] Julien Le Dem commented on PIG-3445: We merged the PR for parquet-pig-bundle I'm making a release so that this can be merge in pig 0.12 > Make Parquet format available out of the box in Pig > --- > > Key: PIG-3445 > URL: https://issues.apache.org/jira/browse/PIG-3445 > Project: Pig > Issue Type: Improvement >Reporter: Julien Le Dem > Fix For: 0.12.0 > > Attachments: PIG-3445-2.patch, PIG-3445-3.patch, PIG-3445-4.patch, > PIG-3445.patch > > > We would add the Parquet jar in the Pig packages to make it available out of > the box to pig users. > On top of that we could add the parquet.pig package to the list of packages > to search for UDFs. (alternatively, the parquet jar could contain classes > name or.apache.pig.builtin.ParquetLoader and ParquetStorer) > This way users can use Parquet simply by typing: > A = LOAD 'foo' USING ParquetLoader(); > STORE A INTO 'bar' USING ParquetStorer(); -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig
[ https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785565#comment-13785565 ] Julien Le Dem commented on PIG-3445: parquet-format.version should be 1.0.0 > Make Parquet format available out of the box in Pig > --- > > Key: PIG-3445 > URL: https://issues.apache.org/jira/browse/PIG-3445 > Project: Pig > Issue Type: Improvement >Reporter: Julien Le Dem > Fix For: 0.12.0 > > Attachments: PIG-3445-2.patch, PIG-3445-3.patch, PIG-3445-4.patch, > PIG-3445.patch > > > We would add the Parquet jar in the Pig packages to make it available out of > the box to pig users. > On top of that we could add the parquet.pig package to the list of packages > to search for UDFs. (alternatively, the parquet jar could contain classes > name or.apache.pig.builtin.ParquetLoader and ParquetStorer) > This way users can use Parquet simply by typing: > A = LOAD 'foo' USING ParquetLoader(); > STORE A INTO 'bar' USING ParquetStorer(); -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig
[ https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785564#comment-13785564 ] Julien Le Dem commented on PIG-3445: I add a parquet-pig-bundle and the shading of fastutil: https://github.com/Parquet/parquet-mr/pull/186 We can make a new release to simplify > Make Parquet format available out of the box in Pig > --- > > Key: PIG-3445 > URL: https://issues.apache.org/jira/browse/PIG-3445 > Project: Pig > Issue Type: Improvement >Reporter: Julien Le Dem > Fix For: 0.12.0 > > Attachments: PIG-3445-2.patch, PIG-3445-3.patch, PIG-3445-4.patch, > PIG-3445.patch > > > We would add the Parquet jar in the Pig packages to make it available out of > the box to pig users. > On top of that we could add the parquet.pig package to the list of packages > to search for UDFs. (alternatively, the parquet jar could contain classes > name or.apache.pig.builtin.ParquetLoader and ParquetStorer) > This way users can use Parquet simply by typing: > A = LOAD 'foo' USING ParquetLoader(); > STORE A INTO 'bar' USING ParquetStorer(); -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig
[ https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13784627#comment-13784627 ] Dmitriy V. Ryaboy commented on PIG-3445: That's a great addition, thanks Lorand. The code looks really tidy now. Looks like ParquetUtil is actually general util? Maybe add that functionality to org.apache.pig.impl.util.JarManager or something along those lines? [~julienledem] do we need to publish a new artifact version so fastutil isn't required for dictionary encoding? > Make Parquet format available out of the box in Pig > --- > > Key: PIG-3445 > URL: https://issues.apache.org/jira/browse/PIG-3445 > Project: Pig > Issue Type: Improvement >Reporter: Julien Le Dem > Fix For: 0.12.0 > > Attachments: PIG-3445-2.patch, PIG-3445-3.patch, PIG-3445.patch > > > We would add the Parquet jar in the Pig packages to make it available out of > the box to pig users. > On top of that we could add the parquet.pig package to the list of packages > to search for UDFs. (alternatively, the parquet jar could contain classes > name or.apache.pig.builtin.ParquetLoader and ParquetStorer) > This way users can use Parquet simply by typing: > A = LOAD 'foo' USING ParquetLoader(); > STORE A INTO 'bar' USING ParquetStorer(); -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig
[ https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13783614#comment-13783614 ] Dmitriy V. Ryaboy commented on PIG-3445: [~lbendig] might be more succinct to use StoreFuncWrapper ? > Make Parquet format available out of the box in Pig > --- > > Key: PIG-3445 > URL: https://issues.apache.org/jira/browse/PIG-3445 > Project: Pig > Issue Type: Improvement >Reporter: Julien Le Dem > Fix For: 0.12.0 > > Attachments: PIG-3445-2.patch, PIG-3445.patch > > > We would add the Parquet jar in the Pig packages to make it available out of > the box to pig users. > On top of that we could add the parquet.pig package to the list of packages > to search for UDFs. (alternatively, the parquet jar could contain classes > name or.apache.pig.builtin.ParquetLoader and ParquetStorer) > This way users can use Parquet simply by typing: > A = LOAD 'foo' USING ParquetLoader(); > STORE A INTO 'bar' USING ParquetStorer(); -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig
[ https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13776879#comment-13776879 ] Dmitriy V. Ryaboy commented on PIG-3445: Other loaders like csv, avro, json, xml, etc (even RC, though it's in piggybank due to heavy dependencies and lack of support) are all in already so I don't see this as unfair, but as consistent. Not packaging the pq jars into pig monojar and instead adding them, the way we add guava et al for hbase, sounds like a good idea. [~julienledem] should we do that by providing a simple wrapper in pig builtins, or by messing with the job conf in parquet's own loader/storer? > Make Parquet format available out of the box in Pig > --- > > Key: PIG-3445 > URL: https://issues.apache.org/jira/browse/PIG-3445 > Project: Pig > Issue Type: Improvement >Reporter: Julien Le Dem > Fix For: 0.12.0 > > Attachments: PIG-3445.patch > > > We would add the Parquet jar in the Pig packages to make it available out of > the box to pig users. > On top of that we could add the parquet.pig package to the list of packages > to search for UDFs. (alternatively, the parquet jar could contain classes > name or.apache.pig.builtin.ParquetLoader and ParquetStorer) > This way users can use Parquet simply by typing: > A = LOAD 'foo' USING ParquetLoader(); > STORE A INTO 'bar' USING ParquetStorer(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig
[ https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13776764#comment-13776764 ] Daniel Dai commented on PIG-3445: - Size maybe one thing, but still, doing a favor for Parquet sounds unfair to other loaders. Is it possible to push the jar dependency logic into LoadFunc, only shipping jar to backend when use the LoadFunc. > Make Parquet format available out of the box in Pig > --- > > Key: PIG-3445 > URL: https://issues.apache.org/jira/browse/PIG-3445 > Project: Pig > Issue Type: Improvement >Reporter: Julien Le Dem > Fix For: 0.12 > > Attachments: PIG-3445.patch > > > We would add the Parquet jar in the Pig packages to make it available out of > the box to pig users. > On top of that we could add the parquet.pig package to the list of packages > to search for UDFs. (alternatively, the parquet jar could contain classes > name or.apache.pig.builtin.ParquetLoader and ParquetStorer) > This way users can use Parquet simply by typing: > A = LOAD 'foo' USING ParquetLoader(); > STORE A INTO 'bar' USING ParquetStorer(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig
[ https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13776172#comment-13776172 ] Dmitriy V. Ryaboy commented on PIG-3445: The size of the dependency introduced by this is orders of magnitude smaller than the HBase (or Avro) one, since everything comes from a single project (unlike HBase's liberal use of guava, metric, ZK, and everything else under the sun). The total size is less than 1 meg. Can we add parquet.pig to udf import list in the same patch? > Make Parquet format available out of the box in Pig > --- > > Key: PIG-3445 > URL: https://issues.apache.org/jira/browse/PIG-3445 > Project: Pig > Issue Type: Improvement >Reporter: Julien Le Dem > Attachments: PIG-3445.patch > > > We would add the Parquet jar in the Pig packages to make it available out of > the box to pig users. > On top of that we could add the parquet.pig package to the list of packages > to search for UDFs. (alternatively, the parquet jar could contain classes > name or.apache.pig.builtin.ParquetLoader and ParquetStorer) > This way users can use Parquet simply by typing: > A = LOAD 'foo' USING ParquetLoader(); > STORE A INTO 'bar' USING ParquetStorer(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig
[ https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766515#comment-13766515 ] Lorand Bendig commented on PIG-3445: Yes, that's definitely a drawback of this patch. Is it an option here to utilize pig.additional.jars and udf.import.list? If so, I can think of the following: pig.properties: {code} pig.additional.jars.parquet.column=/path/to/parquet-column.jar pig.additional.jars.parquet.common= pig.additional.jars.parquet.encoding= ... or: pig.additional.jars.parquet=parquet-column.jar:parquet-common.jar udf.import.list.parquet=parquet.pig. {code} At the point where 3rd party jars and import packages are initialized an additional code could take care of these grouped properties. If some checks (can be defined per group) succeed, like valid paths..etc then these props would be merged to pig.additional.jars and udf.import.list. The rest is the same as before. However, this might be silly solution which may not address all the issues that can arise, I'm curious if it can be an option. > Make Parquet format available out of the box in Pig > --- > > Key: PIG-3445 > URL: https://issues.apache.org/jira/browse/PIG-3445 > Project: Pig > Issue Type: Improvement >Reporter: Julien Le Dem > Attachments: PIG-3445.patch > > > We would add the Parquet jar in the Pig packages to make it available out of > the box to pig users. > On top of that we could add the parquet.pig package to the list of packages > to search for UDFs. (alternatively, the parquet jar could contain classes > name or.apache.pig.builtin.ParquetLoader and ParquetStorer) > This way users can use Parquet simply by typing: > A = LOAD 'foo' USING ParquetLoader(); > STORE A INTO 'bar' USING ParquetStorer(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3445) Make Parquet format available out of the box in Pig
[ https://issues.apache.org/jira/browse/PIG-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13762182#comment-13762182 ] Daniel Dai commented on PIG-3445: - This reminds me a similar ticket for HBase PIG-3285. Not sure packing a bunch of jars for a new loader is a good idea. I am not objecting the patch, but seems we need a better solution for that in the future. > Make Parquet format available out of the box in Pig > --- > > Key: PIG-3445 > URL: https://issues.apache.org/jira/browse/PIG-3445 > Project: Pig > Issue Type: Improvement >Reporter: Julien Le Dem > Attachments: PIG-3445.patch > > > We would add the Parquet jar in the Pig packages to make it available out of > the box to pig users. > On top of that we could add the parquet.pig package to the list of packages > to search for UDFs. (alternatively, the parquet jar could contain classes > name or.apache.pig.builtin.ParquetLoader and ParquetStorer) > This way users can use Parquet simply by typing: > A = LOAD 'foo' USING ParquetLoader(); > STORE A INTO 'bar' USING ParquetStorer(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira