[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068089#comment-14068089 ] Rohini Palaniswamy commented on PIG-3558: - +1 > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Labels: porc > Fix For: 0.14.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, > PIG-3558-4.patch, PIG-3558-5.patch, PIG-3558-6.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067407#comment-14067407 ] Daniel Dai commented on PIG-3558: - There will be 3 compile test dependency: hive-exec-core.jar, hive-common.jar, hive-serde.jar. Non of these will be packed into pig-withouthadoop.jar. We will not release Pig 0.14 with a snapshot dependency. But I'd like to commit the patch to trunk sooner to facilitate follow up development. > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Labels: porc > Fix For: 0.14.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, > PIG-3558-4.patch, PIG-3558-5.patch, PIG-3558-6.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067289#comment-14067289 ] Dmitriy V. Ryaboy commented on PIG-3558: Nice. How much does this increase the weight of the pig build, and what packages does it pull in? I assume this won't get pushed to trunk until hive 0.14.0-SNAPSHOT becomes available as a stable version? > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Labels: porc > Fix For: 0.14.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, > PIG-3558-4.patch, PIG-3558-5.patch, PIG-3558-6.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13999169#comment-13999169 ] Daniel Dai commented on PIG-3558: - You can, the key point here is bytearray is not equal to binary. Instead of a semantic check like other datatype does, a runtime check is required for bytearray. > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Labels: porc > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, > PIG-3558-4.patch, PIG-3558-5.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996845#comment-13996845 ] Daniel Dai commented on PIG-3558: - bq. I don't see hive binary being any different than pig bytearray Technically it is different. Pig bytearray means unknown data type. Consider the following script & UDF: {code} public class MapGenerate extends EvalFunc { @Override public Map exec(Tuple input) throws IOException { // TODO Auto-generated method stub Map m = new HashMap(); m.put("key", new Integer(input.size())); return m; } @Override public Schema outputSchema(Schema input) { return new Schema(new Schema.FieldSchema(null, DataType.MAP)); } } {code} {code} a = load '1.txt' as (a0); b = foreach a generate a0, MapGenerate(*) as m:map[]; c = group c by key; dump c; {code} The group key will be of data type bytearray (since it is unknown), and the map key is NullableBytesWritable. NullableBytesWritable takes any Object instead of just DataByteArray to accommodate this case. It is possible we map Pig bytearray to binary, but must deal with the fact that the data may not be DataByteArray. > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Labels: porc > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, > PIG-3558-4.patch, PIG-3558-5.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996986#comment-13996986 ] Rohini Palaniswamy commented on PIG-3558: - In that case can we allow Pig bytearray to binary, but throw error on runtime if it was not DataByteArray but some other object? > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Labels: porc > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, > PIG-3558-4.patch, PIG-3558-5.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13991488#comment-13991488 ] Mona Chitnis commented on PIG-3558: --- yes I meant user. [~daijy] [~rohini] for storing the bytearray as binary in ORC, we need to add an ObjectInspector subclass similar to how JodaTimeObjectInspector or PigDecimalObjectInspector is handling the serde API. its minor and I can include it in my patch for tests. > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Labels: porc > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, > PIG-3558-4.patch, PIG-3558-5.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13990889#comment-13990889 ] Rohini Palaniswamy commented on PIG-3558: - bq. Okay. Then when storing back into OrcStorage, we'd have to do explicit cast on the column either at load or later. We should not be trying to cast into anything. The cast should be explicit from user. > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Labels: porc > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, > PIG-3558-4.patch, PIG-3558-5.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13990888#comment-13990888 ] Rohini Palaniswamy commented on PIG-3558: - [~daijy], We should allow storing bytearray as binary in ORC if user has not done a explicit cast. Even in AvroStorage pig bytearray is translated to avro Type.BYTES and stored and we don't throw an error. I don't see hive binary being any different than pig bytearray(https://cwiki.apache.org/confluence/display/Hive/Binary+DataType+Proposal - Sometimes, user is just interested in few of those columns and doesn't want to bother about exact type information for rest of columns. In such cases, he may just declare the types of those columns as binary and Hive will not try to interpret those columns.) On a different note, can we have the class renamed to ORCStorage instead of OrcStorage as ORC is abbreviation of Optimized Row Columnar? > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Labels: porc > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, > PIG-3558-4.patch, PIG-3558-5.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13990285#comment-13990285 ] Mona Chitnis commented on PIG-3558: --- Okay. Then when storing back into OrcStorage, we'd have to do explicit cast on the column either at load or later. > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Labels: porc > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, > PIG-3558-4.patch, PIG-3558-5.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987469#comment-13987469 ] Daniel Dai commented on PIG-3558: - [~chitnis] Not sure if I understand correctly. Hive BINARY has clear meaning, but Pig bytearray is not. It means unknown datatype where user does not explicitly declare the datatype. The real data can be anything not just DataByteArray. So it should be safe to convert Hive BINARY to Pig bytearray but not vice versa. > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Labels: porc > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, > PIG-3558-4.patch, PIG-3558-5.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986811#comment-13986811 ] Mona Chitnis commented on PIG-3558: --- If Hive type Binary is equated to Pig's bytearray {code} public static Object getPrimaryFromOrc(Object obj, PrimitiveObjectInspector poi) { Object result; switch (poi.getPrimitiveCategory()) { . . . case BINARY: result = new DataByteArray(((BytesWritable) obj).copyBytes()); break; {code} Is there a reason we cannot typecast it back to Binary during a store back to Orc? I can make it work by explicitly casting that column to int but I'd like to confirm it doesnt violate any Hive metadata serialization. > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Labels: porc > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, > PIG-3558-4.patch, PIG-3558-5.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13968236#comment-13968236 ] Lorand Bendig commented on PIG-3558: [~daijy], [~chitnis], thanks for pointing out the issue with direct fetch. I filed a patch (PIG-3888) that fixes it, so you may re-enable fetch in TestOrcStorage#setup(). > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Labels: porc > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, > PIG-3558-4.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13966108#comment-13966108 ] Mona Chitnis commented on PIG-3558: --- Thanks [~daijy]. Some tests pass now and will continue testing this on my setup > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, > PIG-3558-4.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964465#comment-13964465 ] Mona Chitnis commented on PIG-3558: --- [~daijy] When I run the unit tests in TestOrcStorage, they fail with NPE at OrcUtils.java:69 {code} public static Object convertOrcToPig(Object obj, ObjectInspector oi, boolean[] includedColumns) { Object result; 69switch (oi.getCategory()) { {code} because the ObjectInspector oi which should have been initialized in OrcStorage.setLocation() is still null. Do you have an updated patch for passing unit tests? > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13905824#comment-13905824 ] Daniel Dai commented on PIG-3558: - It is possible HIVE-860 will clean up the hive-exec.jar. It might strip away guava/thrift etc from hive-exec.jar. > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904684#comment-13904684 ] Daniel Dai commented on PIG-3558: - Please ignore my previous comment. It is actually an upgrade not downgrade. Sorry for the confusion. > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904663#comment-13904663 ] Dmitriy V. Ryaboy commented on PIG-3558: Help me understand this. My understanding is as follows: Compile is minimum required to compile main code. Test is minimum required to compile main code + stuff needed to test (hence, "extends"). Pushing a dependency up to compile means everything, not just test, needs the dependency. Also, the bump from 0.8 to 0.12 is 6 megs worth of code. That's a pretty big version bump. > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904641#comment-13904641 ] Daniel Dai commented on PIG-3558: - This is actually a downgrade. test dependency implies compile dependency, plus use the jar in unit test. {code} {code} > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904627#comment-13904627 ] Dmitriy V. Ryaboy commented on PIG-3558: [~daijy] not quite: {code} - conf="test->master" /> + conf="compile->master" /> {code} > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904598#comment-13904598 ] Daniel Dai commented on PIG-3558: - [~dvryaboy], this is only compile time dependency. We don't need to ship hive-exec.jar in distribution. We already have hive-exec.jar compile time dependency in Pig, only need to upgrade version. Agree it is desired for Hive to clean up dependencies, but it will be hard to make it happen in a short time. > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904600#comment-13904600 ] Daniel Dai commented on PIG-3558: - I am fine to unlink from 0.13 release though. > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904590#comment-13904590 ] Dmitriy V. Ryaboy commented on PIG-3558: So that's a -1. I would +1 this if it was going into piggybank. Since this depends on unpublished changes, I'd rather we unlink it from 0.13 release (as that would tie us to Hive's release schedule -- obviously we can't make a release that depends on a snapshot). > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904574#comment-13904574 ] Dmitriy V. Ryaboy commented on PIG-3558: I am pro adding ORC support in Pig, but against introducing massive dependencies. According to http://mvnrepository.com/artifact/org.apache.hive/hive-exec/0.12.0 the hive-exec jar for 0.12 is 9 megs, and hides within it specific versions of jackson, snappy, org.json, chunks of thrift, hadoop.io (?!), avro, commons, protobuf, and guava. If ORC authors are not interested in reducing their dependency hygene, they have to live with the fact that their project is unlikely to get integrated into other projects. This is self-inflicted jar hell. Please don't do this. When ORC cleans up their dependencies, let's revisit. > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902277#comment-13902277 ] Daniel Dai commented on PIG-3558: - Hive does not plan to make a separate jar for ORC unfortunately. > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902218#comment-13902218 ] Julien Le Dem commented on PIG-3558: Is hive-exec the fat jar that assembles the runtime dependencies of hive in one jar? Could we depend on the individual hive modules that we need instead? > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13843747#comment-13843747 ] Daniel Dai commented on PIG-3558: - Thanks Alan for review! Will commit once HIVE-5728 is committed and there is a maven artifacts to build against. > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13843632#comment-13843632 ] Alan Gates commented on PIG-3558: - +1. > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13811752#comment-13811752 ] Daniel Dai commented on PIG-3558: - Also there is a binary file which cannot be put in patch. Copy http://svn.apache.org/viewvc/hive/trunk/ql/src/test/resources/orc-file-11-format.orc?revision=1519868&view=co to test/org/apache/pig/builtin/orc. > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (PIG-3558) ORC support for Pig
[ https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13811736#comment-13811736 ] Daniel Dai commented on PIG-3558: - The patch depends on HIVE-5728, which provide the InputFormat/OutputFormat Pig needs. > ORC support for Pig > --- > > Key: PIG-3558 > URL: https://issues.apache.org/jira/browse/PIG-3558 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.13.0 > > Attachments: PIG-3558-1.patch > > > Adding LoadFunc and StoreFunc for ORC. -- This message was sent by Atlassian JIRA (v6.1#6144)