[jira] [Commented] (PIG-3558) ORC support for Pig

2014-07-20 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068089#comment-14068089
 ] 

Rohini Palaniswamy commented on PIG-3558:
-

+1

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>  Labels: porc
> Fix For: 0.14.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, 
> PIG-3558-4.patch, PIG-3558-5.patch, PIG-3558-6.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3558) ORC support for Pig

2014-07-18 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067407#comment-14067407
 ] 

Daniel Dai commented on PIG-3558:
-

There will be 3 compile test dependency: hive-exec-core.jar, hive-common.jar, 
hive-serde.jar. Non of these will be packed into pig-withouthadoop.jar.

We will not release Pig 0.14 with a snapshot dependency. But I'd like to commit 
the patch to trunk sooner to facilitate follow up development.

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>  Labels: porc
> Fix For: 0.14.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, 
> PIG-3558-4.patch, PIG-3558-5.patch, PIG-3558-6.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3558) ORC support for Pig

2014-07-18 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067289#comment-14067289
 ] 

Dmitriy V. Ryaboy commented on PIG-3558:


Nice.

How much does this increase the weight of the pig build, and what packages does 
it pull in?

I assume this won't get pushed to trunk until hive 0.14.0-SNAPSHOT becomes 
available as a stable version?

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>  Labels: porc
> Fix For: 0.14.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, 
> PIG-3558-4.patch, PIG-3558-5.patch, PIG-3558-6.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3558) ORC support for Pig

2014-05-16 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13999169#comment-13999169
 ] 

Daniel Dai commented on PIG-3558:
-

You can, the key point here is bytearray is not equal to binary. Instead of a 
semantic check like other datatype does, a runtime check is required for 
bytearray. 

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>  Labels: porc
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, 
> PIG-3558-4.patch, PIG-3558-5.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3558) ORC support for Pig

2014-05-13 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996845#comment-13996845
 ] 

Daniel Dai commented on PIG-3558:
-

bq. I don't see hive binary being any different than pig bytearray
Technically it is different. Pig bytearray means unknown data type. Consider 
the following script & UDF:
{code}
public class MapGenerate extends EvalFunc {
@Override
public Map exec(Tuple input) throws IOException {
// TODO Auto-generated method stub
Map m = new HashMap();
m.put("key", new Integer(input.size()));
return m;
}

@Override
public Schema outputSchema(Schema input) {
return new Schema(new Schema.FieldSchema(null, DataType.MAP));
}
}
{code}
{code}
a = load '1.txt' as (a0);
b = foreach a generate a0, MapGenerate(*) as m:map[];
c = group c by key;
dump c;
{code}
The group key will be of data type bytearray (since it is unknown), and the map 
key is NullableBytesWritable. NullableBytesWritable takes any Object instead of 
just DataByteArray to accommodate this case.

It is possible we map Pig bytearray to binary, but must deal with the fact that 
the data may not be DataByteArray.

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>  Labels: porc
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, 
> PIG-3558-4.patch, PIG-3558-5.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3558) ORC support for Pig

2014-05-13 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996986#comment-13996986
 ] 

Rohini Palaniswamy commented on PIG-3558:
-

In that case can we allow Pig bytearray to binary, but throw error on runtime 
if it was not DataByteArray but some other object?

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>  Labels: porc
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, 
> PIG-3558-4.patch, PIG-3558-5.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3558) ORC support for Pig

2014-05-07 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13991488#comment-13991488
 ] 

Mona Chitnis commented on PIG-3558:
---

yes I meant user. [~daijy] [~rohini] for storing the bytearray as binary in 
ORC, we need to add an ObjectInspector subclass similar to how 
JodaTimeObjectInspector or PigDecimalObjectInspector is handling the serde API. 
its minor and I can include it in my patch for tests.

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>  Labels: porc
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, 
> PIG-3558-4.patch, PIG-3558-5.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3558) ORC support for Pig

2014-05-06 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13990889#comment-13990889
 ] 

Rohini Palaniswamy commented on PIG-3558:
-

bq. Okay. Then when storing back into OrcStorage, we'd have to do explicit cast 
on the column either at load or later.
  We should not be trying to cast into anything. The cast should be explicit 
from user.

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>  Labels: porc
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, 
> PIG-3558-4.patch, PIG-3558-5.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3558) ORC support for Pig

2014-05-06 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13990888#comment-13990888
 ] 

Rohini Palaniswamy commented on PIG-3558:
-

[~daijy],
   We should allow storing bytearray as binary in ORC if user has not done a 
explicit cast. Even in AvroStorage pig bytearray is translated to avro 
Type.BYTES and stored and we don't throw an error. I don't see hive binary 
being any different than pig 
bytearray(https://cwiki.apache.org/confluence/display/Hive/Binary+DataType+Proposal
 - Sometimes, user is just interested in few of those columns and doesn't want 
to bother about exact type information for rest of columns. In such cases, he 
may just declare the types of those columns as binary and Hive will not try to 
interpret those columns.)

On a different note, can we have the class renamed to ORCStorage instead of 
OrcStorage as ORC is abbreviation of Optimized Row Columnar?

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>  Labels: porc
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, 
> PIG-3558-4.patch, PIG-3558-5.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3558) ORC support for Pig

2014-05-05 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13990285#comment-13990285
 ] 

Mona Chitnis commented on PIG-3558:
---

Okay. Then when storing back into OrcStorage, we'd have to do explicit cast on 
the column either at load or later.

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>  Labels: porc
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, 
> PIG-3558-4.patch, PIG-3558-5.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3558) ORC support for Pig

2014-05-02 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987469#comment-13987469
 ] 

Daniel Dai commented on PIG-3558:
-

[~chitnis] Not sure if I understand correctly. Hive BINARY has clear meaning, 
but Pig bytearray is not. It means unknown datatype where user does not 
explicitly declare the datatype. The real data can be anything not just 
DataByteArray. So it should be safe to convert Hive BINARY to Pig bytearray but 
not vice versa.

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>  Labels: porc
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, 
> PIG-3558-4.patch, PIG-3558-5.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3558) ORC support for Pig

2014-05-01 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986811#comment-13986811
 ] 

Mona Chitnis commented on PIG-3558:
---

If Hive type Binary is equated to Pig's bytearray
{code}
public static Object getPrimaryFromOrc(Object obj, PrimitiveObjectInspector 
poi) {
Object result;
switch (poi.getPrimitiveCategory()) {
. . .
case BINARY:
result = new DataByteArray(((BytesWritable) obj).copyBytes());
break;
{code}
Is there a reason we cannot typecast it back to Binary during a store back to 
Orc? I can make it work by explicitly casting that column to int but I'd like 
to confirm it doesnt violate any Hive metadata serialization.

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>  Labels: porc
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, 
> PIG-3558-4.patch, PIG-3558-5.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3558) ORC support for Pig

2014-04-14 Thread Lorand Bendig (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13968236#comment-13968236
 ] 

Lorand Bendig commented on PIG-3558:


[~daijy], [~chitnis], thanks for pointing out the issue with direct fetch. I 
filed a patch (PIG-3888) that fixes it, so you may re-enable fetch in 
TestOrcStorage#setup(). 

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>  Labels: porc
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, 
> PIG-3558-4.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3558) ORC support for Pig

2014-04-10 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13966108#comment-13966108
 ] 

Mona Chitnis commented on PIG-3558:
---

Thanks [~daijy]. Some tests pass now and will continue testing this on my setup

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch, 
> PIG-3558-4.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3558) ORC support for Pig

2014-04-09 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964465#comment-13964465
 ] 

Mona Chitnis commented on PIG-3558:
---

[~daijy] When I run the unit tests in TestOrcStorage, they fail with NPE at 
OrcUtils.java:69

{code}
public static Object convertOrcToPig(Object obj, ObjectInspector oi, boolean[] 
includedColumns) {
Object result;
69switch (oi.getCategory()) {
{code}

because the ObjectInspector oi which should have been initialized in 
OrcStorage.setLocation() is still null.
Do you have an updated patch for passing unit tests?

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3558) ORC support for Pig

2014-02-19 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13905824#comment-13905824
 ] 

Daniel Dai commented on PIG-3558:
-

It is possible HIVE-860 will clean up the hive-exec.jar. It might strip away 
guava/thrift etc from hive-exec.jar.

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (PIG-3558) ORC support for Pig

2014-02-18 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904684#comment-13904684
 ] 

Daniel Dai commented on PIG-3558:
-

Please ignore my previous comment. It is actually an upgrade not downgrade. 
Sorry for the confusion.

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (PIG-3558) ORC support for Pig

2014-02-18 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904663#comment-13904663
 ] 

Dmitriy V. Ryaboy commented on PIG-3558:


Help me understand this. My understanding is as follows: 

Compile is minimum required to compile main code. Test is minimum required to 
compile main code + stuff needed to test (hence, "extends"). Pushing a 
dependency up to compile means everything, not just test, needs the dependency. 

Also, the bump from 0.8 to 0.12 is 6 megs worth of code. That's a pretty big 
version bump.

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (PIG-3558) ORC support for Pig

2014-02-18 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904641#comment-13904641
 ] 

Daniel Dai commented on PIG-3558:
-

This is actually a downgrade. test dependency implies compile dependency, plus 
use the jar in unit test.
{code}

{code}

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (PIG-3558) ORC support for Pig

2014-02-18 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904627#comment-13904627
 ] 

Dmitriy V. Ryaboy commented on PIG-3558:


[~daijy] not quite:

{code}
-  conf="test->master" />
+  conf="compile->master" />
{code}

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (PIG-3558) ORC support for Pig

2014-02-18 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904598#comment-13904598
 ] 

Daniel Dai commented on PIG-3558:
-

[~dvryaboy], this is only compile time dependency. We don't need to ship 
hive-exec.jar in distribution. We already have hive-exec.jar compile time 
dependency in Pig, only need to upgrade version. Agree it is desired for Hive 
to clean up dependencies, but it will be hard to make it happen in a short time.

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (PIG-3558) ORC support for Pig

2014-02-18 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904600#comment-13904600
 ] 

Daniel Dai commented on PIG-3558:
-

I am fine to unlink from 0.13 release though.

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (PIG-3558) ORC support for Pig

2014-02-18 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904590#comment-13904590
 ] 

Dmitriy V. Ryaboy commented on PIG-3558:


So that's a -1.

I would +1 this if it was going into piggybank.

Since this depends on unpublished changes, I'd rather we unlink it from 0.13 
release (as that would tie us to Hive's release schedule -- obviously we can't 
make a release that depends on a snapshot).

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (PIG-3558) ORC support for Pig

2014-02-18 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904574#comment-13904574
 ] 

Dmitriy V. Ryaboy commented on PIG-3558:


I am pro adding ORC support in Pig, but against introducing massive 
dependencies.

According to http://mvnrepository.com/artifact/org.apache.hive/hive-exec/0.12.0 
the hive-exec jar for 0.12 is 9 megs, and hides within it specific versions of 
jackson, snappy, org.json, chunks of thrift, hadoop.io (?!), avro, commons, 
protobuf, and guava. If ORC authors are not interested in reducing their 
dependency hygene, they have to live with the fact that their project is 
unlikely to get integrated into other projects.

This is self-inflicted jar hell. Please don't do this. When ORC cleans up their 
dependencies, let's revisit.

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (PIG-3558) ORC support for Pig

2014-02-14 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902277#comment-13902277
 ] 

Daniel Dai commented on PIG-3558:
-

Hive does not plan to make a separate jar for ORC unfortunately.

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (PIG-3558) ORC support for Pig

2014-02-14 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902218#comment-13902218
 ] 

Julien Le Dem commented on PIG-3558:


Is hive-exec the fat jar that assembles the runtime dependencies of hive in one 
jar?
Could we depend on the individual hive modules that we need instead?


> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (PIG-3558) ORC support for Pig

2013-12-09 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13843747#comment-13843747
 ] 

Daniel Dai commented on PIG-3558:
-

Thanks Alan for review!

Will commit once HIVE-5728 is committed and there is a maven artifacts to build 
against.

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (PIG-3558) ORC support for Pig

2013-12-09 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13843632#comment-13843632
 ] 

Alan Gates commented on PIG-3558:
-

+1.

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (PIG-3558) ORC support for Pig

2013-11-01 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13811752#comment-13811752
 ] 

Daniel Dai commented on PIG-3558:
-

Also there is a binary file which cannot be put in patch. Copy 
http://svn.apache.org/viewvc/hive/trunk/ql/src/test/resources/orc-file-11-format.orc?revision=1519868&view=co
 to test/org/apache/pig/builtin/orc. 

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (PIG-3558) ORC support for Pig

2013-11-01 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13811736#comment-13811736
 ] 

Daniel Dai commented on PIG-3558:
-

The patch depends on HIVE-5728, which provide the InputFormat/OutputFormat Pig 
needs.

> ORC support for Pig
> ---
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.



--
This message was sent by Atlassian JIRA
(v6.1#6144)