[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12706103#action_12706103
]
Ashish Thusoo commented on HIVE-352:
Very cool contribution Yongqiang!!
Make Hive
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12704617#action_12704617
]
Zheng Shao commented on HIVE-352:
-
hive-352-2009-4-30-4.patch:
Thanks Yongqiang. I tried it
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12704765#action_12704765
]
Zheng Shao commented on HIVE-352:
-
Writer: how do you pass the column number from Hive to the
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12704807#action_12704807
]
Zheng Shao commented on HIVE-352:
-
hive-352-2009-5-1-3.patch
Can you remove the extra
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12704329#action_12704329
]
Zheng Shao commented on HIVE-352:
-
hive-352-2009-4-30-2.patch
1. It seems you compiled
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12704397#action_12704397
]
Zheng Shao commented on HIVE-352:
-
hive-352-2009-4-30-3.patch
It seems there is a bug - only
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12703606#action_12703606
]
Zheng Shao commented on HIVE-352:
-
Nice work Yongqiang!
I totally agree with your analysis
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12703616#action_12703616
]
Zheng Shao commented on HIVE-352:
-
Good news: A test on some of our internal data shows that
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12703647#action_12703647
]
Ashish Thusoo commented on HIVE-352:
That is very encouraging from the storage
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12703649#action_12703649
]
Joydeep Sen Sarma commented on HIVE-352:
how does read performance look like?
Make
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12703980#action_12703980
]
He Yongqiang commented on HIVE-352:
---
Zheng, can you post your profiling results?
I did a
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12703993#action_12703993
]
Zheng Shao commented on HIVE-352:
-
The following numbers are all for 128MB gzip compressed
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12702267#action_12702267
]
Zheng Shao commented on HIVE-352:
-
@Yongqiang,
The reason that native codec matters more for
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12701829#action_12701829
]
Zheng Shao commented on HIVE-352:
-
The numbers look much reasonable than before. 1.7s to read
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12701837#action_12701837
]
He Yongqiang commented on HIVE-352:
---
Thanks, Zheng.
0. Did you try that with hadoop 0.17.0?
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12701843#action_12701843
]
He Yongqiang commented on HIVE-352:
---
More explaination for the fomular used:
{noformat}
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12701876#action_12701876
]
Zheng Shao commented on HIVE-352:
-
Running Yongqiang's tests with hadoop native library,
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12701930#action_12701930
]
Ashish Thusoo commented on HIVE-352:
Can we also get some numbers on the amount of memory
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12702146#action_12702146
]
He Yongqiang commented on HIVE-352:
---
Can we also get some numbers on the amount of memory
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12700764#action_12700764
]
He Yongqiang commented on HIVE-352:
---
When I am testing RCFile's read performance, I notice
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12700765#action_12700765
]
He Yongqiang commented on HIVE-352:
---
More explaination to the read sharp decrease problem:
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12700941#action_12700941
]
Zheng Shao commented on HIVE-352:
-
I agree we should do 1 and 2, but I don't feel 3 is worth
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12700557#action_12700557
]
Zheng Shao commented on HIVE-352:
-
I am surprised that RCFile is at least 2 times faster than
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12700542#action_12700542
]
Zheng Shao commented on HIVE-352:
-
2 major approaches for the RCFileFormat to work are:
1.
Agreed.
Can we have both?
1 is absolutely better for high selectivity filter clauses. With 2, we can
skip loading unnecessary (compressed) columns into memory.
I have done a simple RCFile perform test in my local single machine. It
seems RCFile perform much better in reading than block-compressed
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12700098#action_12700098
]
Zheng Shao commented on HIVE-352:
-
hive-352-2009-4-17.patch:
Very nice job!
2 more tests to
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12699046#action_12699046
]
He Yongqiang commented on HIVE-352:
---
more tests are needed, especaill on HiveOutputFormat.
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12699050#action_12699050
]
Zheng Shao commented on HIVE-352:
-
Please try a simple test for writing/reading using the new
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12693664#action_12693664
]
Zheng Shao commented on HIVE-352:
-
Haven't looked it completely through yet.
Some initial
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12693665#action_12693665
]
Zheng Shao commented on HIVE-352:
-
@Raghu: Thanks for the references. For this issue, we are
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12689492#action_12689492
]
He Yongqiang commented on HIVE-352:
---
{quote}
impose our own structure on the sequencefile
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12689705#action_12689705
]
Raghotham Murthy commented on HIVE-352:
---
Here are a few insightful articles about using
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12689794#action_12689794
]
He Yongqiang commented on HIVE-352:
---
Thanks, Raghotham Murthy.
Besides these two posts,
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12689151#action_12689151
]
He Yongqiang commented on HIVE-352:
---
One problem with this RCFile is that it needs to know
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12689221#action_12689221
]
Zheng Shao commented on HIVE-352:
-
@Yongqiang: The reason that we do that lazy operation is
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12688606#action_12688606
]
He Yongqiang commented on HIVE-352:
---
Thank you for the advices, joydeep.
yeah,i am
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12688611#action_12688611
]
He Yongqiang commented on HIVE-352:
---
By So i guess i may need to discard SequenceFile,
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683783#action_12683783
]
He Yongqiang commented on HIVE-352:
---
Thanks, Joydeep and Zheng. The advises are really
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683348#action_12683348
]
Joydeep Sen Sarma commented on HIVE-352:
B2.2 is easier to implement, because we
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683353#action_12683353
]
Zheng Shao commented on HIVE-352:
-
Let's do B2.2 first. I guess there will need to be some
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683293#action_12683293
]
Zheng Shao commented on HIVE-352:
-
Hi Yongqiang,
Sorry for jumping on this issue late.
Let
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12682716#action_12682716
]
Joydeep Sen Sarma commented on HIVE-352:
thanks for taking this on. this could be
[
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12682740#action_12682740
]
he yongqiang commented on HIVE-352:
---
Thanks, Joydeep Sen Sarma. Your feedback is really
43 matches
Mail list logo