[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-05-05 Thread Ashish Thusoo (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12706103#action_12706103 ] Ashish Thusoo commented on HIVE-352: Very cool contribution Yongqiang!! Make Hive

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-30 Thread Zheng Shao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12704617#action_12704617 ] Zheng Shao commented on HIVE-352: - hive-352-2009-4-30-4.patch: Thanks Yongqiang. I tried it

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-30 Thread Zheng Shao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12704765#action_12704765 ] Zheng Shao commented on HIVE-352: - Writer: how do you pass the column number from Hive to the

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-30 Thread Zheng Shao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12704807#action_12704807 ] Zheng Shao commented on HIVE-352: - hive-352-2009-5-1-3.patch Can you remove the extra

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-29 Thread Zheng Shao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12704329#action_12704329 ] Zheng Shao commented on HIVE-352: - hive-352-2009-4-30-2.patch 1. It seems you compiled

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-29 Thread Zheng Shao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12704397#action_12704397 ] Zheng Shao commented on HIVE-352: - hive-352-2009-4-30-3.patch It seems there is a bug - only

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-28 Thread Zheng Shao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12703606#action_12703606 ] Zheng Shao commented on HIVE-352: - Nice work Yongqiang! I totally agree with your analysis

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-28 Thread Zheng Shao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12703616#action_12703616 ] Zheng Shao commented on HIVE-352: - Good news: A test on some of our internal data shows that

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-28 Thread Ashish Thusoo (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12703647#action_12703647 ] Ashish Thusoo commented on HIVE-352: That is very encouraging from the storage

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-28 Thread Joydeep Sen Sarma (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12703649#action_12703649 ] Joydeep Sen Sarma commented on HIVE-352: how does read performance look like? Make

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-28 Thread He Yongqiang (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12703980#action_12703980 ] He Yongqiang commented on HIVE-352: --- Zheng, can you post your profiling results? I did a

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-28 Thread Zheng Shao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12703993#action_12703993 ] Zheng Shao commented on HIVE-352: - The following numbers are all for 128MB gzip compressed

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-24 Thread Zheng Shao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12702267#action_12702267 ] Zheng Shao commented on HIVE-352: - @Yongqiang, The reason that native codec matters more for

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-23 Thread Zheng Shao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12701829#action_12701829 ] Zheng Shao commented on HIVE-352: - The numbers look much reasonable than before. 1.7s to read

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-23 Thread He Yongqiang (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12701837#action_12701837 ] He Yongqiang commented on HIVE-352: --- Thanks, Zheng. 0. Did you try that with hadoop 0.17.0?

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-23 Thread He Yongqiang (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12701843#action_12701843 ] He Yongqiang commented on HIVE-352: --- More explaination for the fomular used: {noformat}

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-23 Thread Zheng Shao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12701876#action_12701876 ] Zheng Shao commented on HIVE-352: - Running Yongqiang's tests with hadoop native library,

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-23 Thread Ashish Thusoo (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12701930#action_12701930 ] Ashish Thusoo commented on HIVE-352: Can we also get some numbers on the amount of memory

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-23 Thread He Yongqiang (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12702146#action_12702146 ] He Yongqiang commented on HIVE-352: --- Can we also get some numbers on the amount of memory

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-20 Thread He Yongqiang (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12700764#action_12700764 ] He Yongqiang commented on HIVE-352: --- When I am testing RCFile's read performance, I notice

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-20 Thread He Yongqiang (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12700765#action_12700765 ] He Yongqiang commented on HIVE-352: --- More explaination to the read sharp decrease problem:

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-20 Thread Zheng Shao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12700941#action_12700941 ] Zheng Shao commented on HIVE-352: - I agree we should do 1 and 2, but I don't feel 3 is worth

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-19 Thread Zheng Shao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12700557#action_12700557 ] Zheng Shao commented on HIVE-352: - I am surprised that RCFile is at least 2 times faster than

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-18 Thread Zheng Shao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12700542#action_12700542 ] Zheng Shao commented on HIVE-352: - 2 major approaches for the RCFileFormat to work are: 1.

Re: [jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-18 Thread He Yongqiang
Agreed. Can we have both? 1 is absolutely better for high selectivity filter clauses. With 2, we can skip loading unnecessary (compressed) columns into memory. I have done a simple RCFile perform test in my local single machine. It seems RCFile perform much better in reading than block-compressed

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-17 Thread Zheng Shao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12700098#action_12700098 ] Zheng Shao commented on HIVE-352: - hive-352-2009-4-17.patch: Very nice job! 2 more tests to

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-14 Thread He Yongqiang (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12699046#action_12699046 ] He Yongqiang commented on HIVE-352: --- more tests are needed, especaill on HiveOutputFormat.

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-04-14 Thread Zheng Shao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12699050#action_12699050 ] Zheng Shao commented on HIVE-352: - Please try a simple test for writing/reading using the new

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-03-29 Thread Zheng Shao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12693664#action_12693664 ] Zheng Shao commented on HIVE-352: - Haven't looked it completely through yet. Some initial

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-03-29 Thread Zheng Shao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12693665#action_12693665 ] Zheng Shao commented on HIVE-352: - @Raghu: Thanks for the references. For this issue, we are

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-03-26 Thread He Yongqiang (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12689492#action_12689492 ] He Yongqiang commented on HIVE-352: --- {quote} impose our own structure on the sequencefile

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-03-26 Thread Raghotham Murthy (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12689705#action_12689705 ] Raghotham Murthy commented on HIVE-352: --- Here are a few insightful articles about using

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-03-26 Thread He Yongqiang (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12689794#action_12689794 ] He Yongqiang commented on HIVE-352: --- Thanks, Raghotham Murthy. Besides these two posts,

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-03-25 Thread He Yongqiang (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12689151#action_12689151 ] He Yongqiang commented on HIVE-352: --- One problem with this RCFile is that it needs to know

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-03-25 Thread Zheng Shao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12689221#action_12689221 ] Zheng Shao commented on HIVE-352: - @Yongqiang: The reason that we do that lazy operation is

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-03-24 Thread He Yongqiang (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12688606#action_12688606 ] He Yongqiang commented on HIVE-352: --- Thank you for the advices, joydeep. yeah,i am

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-03-24 Thread He Yongqiang (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12688611#action_12688611 ] He Yongqiang commented on HIVE-352: --- By So i guess i may need to discard SequenceFile,

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-03-20 Thread He Yongqiang (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683783#action_12683783 ] He Yongqiang commented on HIVE-352: --- Thanks, Joydeep and Zheng. The advises are really

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-03-19 Thread Joydeep Sen Sarma (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683348#action_12683348 ] Joydeep Sen Sarma commented on HIVE-352: B2.2 is easier to implement, because we

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-03-19 Thread Zheng Shao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683353#action_12683353 ] Zheng Shao commented on HIVE-352: - Let's do B2.2 first. I guess there will need to be some

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-03-18 Thread Zheng Shao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683293#action_12683293 ] Zheng Shao commented on HIVE-352: - Hi Yongqiang, Sorry for jumping on this issue late. Let

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-03-17 Thread Joydeep Sen Sarma (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12682716#action_12682716 ] Joydeep Sen Sarma commented on HIVE-352: thanks for taking this on. this could be

[jira] Commented: (HIVE-352) Make Hive support column based storage

2009-03-17 Thread he yongqiang (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12682740#action_12682740 ] he yongqiang commented on HIVE-352: --- Thanks, Joydeep Sen Sarma. Your feedback is really