[GitHub] orc pull request #188: ORC-263: Implement column writers of compound types

2017-11-07 Thread wgtmac
GitHub user wgtmac opened a pull request:

https://github.com/apache/orc/pull/188

ORC-263: Implement column writers of compound types

Implemented ListColumnWriter, MapColumnWriter and
UnionColumnWriter. Also corresponding test cases
are added.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/wgtmac/orc ORC-263

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/orc/pull/188.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #188


commit c88f4cbb3fef95d3e2f02a59d57f811ea88b127a
Author: Gang Wu 
Date:   2017-11-08T04:58:56Z

ORC-263: Implement column writers of compound types

Implemented ListColumnWriter, MapColumnWriter and
UnionColumnWriter. Also corresponding test cases
are added.




---


[GitHub] orc pull request #186: ORC-187: Simplify BitFieldReader to only support sing...

2017-11-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/orc/pull/186


---


[GitHub] orc issue #186: ORC-187: Simplify BitFieldReader to only support single bits

2017-11-07 Thread t3rmin4t0r
Github user t3rmin4t0r commented on the issue:

https://github.com/apache/orc/pull/186
  
@asfgit merge


---


[jira] [Created] (ORC-263) Implement column writers of compound types

2017-11-07 Thread Gang Wu (JIRA)
Gang Wu created ORC-263:
---

 Summary: Implement column writers of compound types
 Key: ORC-263
 URL: https://issues.apache.org/jira/browse/ORC-263
 Project: ORC
  Issue Type: Sub-task
  Components: C++
Reporter: Gang Wu
Assignee: Gang Wu


The scope of this ticket is to implement column writers for list, map, and 
union types.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (ORC-262) Support async prefetch in Orc reader

2017-11-07 Thread Xiening Dai (JIRA)
Xiening Dai created ORC-262:
---

 Summary: Support async prefetch in Orc reader
 Key: ORC-262
 URL: https://issues.apache.org/jira/browse/ORC-262
 Project: ORC
  Issue Type: Improvement
  Components: C++
Reporter: Xiening Dai


Currently RowReader::next() method reads a batch of rows and return them to be 
processed by runtime. The function call is synchronized, meaning that the 
execution thread is blocked while reader is loading data from disk. We could 
potentially parallelize the execution and data loading through async prefetch 
using logic described as below.

In SeekableFileInputStream::Next(), we firstly check if the requested data 
block is already prefetched, if yes, we simply return the buffer to the caller, 
otherwise we issue a sync call to read data from file stream. No matter how we 
load the requested data block, we always issue another async call to prefetch 
the next block within current stream. 

Additionally orc::InputStream will need a new method that does the async read 
for a given offset and length.

According to our experiment, async prefetch can significantly reduce the IO 
wait time on a heavy loaded distributed file system. By carefully choosing the 
prefetch data block size, we can maximize the parallelization of runtime 
execution and data loading, and achieve a relatively high cache hit rate (~85%).




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] orc issue #186: ORC-187: BitFieldReader has an unnecessary loop

2017-11-07 Thread omalley
Github user omalley commented on the issue:

https://github.com/apache/orc/pull/186
  
Since this is removing the old functionality, you should remove the old 
constructor that takes the bitWidth. With this change, someone who passed in a 
non-one width would be surprised with the result.


---


[GitHub] orc pull request #187: ORC-260 - Fix a bug in masking data for Decimal

2017-11-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/orc/pull/187


---


[GitHub] orc pull request #185: ORC-261: [C++] Fix installation to include all header...

2017-11-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/orc/pull/185


---


[GitHub] orc issue #185: ORC-261: [C++] Fix installation to include all header files

2017-11-07 Thread omalley
Github user omalley commented on the issue:

https://github.com/apache/orc/pull/185
  
+1 This looks good.


---


[GitHub] orc pull request #149: ORC-224: Implement column writers of primitive types

2017-11-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/orc/pull/149


---


[GitHub] orc issue #170: ORC-207: [C++] Enable users the ability to provide their own...

2017-11-07 Thread majetideepak
Github user majetideepak commented on the issue:

https://github.com/apache/orc/pull/170
  
@jcrist  Will review that PR. Thanks!


---


[GitHub] orc pull request #170: ORC-207: [C++] Enable users the ability to provide th...

2017-11-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/orc/pull/170


---


[GitHub] orc issue #170: ORC-207: [C++] Enable users the ability to provide their own...

2017-11-07 Thread jcrist
Github user jcrist commented on the issue:

https://github.com/apache/orc/pull/170
  
Thanks @majetideepak. If you also have a chance, it would be nice to get 
#185 reviewed and merged too. Waiting on both of these so I can use orc in a 
larger cmake project.


---