Re: A quick question on the intent for some of the methods in AbstractRecordReader

Abdel Hakim Deneche Tue, 08 Sep 2015 10:34:00 -0700

some answers (to the best of my knowledge) inlined:

On Sun, Sep 6, 2015 at 8:41 PM, Edmon Begoli <[email protected]> wrote:


> In AbstractRecordReader:
>
> What is the intent of the following methods (my comments or questions
> follow #.
> I will try to add the answers as javadoc)
>
> # is this a collection of columns specified in the query projection (select
> columns[1],...[n]?
>
>   protected final void setColumns(Collection<SchemaPath> projected) {
>     assert Preconditions.checkNotNull(projected, COL_NULL_ERROR).size() > 0
> : COL_EMPTY_ERROR;
>     if (projected instanceof ColumnList) {
>       final ColumnList columns = ColumnList.class.cast(projected);
>       isSkipQuery = columns.getMode() == ColumnList.Mode.SKIP_ALL;
>     }
>     isStarQuery = isStarQuery(projected);
>     columns = transformColumns(projected);
>   }
>

receives a list of columns that should be read. This includes any column
referenced in the query.


>   # is this returning all columns in the storage or what specified in the
> projection?
>   protected Collection<SchemaPath> getColumns() {
>

returns columns that will be read by this reader (specified by setColumns())


>  # what is the intention of this?
>   protected Collection<SchemaPath> transformColumns(Collection<SchemaPath>
> projected) {
>

This can be used to transform the projected columns from Drill's SchemaPath
to the underlying scanner representation. HbaseRecordReader and
MongoRecordReader to override the default implementation


>  # where is this actually determined?
>   protected boolean isStarQuery() {
>     return isStarQuery;
>   }
>

In the same class, look at setColumns()


>
> # where is this set? Is it only set inside the setColumns(...) ?
>   /**
>    * Returns true if reader should skip all of the columns, reporting
> number of records only. Handling of a skip query
>    * is storage plugin-specific.
>    */
>   protected boolean isSkipQuery() {
>     return isSkipQuery;
>   }
>

Yes


>
>   # what exactly is a schema path and how it looks like in a query? How it
> related to star query?
>   public static boolean isStarQuery(Collection<SchemaPath> projected) {
>     return Iterables.tryFind(Preconditions.checkNotNull(projected,
> COL_NULL_ERROR), new Predicate<SchemaPath>() {
>       @Override
>
>       public boolean apply(SchemaPath path) {
>         return Preconditions.checkNotNull(path).equals(STAR_COLUMN);
>       }
>     }).isPresent();
>   }
>
>   # what is the allocation? what drives it? are there any limits or options
> for "lazy" allocation?
>   @Override
>   public void allocate(Map<Key, ValueVector> vectorMap) throws
> OutOfMemoryException {
>
>
> Thank you,
> Edmon
>



-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Re: A quick question on the intent for some of the methods in AbstractRecordReader

Reply via email to