[GitHub] spark pull request: [SPARK-2523] [SQL] [WIP] Hadoop table scan bug...

concretevitamin Wed, 16 Jul 2014 10:22:41 -0700

Github user concretevitamin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1439#discussion_r15013611
  
    --- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala ---
    @@ -241,4 +252,37 @@ private[hive] object HadoopTableReader {
         val bufferSize = System.getProperty("spark.buffer.size", "65536")
         jobConf.set("io.file.buffer.size", bufferSize)
       }
    +
    +  /**
    +   * Transform the raw data(Writable object) into the Row object for an 
iterable input
    +   * @param iter Iterable input which represented as Writable object
    +   * @param deserializer Deserializer associated with the input writable 
object
    +   * @param attrs Represents the row attribute names and its zero-based 
position in the MutableRow
    +   * @param row reusable MutableRow object
    +   * 
    +   * @return Iterable Row object that transformed from the given iterable 
input.
    +   */
    +  def fillObject(iter: Iterator[Writable], deserializer: Deserializer, 
    +      attrs: Seq[(Attribute, Int)], row: GenericMutableRow): Iterator[Row] 
= {
    +    val soi = 
deserializer.getObjectInspector().asInstanceOf[StructObjectInspector]
    +    // get the field references according to the attributes(output of the 
reader) required
    +    val fieldRefs = attrs.map { case (attr, idx) => 
(soi.getStructFieldRef(attr.name), idx) }
    +      
    +    // Map each tuple to a row object
    +    iter.map { value =>
    +      val raw = deserializer.deserialize(value)
    +      var idx = 0;
    +      while(idx < fieldRefs.length) {
    --- End diff --
    
    nit: space after while



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2523] [SQL] [WIP] Hadoop table scan bug...

Reply via email to