Yes, it seems related. I think the query string is not refreshed when hive
decides to run without a map reduce job. Problem is that I try to interact
with the query string to apply an early filter in the record reader. Any
other known way to detect that a map reduce job is not spawned so that I
can work around this issue?

/Petter

Den tisdagen den 3:e december 2013 skrev Adam Kawa:

> Hmmm?
>
> Maybe it is related to the fact, that a query:
> > select * from mytable limit 100;
> does not start any MapReduce job. It is starts a reading operation from
> HDFS (and a communication with MetaStore to know what is the schema and how
> to parse the data using InputFormat and SerDe).
>
> For example, If you run a query that has the same functionality (i.e. to
> show all content of the table by specifying all columns in SELECT)
> > select column1, column2, ... columnN from mytable limit 100;
> then a map-only job will be started and maybe (?) hive.query.string will
> contain this query..
>
>
> 2013/12/3 Petter von Dolwitz (Hem) 
> <petter.von.dolw...@gmail.com<javascript:_e({}, 'cvml', 
> 'petter.von.dolw...@gmail.com');>
> >
>
>> Hi,
>>
>> I use hive 0.11 with a five machine cluster. I am reading the property
>> hive.query.string from a custom RecordReader (used for reading external
>> tables).
>>
>> If I first invoke a query like
>>
>> select * from mytable where mycolumn='myvalue';
>>
>> I get the correct query string in this property.
>>
>> If I then invoke
>>
>> select * from mytable limit 100;
>>
>> the property hive.query.string still contains the first query. Seems like
>> hive uses local mode for the second query. Don't know if it is related.
>>
>> Anybody knows why the query string is not updated in the second case?
>>
>> Thanks,
>> Petter
>>
>
>

Reply via email to