Re: hive.query.string not reflecting the current query

2013-12-11 Thread Petter von Dolwitz (Hem)
Hi,

thank you all for your replies.

I switched to using 'hive.io.filter.text' inline with Peters reply. I also
applied the filter negotiation mechanism (HiveStoragePredicateHandler) in
my storage handler. It works very well (so far) even though the filter
negotiation mechanism is a bit limited in the allowed expression. I'll
bring up that question in a separate thread.

Br,
Petter




2013/12/5 Peter Marron 

>  Hi,
>
>
>
> Sorry for the late reply.
>
> Maybe the property ‘hive.io.filter.expr.serialized’ is something that can
> help?
>
> It works for me, and it certainly works in the case where the query does
> not
>
> result in a Map/Reduce (which is something that I rely on).
>
>
>
> (If you google you should be able to find out about it.)
>
>
>
> Regards,
>
>
>
> *Peter Marron*
>
> Senior Developer, Research & Development
>
>
>
> Office: +44 *(0) 118-940-7609*  peter.mar...@trilliumsoftware.com
>
> Theale Court First Floor, 11-13 High Street, Theale, RG7 5AH, UK
>
>
>
>  
>
>  
>
>
>
> *www.trilliumsoftware.com *
>
> Be Certain About Your Data. Be Trillium Certain.
>
>
>
> *From:* Petter von Dolwitz (Hem) [mailto:petter.von.dolw...@gmail.com]
> *Sent:* 03 December 2013 12:46
> *To:* user@hive.apache.org
> *Subject:* hive.query.string not reflecting the current query
>
>
>
> Hi,
>
> I use hive 0.11 with a five machine cluster. I am reading the property
> hive.query.string from a custom RecordReader (used for reading external
> tables).
>
> If I first invoke a query like
>
> select * from mytable where mycolumn='myvalue';
>
> I get the correct query string in this property.
>
> If I then invoke
>
> select * from mytable limit 100;
>
> the property hive.query.string still contains the first query. Seems like
> hive uses local mode for the second query. Don't know if it is related.
>
> Anybody knows why the query string is not updated in the second case?
>
> Thanks,
>
> Petter
>
<><><><>

RE: hive.query.string not reflecting the current query

2013-12-05 Thread Peter Marron
Hi,

Sorry for the late reply.
Maybe the property 'hive.io.filter.expr.serialized' is something that can help?
It works for me, and it certainly works in the case where the query does not
result in a Map/Reduce (which is something that I rely on).

(If you google you should be able to find out about it.)

Regards,

Peter Marron
Senior Developer, Research & Development

Office: +44 (0) 118-940-7609  
peter.mar...@trilliumsoftware.com
Theale Court First Floor, 11-13 High Street, Theale, RG7 5AH, UK
[cid:image001.png@01CEF1A7.CCFE66A0]

[cid:image002.png@01CEF1A7.CCFE66A0]

[cid:image003.png@01CEF1A7.CCFE66A0]

[cid:image004.png@01CEF1A7.CCFE66A0]


www.trilliumsoftware.com

Be Certain About Your Data. Be Trillium Certain.

From: Petter von Dolwitz (Hem) [mailto:petter.von.dolw...@gmail.com]
Sent: 03 December 2013 12:46
To: user@hive.apache.org
Subject: hive.query.string not reflecting the current query

Hi,
I use hive 0.11 with a five machine cluster. I am reading the property 
hive.query.string from a custom RecordReader (used for reading external tables).
If I first invoke a query like

select * from mytable where mycolumn='myvalue';
I get the correct query string in this property.
If I then invoke

select * from mytable limit 100;
the property hive.query.string still contains the first query. Seems like hive 
uses local mode for the second query. Don't know if it is related.
Anybody knows why the query string is not updated in the second case?

Thanks,
Petter
<><><><>

Re: hive.query.string not reflecting the current query

2013-12-03 Thread Navis류승우
Looks like a bug. I've booked this on
https://issues.apache.org/jira/browse/HIVE-5935.


2013/12/4 Adam Kawa 

> Maybe you can parse the output of EXPLAIN operator applied on your query
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain
> or look for other configuration property (e.g. saying that number of map
> and reduce tasks is equal to 0, or something).
>
>
> 2013/12/3 Petter von Dolwitz (Hem) 
>
>> Yes, it seems related. I think the query string is not refreshed when
>> hive decides to run without a map reduce job. Problem is that I try to
>> interact with the query string to apply an early filter in the record
>> reader. Any other known way to detect that a map reduce job is not spawned
>> so that I can work around this issue?
>>
>> /Petter
>>
>> Den tisdagen den 3:e december 2013 skrev Adam Kawa:
>>
>> Hmmm?
>>>
>>> Maybe it is related to the fact, that a query:
>>> > select * from mytable limit 100;
>>> does not start any MapReduce job. It is starts a reading operation from
>>> HDFS (and a communication with MetaStore to know what is the schema and how
>>> to parse the data using InputFormat and SerDe).
>>>
>>> For example, If you run a query that has the same functionality (i.e. to
>>> show all content of the table by specifying all columns in SELECT)
>>> > select column1, column2, ... columnN from mytable limit 100;
>>> then a map-only job will be started and maybe (?) hive.query.string
>>> will contain this query..
>>>
>>>
>>> 2013/12/3 Petter von Dolwitz (Hem) 
>>>
 Hi,

 I use hive 0.11 with a five machine cluster. I am reading the property
 hive.query.string from a custom RecordReader (used for reading external
 tables).

 If I first invoke a query like

 select * from mytable where mycolumn='myvalue';

 I get the correct query string in this property.

 If I then invoke

 select * from mytable limit 100;

 the property hive.query.string still contains the first query. Seems
 like hive uses local mode for the second query. Don't know if it is 
 related.

 Anybody knows why the query string is not updated in the second case?

 Thanks,
 Petter

>>>
>>>
>


Re: hive.query.string not reflecting the current query

2013-12-03 Thread Adam Kawa
Maybe you can parse the output of EXPLAIN operator applied on your query
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain  or
look for other configuration property (e.g. saying that number of map and
reduce tasks is equal to 0, or something).


2013/12/3 Petter von Dolwitz (Hem) 

> Yes, it seems related. I think the query string is not refreshed when hive
> decides to run without a map reduce job. Problem is that I try to interact
> with the query string to apply an early filter in the record reader. Any
> other known way to detect that a map reduce job is not spawned so that I
> can work around this issue?
>
> /Petter
>
> Den tisdagen den 3:e december 2013 skrev Adam Kawa:
>
> Hmmm?
>>
>> Maybe it is related to the fact, that a query:
>> > select * from mytable limit 100;
>> does not start any MapReduce job. It is starts a reading operation from
>> HDFS (and a communication with MetaStore to know what is the schema and how
>> to parse the data using InputFormat and SerDe).
>>
>> For example, If you run a query that has the same functionality (i.e. to
>> show all content of the table by specifying all columns in SELECT)
>> > select column1, column2, ... columnN from mytable limit 100;
>> then a map-only job will be started and maybe (?) hive.query.string will
>> contain this query..
>>
>>
>> 2013/12/3 Petter von Dolwitz (Hem) 
>>
>>> Hi,
>>>
>>> I use hive 0.11 with a five machine cluster. I am reading the property
>>> hive.query.string from a custom RecordReader (used for reading external
>>> tables).
>>>
>>> If I first invoke a query like
>>>
>>> select * from mytable where mycolumn='myvalue';
>>>
>>> I get the correct query string in this property.
>>>
>>> If I then invoke
>>>
>>> select * from mytable limit 100;
>>>
>>> the property hive.query.string still contains the first query. Seems
>>> like hive uses local mode for the second query. Don't know if it is related.
>>>
>>> Anybody knows why the query string is not updated in the second case?
>>>
>>> Thanks,
>>> Petter
>>>
>>
>>


Re: hive.query.string not reflecting the current query

2013-12-03 Thread Petter von Dolwitz (Hem)
Yes, it seems related. I think the query string is not refreshed when hive
decides to run without a map reduce job. Problem is that I try to interact
with the query string to apply an early filter in the record reader. Any
other known way to detect that a map reduce job is not spawned so that I
can work around this issue?

/Petter

Den tisdagen den 3:e december 2013 skrev Adam Kawa:

> Hmmm?
>
> Maybe it is related to the fact, that a query:
> > select * from mytable limit 100;
> does not start any MapReduce job. It is starts a reading operation from
> HDFS (and a communication with MetaStore to know what is the schema and how
> to parse the data using InputFormat and SerDe).
>
> For example, If you run a query that has the same functionality (i.e. to
> show all content of the table by specifying all columns in SELECT)
> > select column1, column2, ... columnN from mytable limit 100;
> then a map-only job will be started and maybe (?) hive.query.string will
> contain this query..
>
>
> 2013/12/3 Petter von Dolwitz (Hem) 
>  'petter.von.dolw...@gmail.com');>
> >
>
>> Hi,
>>
>> I use hive 0.11 with a five machine cluster. I am reading the property
>> hive.query.string from a custom RecordReader (used for reading external
>> tables).
>>
>> If I first invoke a query like
>>
>> select * from mytable where mycolumn='myvalue';
>>
>> I get the correct query string in this property.
>>
>> If I then invoke
>>
>> select * from mytable limit 100;
>>
>> the property hive.query.string still contains the first query. Seems like
>> hive uses local mode for the second query. Don't know if it is related.
>>
>> Anybody knows why the query string is not updated in the second case?
>>
>> Thanks,
>> Petter
>>
>
>


Re: hive.query.string not reflecting the current query

2013-12-03 Thread Adam Kawa
Hmmm?

Maybe it is related to the fact, that a query:
> select * from mytable limit 100;
does not start any MapReduce job. It is starts a reading operation from
HDFS (and a communication with MetaStore to know what is the schema and how
to parse the data using InputFormat and SerDe).

For example, If you run a query that has the same functionality (i.e. to
show all content of the table by specifying all columns in SELECT)
> select column1, column2, ... columnN from mytable limit 100;
then a map-only job will be started and maybe (?) hive.query.string will
contain this query..


2013/12/3 Petter von Dolwitz (Hem) 

> Hi,
>
> I use hive 0.11 with a five machine cluster. I am reading the property
> hive.query.string from a custom RecordReader (used for reading external
> tables).
>
> If I first invoke a query like
>
> select * from mytable where mycolumn='myvalue';
>
> I get the correct query string in this property.
>
> If I then invoke
>
> select * from mytable limit 100;
>
> the property hive.query.string still contains the first query. Seems like
> hive uses local mode for the second query. Don't know if it is related.
>
> Anybody knows why the query string is not updated in the second case?
>
> Thanks,
> Petter
>