Re: hive.query.string not reflecting the current query

2013-12-11 Thread Petter von Dolwitz (Hem)
Hi,

thank you all for your replies.

I switched to using 'hive.io.filter.text' inline with Peters reply. I also
applied the filter negotiation mechanism (HiveStoragePredicateHandler) in
my storage handler. It works very well (so far) even though the filter
negotiation mechanism is a bit limited in the allowed expression. I'll
bring up that question in a separate thread.

Br,
Petter




2013/12/5 Peter Marron peter.mar...@trilliumsoftware.com

  Hi,



 Sorry for the late reply.

 Maybe the property ‘hive.io.filter.expr.serialized’ is something that can
 help?

 It works for me, and it certainly works in the case where the query does
 not

 result in a Map/Reduce (which is something that I rely on).



 (If you google you should be able to find out about it.)



 Regards,



 *Peter Marron*

 Senior Developer, Research  Development



 Office: +44 *(0) 118-940-7609*  peter.mar...@trilliumsoftware.com

 Theale Court First Floor, 11-13 High Street, Theale, RG7 5AH, UK

https://www.facebook.com/pages/Trillium-Software/109184815778307

  https://twitter.com/TrilliumSW

  http://www.linkedin.com/company/17710



 *www.trilliumsoftware.com http://www.trilliumsoftware.com/*

 Be Certain About Your Data. Be Trillium Certain.



 *From:* Petter von Dolwitz (Hem) [mailto:petter.von.dolw...@gmail.com]
 *Sent:* 03 December 2013 12:46
 *To:* user@hive.apache.org
 *Subject:* hive.query.string not reflecting the current query



 Hi,

 I use hive 0.11 with a five machine cluster. I am reading the property
 hive.query.string from a custom RecordReader (used for reading external
 tables).

 If I first invoke a query like

 select * from mytable where mycolumn='myvalue';

 I get the correct query string in this property.

 If I then invoke

 select * from mytable limit 100;

 the property hive.query.string still contains the first query. Seems like
 hive uses local mode for the second query. Don't know if it is related.

 Anybody knows why the query string is not updated in the second case?

 Thanks,

 Petter

image001.pngimage002.pngimage004.pngimage003.png

RE: hive.query.string not reflecting the current query

2013-12-05 Thread Peter Marron
Hi,

Sorry for the late reply.
Maybe the property 'hive.io.filter.expr.serialized' is something that can help?
It works for me, and it certainly works in the case where the query does not
result in a Map/Reduce (which is something that I rely on).

(If you google you should be able to find out about it.)

Regards,

Peter Marron
Senior Developer, Research  Development

Office: +44 (0) 118-940-7609  
peter.mar...@trilliumsoftware.commailto:peter.mar...@trilliumsoftware.com
Theale Court First Floor, 11-13 High Street, Theale, RG7 5AH, UK
[cid:image001.png@01CEF1A7.CCFE66A0]

[cid:image002.png@01CEF1A7.CCFE66A0]https://www.facebook.com/pages/Trillium-Software/109184815778307

[cid:image003.png@01CEF1A7.CCFE66A0]https://twitter.com/TrilliumSW

[cid:image004.png@01CEF1A7.CCFE66A0]http://www.linkedin.com/company/17710


www.trilliumsoftware.comhttp://www.trilliumsoftware.com/

Be Certain About Your Data. Be Trillium Certain.

From: Petter von Dolwitz (Hem) [mailto:petter.von.dolw...@gmail.com]
Sent: 03 December 2013 12:46
To: user@hive.apache.org
Subject: hive.query.string not reflecting the current query

Hi,
I use hive 0.11 with a five machine cluster. I am reading the property 
hive.query.string from a custom RecordReader (used for reading external tables).
If I first invoke a query like

select * from mytable where mycolumn='myvalue';
I get the correct query string in this property.
If I then invoke

select * from mytable limit 100;
the property hive.query.string still contains the first query. Seems like hive 
uses local mode for the second query. Don't know if it is related.
Anybody knows why the query string is not updated in the second case?

Thanks,
Petter
inline: image001.pnginline: image002.pnginline: image003.pnginline: image004.png

hive.query.string not reflecting the current query

2013-12-03 Thread Petter von Dolwitz (Hem)
Hi,

I use hive 0.11 with a five machine cluster. I am reading the property
hive.query.string from a custom RecordReader (used for reading external
tables).

If I first invoke a query like

select * from mytable where mycolumn='myvalue';

I get the correct query string in this property.

If I then invoke

select * from mytable limit 100;

the property hive.query.string still contains the first query. Seems like
hive uses local mode for the second query. Don't know if it is related.

Anybody knows why the query string is not updated in the second case?

Thanks,
Petter


Re: hive.query.string not reflecting the current query

2013-12-03 Thread Adam Kawa
Hmmm?

Maybe it is related to the fact, that a query:
 select * from mytable limit 100;
does not start any MapReduce job. It is starts a reading operation from
HDFS (and a communication with MetaStore to know what is the schema and how
to parse the data using InputFormat and SerDe).

For example, If you run a query that has the same functionality (i.e. to
show all content of the table by specifying all columns in SELECT)
 select column1, column2, ... columnN from mytable limit 100;
then a map-only job will be started and maybe (?) hive.query.string will
contain this query..


2013/12/3 Petter von Dolwitz (Hem) petter.von.dolw...@gmail.com

 Hi,

 I use hive 0.11 with a five machine cluster. I am reading the property
 hive.query.string from a custom RecordReader (used for reading external
 tables).

 If I first invoke a query like

 select * from mytable where mycolumn='myvalue';

 I get the correct query string in this property.

 If I then invoke

 select * from mytable limit 100;

 the property hive.query.string still contains the first query. Seems like
 hive uses local mode for the second query. Don't know if it is related.

 Anybody knows why the query string is not updated in the second case?

 Thanks,
 Petter



Re: hive.query.string not reflecting the current query

2013-12-03 Thread Petter von Dolwitz (Hem)
Yes, it seems related. I think the query string is not refreshed when hive
decides to run without a map reduce job. Problem is that I try to interact
with the query string to apply an early filter in the record reader. Any
other known way to detect that a map reduce job is not spawned so that I
can work around this issue?

/Petter

Den tisdagen den 3:e december 2013 skrev Adam Kawa:

 Hmmm?

 Maybe it is related to the fact, that a query:
  select * from mytable limit 100;
 does not start any MapReduce job. It is starts a reading operation from
 HDFS (and a communication with MetaStore to know what is the schema and how
 to parse the data using InputFormat and SerDe).

 For example, If you run a query that has the same functionality (i.e. to
 show all content of the table by specifying all columns in SELECT)
  select column1, column2, ... columnN from mytable limit 100;
 then a map-only job will be started and maybe (?) hive.query.string will
 contain this query..


 2013/12/3 Petter von Dolwitz (Hem) 
 petter.von.dolw...@gmail.comjavascript:_e({}, 'cvml', 
 'petter.von.dolw...@gmail.com');
 

 Hi,

 I use hive 0.11 with a five machine cluster. I am reading the property
 hive.query.string from a custom RecordReader (used for reading external
 tables).

 If I first invoke a query like

 select * from mytable where mycolumn='myvalue';

 I get the correct query string in this property.

 If I then invoke

 select * from mytable limit 100;

 the property hive.query.string still contains the first query. Seems like
 hive uses local mode for the second query. Don't know if it is related.

 Anybody knows why the query string is not updated in the second case?

 Thanks,
 Petter





Re: hive.query.string not reflecting the current query

2013-12-03 Thread Navis류승우
Looks like a bug. I've booked this on
https://issues.apache.org/jira/browse/HIVE-5935.


2013/12/4 Adam Kawa kawa.a...@gmail.com

 Maybe you can parse the output of EXPLAIN operator applied on your query
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain
 or look for other configuration property (e.g. saying that number of map
 and reduce tasks is equal to 0, or something).


 2013/12/3 Petter von Dolwitz (Hem) petter.von.dolw...@gmail.com

 Yes, it seems related. I think the query string is not refreshed when
 hive decides to run without a map reduce job. Problem is that I try to
 interact with the query string to apply an early filter in the record
 reader. Any other known way to detect that a map reduce job is not spawned
 so that I can work around this issue?

 /Petter

 Den tisdagen den 3:e december 2013 skrev Adam Kawa:

 Hmmm?

 Maybe it is related to the fact, that a query:
  select * from mytable limit 100;
 does not start any MapReduce job. It is starts a reading operation from
 HDFS (and a communication with MetaStore to know what is the schema and how
 to parse the data using InputFormat and SerDe).

 For example, If you run a query that has the same functionality (i.e. to
 show all content of the table by specifying all columns in SELECT)
  select column1, column2, ... columnN from mytable limit 100;
 then a map-only job will be started and maybe (?) hive.query.string
 will contain this query..


 2013/12/3 Petter von Dolwitz (Hem) petter.von.dolw...@gmail.com

 Hi,

 I use hive 0.11 with a five machine cluster. I am reading the property
 hive.query.string from a custom RecordReader (used for reading external
 tables).

 If I first invoke a query like

 select * from mytable where mycolumn='myvalue';

 I get the correct query string in this property.

 If I then invoke

 select * from mytable limit 100;

 the property hive.query.string still contains the first query. Seems
 like hive uses local mode for the second query. Don't know if it is 
 related.

 Anybody knows why the query string is not updated in the second case?

 Thanks,
 Petter