Re: Tez session after closing CLI

2014-12-08 Thread Gopal V

On 12/8/14, 10:09 PM, Fabio wrote:

Hi everyone,
when running Hive on Tez, a Tez session is alive within the Hive CLI
until I leave the CLI. So if I run on the terminal something like "hive
-f query.sql", once the query is completed the Tez session is closed. Is
there a way to run a query in this way (let's say from the linux
terminal using the -f parameter) but still keeping the Tez session open
among subsequent commands of this kind?


Not in the way you described, but HiveServer2 can pool connections.

Then beeline (not fat client, but the thin SQL CLI) can connect to it 
and pick up an existing session for reuse.


There's an earlier thread which discusses how this is configured

http://mail-archives.apache.org/mod_mbox/hive-user/201412.mbox/%3ccag6lhyf3taze7grem1psuquk6g+mpmdyno8j6okpcleqnvy...@mail.gmail.com%3E

To use this effectively, you need HiveServer2 to use the GRANT/ALLOW 
security model (similar to how mysql runs all queries as mysql user, 
while allowing granular table level security).


Cheers,
Gopal


Specify encoding for columns in parquet

2014-12-08 Thread Liu, Jun A
Hi everyone
I've been searching this for a while and haven't find an answer for it.
Does anyone know that if I can explicitly specify the encoding algorithm to use 
for individual columns of a Hive table stored as parquet file format.

For example, is it possible to choose RLE for column1, dictionary encoding for 
column2, etc...


Tez session after closing CLI

2014-12-08 Thread Fabio

Hi everyone,
when running Hive on Tez, a Tez session is alive within the Hive CLI 
until I leave the CLI. So if I run on the terminal something like "hive 
-f query.sql", once the query is completed the Tez session is closed. Is 
there a way to run a query in this way (let's say from the linux 
terminal using the -f parameter) but still keeping the Tez session open 
among subsequent commands of this kind?


Thanks in advance

Fabio


Using xPATH and Hive SQL to access XML data, but xPath a problem

2014-12-08 Thread David Novogrodsky
I created a Hive table using one column. Each row contains one XML record.
Here is the script I used to create this first table:

CREATE EXTERNAL TABLE xml_event_table (
xmlevent string)
STORED AS TEXTFILE
LOCATION “/user/cloudera/vector/events”;

Here is a sample XML in one row of the xm-Levent_table:

http://schemas.microsoft.com/win/2004/08/events/event”> 4672 0…

I want to create a view that contains the EventID. But the XPath is not
working correctly:

CREATE VIEW xpath_xml_event_view01(event_id, computer, user_id)
AS SELECT
xpath_string(xmlevent, ‘Event/System/EventID’)
FROM xml_event_table;

I modeled the solution using this web site:

https://communities.intel.com/community/itpeernetwork/datastack/blog/2013/08/15/hadoop-tutorialsingesting-xml-in-hive-using-xpath

Also,
If I change the Hive script to:

CREATE VIEW xpath_xml_event_view01(event_id)
AS SELECT
xpath(xmlevent, '/Event[@xmlns="
http://schemas.microsoft.com/win/2004/08/events/event
"]/System/EventID[@Qualifiers=""]/text()')
FROM xml_event_table;
I get this result when I select all using this view:

0   []
1   []

If I try this Hive script:

CREATE VIEW xpath_xml_event_view01(event_id)
AS SELECT
xpath_string(xmlevent, '/Event[1]/System[1]/EventID')
FROM xml_event_table;
or this Hive script:

CREATE VIEW xpath_xml_event_view01(event_id)
AS SELECT
xpath_string(xmlevent, '/Event[@xmlns="
http://schemas.microsoft.com/win/2004/08/events/event
"]/System/EventID[@Qualifiers=""]/text()')
FROM xml_event_table;

I get this result(empty rows):

0
1
2
David Novogrodsky
david.novogrod...@gmail.com
http://www.linkedin.com/in/davidnovogrodsky


Hive returns different results with/without LZO index when hive.hadoop.supports.splittable.combineinputformat=true

2014-12-08 Thread Nathalie Blais
Hello,

We are experiencing this old issue in our current installation:

https://issues.apache.org/jira/browse/MAPREDUCE-5537

All our data is LZO compressed and indexed; the case is 100% reproducible on 
our CDH 5.2.0 cluster (using MR2 and Yarn).

Do you know if we might be missing a patch or if maybe this particular problem 
found a way back into the code?

Best regards,

Nathalie Blais
B.I. Developer - Ubisoft Montreal