Re: Tez session after closing CLI
On 12/8/14, 10:09 PM, Fabio wrote: Hi everyone, when running Hive on Tez, a Tez session is alive within the Hive CLI until I leave the CLI. So if I run on the terminal something like "hive -f query.sql", once the query is completed the Tez session is closed. Is there a way to run a query in this way (let's say from the linux terminal using the -f parameter) but still keeping the Tez session open among subsequent commands of this kind? Not in the way you described, but HiveServer2 can pool connections. Then beeline (not fat client, but the thin SQL CLI) can connect to it and pick up an existing session for reuse. There's an earlier thread which discusses how this is configured http://mail-archives.apache.org/mod_mbox/hive-user/201412.mbox/%3ccag6lhyf3taze7grem1psuquk6g+mpmdyno8j6okpcleqnvy...@mail.gmail.com%3E To use this effectively, you need HiveServer2 to use the GRANT/ALLOW security model (similar to how mysql runs all queries as mysql user, while allowing granular table level security). Cheers, Gopal
Specify encoding for columns in parquet
Hi everyone I've been searching this for a while and haven't find an answer for it. Does anyone know that if I can explicitly specify the encoding algorithm to use for individual columns of a Hive table stored as parquet file format. For example, is it possible to choose RLE for column1, dictionary encoding for column2, etc...
Tez session after closing CLI
Hi everyone, when running Hive on Tez, a Tez session is alive within the Hive CLI until I leave the CLI. So if I run on the terminal something like "hive -f query.sql", once the query is completed the Tez session is closed. Is there a way to run a query in this way (let's say from the linux terminal using the -f parameter) but still keeping the Tez session open among subsequent commands of this kind? Thanks in advance Fabio
Using xPATH and Hive SQL to access XML data, but xPath a problem
I created a Hive table using one column. Each row contains one XML record. Here is the script I used to create this first table: CREATE EXTERNAL TABLE xml_event_table ( xmlevent string) STORED AS TEXTFILE LOCATION “/user/cloudera/vector/events”; Here is a sample XML in one row of the xm-Levent_table: http://schemas.microsoft.com/win/2004/08/events/event”> 4672 0… I want to create a view that contains the EventID. But the XPath is not working correctly: CREATE VIEW xpath_xml_event_view01(event_id, computer, user_id) AS SELECT xpath_string(xmlevent, ‘Event/System/EventID’) FROM xml_event_table; I modeled the solution using this web site: https://communities.intel.com/community/itpeernetwork/datastack/blog/2013/08/15/hadoop-tutorialsingesting-xml-in-hive-using-xpath Also, If I change the Hive script to: CREATE VIEW xpath_xml_event_view01(event_id) AS SELECT xpath(xmlevent, '/Event[@xmlns=" http://schemas.microsoft.com/win/2004/08/events/event "]/System/EventID[@Qualifiers=""]/text()') FROM xml_event_table; I get this result when I select all using this view: 0 [] 1 [] If I try this Hive script: CREATE VIEW xpath_xml_event_view01(event_id) AS SELECT xpath_string(xmlevent, '/Event[1]/System[1]/EventID') FROM xml_event_table; or this Hive script: CREATE VIEW xpath_xml_event_view01(event_id) AS SELECT xpath_string(xmlevent, '/Event[@xmlns=" http://schemas.microsoft.com/win/2004/08/events/event "]/System/EventID[@Qualifiers=""]/text()') FROM xml_event_table; I get this result(empty rows): 0 1 2 David Novogrodsky david.novogrod...@gmail.com http://www.linkedin.com/in/davidnovogrodsky
Hive returns different results with/without LZO index when hive.hadoop.supports.splittable.combineinputformat=true
Hello, We are experiencing this old issue in our current installation: https://issues.apache.org/jira/browse/MAPREDUCE-5537 All our data is LZO compressed and indexed; the case is 100% reproducible on our CDH 5.2.0 cluster (using MR2 and Yarn). Do you know if we might be missing a patch or if maybe this particular problem found a way back into the code? Best regards, Nathalie Blais B.I. Developer - Ubisoft Montreal