Re: Predicate pushdown optimisation not working for ORC

2014-04-04 Thread Abhay Bansal
It is not essentially the client side. The file can be generated by a running MR job or any process for that matter which can then feed data to the Hadoop cluster to run hive queries. -Abhay On Thu, Apr 3, 2014 at 7:40 PM, Bogala, Chandra Reddy chandra.bog...@gs.com wrote: I thought ORC file

Re: Predicate pushdown optimisation not working for ORC

2014-04-04 Thread Abhay Bansal
I was able to find the property with some digging around and experimentation. Never knew that ppd had something to do with this property. On Thu, Apr 3, 2014 at 7:23 PM, Stephen Sprague sprag...@gmail.com wrote: wow. good find. i hope these config settings are well documented and that you

Unable to set hive.fetch.output.serde via hive2 jdbc

2014-04-04 Thread Adrian Hains
Background: I have some somewhat deeply nested types in a hive table that are causing queries to error out when executing through JDBC. The same queries are successful via the CLI and via hue beeswax clients. The JDBC error states Number of levels of nesting supported for LazySimpleSerde is 7

Re: Problem querying deeply nested data with Parquet and ORC File Hive SerDes

2014-04-04 Thread mpeterson2
I figured out the problem. The JSON SerDe I wrote is not case sensitive, but the ORC and Parquet SerDes are case sensitive. So this works: select ClientCode, Encounter.Number from parquet_tbl; but this does not: select clientcode, encounter.Number from parquet_tbl; -Michael On Thu, Apr 3,

Setting mapred.child.java.opts in Hive script results in MR job getting 'killed' right away

2014-04-04 Thread Decimus Phostle
Hello Folks, I have been having a few jobs failing due to OutOfMemory and GC overhead limit exceeded errors. To counter these I tried setting `SET mapred.child.java.opts=-Xmx3G -XX:+UseConcMarkSweepGC;` at the start of the hive script**. Basically any time I add this option to the script, the MR

Unable to run Join queries

2014-04-04 Thread saquib khan
Dear Folks, Whenever I run join queries, it does not display the output, thought it give me output for queries on single table. Thanks and Regards, Saky

Re: Unable to run Join queries

2014-04-04 Thread saquib khan
I get java exceptions while running the queries: java.lang.InstantiationException: org.antlr.runtime.CommonToken Continuing ... java.lang.RuntimeException: failed to evaluate: unbound=Class.new(); Continuing ... java.lang.InstantiationException: org.antlr.runtime.CommonToken Continuing ...

Re: Unable to run Join queries

2014-04-04 Thread Decimus Phostle
It might help if you post details on the queries themselves. On Fri, Apr 4, 2014 at 1:14 PM, saquib khan skhan...@gmail.com wrote: I get java exceptions while running the queries: java.lang.InstantiationException: org.antlr.runtime.CommonToken Continuing ... java.lang.RuntimeException:

Re: Unable to run Join queries

2014-04-04 Thread saquib khan
Thanks Decimus. Query: SELECT exposed_time, ROUND(COUNT(ses_tx_20130805.pid)/10) COUNT FROM tx_demography_info join ses_tx_20130805 on (tx_demography_info.pid=ses_tx_20130805.pid) where countyid='50015' GROUP BY exposed_time ORDER BY exposed_time; On Fri, Apr 4, 2014 at 4:22 PM, Decimus

Re: Unable to run Join queries

2014-04-04 Thread saquib khan
Query : SELECT exposed_time, ROUND(COUNT(ses_tx_20130805.pid)/10) COUNT FROM tx_demography_info join ses_tx_20130805 on (tx_demography_info.pid=ses_tx_20130805.pid) where countyid='50015' GROUP BY exposed_time ORDER BY exposed_time; On Fri, Apr 4, 2014 at 4:14 PM, saquib khan

Re: Unable to run Join queries

2014-04-04 Thread Decimus Phostle
You seem to have a dangling aggregate function in your SELECT: SELECT exposed_time, ROUND(COUNT(ses_tx_20130805.pid)/10) ***COUNT*** FROM tx_demography_info JOIN ses_tx_20130805 ON (tx_demography_info.pid=ses_tx_20130805.pid) WHERE countyid='50015' GROUP BY exposed_time ORDER BY exposed_time;

Error using ORC format

2014-04-04 Thread Amit Tewari
Hi All, I am just trying to do some simple tests to see speedup in hive query with Hive 0.14 (trunk version this morning). Just tried to use sample test case to start with. First wanted to see how much I can speed up using ORC format. However for some reason I can't insert data into the

Can I update just one row in Hive table using Hive INSERT OVERWRITE

2014-04-04 Thread Raj Hadoop
Can I update ( delete and insert kind of)just one row keeping the remaining rows intact in Hive table using Hive INSERT OVERWRITE. There is no partition in the Hive table. INSERT OVERWRITE TABLE tablename SELECT col1,col2,col3 from tabx where col2='abc'; Does the above work ? Please

Re: Unable to run Join queries

2014-04-04 Thread saquib khan
I removed the count, but it still does not output the query results. These are the parameters I set for hive: set hive.stats.autogather=false set hive.optimize.autoindex=true; set hive.optimize.index.filter=true; set hive.exec.parallel=true; set mapred.reduce.tasks=5; set

Re: Can I update just one row in Hive table using Hive INSERT OVERWRITE

2014-04-04 Thread Nitin Pawar
for non partitioned columns ans in one word: NO detailed answer here: This feature is still being build as part of https://issues.apache.org/jira/browse/HIVE-5317 On Sat, Apr 5, 2014 at 2:28 AM, Raj Hadoop hadoop...@yahoo.com wrote: Can I update ( delete and insert kind of) just one row

Error using ORC Format with Hive

2014-04-04 Thread Amit Tewari
Hi All, I am just trying to do some simple tests to see speedup in hive query with Hive 0.14 (trunk version this morning). Just tried to use sample test case to start with. First wanted to see how much I can speed up using ORC format. However for some reason I can't insert data into the

Re: Setting mapred.child.java.opts in Hive script results in MR job getting 'killed' right away

2014-04-04 Thread Decimus Phostle
Turns out it was just a trivial/inane Hive-ism of setting the 'value' in a particular way. *sigh*. The SO link(http://goo.gl/j9II0V) has details. On Fri, Apr 4, 2014 at 12:41 PM, Decimus Phostle decimusphos...@gmail.comwrote: Hello Folks, I have been having a few jobs failing due to

Re: Error using ORC Format with Hive

2014-04-04 Thread Amit Tewari
I checked out and build hive 0.13. Tried with same results. i.e. eRpcServer.addBlock(NameNodeRpcServer.java:555) at File /tmp/hive-hduser/hive_2014-04-04_20-34-43_550_7470522328893486504-1/_task_tmp.-ext-10002/_tmp.00_3 could only be replicated to 0 nodes instead of minReplication (=1).

Re: Error using ORC Format with Hive

2014-04-04 Thread Bryan Jeffrey
Amit, Are you executing your select for conversion to orc via beeline, or hive cli? From looking at your logs, it appears that you do not have permissions in hdfs to write the resultant orc data. Check permissions in hdfs to ensure that your user has write permissions to write to hive warehouse.