Apache hive Thrift PHP

2016-02-04 Thread Archana Patel
Hello, I have configured hadoop2.7 and apache hive 1.2.1. I want to connect Apache hive to php using thrift. I have followed https://cwiki.apache.org/confluence/display/Hive/HiveClient#HiveClient-PHP this tutorial and i get connection successful but there is a syntax error. Error which i am f

Re: NPE from simple nested ANSI Join

2016-02-04 Thread Sergey Shelukhin
The stack below looks like a bug; Hive should support joins like these, or it should fail with a parse error, not an NPE. Can you open a JIRA? On 16/2/4, 15:15, "Nicholas Hakobian" wrote: >I'm only aware of this: >https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins >but its unc

Re: NPE from simple nested ANSI Join

2016-02-04 Thread Nicholas Hakobian
I'm only aware of this: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins but its unclear if it supports your syntax or not. Nicholas Szandor Hakobian Data Scientist Rally Health nicholas.hakob...@rallyhealth.com On Thu, Feb 4, 2016 at 12:57 PM, Dave Nicodemus wrote: > Thanks

Re: NPE from simple nested ANSI Join

2016-02-04 Thread Dave Nicodemus
Thanks Nick, I did a few experiments and found that the version of the query below does work. So I'm not sure about your theory. Do you know if there is a document that spells out the exact accepted syntax ? SELECT COUNT(*) FROM (nation n INNER JOIN customer c ON n.n_nationkey = c.c_nationkey) IN

Re: NPE from simple nested ANSI Join

2016-02-04 Thread Nicholas Hakobian
I don't believe Hive supports that join format. Its expecting either a table name or a subquery. If its a subquery, it usually requires it to have a table name alias so it can be referenced in an outer statement. -Nick Nicholas Szandor Hakobian Data Scientist Rally Health nicholas.hakob...@rallyh

NPE from simple nested ANSI Join

2016-02-04 Thread Dave Nicodemus
Using hive 1.2.1.2.3 Connecting using JDBC, issuing the following query : SELECT COUNT(*) FROM nation n INNER JOIN (customer c INNER JOIN orders o ON c.c_custkey = o.o_custkey) ON n.n_nationkey = c.c_nationkey; Generates the NPE and stack below. Fields are integ

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-04 Thread Elliot West
Related to this and for the benefit of anyone who is using Hive: The issues around testing and some possible approaches are summarised here: https://cwiki.apache.org/confluence/display/Hive/Unit+testing+HQL Ultimately there are no elegant solutions to the limitations correctly described by Koert

RE: Hive on Spark Engine versus Spark using Hive metastore

2016-02-04 Thread Mich Talebzadeh
Hi Edward, There is another angle to it as well. Fit for purpose. We are currently migrating from a propriety DW on SAN to Hive on JBOD. It is going smoothly. It will save us $$ in licensing fees in times where the technology and storage dollars are at premium. Our DBAs that look afte

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-04 Thread Koert Kuipers
fair enough On Thu, Feb 4, 2016 at 12:41 PM, Edward Capriolo wrote: > Hive is not the correct tool for every problem. Use the tool that makes > the most sense for your problem and your experience. > > Many people like hive because it is generally applicable. In my case study > for the hive book

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-04 Thread Edward Capriolo
Hive is not the correct tool for every problem. Use the tool that makes the most sense for your problem and your experience. Many people like hive because it is generally applicable. In my case study for the hive book I highlighted many smart capably organizations use hive. Your argument is total

Re: Hive optimizer

2016-02-04 Thread Ashok Kumar
Thank you for the link. On Thursday, 4 February 2016, 8:07, Lefty Leverenz wrote: You can find Hive CBO information in Cost Based Optimizer in Hive. -- Lefty On Wed, Feb 3, 2016 at 11:48 AM, John Pullokkaran wrote: Its both.Some of the optimizations are rule based and some are cost

RE: Hive on Spark Engine versus Spark using Hive metastore

2016-02-04 Thread Koert Kuipers
Is the sky the limit? I know udfs can be used inside hive, like lambas basically i assume, and i will assume you have something similar for aggregations. But that's just abstractions inside a single map or reduce phase, pretty low level stuff. What you really need is abstractions around many map an

Re: Hive on Spark Engine versus Spark using Hive metastore

2016-02-04 Thread Edward Capriolo
Lol very off beat convo for the hive list. Lets not drag ourselves too far down here. On Wednesday, February 3, 2016, Stephen Sprague wrote: > i refuse to take anybody seriously who has a sig file longer than one line > and that there is just plain repugnant. > > On Wed, Feb 3, 2016 at 1:47 PM,

Re: Is Hive Index officially not recommended?

2016-02-04 Thread Amey Barve
Hi Gopal, As you suggested in your email above that *Part #1 of using hive indexes effectively is to write your ownHiveIndexHandler, with usesIndexTable=false;* *And then write a IndexPredicateAnalyzer, which lets you map arbitrarylookups into other range conditions.* Is anybody storing there

Querying Hive from R via SparkR

2016-02-04 Thread Thomas Achache
Hello everyone, We are running a Hive Cluster in a Kerberos environment, that we usually access via ssh from our local machines on windows. I would like to be able to query Hive directly from R on those same windows machines by using the SparkR p

Re: Hive optimizer

2016-02-04 Thread Lefty Leverenz
You can find Hive CBO information in Cost Based Optimizer in Hive . -- Lefty On Wed, Feb 3, 2016 at 11:48 AM, John Pullokkaran < jpullokka...@hortonworks.com> wrote: > Its both. > Some of the optimizations are ru