Re: NPE from simple nested ANSI Join

2016-02-05 Thread Dave Nicodemus
>Rally Health > >nicholas.hakob...@rallyhealth.com > > > >On Thu, Feb 4, 2016 at 12:57 PM, Dave Nicodemus > > wrote: > >> Thanks Nick, > >> > >> I did a few experiments and found that the version of the query below > >>d

Re: NPE from simple nested ANSI Join

2016-02-04 Thread Dave Nicodemus
requires it to > have a table name alias so it can be referenced in an outer statement. > > -Nick > > Nicholas Szandor Hakobian > Data Scientist > Rally Health > nicholas.hakob...@rallyhealth.com > > On Thu, Feb 4, 2016 at 11:28 AM, Dave Nicodemus > wrote: >

NPE from simple nested ANSI Join

2016-02-04 Thread Dave Nicodemus
integer data type Does anyone know if this is a known issue and whether it's fixed someplace ? Thanks, Dave Stack Caused by: java.lang.NullPointerExcEeption: Remote java.lang.NullPointerException: null at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.isPr

HiveException: No type found for column type entry

2016-02-03 Thread Dave Maughan
orcfiledump of various ORC files being used in the query some are DICTIONARY_V2 and some are DIRECT_V2 encoded, depending on the data for column 584. We can workaround it by disabling hive.vectorized.execution.enabled. Has anyone else experienced anything similar? Thanks -Dave

Re: query orc file by hive

2015-11-13 Thread Dave Maughan
) LOCATION '/hdfs/folder/containing/orc/files/col1=val1/col2=val2'; Thanks - Dave On Fri, 13 Nov 2015 at 11:59 patcharee wrote: > Hi, > > It work with non-partition ORC, but does not work with (2-column) > partitioned ORC. > > Thanks, > Patcharee > > >

Insert with dynamic partitioning from an ORC table fails

2015-07-08 Thread Dave Maughan
seen this error before? Is it fixed in a later version? I've included reproduction steps below. Thanks - Dave *Create a sample text file* echo "a1,b1" > part *Create a textfile table and load the data into it* CREATE TABLE t1 (a STRING, b STRING) PARTITIONED BY (c STRING)

Writing uniontype to ORC file outside of hive

2015-06-12 Thread Dave Maughan
ith the current class/method accessibility this is the only option available, which leaves me with the ugly OrcUnion factory workaround. Is there something I've missed? Is there a specific reason for these two observations or were they just an oversight? Thanks, Dave package org.apache.had

ORC HiveChar, HiveVarchar & HiveDecimal

2015-04-01 Thread Dave Maughan
e what I'm trying to do that I've missed? Regards, Dave

Re: hiveserver2 Thrift Interface With Perl

2013-07-01 Thread Dave Cardwell
://metacpan.org/release/Thrift-API-HiveClient2 -- Best wishes, Dave Cardwell. http://davecardwell.co.uk/ On 14 May 2013 11:35, Dave Cardwell wrote: > I wrote a few reporting scripts in Perl that connected to Hive via the > Thrift interface. > > Since we upgraded CDH to 4.2.0 and hiv

Re: hiveserver2 Thrift Interface With Perl

2013-05-15 Thread Dave Cardwell
Hi guys, I had already used the NOSASL setting to turn off that authentication, so was able to connect to the cluster fine. My issue is with how to use the new API to execute a query and get the response. -- Best wishes, Dave Cardwell. http://davecardwell.co.uk/ On 15 May 2013 00:05, Carl

hiveserver2 Thrift Interface With Perl

2013-05-14 Thread Dave Cardwell
ion() without an argument, the TCLIService module itself complains that it cannot create a TOpenSessionResp object because the class is not loaded. I have attached example code. Can anyone advise me on how to get past this block? -- Best wishes, Dave Cardwell. http://davecardwell.co

Lag & Lead

2012-01-11 Thread Dave Houston
Hi guys, trying to calculate the dwell time of pages in a weblog. In oracle we would used the lead analytic function to find the next row for a particular cookie. What is the best approach for Hive? Thanks Dave Dave Houston r...@crankyadmin.net

Insert based on whether string contains

2012-01-04 Thread Dave Houston
ture where regexp_extract(event_list, '\d+') = "239"; is that I have at the minute but always returns 0 Rows loaded to video_plays_for_sept Many thanks Dave Houston r...@crankyadmin.net

Re: Ignore subdirectories when querying external table

2011-08-19 Thread Dave
DELIMITED FIELDS TERMINATED BY '\t' STORED AS INPUTFORMAT 'com.example.mapreduce.input.TextFileInputFormatIgnoreSubDir' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/data/test/users'; Hope this saves someone else the trouble of

Ignore subdirectories when querying external table

2011-08-18 Thread Dave
look at files? Thanks in advance, -Dave

Re: Hive MAP/REDUCE/TRANSFORM output creates many small files

2011-08-16 Thread Dave Brondsema
T tkey, tvalue > > In my case, 32 reducers are launched, and dest1 always ends up with 32 > files. If I set hive.exec.reducers.max=1, it does launch only 1 reducer > (instead of 32), but I still get 32 teeny output files. Setting the > various "hive.merge.*” options does not see

Re: cannot start the transform script. reason : "argument list too long"

2011-03-02 Thread Dave Brondsema
{"key":{"reducesinkkey0":"AA11223344","reducesinkkey1":"20110210_02"},"value":{"_col0":"x","_col1":"m1","_col2":"20110210_02","_col3":"{'m07': >> 'x12', 'm02': 'x34', 'm01': 'm45'}","_col4":"0A9"},"alias":0} >> >> at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:265) >> >> at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:467) >> >> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:415) >> >> at org.apache.hadoop.mapred.Child$4.run(Child.java:217) >> >> at java.security.AccessController.doPrivileged(Native Method) >> >> at javax.security.auth.Subject.doAs(Subject.java:396) >> >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063) >> >> at org.apache.hadoop.mapred.Child.main(Child.java:211) >> >> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime >> Error while processing row >> (tag=0) >> {"key":{"reducesinkkey0":"AA11223344","reducesinkkey1":"20110210_02"},"value":{"_col0":"x","_col1":"m1","_col2":"20110210_02","_col3":"{'m07': >> 'x12', 'm02': 'x34', 'm01': 'm45'}","_col4":"0A9"},"alias":0} >> >> at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:253) >> >> ... 7 more >> >> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Cannot >> initialize ScriptOperator >> >> at >> org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:320) >> >> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457) >> >> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697) >> >> at >> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) >> >> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457) >> >> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697) >> >> at >> org.apache.hadoop.hive.ql.exec.ExtractOperator.processOp(ExtractOperator.java:45) >> >> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457) >> >> at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:244) >> >> ... 7 more >> >> Caused by: java.io.IOException: Cannot run program "/usr/bin/python2.6": >> java.io.IOException: error=7, Argument list too long >> >> at java.lang.ProcessBuilder.start(ProcessBuilder.java:459) >> >> at >> org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:279) >> >> ... 15 more >> >> Caused by: java.io.IOException: java.io.IOException: error=7, Argument >> list too long >> >> at java.lang.UNIXProcess.(UNIXProcess.java:148) >> >> at java.lang.ProcessImpl.start(ProcessImpl.java:65) >> >> at java.lang.ProcessBuilder.start(ProcessBuilder.java:452) >> >> ... 16 more >> >> 2011-03-01 14:46:13,784 INFO org.apache.hadoop.mapred.Task: Runnning >> cleanup for the task >> >> >> >> >> >> >> > > -- Dave Brondsema Software Engineer Geeknet www.geek.net

Re: counting impressions strategy

2011-03-01 Thread Dave Viner
org/hadoop/Hive/LanguageManual/UDF). Possibly use of a Map type would be best.. not sure. HTH Dave Viner On Tue, Mar 1, 2011 at 4:33 AM, Cam Bazz wrote: > Hello, > > Now I would like to count impressions per item. To achieve this, I > made a logger, for instance when the user goes i

Re: Stopping Hive Metastore Service

2011-01-27 Thread Dave Brondsema
@01C6BF9D.4CF84EB0]** > > 4 Park Plaza, suite 1500, Irvine, CA 92614 > > > > *we deliver specific audiences to advertisers* > > > > Visit* www.specificmedia.com*. > > > > > -- Dave Brondsema Software Engineer Geeknet www.geek.net

Re: hive not stable,weird exception

2010-11-29 Thread Dave Brondsema
eal with it as a failed job, but > hive can't return the correct result. > > 2010-11-29 > -- > shangan > -- Dave Brondsema Software Engineer Geeknet www.geek.net

Re: Hive produces very small files despite hive.merge...=true settings

2010-11-19 Thread Dave Brondsema
InputFormat, which is used for the new merge job. Someone > reported previously merge was not successful because of this. If that's the > case, you can turn off CombineHiveInputFormat and use the old > HiveInputFormat (though slower) by setting hive.mergejob.maponly=false. > >>> > >>> Ning > >>> On Nov 17, 2010, at 6:00 PM, Leo Alekseyev wrote: > >>> > >>>> I have jobs that sample (or generate) a small amount of data from a > >>>> large table. At the end, I get e.g. about 3000 or more files of 1kb > >>>> or so. This becomes a nuisance. How can I make Hive do another pass > >>>> to merge the output? I have the following settings: > >>>> > >>>> hive.merge.mapfiles=true > >>>> hive.merge.mapredfiles=true > >>>> hive.merge.size.per.task=25600 > >>>> hive.merge.size.smallfiles.avgsize=1600 > >>>> > >>>> After setting hive.merge* to true, Hive started indicating "Total > >>>> MapReduce jobs = 2". However, after generating the > >>>> lots-of-small-files table, Hive says: > >>>> Ended Job = job_201011021934_1344 > >>>> Ended Job = 781771542, job is filtered out (removed at runtime). > >>>> > >>>> Is there a way to force the merge, or am I missing something? > >>>> --Leo > >>> > >>> > > > > > -- Dave Brondsema Software Engineer Geeknet www.geek.net

Re: Merging small files with dynamic partitions

2010-11-12 Thread Dave Brondsema
I copied Hadoop19Shims' implementation of getCombineFileInputFormat (HIVE-1121) into Hadoop18Shims and it worked, if anyone is interested. And hopefully we can upgrade our Hadoop version soon :) On Fri, Nov 12, 2010 at 12:44 PM, Dave Brondsema wrote: > It seems that I can't use thi

Re: Merging small files with dynamic partitions

2010-11-12 Thread Dave Brondsema
on of getCombineFileInputFormat into Hadoop18Shims? On Wed, Nov 10, 2010 at 4:31 PM, yongqiang he wrote: > I think the problem was solved in hive trunk. You can just try hive trunk. > > On Wed, Nov 10, 2010 at 10:05 AM, Dave Brondsema > wrote: > > Hi, has there been any resolution to thi

Re: Merging small files with dynamic partitions

2010-11-10 Thread Dave Brondsema
> >> raised hive.merge.smallfiles.avgsize. I'm wondering if the filtering > >> at runtime is causing the merge process to be skipped. Attached are > >> the hive output and log files. > >> > >> > >> Thanks, > >> Sammy > >> > > > > > > > > -- > Chief Architect, BrightEdge > email: s...@brightedge.com | mobile: 650.539.4867 | fax: > 650.521.9678 | address: 1850 Gateway Dr Suite 400, San Mateo, CA > 94404 > -- Dave Brondsema Software Engineer Geeknet www.geek.net

USING .. AS column names

2010-10-13 Thread Dave Brondsema
'foo', 'bar', 'baz') USING '/bin/cat' AS (x, y, z) limit 1 select * from test2 > ['foo', 'bar', 'baz'] I'd recommend that Hive either support column reordering with the AS statement, or make it completely optional (although this may be backwards-incompatible with the docs at the link above). -- Dave Brondsema Software Engineer Geeknet www.geek.net

Re: boolean types thru a transform script

2010-10-13 Thread Dave Brondsema
; When I log the 'folder' value from inside reduce.py, it shows: > > 2010-10-12 15:32:10,914 - dstat - INFO - reduce to stdout, h[folder]: > > i.e., an empty string. But when the INSERT executes, it seems to treat the > value as TRUE (or string 'true')? > > > select folder from dl_day > ['true'] > ['true'] > ['true'] > ['true'] > ... > > How can I preserve the FALSE value thru the transform script? > > Thanks, > -L > -- Dave Brondsema Software Engineer Geeknet www.geek.net