>Rally Health
> >nicholas.hakob...@rallyhealth.com
> >
> >On Thu, Feb 4, 2016 at 12:57 PM, Dave Nicodemus
> > wrote:
> >> Thanks Nick,
> >>
> >> I did a few experiments and found that the version of the query below
> >>d
requires it to
> have a table name alias so it can be referenced in an outer statement.
>
> -Nick
>
> Nicholas Szandor Hakobian
> Data Scientist
> Rally Health
> nicholas.hakob...@rallyhealth.com
>
> On Thu, Feb 4, 2016 at 11:28 AM, Dave Nicodemus
> wrote:
>
integer data type
Does anyone know if this is a known issue and whether it's fixed someplace
?
Thanks,
Dave
Stack
Caused by: java.lang.NullPointerExcEeption: Remote
java.lang.NullPointerException: null
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.isPr
orcfiledump of various ORC
files being used in the query some are DICTIONARY_V2 and some are DIRECT_V2
encoded, depending on the data for column 584.
We can workaround it by disabling hive.vectorized.execution.enabled.
Has anyone else experienced anything similar?
Thanks
-Dave
) LOCATION
'/hdfs/folder/containing/orc/files/col1=val1/col2=val2';
Thanks - Dave
On Fri, 13 Nov 2015 at 11:59 patcharee wrote:
> Hi,
>
> It work with non-partition ORC, but does not work with (2-column)
> partitioned ORC.
>
> Thanks,
> Patcharee
>
>
>
seen this error
before? Is it fixed in a later version? I've included reproduction steps
below.
Thanks - Dave
*Create a sample text file*
echo "a1,b1" > part
*Create a textfile table and load the data into it*
CREATE TABLE t1 (a STRING, b STRING)
PARTITIONED BY (c STRING)
ith the current
class/method accessibility this is the only option available, which
leaves me with the ugly OrcUnion factory workaround.
Is there something I've missed? Is there a specific reason for these
two observations or were they just an oversight?
Thanks,
Dave
package org.apache.had
e what I'm trying to do that I've missed?
Regards,
Dave
://metacpan.org/release/Thrift-API-HiveClient2
--
Best wishes,
Dave Cardwell.
http://davecardwell.co.uk/
On 14 May 2013 11:35, Dave Cardwell wrote:
> I wrote a few reporting scripts in Perl that connected to Hive via the
> Thrift interface.
>
> Since we upgraded CDH to 4.2.0 and hiv
Hi guys,
I had already used the NOSASL setting to turn off that authentication, so
was able to connect to the cluster fine.
My issue is with how to use the new API to execute a query and get the
response.
--
Best wishes,
Dave Cardwell.
http://davecardwell.co.uk/
On 15 May 2013 00:05, Carl
ion() without an argument, the TCLIService
module itself complains that it cannot create a TOpenSessionResp object
because the class is not loaded.
I have attached example code. Can anyone advise me on how to get past this
block?
--
Best wishes,
Dave Cardwell.
http://davecardwell.co
Hi guys,
trying to calculate the dwell time of pages in a weblog. In oracle we would
used the lead analytic function to find the next row for a particular cookie.
What is the best approach for Hive?
Thanks
Dave
Dave Houston
r...@crankyadmin.net
ture where regexp_extract(event_list, '\d+') = "239";
is that I have at the minute but always returns 0 Rows loaded to
video_plays_for_sept
Many thanks
Dave Houston
r...@crankyadmin.net
DELIMITED FIELDS TERMINATED BY '\t'
STORED AS INPUTFORMAT
'com.example.mapreduce.input.TextFileInputFormatIgnoreSubDir'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION '/data/test/users';
Hope this saves someone else the trouble of
look at files?
Thanks in advance,
-Dave
T tkey, tvalue
>
> In my case, 32 reducers are launched, and dest1 always ends up with 32
> files. If I set hive.exec.reducers.max=1, it does launch only 1 reducer
> (instead of 32), but I still get 32 teeny output files. Setting the
> various "hive.merge.*” options does not see
{"key":{"reducesinkkey0":"AA11223344","reducesinkkey1":"20110210_02"},"value":{"_col0":"x","_col1":"m1","_col2":"20110210_02","_col3":"{'m07':
>> 'x12', 'm02': 'x34', 'm01': 'm45'}","_col4":"0A9"},"alias":0}
>>
>> at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:265)
>>
>> at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:467)
>>
>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:415)
>>
>> at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
>>
>> at java.security.AccessController.doPrivileged(Native Method)
>>
>> at javax.security.auth.Subject.doAs(Subject.java:396)
>>
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063)
>>
>> at org.apache.hadoop.mapred.Child.main(Child.java:211)
>>
>> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime
>> Error while processing row
>> (tag=0)
>> {"key":{"reducesinkkey0":"AA11223344","reducesinkkey1":"20110210_02"},"value":{"_col0":"x","_col1":"m1","_col2":"20110210_02","_col3":"{'m07':
>> 'x12', 'm02': 'x34', 'm01': 'm45'}","_col4":"0A9"},"alias":0}
>>
>> at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:253)
>>
>> ... 7 more
>>
>> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Cannot
>> initialize ScriptOperator
>>
>> at
>> org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:320)
>>
>> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457)
>>
>> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697)
>>
>> at
>> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
>>
>> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457)
>>
>> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697)
>>
>> at
>> org.apache.hadoop.hive.ql.exec.ExtractOperator.processOp(ExtractOperator.java:45)
>>
>> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457)
>>
>> at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:244)
>>
>> ... 7 more
>>
>> Caused by: java.io.IOException: Cannot run program "/usr/bin/python2.6":
>> java.io.IOException: error=7, Argument list too long
>>
>> at java.lang.ProcessBuilder.start(ProcessBuilder.java:459)
>>
>> at
>> org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:279)
>>
>> ... 15 more
>>
>> Caused by: java.io.IOException: java.io.IOException: error=7, Argument
>> list too long
>>
>> at java.lang.UNIXProcess.(UNIXProcess.java:148)
>>
>> at java.lang.ProcessImpl.start(ProcessImpl.java:65)
>>
>> at java.lang.ProcessBuilder.start(ProcessBuilder.java:452)
>>
>> ... 16 more
>>
>> 2011-03-01 14:46:13,784 INFO org.apache.hadoop.mapred.Task: Runnning
>> cleanup for the task
>>
>>
>>
>>
>>
>>
>>
>
>
--
Dave Brondsema
Software Engineer
Geeknet
www.geek.net
org/hadoop/Hive/LanguageManual/UDF). Possibly use of a
Map type would be best.. not sure.
HTH
Dave Viner
On Tue, Mar 1, 2011 at 4:33 AM, Cam Bazz wrote:
> Hello,
>
> Now I would like to count impressions per item. To achieve this, I
> made a logger, for instance when the user goes i
@01C6BF9D.4CF84EB0]**
>
> 4 Park Plaza, suite 1500, Irvine, CA 92614
>
>
>
> *we deliver specific audiences to advertisers*
>
>
>
> Visit* www.specificmedia.com*.
>
>
>
>
>
--
Dave Brondsema
Software Engineer
Geeknet
www.geek.net
eal with it as a failed job, but
> hive can't return the correct result.
>
> 2010-11-29
> --
> shangan
>
--
Dave Brondsema
Software Engineer
Geeknet
www.geek.net
InputFormat, which is used for the new merge job. Someone
> reported previously merge was not successful because of this. If that's the
> case, you can turn off CombineHiveInputFormat and use the old
> HiveInputFormat (though slower) by setting hive.mergejob.maponly=false.
> >>>
> >>> Ning
> >>> On Nov 17, 2010, at 6:00 PM, Leo Alekseyev wrote:
> >>>
> >>>> I have jobs that sample (or generate) a small amount of data from a
> >>>> large table. At the end, I get e.g. about 3000 or more files of 1kb
> >>>> or so. This becomes a nuisance. How can I make Hive do another pass
> >>>> to merge the output? I have the following settings:
> >>>>
> >>>> hive.merge.mapfiles=true
> >>>> hive.merge.mapredfiles=true
> >>>> hive.merge.size.per.task=25600
> >>>> hive.merge.size.smallfiles.avgsize=1600
> >>>>
> >>>> After setting hive.merge* to true, Hive started indicating "Total
> >>>> MapReduce jobs = 2". However, after generating the
> >>>> lots-of-small-files table, Hive says:
> >>>> Ended Job = job_201011021934_1344
> >>>> Ended Job = 781771542, job is filtered out (removed at runtime).
> >>>>
> >>>> Is there a way to force the merge, or am I missing something?
> >>>> --Leo
> >>>
> >>>
> >
> >
>
--
Dave Brondsema
Software Engineer
Geeknet
www.geek.net
I copied Hadoop19Shims' implementation of getCombineFileInputFormat
(HIVE-1121) into Hadoop18Shims and it worked, if anyone is interested.
And hopefully we can upgrade our Hadoop version soon :)
On Fri, Nov 12, 2010 at 12:44 PM, Dave Brondsema wrote:
> It seems that I can't use thi
on of getCombineFileInputFormat into
Hadoop18Shims?
On Wed, Nov 10, 2010 at 4:31 PM, yongqiang he wrote:
> I think the problem was solved in hive trunk. You can just try hive trunk.
>
> On Wed, Nov 10, 2010 at 10:05 AM, Dave Brondsema
> wrote:
> > Hi, has there been any resolution to thi
> >> raised hive.merge.smallfiles.avgsize. I'm wondering if the filtering
> >> at runtime is causing the merge process to be skipped. Attached are
> >> the hive output and log files.
> >>
> >>
> >> Thanks,
> >> Sammy
> >>
> >
> >
>
>
>
> --
> Chief Architect, BrightEdge
> email: s...@brightedge.com | mobile: 650.539.4867 | fax:
> 650.521.9678 | address: 1850 Gateway Dr Suite 400, San Mateo, CA
> 94404
>
--
Dave Brondsema
Software Engineer
Geeknet
www.geek.net
'foo', 'bar', 'baz')
USING '/bin/cat' AS (x, y, z) limit 1
select * from test2
> ['foo', 'bar', 'baz']
I'd recommend that Hive either support column reordering with the AS
statement, or make it completely optional (although this may be
backwards-incompatible with the docs at the link above).
--
Dave Brondsema
Software Engineer
Geeknet
www.geek.net
; When I log the 'folder' value from inside reduce.py, it shows:
>
> 2010-10-12 15:32:10,914 - dstat - INFO - reduce to stdout, h[folder]:
>
> i.e., an empty string. But when the INSERT executes, it seems to treat the
> value as TRUE (or string 'true')?
>
> > select folder from dl_day
> ['true']
> ['true']
> ['true']
> ['true']
> ...
>
> How can I preserve the FALSE value thru the transform script?
>
> Thanks,
> -L
>
--
Dave Brondsema
Software Engineer
Geeknet
www.geek.net
26 matches
Mail list logo