Re: Permissions preventing me from inserting data into table I have just created

2012-11-26 Thread Tim Havens
make sure :/home/yaboulnaga/tmp/**hive-scratch/ is writeable by your
processes.


On Mon, Nov 26, 2012 at 10:07 AM,  wrote:

> Hello,
>
> I'm using Cloudera's CDH4 with Hive 0.9 and Hive Server 2. I am trying to
> load data into hive using the JDBC driver (the one distributed with
> Cloudera CDH4 "org.apache.hive.jdbc.**HiveDriver". I can create the
> staging table and LOAD LOCAL into it. However when I try to insert data
> into a table with Columnar SerDe Stored As RCFILE I get an error caused by
> file permissions. I don't think that the SerDE or the Stored as parameters
> have anything to do with the problem but I mentioned them for completeness.
> The problem is that hive creates a temporary file in its scratch folder
> (local) owned by hive:hive with permissions 755, then pass it as an input
> to a mapper running as the user mapred:mapred. Now the mapper tries to
> create something inside the input folder (probably can do this elsewhere),
> and the following exception is thrown:
>
> org.apache.hadoop.hive.ql.**metadata.HiveException: java.io.IOException:
> Mkdirs failed to create file:/home/yaboulnaga/tmp/**
> hive-scratch/hive_2012-11-26_**10-46-44_887_**
> 2004468370569495405/_task_tmp.**-ext-10002
> at org.apache.hadoop.hive.ql.io.**HiveFileFormatUtils.**
> getHiveRecordWriter(**HiveFileFormatUtils.java:237)
> at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.**
> createBucketFiles(**FileSinkOperator.java:477)
> at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.closeOp(**
> FileSinkOperator.java:709)
> at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
> java:557)
> at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
> java:566)
> at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
> java:566)
> at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
> java:566)
> at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
> java:566)
> at org.apache.hadoop.hive.ql.**exec.ExecMapper.close(**
> ExecMapper.java:193)
> at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**57)
> at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**
> java:393)
> at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:327)
> at org.apache.hadoop.mapred.**Child$4.run(Child.java:268)
> at java.security.**AccessController.doPrivileged(**Native Method)
> at javax.security.auth.Subject.**doAs(Subject.java:396)
> at org.apache.hadoop.security.**UserGroupInformation.doAs(**
> UserGroupInformation.java:**1332)
> at org.apache.hadoop.mapred.**Child.main(Child.java:262)
>
>
> As you might have noticed, I moved the scrach folder to a directory under
> my home dir so that I can give this directory 777 permissions. The idea was
> to use hive.files.umask.value of  to cause subdirectories to inherit
> the same open permission (not the best workaround, but wouldn't hurt on my
> local machine). Unfortunately this didn't work even when I added a umask
>  to /etc/init.d/hiveserver2. Can someone please tell me what's the
> right way to do this? I mean create a table and then insert values into it!
> The Hive QL statements I use are very similar to the ones in the tutorials
> about loading data.
>
> Cheers!
> -- Younos
>
>
>
>


-- 
"The whole world is you. Yet you keep thinking there is something else." -
Xuefeng Yicun 822-902 A.D.

Tim R. Havens
Google Phone: 573.454.1232
ICQ: 495992798
ICBM:  37°51'34.79"N   90°35'24.35"W
ham radio callsign: NW0W


FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.FunctionTask

2012-01-22 Thread Tim Havens
I have a similar UDF to this one which create's just fine.

I cam seem to resolve what 'return code -101' means however with this
one.

Can anyone tell me what 'return code -101' means?

My StemTermsUDF.jar has the proper classpath for the JWNL jars
already, I'm trying to insure they've REALLY available by 'add jar'
within hive.

All the paths are correct and I've checked them MANY times to be sure
before posting this.

hive> add file /usr/lib/hadoop/lib/jwnl/jwnl_properties.xml;
Added resource: /usr/lib/hadoop/lib/jwnl/jwnl_properties.xml
hive> add jar /usr/lib/hive/lib/jwnl/jwnl.jar;
Added /usr/lib/hive/lib/jwnl/jwnl.jar to class path
Added resource: /usr/lib/hive/lib/jwnl/jwnl.jar
hive> add jar /usr/lib/hive/lib/jwnl/commons-logging.jar;
Added /usr/lib/hive/lib/jwnl/commons-logging.jar to class path
Added resource: /usr/lib/hive/lib/jwnl/commons-logging.jar
hive> add jar StemTermsUDF.jar;
Added StemTermsUDF.jar to class path
Added resource: StemTermsUDF.jar
hive> create temporary function StemTermsUDF as
'org.apache.hadoop.hive.ql.udf.StemTermsUDF';
java.lang.NoClassDefFoundError: net/didion/jwnl/JWNLException
   at java.lang.Class.forName0(Native Method)
   at java.lang.Class.forName(Class.java:264)
   at
org.apache.hadoop.hive.ql.exec.FunctionTask.getUdfClass(FunctionTask.java:
119)
   at
org.apache.hadoop.hive.ql.exec.FunctionTask.createFunction(FunctionTask.
java:
75)
   at
org.apache.hadoop.hive.ql.exec.FunctionTask.execute(FunctionTask.java:
63)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:
130)
   at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:
57)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:
1063)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:900)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:748)
   at
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:209)
   at
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:286)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:
516)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
57)
   at
sun.reflect.DelegatingMethodAccessorImpl.invoke(
DelegatingMethodAccessorImpl.java:
43)
   at java.lang.reflect.Method.invoke(Method.java:616)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
Caused by: java.lang.ClassNotFoundException:
net.didion.jwnl.JWNLException
   at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:
294)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
   ... 18 more
FAILED: Execution Error, return code -101 from
org.apache.hadoop.hive.ql.exec.FunctionTask


Re: FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.FunctionTask

2012-01-22 Thread Tim Havens
Unfortunately the issue appears to be something with the Jar, or my UDF.

What I can't seem to resolve is what is causing the -101 Error Code.

Tim

On Sun, Jan 22, 2012 at 3:26 PM, Aniket Mokashi  wrote:

> A simplest way would be to put the jar in auxlib directory. That does the
> both for you I guess. After that you can directly create temporary function
> in hive.
>
> ~Aniket
>
>
> On Sun, Jan 22, 2012 at 1:24 PM, Aniket Mokashi wrote:
>
>> Add the jar to HADOOP_CLASSPATH when you launch hive. That should help.
>>
>> Thanks,
>> Aniket
>>
>>
>> On Sun, Jan 22, 2012 at 9:25 AM, Tim Havens  wrote:
>>
>>> I have a similar UDF to this one which create's just fine.
>>>
>>> I cam seem to resolve what 'return code -101' means however with this
>>> one.
>>>
>>> Can anyone tell me what 'return code -101' means?
>>>
>>> My StemTermsUDF.jar has the proper classpath for the JWNL jars
>>> already, I'm trying to insure they've REALLY available by 'add jar'
>>> within hive.
>>>
>>> All the paths are correct and I've checked them MANY times to be sure
>>> before posting this.
>>>
>>> hive> add file /usr/lib/hadoop/lib/jwnl/jwnl_properties.xml;
>>> Added resource: /usr/lib/hadoop/lib/jwnl/jwnl_properties.xml
>>> hive> add jar /usr/lib/hive/lib/jwnl/jwnl.jar;
>>> Added /usr/lib/hive/lib/jwnl/jwnl.jar to class path
>>> Added resource: /usr/lib/hive/lib/jwnl/jwnl.jar
>>> hive> add jar /usr/lib/hive/lib/jwnl/commons-logging.jar;
>>> Added /usr/lib/hive/lib/jwnl/commons-logging.jar to class path
>>> Added resource: /usr/lib/hive/lib/jwnl/commons-logging.jar
>>> hive> add jar StemTermsUDF.jar;
>>> Added StemTermsUDF.jar to class path
>>> Added resource: StemTermsUDF.jar
>>> hive> create temporary function StemTermsUDF as
>>> 'org.apache.hadoop.hive.ql.udf.StemTermsUDF';
>>> java.lang.NoClassDefFoundError: net/didion/jwnl/JWNLException
>>>at java.lang.Class.forName0(Native Method)
>>>at java.lang.Class.forName(Class.java:264)
>>>at
>>> org.apache.hadoop.hive.ql.exec.FunctionTask.getUdfClass(
>>> FunctionTask.java:
>>> 119)
>>>at
>>> org.apache.hadoop.hive.ql.exec.FunctionTask.createFunction(FunctionTask.
>>> java:
>>> 75)
>>>at
>>> org.apache.hadoop.hive.ql.exec.FunctionTask.execute(FunctionTask.java:
>>> 63)
>>>at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:
>>> 130)
>>>at
>>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:
>>> 57)
>>>at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:
>>> 1063)
>>>at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:900)
>>>at org.apache.hadoop.hive.ql.Driver.run(Driver.java:748)
>>>at
>>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:209)
>>>at
>>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:286)
>>>at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:
>>> 516)
>>>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(
>>> NativeMethodAccessorImpl.java:
>>> 57)
>>>at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
>>> DelegatingMethodAccessorImpl.java:
>>> 43)
>>>at java.lang.reflect.Method.invoke(Method.java:616)
>>>at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
>>> Caused by: java.lang.ClassNotFoundException:
>>> net.didion.jwnl.JWNLException
>>>at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
>>>at java.security.AccessController.doPrivileged(Native Method)
>>>at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
>>>at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
>>>at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:
>>> 294)
>>>at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
>>>... 18 more
>>> FAILED: Execution Error, return code -101 from
>>> org.apache.hadoop.hive.ql.exec.FunctionTask
>>
>>
>>
>>
>> --
>> "...:::Aniket:::... Quetzalco@tl"
>>
>
>
>
> --
> "...:::Aniket:::... Quetzalco@tl"
>


Re: FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.FunctionTask

2012-01-22 Thread Tim Havens
hive --auxpath /reports/hive/ddc_jars/jwnl.jar
Hive history
file=/tmp/thavens/hive_job_log_thavens_20120105_2003418921.txt
hive> create temporary function StemTermsUDF as
'org.apache.hadoop.hive.ql.udf.StemTermsUDF';
OK
Time taken: 0.005 seconds

however:

With in .hiverc with:
add jar /reports/hive/ddc_jars/jwnl.jar

AND
/etc/hive/conf/hive-site.xml


  hive.aux.jars.path
  /reports/hive/ddc_jars


results in:

hive> create temporary function StemTermsUDF as
'org.apache.hadoop.hive.ql.udf.StemTermsUDF';
java.lang.NoClassDefFoundError: net/didion/jwnl/JWNLException
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at
org.apache.hadoop.hive.ql.exec.FunctionTask.getUdfClass(FunctionTask.java:119)
at
org.apache.hadoop.hive.ql.exec.FunctionTask.createFunction(FunctionTask.java:75)
at
org.apache.hadoop.hive.ql.exec.FunctionTask.execute(FunctionTask.java:63)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130)
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1063)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:900)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:748)
at
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:209)
at
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:286)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:516)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
Caused by: java.lang.ClassNotFoundException: net.didion.jwnl.JWNLException
at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
... 18 more
FAILED: Execution Error, return code -101 from
org.apache.hadoop.hive.ql.exec.FunctionTask

On Sun, Jan 22, 2012 at 3:43 PM, Tim Havens  wrote:

> Unfortunately the issue appears to be something with the Jar, or my UDF.
>
> What I can't seem to resolve is what is causing the -101 Error Code.
>
> Tim
>
>
> On Sun, Jan 22, 2012 at 3:26 PM, Aniket Mokashi wrote:
>
>> A simplest way would be to put the jar in auxlib directory. That does the
>> both for you I guess. After that you can directly create temporary function
>> in hive.
>>
>> ~Aniket
>>
>>
>> On Sun, Jan 22, 2012 at 1:24 PM, Aniket Mokashi wrote:
>>
>>> Add the jar to HADOOP_CLASSPATH when you launch hive. That should help.
>>>
>>> Thanks,
>>> Aniket
>>>
>>>
>>> On Sun, Jan 22, 2012 at 9:25 AM, Tim Havens  wrote:
>>>
>>>> I have a similar UDF to this one which create's just fine.
>>>>
>>>> I cam seem to resolve what 'return code -101' means however with this
>>>> one.
>>>>
>>>> Can anyone tell me what 'return code -101' means?
>>>>
>>>> My StemTermsUDF.jar has the proper classpath for the JWNL jars
>>>> already, I'm trying to insure they've REALLY available by 'add jar'
>>>> within hive.
>>>>
>>>> All the paths are correct and I've checked them MANY times to be sure
>>>> before posting this.
>>>>
>>>> hive> add file /usr/lib/hadoop/lib/jwnl/jwnl_properties.xml;
>>>> Added resource: /usr/lib/hadoop/lib/jwnl/jwnl_properties.xml
>>>> hive> add jar /usr/lib/hive/lib/jwnl/jwnl.jar;
>>>> Added /usr/lib/hive/lib/jwnl/jwnl.jar to class path
>>>> Added resource: /usr/lib/hive/lib/jwnl/jwnl.jar
>>>> hive> add jar /usr/lib/hive/lib/jwnl/commons-logging.jar;
>>>> Added /usr/lib/hive/lib/jwnl/commons-logging.jar to class path
>>>> Added resource: /usr/lib/hive/lib/jwnl/commons-logging.jar
>>>> hive> add jar StemTermsUDF.jar;
>>>> Added StemTermsUDF.jar to class path
>>>> Added resource: StemTermsUDF.jar
>>>> hive> create temporary function StemTermsUDF as
>>>> 'org.apache.hadoop.hive.ql.udf.StemTermsUDF';

Re: Order by Sort by partitioned columns

2012-05-16 Thread Tim Havens
I frequently sort by partitioned columns, without issues.  Post your table
schema, and your query that's failing, lets see what's going on?

Tim

On Mon, May 14, 2012 at 1:28 AM, Shin Chan  wrote:

> Hi All
>
> Just curious if its possible to Order by or Sort by partitioned columns.
>
> I tried it , it fails as it is not able to find those columns. They are
> present as partitions.
>
> Any help would be appreciated.
>
> Thanks and Regards ,




-- 
"The whole world is you. Yet you keep thinking there is something else." -
Xuefeng Yicun 822-902 A.D.

Tim R. Havens
Google Phone: 573.454.1232
ICQ: 495992798
ICBM:  37°51'34.79"N   90°35'24.35"W
ham radio callsign: NW0W


Please HELP: HIVE alter table add new partition to schema...

2012-06-19 Thread Tim Havens
So...I have a table that has thousands of files, and Billions of rows
related it.

Lets make this a simple table:

CREATE TABLE test_table (
ts BIGINT,
exec_time DOUBLE,
domain_id BIGINT,
domain_name STRING,
)
PARTITIONED BY (logdate STRING, source STRING, datacenter STRING,
hostname STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
COLLECTION ITEMS TERMINATED BY '\001'
MAP KEYS TERMINATED BY '\002'
LINES TERMINATED BY '\n'
STORED AS TextFile;

So...what I need to do is ADD a partition to the PARTITIONED BY spec
above

The partitioned by column I want to add is 'loghour STRING'.

I can't seem to find any way to accomplish actually adding a NEW
PARTITION COLUMN in the 'PARTITIONED BY' spec, without completely
recreating and reloading the table.

What's the correct way of adding to the partition schema and new
column like 'loghour STRING'.

I'm not trying to add an entry into the table DATA, I'm trying to add
a completely new PARTITIONED BY Column...


Re: Please HELP: HIVE alter table add new partition to schema...

2012-06-19 Thread Tim Havens
Thanks for the 'small files' heads up.  The current LOGDATE file is about
7.2GB Gzipped.  So I don't think we'll be running into many small files
when that's split into 24 hours.  But thanks for the reminder about 'small
files'.

I guess I'd rather just reload the 2500 logs or so...than have to jump
through hoops...

On Tue, Jun 19, 2012 at 10:45 AM, Mark Grover  wrote:

> I agree.
> It would be best if you wrote a script that iterates through each
> leaf-level partition of your existing table (logdate='A', source='B',
> datacenter='C', hostname='D')
> and populate new leaf-level partitions in the new table (logdate='A',
> source='B', datacenter='C', hostname='D', loghour).
>
> By leaf-level partition, I am referring to the partition column that
> appears the last in the list of partition columns.
>
> Technically, you can do dynamic partitioning in 1 query with something
> like:
> FROM test_table src
> INSERT OVERWRITE TABLE dest PARTITION(logdate, source, datacenter,
> hostname, loghour)
>   SELECT ts, exec_time, domain_id, domain_name, logdate, source,
> datacenter, hostname, loghour;
>
> However, as far as I understand, you wouldn't be able to take advantage of
> the existing partitioning in your source table (test_table). If you would
> like to take advantage of existing partitioning, you would have to issue a
> series of queries like this:
> FROM test_table src
> INSERT OVERWRITE TABLE dest PARTITION(logdate='A', source='B',
> datacenter='C', hostname='D', loghour)
>   SELECT ts, exec_time, domain_id, domain_name, logdate, source,
> datacenter, hostname, loghour where logdate='A' and source='B' and
> datacenter='C' and hostname='D';
>
> Just as a side note, whenever you think of your table partitioning, keep
> in mind to not overdo it. Creating more partitions could lead to a lot of
> small files on HDFS which reduces the performance of your Hadoop cluster. A
> couple people have talked about this small files problem:
>
> http://arunxjacob.blogspot.ca/2011/04/hdfs-file-size-vs-allocation-other.html
> http://www.cloudera.com/blog/2009/02/the-small-files-problem/
>
> http://blog.rapleaf.com/dev/2008/11/20/give-me-liberty-or-give-me-death-but-dont-give-me-small-files/
>
> In general, you would like your file sizes to be atleast of HDFS block
> size, most likely a small multiple of the block size. If you do find that
> you are running into the small files problem, there are other ways to get
> around like bucketing.
>
> Good luck!
> Mark
>
> - Original Message -----
> From: "Edward Capriolo" 
> To: user@hive.apache.org
> Sent: Tuesday, June 19, 2012 11:12:48 AM
> Subject: Re: Please HELP: HIVE alter table add new partition to schema...
>
> You can not change the partition columns. I would use a dynamic
> partition insert to select all the data from the original table into
> the new table.
>
> On 6/19/12, Tim Havens  wrote:
> > So...I have a table that has thousands of files, and Billions of rows
> > related it.
> >
> > Lets make this a simple table:
> >
> > CREATE TABLE test_table (
> > ts BIGINT,
> > exec_time DOUBLE,
> > domain_id BIGINT,
> > domain_name STRING,
> > )
> > PARTITIONED BY (logdate STRING, source STRING, datacenter STRING,
> > hostname STRING)
> > ROW FORMAT DELIMITED
> > FIELDS TERMINATED BY '\t'
> > COLLECTION ITEMS TERMINATED BY '\001'
> > MAP KEYS TERMINATED BY '\002'
> > LINES TERMINATED BY '\n'
> > STORED AS TextFile;
> >
> > So...what I need to do is ADD a partition to the PARTITIONED BY spec
> > above
> >
> > The partitioned by column I want to add is 'loghour STRING'.
> >
> > I can't seem to find any way to accomplish actually adding a NEW
> > PARTITION COLUMN in the 'PARTITIONED BY' spec, without completely
> > recreating and reloading the table.
> >
> > What's the correct way of adding to the partition schema and new
> > column like 'loghour STRING'.
> >
> > I'm not trying to add an entry into the table DATA, I'm trying to add
> > a completely new PARTITIONED BY Column...
> >
>



-- 
"The whole world is you. Yet you keep thinking there is something else." -
Xuefeng Yicun 822-902 A.D.

Tim R. Havens
Google Phone: 573.454.1232
ICQ: 495992798
ICBM:  37°51'34.79"N   90°35'24.35"W
ham radio callsign: NW0W


Re: How to override a property with in a hive query

2012-07-02 Thread Tim Havens
Before your query type SET  = ;

On Mon, Jul 2, 2012 at 7:13 PM, Abhishek  wrote:

> Hi all,
>
> Can I know, how to override a property of mapred-site.xml in Hive join
> query.
> Suppose for map reduce job we override using -D , how
> to do it with in hive query.
>
> Regards
> Abhishek.
>
>
> Begin forwarded message:
>
> *From:* Abhishek 
> *Date:* July 2, 2012 8:10:47 PM EDT
> *To:* "user@hive.apache.org" 
> *Subject:* *How to a property in hive query*
>
> Hi all,
>
> Can I know, how to override a property of mapred-site.xml in Hive join
> query.
> Suppose for map reduce job we override using -D , how
> to do it with in hive query.
>
>


-- 
"The whole world is you. Yet you keep thinking there is something else." -
Xuefeng Yicun 822-902 A.D.

Tim R. Havens
Google Phone: 573.454.1232
ICQ: 495992798
ICBM:  37°51'34.79"N   90°35'24.35"W
ham radio callsign: NW0W


Re: How to override a property with in a hive query

2012-07-03 Thread Tim Havens
What are you trying to accomplish that a method like this won't work for?

On Mon, Jul 2, 2012 at 10:25 PM, Abhishek  wrote:

> Hi Tim,
>
> Is this the only way, or if we have any other ways.
>
> Sent from my iPhone
>
> On Jul 2, 2012, at 8:49 PM, Tim Havens  wrote:
>
> Before your query type SET  = ;
>
> On Mon, Jul 2, 2012 at 7:13 PM, Abhishek wrote:
>
>> Hi all,
>>
>> Can I know, how to override a property of mapred-site.xml in Hive join
>> query.
>> Suppose for map reduce job we override using -D ,
>> how to do it with in hive query.
>>
>> Regards
>> Abhishek.
>>
>>
>> Begin forwarded message:
>>
>> *From:* Abhishek 
>> *Date:* July 2, 2012 8:10:47 PM EDT
>> *To:* "user@hive.apache.org" 
>> *Subject:* *How to a property in hive query*
>>
>> Hi all,
>>
>> Can I know, how to override a property of mapred-site.xml in Hive join
>> query.
>> Suppose for map reduce job we override using -D ,
>> how to do it with in hive query.
>>
>>
>
>
> --
> "The whole world is you. Yet you keep thinking there is something else." -
> Xuefeng Yicun 822-902 A.D.
>
> Tim R. Havens
> Google Phone: 573.454.1232
> ICQ: 495992798
> ICBM:  37°51'34.79"N   90°35'24.35"W
> ham radio callsign: NW0W
>
>


-- 
"The whole world is you. Yet you keep thinking there is something else." -
Xuefeng Yicun 822-902 A.D.

Tim R. Havens
Google Phone: 573.454.1232
ICQ: 495992798
ICBM:  37°51'34.79"N   90°35'24.35"W
ham radio callsign: NW0W


Re: How to list all views

2012-07-05 Thread Tim Havens
There may be a way to select table_name's from the metastore...if so name
your views like *_view and select for those table names.

Of course if you can query the metastore there's probably an even better
way where you don't have to name you view's anything special, and just
search for a table type or some other descriptor.

I've not really poked around in the metastore much ... but that's probably
where Hive would have to look anyway.

Not sure if there are any built in commands for selecting data from the
metastore directly like this...other than things like 'show tables'


Tim

On Thu, Jul 5, 2012 at 10:15 AM, Stephen R. Scaffidi <
sscaff...@tripadvisor.com> wrote:

>
> Thank you, but what I need is to list only the views, or conversely, only
> the actual tables.
>
>
> On 07/05/2012 11:14 AM, Bejoy KS wrote:
>
>> Hi Stephen
>>
>> You can see the views as well along with tables using the
>>
>> Show tables;
>>   command.
>> --Original Message--
>> From: Stephen R. Scaffidi
>> To: user@hive.apache.org
>> ReplyTo: user@hive.apache.org
>> ReplyTo: sscaff...@tripadvisor.com
>> Subject: How to list all views
>> Sent: Jul 5, 2012 20:38
>>
>> How can I list all the views in hive? I can't seem to figure out how to
>> do it with HQL.
>>
>>
>> Regards
>> Bejoy KS
>>
>> Sent from handheld, please excuse typos.
>>
>>
>
>


-- 
"The whole world is you. Yet you keep thinking there is something else." -
Xuefeng Yicun 822-902 A.D.

Tim R. Havens
Google Phone: 573.454.1232
ICQ: 495992798
ICBM:  37°51'34.79"N   90°35'24.35"W
ham radio callsign: NW0W


Re: Long running Join Query - Reduce task fails due to failing to report status

2012-08-24 Thread Tim Havens
Just curious if you've tried using Hive's explain method to see what IT
thinks of your query.


On Fri, Aug 24, 2012 at 9:36 AM, Himanish Kushary wrote:

> Hi,
>
> We have a complex query that involves several left outer joins resulting
> in 8 M/R jobs in Hive.During execution of one of the stages ( after three
> M/R has run) the M/R job fails due to few Reduce tasks failing due to
> inactivity.
>
> Most of the reduce tasks go through fine ( within 3 mins) but the last one
> gets stuck for a long time (> 1 hour) and finally after several attempts
> gets killed due to "failed to report status for 600 seconds. Killing!"
>
> What may be causing this issue ? Would hive.script.auto.progress help in
> this case ? As we are not able to get much information from the log files
> how may we approach resolving this ? Will tweaking of any specific M/R
> parameters help ?
>
> The task attempt log shows several lines like this before exiting :
>
> 2012-08-23 19:17:23,848 INFO ExecReducer: ExecReducer: processing 21900 
> rows: used memory = 408582240
> 2012-08-23 19:17:30,189 INFO ExecReducer: ExecReducer: processing 22000 
> rows: used memory = 346110400
> 2012-08-23 19:17:37,510 INFO ExecReducer: ExecReducer: processing 22100 
> rows: used memory = 583913576
> 2012-08-23 19:17:44,829 INFO ExecReducer: ExecReducer: processing 22200 
> rows: used memory = 513071504
> 2012-08-23 19:17:47,923 INFO org.apache.hadoop.mapred.FileInputFormat: Total 
> input paths to process : 1
>
> Here are the reduce task counters:
>
> *Map-Reduce Framework* Combine input records0 Combine output records0Reduce 
> input groups
> 222,480,335 Reduce shuffle bytes7,726,141,897 Reduce input records
> 222,480,335 Reduce output records0 Spilled Records355,827,191 CPU time
> spent (ms)2,152,160 Physical memory (bytes) snapshot1,182,490,624 Virtual
> memory (bytes) snapshot1,694,531,584 Total committed heap usage (bytes)
> 990,052,352
>
> The tasktracker log gives a thread dump at that time but no exception.
>
> *2012-08-23 20:05:49,319 INFO org.apache.hadoop.mapred.TaskTracker:
> Process Thread Dump: lost task*
> *69 active threads*
>
> ---
> Thanks & Regards
> Himanish
>



-- 
"The whole world is you. Yet you keep thinking there is something else." -
Xuefeng Yicun 822-902 A.D.

Tim R. Havens
Google Phone: 573.454.1232
ICQ: 495992798
ICBM:  37°51'34.79"N   90°35'24.35"W
ham radio callsign: NW0W


Re: Count bug in Hive 3.0.0.3.1

2020-04-28 Thread Tim Havens
Unsubscribe

On Tue, Apr 28, 2020, 1:23 AM Deepak Krishna  wrote:

> Hi team,
>
> We came across a bug related to count function. We are using hive
> 3.0.0.3.1 with Tez 0.9.0.3.1. PFA the queries to replicate the issue.
>
> Please register this as a bug and let us know if we can support in anyway
> to fix the issue. It would also be helpful to know if there are any other
> workarounds for this issue.
>
> Thanks and Regards,
> Deepak Krishna
>
>
>
>
> Deepak Krishna
> Big Data Engineer
>
> [image: Tel.]
> [image: Fax] +49 721 98993-
> [image: E-Mail] hs-d...@solute.de
>
> solute GmbH
> Zeppelinstraße 15
> 76185 Karlsruhe
> Germany
>
>
> [image: Logo Solute]
>
> Marken der solute GmbH | brands of solute GmbH
> [image: Marken]
> Geschäftsführer | Managing Director: Dr. Thilo Gans, Bernd Vermaaten
> Webseite | www.solute.de
> Sitz | Registered Office: Karlsruhe
> Registergericht | Register Court: Amtsgericht Mannheim
> Registernummer | Register No.: HRB 110579
> USt-ID | VAT ID: DE234663798
>
> *Informationen zum Datenschutz | Information about privacy policy*
> https://www.solute.de/ger/datenschutz/grundsaetze-der-datenverarbeitung.php
>
>
>
>