orc table with sorted field

2015-10-13 Thread Patcharee Thongtra
Hi, How can I create a partitioned orc table with sorted field(s)? I tried to use sorted by keyword, but failed parse exception> CREATE TABLE peoplesort (name string, age int) partition by (bddate int) SORTED BY (age) stored as orc Is it possible to have some sorted columns? From hive ddl

Hive Storage Handler to replace HDFS with custom store

2015-10-13 Thread Amey Barve
Hi All, Hive by default queries onto HDFS. If I implement Hive Storage Handler for my store with 1. input format 2. output format 3. AbstractSerDe Hive queries will then store and retrieve from my store instead of HDFS, Is this understanding correct? or any other changes needed in Hive code? If

Container is running beyond physical memory limits

2015-10-13 Thread Mich Talebzadeh
Hi, I have been having some issues with loading data into hive from one table to another for 1,767,886 rows. I was getting the following error Task with the most failures(4): - Task ID: task_1444731612741_0001_r_00 URL:

Re: Container is running beyond physical memory limits

2015-10-13 Thread Gopal Vijayaraghavan
> is running beyond physical memory limits. Current usage: 2.0 GB of 2 GB >physical memory used; 6.6 GB of 8 GB virtual memory used. Killing >container. You need to change the yarn.nodemanager.vmem-check-enabled=false on *every* machine on your cluster & restart all NodeManagers. The VMEM

Re: Container is running beyond physical memory limits

2015-10-13 Thread hadoop hive
http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/ On Wed, Oct 14, 2015 at 1:42 AM, Mich Talebzadeh wrote: > Thank you all. > > > > Hi Gopal, > > > > My understanding is that the parameter below specifies the max size of 4GB > for each contain. That

Re: Two Tables Join (One Big table and other 1gb size table)

2015-10-13 Thread Gopal Vijayaraghavan
> I tried doing stream table, but ran for long time like 3 hrs : Looks >like only 1 reducer is working on it ... > on (trim(p.pid)=trim(c.p_id) and p.source='XYZ'); In case that's devolving to a cross-product, it might be a miss in pushing down the trim() to the TableScan. Are you using

Re: Container is running beyond physical memory limits

2015-10-13 Thread Gopal Vijayaraghavan
> Now I am rather confused about the following parameters (for example > mapreduce.reduce versus > mapreduce.map) and their correlation to each other They have no relationship with each other. They are meant for two different task types in MapReduce. In general you run fewer reducers than

Re: Container is running beyond physical memory limits

2015-10-13 Thread Muni Chada
Reduce yarn.nodemanager.vmem-pmem-ratio to 2.1 and lower. On Tue, Oct 13, 2015 at 2:32 PM, hadoop hive wrote: > > > mapreduce.reduce.memory.mb > > 4096 > > > > change this to 8 G > > > On Wed, Oct 14, 2015 at 12:52 AM, Ranjana Rajendran < > ranjana.rajend...@gmail.com>

RE: Container is running beyond physical memory limits

2015-10-13 Thread Mich Talebzadeh
Thank you all. Hi Gopal, My understanding is that the parameter below specifies the max size of 4GB for each contain. That seems to work for me mapreduce.map.memory.mb 4096 Now I am rather confused about the following parameters (for example mapreduce.reduce versus

Re: Container is running beyond physical memory limits

2015-10-13 Thread Ranjana Rajendran
Here is Altiscale's documentation about the topic. Do let me know if you have any more questions. http://documentation.altiscale.com/heapsize-for-mappers-and-reducers On Tue, Oct 13, 2015 at 9:31 AM, Mich Talebzadeh wrote: > Hi, > > > > I have been having some issues with

Re: Container is running beyond physical memory limits

2015-10-13 Thread hadoop hive
mapreduce.reduce.memory.mb 4096 change this to 8 G On Wed, Oct 14, 2015 at 12:52 AM, Ranjana Rajendran < ranjana.rajend...@gmail.com> wrote: > Here is Altiscale's documentation about the topic. Do let me know if you > have any more questions. > >

Two Tables Join (One Big table and other 1gb size table)

2015-10-13 Thread Kartik Eyan
Hi, I am trying to do inner join on two tables, but running for long time Tab1 - 100GB Tab2 - 2GB -- Partition table on source I tried doing stream table, but ran for long time like 3 hrs : Looks like only 1 reducer is working on it I tried Map Join by increasing the mem, it failed. Pls find

GC Overhead

2015-10-13 Thread Gary Clark
Hello, I am seeing the below. I did have Heap errors before so I increased the JAVA heap memory. Not sure how to get rid of the below from the hadoop logs its causing the job to hang and then fail: >>> Invoking Hive command line now >>> Heart beat 40164 [main] ERROR

RE: Container is running beyond physical memory limits

2015-10-13 Thread Mich Talebzadeh
Thank you. Very helpful Mich Talebzadeh Sybase ASE 15 Gold Medal Award 2008 A Winning Strategy: Running the most Critical Financial Data on ASE 15 http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf Author of the books "A Practitioner’s Guide to Upgrading to

RE: Container is running beyond physical memory limits

2015-10-13 Thread Mich Talebzadeh
Many thanks Gopal. Mich Talebzadeh Sybase ASE 15 Gold Medal Award 2008 A Winning Strategy: Running the most Critical Financial Data on ASE 15 http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908. pdf Author of the books "A Practitioner's Guide to Upgrading to Sybase ASE