Re: Hive Cli ORC table read error with limit option

2016-04-18 Thread Biswajit Nayak
Thanks Prasanth for the update. I will test it and update it here the
outcome.

Thanks
Biswa

On Tue, Apr 19, 2016 at 6:26 AM, Prasanth Jayachandran <
pjayachand...@hortonworks.com> wrote:

> Hi Biswajit
>
> You might need patch from https://issues.apache.org/jira/browse/HIVE-11546
>
> Can you apply this patch to your hive build and see if it solves the
> issue? (recommended)
>
> Alternatively, you can use “hive.exec.orc.split.strategy”=“BI” as
> workaround.
> Its highly not recommended to use this config as it will disable split
> elimination
> and may generate sub-optiomal splits resulting in less map-side
> parallelism.
> This config is just provided as an workaround and is suitable when all orc
> files
> are small (
> Thanks
> Prasanth
>
>
> On Apr 18, 2016, at 7:44 PM, Biswajit Nayak <biswa...@altiscale.com>
> wrote:
>
> Hi All,
>
> I seriously need help on this aspect. Any reference or pointer to
> troubleshoot or fix this, could be helpful.
>
> Regards
> Biswa
>
> On Fri, Mar 25, 2016 at 11:24 PM, Biswajit Nayak <biswa...@altiscale.com>
> wrote:
>
>> Prashanth,
>>
>> Apologies for the delay in response.
>>
>> Below is the orcfiledump of the empty orc file from a broken partition.
>>
>> *$ hive --orcfiledump /hive/*testdb*.db/*table_orc
>> */year=2016/month=1/day=29/00_0*
>> *Structure for  /hive/*testdb*.db/*table_orc
>> */year=2016/month=1/day=29/00_0*
>> *File Version: 0.12 with HIVE_8732*
>> *16/03/25 17:49:09 INFO orc.ReaderImpl: Reading ORC rows from  /hive/*
>> testdb*.db/*table_orc*/year=2016/month=1/day=29/00_0 with {include:
>> null, offset: 0, length: 9223372036854775807}*
>> *16/03/25 17:49:09 INFO orc.RecordReaderFactory: Schema is not specified
>> on read. Using file schema.*
>> *Rows: 0*
>> *Compression: SNAPPY*
>> *Compression size: 262144*
>> *Type: struct<>*
>>
>> *Stripe Statistics:*
>>
>> *File Statistics:*
>> *  Column 0: count: 0 hasNull: false*
>>
>> *Stripes:*
>>
>> *File length: 49 bytes*
>> *Padding length: 0 bytes*
>> *Padding ratio: 0%*
>> *$ *
>>
>>
>> I still not able to figure it out whats causing this odd behaviour?
>>
>>
>> Regards
>> Biswa
>>
>> On Thu, Mar 10, 2016 at 3:12 PM, Prasanth Jayachandran <
>> pjayachand...@hortonworks.com> wrote:
>>
>>> Alternatively you can send orcfiledump output for the empty orc file
>>> from broken partition.
>>>
>>> Thanks
>>> Prasanth
>>>
>>> On Mar 10, 2016, at 5:11 PM, Prasanth Jayachandran <
>>> pjayachand...@hortonworks.com> wrote:
>>>
>>> Could you attach the emtpy orc files from one of the broken partition
>>> somewhere? I can run some tests on it to see why its happening.
>>>
>>> Thanks
>>> Prasanth
>>>
>>> On Mar 8, 2016, at 12:02 AM, Biswajit Nayak <biswa...@altiscale.com>
>>> wrote:
>>>
>>> Both the parameters are set to false by default.
>>>
>>> *hive> set hive.optimize.index.filter;*
>>> *hive.optimize.index.filter=false*
>>> *hive> set hive.orc.splits.include.file.footer;*
>>> *hive.orc.splits.include.file.footer=false*
>>> *hive> *
>>>
>>> >>>I suspect this might be related to having 0 row files in the buckets
>>> not
>>> having any recorded schema.
>>>
>>> yes there are few files with 0 row, but the query works with other
>>> partition (which has 0 row files). Out of 30 partition (for a month), 3-4
>>> partition are having this issue. Even reload of the data does not yield
>>> anything. Query works fine in MR now, but having issue in tez.
>>>
>>>
>>>
>>> On Tue, Mar 8, 2016 at 2:43 AM, Gopal Vijayaraghavan <gop...@apache.org>
>>> wrote:
>>>
>>>>
>>>> > cvarchar(2)
>>>> ...
>>>> > Num Buckets: 7
>>>>
>>>> I suspect this might be related to having 0 row files in the buckets not
>>>> having any recorded schema.
>>>>
>>>> You can also experiment with hive.optimize.index.filter=false, to see if
>>>> the zero row case is artificially produced via predicate push-down.
>>>>
>>>>
>>>> That shouldn't be a problem unless you've turned on
>>>> hive.orc.splits.include.file.footer=true (recommended to be false).
>>>>
>>>> Your row-locations don't actually match any Apache source jar in my
>>>> builds, are there any other patches to consider?
>>>>
>>>> Cheers,
>>>> Gopal
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>
>


Re: Hive Cli ORC table read error with limit option

2016-04-18 Thread Biswajit Nayak
Hi All,

I seriously need help on this aspect. Any reference or pointer to
troubleshoot or fix this, could be helpful.

Regards
Biswa

On Fri, Mar 25, 2016 at 11:24 PM, Biswajit Nayak <biswa...@altiscale.com>
wrote:

> Prashanth,
>
> Apologies for the delay in response.
>
> Below is the orcfiledump of the empty orc file from a broken partition.
>
> *$ hive --orcfiledump /hive/*testdb*.db/*table_orc
> */year=2016/month=1/day=29/00_0*
>
> *Structure for  /hive/*testdb*.db/*table_orc
> */year=2016/month=1/day=29/00_0*
>
> *File Version: 0.12 with HIVE_8732*
>
> *16/03/25 17:49:09 INFO orc.ReaderImpl: Reading ORC rows from  /hive/*
> testdb*.db/*table_orc*/year=2016/month=1/day=29/00_0 with {include:
> null, offset: 0, length: 9223372036854775807}*
>
> *16/03/25 17:49:09 INFO orc.RecordReaderFactory: Schema is not specified
> on read. Using file schema.*
>
> *Rows: 0*
>
> *Compression: SNAPPY*
>
> *Compression size: 262144*
>
> *Type: struct<>*
>
>
> *Stripe Statistics:*
>
>
> *File Statistics:*
>
> *  Column 0: count: 0 hasNull: false*
>
>
> *Stripes:*
>
>
> *File length: 49 bytes*
>
> *Padding length: 0 bytes*
>
> *Padding ratio: 0%*
>
> *$ *
>
>
> I still not able to figure it out whats causing this odd behaviour?
>
>
> Regards
> Biswa
>
> On Thu, Mar 10, 2016 at 3:12 PM, Prasanth Jayachandran <
> pjayachand...@hortonworks.com> wrote:
>
>> Alternatively you can send orcfiledump output for the empty orc file from
>> broken partition.
>>
>> Thanks
>> Prasanth
>>
>> On Mar 10, 2016, at 5:11 PM, Prasanth Jayachandran <
>> pjayachand...@hortonworks.com> wrote:
>>
>> Could you attach the emtpy orc files from one of the broken partition
>> somewhere? I can run some tests on it to see why its happening.
>>
>> Thanks
>> Prasanth
>>
>> On Mar 8, 2016, at 12:02 AM, Biswajit Nayak <biswa...@altiscale.com>
>> wrote:
>>
>> Both the parameters are set to false by default.
>>
>> *hive> set hive.optimize.index.filter;*
>> *hive.optimize.index.filter=false*
>> *hive> set hive.orc.splits.include.file.footer;*
>> *hive.orc.splits.include.file.footer=false*
>> *hive> *
>>
>> >>>I suspect this might be related to having 0 row files in the buckets
>> not
>> having any recorded schema.
>>
>> yes there are few files with 0 row, but the query works with other
>> partition (which has 0 row files). Out of 30 partition (for a month), 3-4
>> partition are having this issue. Even reload of the data does not yield
>> anything. Query works fine in MR now, but having issue in tez.
>>
>>
>>
>> On Tue, Mar 8, 2016 at 2:43 AM, Gopal Vijayaraghavan <gop...@apache.org>
>> wrote:
>>
>>>
>>> > cvarchar(2)
>>> ...
>>> > Num Buckets: 7
>>>
>>> I suspect this might be related to having 0 row files in the buckets not
>>> having any recorded schema.
>>>
>>> You can also experiment with hive.optimize.index.filter=false, to see if
>>> the zero row case is artificially produced via predicate push-down.
>>>
>>>
>>> That shouldn't be a problem unless you've turned on
>>> hive.orc.splits.include.file.footer=true (recommended to be false).
>>>
>>> Your row-locations don't actually match any Apache source jar in my
>>> builds, are there any other patches to consider?
>>>
>>> Cheers,
>>> Gopal
>>>
>>>
>>>
>>
>>
>>
>


Re: Hive Cli ORC table read error with limit option

2016-03-25 Thread Biswajit Nayak
Prashanth,

Apologies for the delay in response.

Below is the orcfiledump of the empty orc file from a broken partition.

*$ hive --orcfiledump /hive/*testdb*.db/*table_orc
*/year=2016/month=1/day=29/00_0*

*Structure for  /hive/*testdb*.db/*table_orc
*/year=2016/month=1/day=29/00_0*

*File Version: 0.12 with HIVE_8732*

*16/03/25 17:49:09 INFO orc.ReaderImpl: Reading ORC rows from  /hive/*testdb
*.db/*table_orc*/year=2016/month=1/day=29/00_0 with {include: null,
offset: 0, length: 9223372036854775807}*

*16/03/25 17:49:09 INFO orc.RecordReaderFactory: Schema is not specified on
read. Using file schema.*

*Rows: 0*

*Compression: SNAPPY*

*Compression size: 262144*

*Type: struct<>*


*Stripe Statistics:*


*File Statistics:*

*  Column 0: count: 0 hasNull: false*


*Stripes:*


*File length: 49 bytes*

*Padding length: 0 bytes*

*Padding ratio: 0%*

*$ *


I still not able to figure it out whats causing this odd behaviour?


Regards
Biswa

On Thu, Mar 10, 2016 at 3:12 PM, Prasanth Jayachandran <
pjayachand...@hortonworks.com> wrote:

> Alternatively you can send orcfiledump output for the empty orc file from
> broken partition.
>
> Thanks
> Prasanth
>
> On Mar 10, 2016, at 5:11 PM, Prasanth Jayachandran <
> pjayachand...@hortonworks.com> wrote:
>
> Could you attach the emtpy orc files from one of the broken partition
> somewhere? I can run some tests on it to see why its happening.
>
> Thanks
> Prasanth
>
> On Mar 8, 2016, at 12:02 AM, Biswajit Nayak <biswa...@altiscale.com>
> wrote:
>
> Both the parameters are set to false by default.
>
> *hive> set hive.optimize.index.filter;*
> *hive.optimize.index.filter=false*
> *hive> set hive.orc.splits.include.file.footer;*
> *hive.orc.splits.include.file.footer=false*
> *hive> *
>
> >>>I suspect this might be related to having 0 row files in the buckets
> not
> having any recorded schema.
>
> yes there are few files with 0 row, but the query works with other
> partition (which has 0 row files). Out of 30 partition (for a month), 3-4
> partition are having this issue. Even reload of the data does not yield
> anything. Query works fine in MR now, but having issue in tez.
>
>
>
> On Tue, Mar 8, 2016 at 2:43 AM, Gopal Vijayaraghavan <gop...@apache.org>
> wrote:
>
>>
>> > cvarchar(2)
>> ...
>> > Num Buckets: 7
>>
>> I suspect this might be related to having 0 row files in the buckets not
>> having any recorded schema.
>>
>> You can also experiment with hive.optimize.index.filter=false, to see if
>> the zero row case is artificially produced via predicate push-down.
>>
>>
>> That shouldn't be a problem unless you've turned on
>> hive.orc.splits.include.file.footer=true (recommended to be false).
>>
>> Your row-locations don't actually match any Apache source jar in my
>> builds, are there any other patches to consider?
>>
>> Cheers,
>> Gopal
>>
>>
>>
>
>
>


Re: Hive Cli ORC table read error with limit option

2016-03-07 Thread Biswajit Nayak
Both the parameters are set to false by default.

*hive> set hive.optimize.index.filter;*

*hive.optimize.index.filter=false*

*hive> set hive.orc.splits.include.file.footer;*

*hive.orc.splits.include.file.footer=false*

*hive> *

>>>I suspect this might be related to having 0 row files in the buckets not
having any recorded schema.

yes there are few files with 0 row, but the query works with other
partition (which has 0 row files). Out of 30 partition (for a month), 3-4
partition are having this issue. Even reload of the data does not yield
anything. Query works fine in MR now, but having issue in tez.



On Tue, Mar 8, 2016 at 2:43 AM, Gopal Vijayaraghavan 
wrote:

>
> > cvarchar(2)
> ...
> > Num Buckets: 7
>
> I suspect this might be related to having 0 row files in the buckets not
> having any recorded schema.
>
> You can also experiment with hive.optimize.index.filter=false, to see if
> the zero row case is artificially produced via predicate push-down.
>
>
> That shouldn't be a problem unless you've turned on
> hive.orc.splits.include.file.footer=true (recommended to be false).
>
> Your row-locations don't actually match any Apache source jar in my
> builds, are there any other patches to consider?
>
> Cheers,
> Gopal
>
>
>


Re: Hive Cli ORC table read error with limit option

2016-03-06 Thread Biswajit Nayak
Hi Gopal,


I had already pasted the table format in this thread. Will repeat it again.


*hive> desc formatted *testdb.table_orc*;*

*OK*

*# col_name data_typecomment *



*row_id   bigint   *

*a int  *

*b  int  *

*cvarchar(2)   *

*d bigint   *

*e   int  *

*fbigint   *

*gfloat*

*h int  *

*i  int  *



*# Partition Information*

*# col_name data_typecomment *



*year int  *

*monthint  *

*day  int  *



*# Detailed Table Information*

*Database:*testdb

*Owner:   **

*CreateTime:  Mon Jan 25 22:32:22 UTC 2016  *

*LastAccessTime:  UNKNOWN   *

*Protect Mode:None  *

*Retention:   0 *

*Location:hdfs://***:8020/hive/*testdb*.db/table_orc
 *

*Table Type:  MANAGED_TABLE *

*Table Parameters:*

* last_modified_by **  *

* last_modified_time   **  *

* orc.compress SNAPPY  *

* transient_lastDdlTime 1454104669  *



*# Storage Information*

*SerDe Library:   org.apache.hadoop.hive.ql.io.orc.OrcSerde  *

*InputFormat: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat  *

*OutputFormat:org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat  *

*Compressed:  No*

*Num Buckets: 7 *

*Bucket Columns:  [f]*

*Sort Columns:[]*

*Storage Desc Params:*

* field.delim  \t  *

* serialization.format \t  *

*Time taken: 0.105 seconds, Fetched: 46 row(s)*

*hive> *


>>>Depends on whether any of those columns are paritition columns or not & 
>>>whether
the table is marked transactional.

Yes those columns are partitioned and they are not marked as transactional.


>>>Usually that and a copy of --orcfiledump output to check the
offsets/types.

there are around 10 files, so copying all the orcfiledump will be a mess
here. Is there any way to find the defective file so that i could isolate
it and copy the orcfiledump of it here.

Thanks
Biswa


On Sat, Mar 5, 2016 at 12:21 AM, Gopal Vijayaraghavan 
wrote:

>
> > Any one has any idea about this.. Really stuck with this.
> ...
> > hive> select h from testdb.table_orc where year = 2016 and month =1 and
> >day >29 limit 10;
>
> Depends on whether any of those columns are paritition columns or not &
> whether the table is marked transactional.
>
> > Caused by: java.lang.IndexOutOfBoundsException: Index: 0
> > at java.util.Collections$EmptyList.get(Collections.java:3212)
> > at
> >org.apache.hadoop.hive.ql.io.orc.OrcProto$Type.getSubtypes(OrcProto.java:1
> >2240)
>
> If you need answers to rare problems, these emails need at least the table
> format ("desc formatted").
>
>
> Usually that and a copy of --orcfiledump output to check the offsets/types.
>
> Cheers,
> Gopal
>
>
>


Re: Sqoop Hcat Int partition error

2016-03-04 Thread Biswajit Nayak
Any one has seen this ?

On Tue, Mar 1, 2016 at 11:07 AM, Biswajit Nayak <biswa...@altiscale.com>
wrote:

> The fix in the https://issues.apache.org/jira/browse/HIVE-7164.  does not
> works.
>
> On Tue, Mar 1, 2016 at 10:51 AM, Richa Sharma <mailtorichasha...@gmail.com
> > wrote:
>
>> Great!
>>
>> So what is the interim fix you are implementing
>>
>> Richa
>> On Mar 1, 2016 4:06 PM, "Biswajit Nayak" <biswa...@altiscale.com> wrote:
>>
>>> Thanks Richa.
>>>
>>> The issue was suppose to be fixed in Hive 0.12 version as per the jira
>>> https://issues.apache.org/jira/browse/HIVE-7164.
>>>
>>> Even raised a ticket in sqoop jira [SQOOP-2840] for this .
>>>
>>> Thanks
>>> Biswa
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Mar 1, 2016 at 9:56 AM, Richa Sharma <
>>> mailtorichasha...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> The values should still persist if partition column data type in Hive
>>>> is a string.
>>>>
>>>> I am checking HCatalog documentation for support of int data type in
>>>> partition column.
>>>>
>>>> Cheers
>>>> Richa
>>>>
>>>> On Tue, Mar 1, 2016 at 3:06 PM, Biswajit Nayak <biswa...@altiscale.com>
>>>> wrote:
>>>>
>>>>> Hi Richa,
>>>>>
>>>>> Thats a work around. But how to handle the columns with INT type.
>>>>> Changing the type will be the last option for me.
>>>>>
>>>>> Regards
>>>>> Biswa
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Mar 1, 2016 at 9:31 AM, Richa Sharma <
>>>>> mailtorichasha...@gmail.com> wrote:
>>>>>
>>>>>> Hi Biswajit
>>>>>>
>>>>>> The answer is in the last line of the error message. Change the data
>>>>>> type of partition column to string in hive and try again.
>>>>>>
>>>>>> Hope it helps !
>>>>>>
>>>>>> Richa
>>>>>>
>>>>>> 16/02/12 08:04:12 ERROR tool.ExportTool: Encountered IOException running 
>>>>>> export job: java.io.IOException: The table provided default.emp_details1 
>>>>>> uses unsupported  partitioning key type  for column salary : int.  Only 
>>>>>> string fields are allowed in partition columns in Catalog
>>>>>>
>>>>>>
>>>>>> On Tue, Mar 1, 2016 at 2:19 PM, Biswajit Nayak <
>>>>>> biswa...@altiscale.com> wrote:
>>>>>>
>>>>>>> Hi All,
>>>>>>>
>>>>>>> I am trying to do a SQOOP export from hive( integer type partition)
>>>>>>> to mysql through HCAT and it fails with the following error.
>>>>>>>
>>>>>>> Versions:-
>>>>>>>
>>>>>>> Hadoop :-  2.7.1
>>>>>>> Hive  :-  1.2.0
>>>>>>> Sqoop   :-  1.4.5
>>>>>>>
>>>>>>> Table in Hive :-
>>>>>>>
>>>>>>>
>>>>>>> hive> use default;
>>>>>>> OK
>>>>>>> Time taken: 0.028 seconds
>>>>>>> hive> describe emp_details1;
>>>>>>> OK
>>>>>>> id  int
>>>>>>> namestring
>>>>>>> deg string
>>>>>>> deptstring
>>>>>>> salary  int
>>>>>>>
>>>>>>> # Partition Information
>>>>>>> # col_name  data_type   comment
>>>>>>>
>>>>>>> salary  int
>>>>>>> Time taken: 0.125 seconds, Fetched: 10 row(s)
>>>>>>> hive>
>>>>>>>
>>>>>>> hive> select * from emp_details1;
>>>>>>> OK
>>>>>>> 1201gopal   5
>>>>>>> 1202manisha 5
>>>>>>> 1203kalil   5
>>>>>>> 1204prasanth5
>>>>>>> 1205kranthi 5
>>>>>>> 

Re: Hive Cli ORC table read error with limit option

2016-03-04 Thread Biswajit Nayak
Any one has any idea about this.. Really stuck with this.

On Tue, Mar 1, 2016 at 4:09 PM, Biswajit Nayak <biswa...@altiscale.com>
wrote:

> Hi,
>
> It works for MR engine, while in TEZ it fails.
>
> *hive> set hive.execution.engine=tez;*
>
> *hive> set hive.fetch.task.conversion=none;*
>
> *hive> select h from test*db.table_orc* where year = 2016 and month =1
> and day >29 limit 10;*
>
> *Query ID = 26f9a510-c10c-475c-9988-081998b66b0c*
>
> *Total jobs = 1*
>
> *Launching Job 1 out of 1*
>
>
>
> *Status: Running (Executing on YARN cluster with App id
> application_1456379707708_1135)*
>
>
>
> **
>
> *VERTICES  STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED
> KILLED*
>
>
> **
>
> *Map 1 FAILED -1  00   -1   0
>   0*
>
>
> **
>
> *VERTICES: 00/01  [>>--] 0%ELAPSED TIME: 0.37
> s *
>
>
> **
>
> *Status: Failed*
>
> *Vertex failed, vertexName=Map 1, vertexId=vertex_1456379707708_1135_1_00,
> diagnostics=[Vertex vertex_1456379707708_1135_1_00 [Map 1] killed/failed
> due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: t*able_orc* initializer
> failed, vertex=vertex_1456379707708_1135_1_00 [Map 1],
> java.lang.RuntimeException: serious problem*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1021)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1048)*
>
> * at
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:306)*
>
> * at
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:408)*
>
> * at
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:131)*
>
> * at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245)*
>
> * at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239)*
>
> * at java.security.AccessController.doPrivileged(Native Method)*
>
> * at javax.security.auth.Subject.doAs(Subject.java:415)*
>
> * at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)*
>
> * at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239)*
>
> * at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226)*
>
> * at java.util.concurrent.FutureTask.run(FutureTask.java:262)*
>
> * at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)*
>
> * at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)*
>
> * at java.lang.Thread.run(Thread.java:744)*
>
> *Caused by: java.util.concurrent.ExecutionException:
> java.lang.IndexOutOfBoundsException: Index: 0*
>
> * at java.util.concurrent.FutureTask.report(FutureTask.java:122)*
>
> * at java.util.concurrent.FutureTask.get(FutureTask.java:188)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1016)*
>
> * ... 15 more*
>
> *Caused by: java.lang.IndexOutOfBoundsException: Index: 0*
>
> * at java.util.Collections$EmptyList.get(Collections.java:3212)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcProto$Type.getSubtypes(OrcProto.java:12240)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.getColumnIndicesFromNames(ReaderImpl.java:651)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.getRawDataSizeOfColumns(ReaderImpl.java:634)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.populateAndCacheStripeDetails(OrcInputFormat.java:927)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:836)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:702)*
>
> * ... 4 more*
>
> *]*
>
> *DAG did not succeed due to VERTEX_FAILURE. failedVertices:1
> killedVertices:0*
>
> *FAILED: Execution Error, return code 2 from
> org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map
> 1, vertexId=vertex_1456379707708_1135_1_00, 

Re: Hive Cli ORC table read error with limit option

2016-03-01 Thread Biswajit Nayak
izerCallable$1.run(RootInputInitializerManager.java:245)*

* at
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239)*

* at java.security.AccessController.doPrivileged(Native Method)*

* at javax.security.auth.Subject.doAs(Subject.java:415)*

* at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)*

* at
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239)*

* at
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226)*

* at java.util.concurrent.FutureTask.run(FutureTask.java:262)*

* at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)*

* at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)*

* at java.lang.Thread.run(Thread.java:744)*

*Caused by: java.util.concurrent.ExecutionException:
java.lang.IndexOutOfBoundsException: Index: 0*

* at java.util.concurrent.FutureTask.report(FutureTask.java:122)*

* at java.util.concurrent.FutureTask.get(FutureTask.java:188)*

* at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1016)*

* ... 15 more*

*Caused by: java.lang.IndexOutOfBoundsException: Index: 0*

* at java.util.Collections$EmptyList.get(Collections.java:3212)*

* at
org.apache.hadoop.hive.ql.io.orc.OrcProto$Type.getSubtypes(OrcProto.java:12240)*

* at
org.apache.hadoop.hive.ql.io.orc.ReaderImpl.getColumnIndicesFromNames(ReaderImpl.java:651)*

* at
org.apache.hadoop.hive.ql.io.orc.ReaderImpl.getRawDataSizeOfColumns(ReaderImpl.java:634)*

* at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.populateAndCacheStripeDetails(OrcInputFormat.java:927)*

* at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:836)*

* at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:702)*

* ... 4 more*

*]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1
killedVertices:0*

*hive> *


On Tue, Mar 1, 2016 at 1:09 PM, Biswajit Nayak <biswa...@altiscale.com>
wrote:

> Gopal,
>
> Any plan of provide the fix to Hive 1.x versions or to backport it?
>
> Regards
> Biswa
>
> On Tue, Mar 1, 2016 at 11:44 AM, Biswajit Nayak <biswa...@altiscale.com>
> wrote:
>
>> Thanks Gopal for the details .. happy to know it has been counted and
>> fixed.
>>
>> Biswa
>>
>>
>> On Tue, Mar 1, 2016 at 11:37 AM, Gopal Vijayaraghavan <gop...@apache.org>
>> wrote:
>>
>>>
>>> > Yes it is kerberos cluster.
>>> ...
>>> > After disabling the optimization in hive cli, it works with limit
>>> >option.
>>>
>>> Alright, then it is fixed in -
>>> https://issues.apache.org/jira/browse/HIVE-13120
>>>
>>>
>>> Cheers,
>>> Gopal
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>


Re: Hive Cli ORC table read error with limit option

2016-02-29 Thread Biswajit Nayak
Gopal,

Any plan of provide the fix to Hive 1.x versions or to backport it?

Regards
Biswa

On Tue, Mar 1, 2016 at 11:44 AM, Biswajit Nayak <biswa...@altiscale.com>
wrote:

> Thanks Gopal for the details .. happy to know it has been counted and
> fixed.
>
> Biswa
>
>
> On Tue, Mar 1, 2016 at 11:37 AM, Gopal Vijayaraghavan <gop...@apache.org>
> wrote:
>
>>
>> > Yes it is kerberos cluster.
>> ...
>> > After disabling the optimization in hive cli, it works with limit
>> >option.
>>
>> Alright, then it is fixed in -
>> https://issues.apache.org/jira/browse/HIVE-13120
>>
>>
>> Cheers,
>> Gopal
>>
>>
>>
>>
>>
>>
>>
>


Re: Hive Cli ORC table read error with limit option

2016-02-29 Thread Biswajit Nayak
Thanks Gopal for the details .. happy to know it has been counted and
fixed.

Biswa


On Tue, Mar 1, 2016 at 11:37 AM, Gopal Vijayaraghavan 
wrote:

>
> > Yes it is kerberos cluster.
> ...
> > After disabling the optimization in hive cli, it works with limit
> >option.
>
> Alright, then it is fixed in -
> https://issues.apache.org/jira/browse/HIVE-13120
>
>
> Cheers,
> Gopal
>
>
>
>
>
>
>


Re: Hive Cli ORC table read error with limit option

2016-02-29 Thread Biswajit Nayak
Thanks Gopal for the response.

Yes it is kerberos cluster.

After disabling the optimization in hive cli, it works with limit option.
Below is the DESC details of the table that you asked for.


*hive> desc formatted *testdb.table_orc*;*

*OK*

*# col_namedata_type   comment *



*row_id  bigint  *

*aint *

*b int *

*c   varchar(2)  *

*dbigint  *

*e int *

*f   bigint  *

*g   float   *

*h   int *

*i int *



*# Partition Information  *

*# col_namedata_type   comment *



*yearint *

*month   int *

*day int *



*# Detailed Table Information  *

*Database:   *testdb

*Owner:  *   *

*CreateTime: Mon Jan 25 22:32:22 UTC 2016  *

*LastAccessTime: UNKNOWN  *

*Protect Mode:   None *

*Retention:  0*

*Location:   hdfs://***:8020/hive/*testdb*.db/table_orc
 *

*Table Type: MANAGED_TABLE*

*Table Parameters:  *

* last_modified_by**  *

* last_modified_time  **  *

* orc.compressSNAPPY  *

* transient_lastDdlTime 1454104669  *



*# Storage Information  *

*SerDe Library:  org.apache.hadoop.hive.ql.io.orc.OrcSerde  *

*InputFormat:org.apache.hadoop.hive.ql.io.orc.OrcInputFormat  *

*OutputFormat:   org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat  *

*Compressed: No   *

*Num Buckets:7*

*Bucket Columns: [f]   *

*Sort Columns:   []   *

*Storage Desc Params:  *

* field.delim \t  *

* serialization.format \t  *

*Time taken: 0.105 seconds, Fetched: 46 row(s)*

*hive> *



On Tue, Mar 1, 2016 at 10:55 AM, Gopal Vijayaraghavan 
wrote:

>
> > Failed with exception java.io.IOException:java.lang.RuntimeException:
> >serious problem
> > Time taken: 0.32 seconds
> ...
> > Any one faced this issue.
>
> No, but that sounds like one of the codepaths I put in - is this a
> Kerberos secure cluster?
>
> Try disabling the optimization and see if it works.
>
> set hive.fetch.task.conversion=none;
>
> If it does, reply back with "desc formatted " & I can help you
> debug deeper.
>
> Cheers,
> Gopal
>
>
>


Re: Sqoop Hcat Int partition error

2016-02-29 Thread Biswajit Nayak
The fix in the https://issues.apache.org/jira/browse/HIVE-7164.  does not
works.

On Tue, Mar 1, 2016 at 10:51 AM, Richa Sharma <mailtorichasha...@gmail.com>
wrote:

> Great!
>
> So what is the interim fix you are implementing
>
> Richa
> On Mar 1, 2016 4:06 PM, "Biswajit Nayak" <biswa...@altiscale.com> wrote:
>
>> Thanks Richa.
>>
>> The issue was suppose to be fixed in Hive 0.12 version as per the jira
>> https://issues.apache.org/jira/browse/HIVE-7164.
>>
>> Even raised a ticket in sqoop jira [SQOOP-2840] for this .
>>
>> Thanks
>> Biswa
>>
>>
>>
>>
>>
>> On Tue, Mar 1, 2016 at 9:56 AM, Richa Sharma <mailtorichasha...@gmail.com
>> > wrote:
>>
>>> Hi,
>>>
>>> The values should still persist if partition column data type in Hive is
>>> a string.
>>>
>>> I am checking HCatalog documentation for support of int data type in
>>> partition column.
>>>
>>> Cheers
>>> Richa
>>>
>>> On Tue, Mar 1, 2016 at 3:06 PM, Biswajit Nayak <biswa...@altiscale.com>
>>> wrote:
>>>
>>>> Hi Richa,
>>>>
>>>> Thats a work around. But how to handle the columns with INT type.
>>>> Changing the type will be the last option for me.
>>>>
>>>> Regards
>>>> Biswa
>>>>
>>>>
>>>>
>>>> On Tue, Mar 1, 2016 at 9:31 AM, Richa Sharma <
>>>> mailtorichasha...@gmail.com> wrote:
>>>>
>>>>> Hi Biswajit
>>>>>
>>>>> The answer is in the last line of the error message. Change the data
>>>>> type of partition column to string in hive and try again.
>>>>>
>>>>> Hope it helps !
>>>>>
>>>>> Richa
>>>>>
>>>>> 16/02/12 08:04:12 ERROR tool.ExportTool: Encountered IOException running 
>>>>> export job: java.io.IOException: The table provided default.emp_details1 
>>>>> uses unsupported  partitioning key type  for column salary : int.  Only 
>>>>> string fields are allowed in partition columns in Catalog
>>>>>
>>>>>
>>>>> On Tue, Mar 1, 2016 at 2:19 PM, Biswajit Nayak <biswa...@altiscale.com
>>>>> > wrote:
>>>>>
>>>>>> Hi All,
>>>>>>
>>>>>> I am trying to do a SQOOP export from hive( integer type partition)
>>>>>> to mysql through HCAT and it fails with the following error.
>>>>>>
>>>>>> Versions:-
>>>>>>
>>>>>> Hadoop :-  2.7.1
>>>>>> Hive  :-  1.2.0
>>>>>> Sqoop   :-  1.4.5
>>>>>>
>>>>>> Table in Hive :-
>>>>>>
>>>>>>
>>>>>> hive> use default;
>>>>>> OK
>>>>>> Time taken: 0.028 seconds
>>>>>> hive> describe emp_details1;
>>>>>> OK
>>>>>> id  int
>>>>>> namestring
>>>>>> deg string
>>>>>> deptstring
>>>>>> salary  int
>>>>>>
>>>>>> # Partition Information
>>>>>> # col_name  data_type   comment
>>>>>>
>>>>>> salary  int
>>>>>> Time taken: 0.125 seconds, Fetched: 10 row(s)
>>>>>> hive>
>>>>>>
>>>>>> hive> select * from emp_details1;
>>>>>> OK
>>>>>> 1201gopal   5
>>>>>> 1202manisha 5
>>>>>> 1203kalil   5
>>>>>> 1204prasanth5
>>>>>> 1205kranthi 5
>>>>>> 1206satish  5
>>>>>> Time taken: 0.195 seconds, Fetched: 6 row(s)
>>>>>> hive>
>>>>>>
>>>>>>
>>>>>> Conf added to Hive metastore site.xml
>>>>>>
>>>>>>
>>>>>> [alti-test-01@hdpnightly271-ci-91-services ~]$ grep -A5 -B2 -i 
>>>>>> "hive.metastore.integral.jdo.pushdown" /etc/hive-metastore/hive-site.xml
>>>>>> 
>>>>>> 
>>>>>

Re: Sqoop Hcat Int partition error

2016-02-29 Thread Biswajit Nayak
Thanks Richa.

The issue was suppose to be fixed in Hive 0.12 version as per the jira
https://issues.apache.org/jira/browse/HIVE-7164.

Even raised a ticket in sqoop jira [SQOOP-2840] for this .

Thanks
Biswa





On Tue, Mar 1, 2016 at 9:56 AM, Richa Sharma <mailtorichasha...@gmail.com>
wrote:

> Hi,
>
> The values should still persist if partition column data type in Hive is a
> string.
>
> I am checking HCatalog documentation for support of int data type in
> partition column.
>
> Cheers
> Richa
>
> On Tue, Mar 1, 2016 at 3:06 PM, Biswajit Nayak <biswa...@altiscale.com>
> wrote:
>
>> Hi Richa,
>>
>> Thats a work around. But how to handle the columns with INT type.
>> Changing the type will be the last option for me.
>>
>> Regards
>> Biswa
>>
>>
>>
>> On Tue, Mar 1, 2016 at 9:31 AM, Richa Sharma <mailtorichasha...@gmail.com
>> > wrote:
>>
>>> Hi Biswajit
>>>
>>> The answer is in the last line of the error message. Change the data
>>> type of partition column to string in hive and try again.
>>>
>>> Hope it helps !
>>>
>>> Richa
>>>
>>> 16/02/12 08:04:12 ERROR tool.ExportTool: Encountered IOException running 
>>> export job: java.io.IOException: The table provided default.emp_details1 
>>> uses unsupported  partitioning key type  for column salary : int.  Only 
>>> string fields are allowed in partition columns in Catalog
>>>
>>>
>>> On Tue, Mar 1, 2016 at 2:19 PM, Biswajit Nayak <biswa...@altiscale.com>
>>> wrote:
>>>
>>>> Hi All,
>>>>
>>>> I am trying to do a SQOOP export from hive( integer type partition) to
>>>> mysql through HCAT and it fails with the following error.
>>>>
>>>> Versions:-
>>>>
>>>> Hadoop :-  2.7.1
>>>> Hive  :-  1.2.0
>>>> Sqoop   :-  1.4.5
>>>>
>>>> Table in Hive :-
>>>>
>>>>
>>>> hive> use default;
>>>> OK
>>>> Time taken: 0.028 seconds
>>>> hive> describe emp_details1;
>>>> OK
>>>> id  int
>>>> namestring
>>>> deg string
>>>> deptstring
>>>> salary  int
>>>>
>>>> # Partition Information
>>>> # col_name  data_type   comment
>>>>
>>>> salary  int
>>>> Time taken: 0.125 seconds, Fetched: 10 row(s)
>>>> hive>
>>>>
>>>> hive> select * from emp_details1;
>>>> OK
>>>> 1201gopal   5
>>>> 1202manisha 5
>>>> 1203kalil   5
>>>> 1204prasanth5
>>>> 1205kranthi 5
>>>> 1206satish  5
>>>> Time taken: 0.195 seconds, Fetched: 6 row(s)
>>>> hive>
>>>>
>>>>
>>>> Conf added to Hive metastore site.xml
>>>>
>>>>
>>>> [alti-test-01@hdpnightly271-ci-91-services ~]$ grep -A5 -B2 -i 
>>>> "hive.metastore.integral.jdo.pushdown" /etc/hive-metastore/hive-site.xml
>>>> 
>>>> 
>>>> hive.metastore.integral.jdo.pushdown
>>>> TRUE
>>>> 
>>>>
>>>> 
>>>> [alti-test-01@hdpnightly271-ci-91-services ~]$
>>>>
>>>>
>>>> The issue remains same
>>>>
>>>>
>>>> [alti-test-01@hdpnightly271-ci-91-services ~]$ /opt/sqoop-1.4.5/bin/sqoop 
>>>> export --connect jdbc:mysql://localhost:3306/test --username hive 
>>>> --password * --table employee --hcatalog-database default 
>>>> --hcatalog-table emp_details1
>>>> Warning: /opt/sqoop-1.4.5/bin/../../hbase does not exist! HBase imports 
>>>> will fail.
>>>> Please set $HBASE_HOME to the root of your HBase installation.
>>>> Warning: /opt/sqoop-1.4.5/bin/../../accumulo does not exist! Accumulo 
>>>> imports will fail.
>>>> Please set $ACCUMULO_HOME to the root of your Accumulo installation.
>>>> Warning: /opt/sqoop-1.4.5/bin/../../zookeeper does not exist! Accumulo 
>>>> imports will fail.
>>>> Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
>>>> 16/02/12 08:04:00 INFO

Hive Cli ORC table read error with limit option

2016-02-29 Thread Biswajit Nayak
Hi All,

I am trying to run a simple query of select with limit option, it fails.
Below are the details.

Versions:-

Hadoop :-  2.7.1
Hive  :-  1.2.0
Sqoop   :-  1.4.5


Query:-

The table table_orc is partitioned based on year, month and day. And the
table is ORC storage.

hive> select date from testdb.table_orc where year = 2016 and month =1
and day =29 limit 10;
OK
Failed with exception java.io.IOException:java.lang.RuntimeException:
serious problem
Time taken: 0.32 seconds
hive>


While the query select date from testdb.table_orc where year = 2016 and
month =1 and day =29 limit 10;   without limit works perfectly. Even the
count (*) or select * works perfectly fine.


Any one faced this issue.

Regards
Biswa


Re: Sqoop Hcat Int partition error

2016-02-29 Thread Biswajit Nayak
Hi Richa,

Thats a work around. But how to handle the columns with INT type. Changing
the type will be the last option for me.

Regards
Biswa



On Tue, Mar 1, 2016 at 9:31 AM, Richa Sharma <mailtorichasha...@gmail.com>
wrote:

> Hi Biswajit
>
> The answer is in the last line of the error message. Change the data type
> of partition column to string in hive and try again.
>
> Hope it helps !
>
> Richa
>
> 16/02/12 08:04:12 ERROR tool.ExportTool: Encountered IOException running 
> export job: java.io.IOException: The table provided default.emp_details1 uses 
> unsupported  partitioning key type  for column salary : int.  Only string 
> fields are allowed in partition columns in Catalog
>
>
> On Tue, Mar 1, 2016 at 2:19 PM, Biswajit Nayak <biswa...@altiscale.com>
> wrote:
>
>> Hi All,
>>
>> I am trying to do a SQOOP export from hive( integer type partition) to
>> mysql through HCAT and it fails with the following error.
>>
>> Versions:-
>>
>> Hadoop :-  2.7.1
>> Hive  :-  1.2.0
>> Sqoop   :-  1.4.5
>>
>> Table in Hive :-
>>
>>
>> hive> use default;
>> OK
>> Time taken: 0.028 seconds
>> hive> describe emp_details1;
>> OK
>> id  int
>> namestring
>> deg string
>> deptstring
>> salary  int
>>
>> # Partition Information
>> # col_name  data_type   comment
>>
>> salary  int
>> Time taken: 0.125 seconds, Fetched: 10 row(s)
>> hive>
>>
>> hive> select * from emp_details1;
>> OK
>> 1201gopal   5
>> 1202manisha 5
>> 1203kalil   5
>> 1204prasanth5
>> 1205kranthi 5
>> 1206satish  5
>> Time taken: 0.195 seconds, Fetched: 6 row(s)
>> hive>
>>
>>
>> Conf added to Hive metastore site.xml
>>
>>
>> [alti-test-01@hdpnightly271-ci-91-services ~]$ grep -A5 -B2 -i 
>> "hive.metastore.integral.jdo.pushdown" /etc/hive-metastore/hive-site.xml
>> 
>> 
>> hive.metastore.integral.jdo.pushdown
>> TRUE
>> 
>>
>> 
>> [alti-test-01@hdpnightly271-ci-91-services ~]$
>>
>>
>> The issue remains same
>>
>>
>> [alti-test-01@hdpnightly271-ci-91-services ~]$ /opt/sqoop-1.4.5/bin/sqoop 
>> export --connect jdbc:mysql://localhost:3306/test --username hive --password 
>> * --table employee --hcatalog-database default --hcatalog-table 
>> emp_details1
>> Warning: /opt/sqoop-1.4.5/bin/../../hbase does not exist! HBase imports will 
>> fail.
>> Please set $HBASE_HOME to the root of your HBase installation.
>> Warning: /opt/sqoop-1.4.5/bin/../../accumulo does not exist! Accumulo 
>> imports will fail.
>> Please set $ACCUMULO_HOME to the root of your Accumulo installation.
>> Warning: /opt/sqoop-1.4.5/bin/../../zookeeper does not exist! Accumulo 
>> imports will fail.
>> Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
>> 16/02/12 08:04:00 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5
>> 16/02/12 08:04:00 WARN tool.BaseSqoopTool: Setting your password on the 
>> command-line is insecure. Consider using -P instead.
>> 16/02/12 08:04:00 INFO manager.MySQLManager: Preparing to use a MySQL 
>> streaming resultset.
>> 16/02/12 08:04:00 INFO tool.CodeGenTool: Beginning code generation
>> 16/02/12 08:04:01 INFO manager.SqlManager: Executing SQL statement: SELECT 
>> t.* FROM `employee` AS t LIMIT 1
>> 16/02/12 08:04:01 INFO manager.SqlManager: Executing SQL statement: SELECT 
>> t.* FROM `employee` AS t LIMIT 1
>> 16/02/12 08:04:01 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is 
>> /opt/hadoop
>> Note: 
>> /tmp/sqoop-alti-test-01/compile/1b0d4b1c30f167eb57ef488232ab49c8/employee.java
>>  uses or overrides a deprecated API.
>> Note: Recompile with -Xlint:deprecation for details.
>> 16/02/12 08:04:07 INFO orm.CompilationManager: Writing jar file: 
>> /tmp/sqoop-alti-test-01/compile/1b0d4b1c30f167eb57ef488232ab49c8/employee.jar
>> 16/02/12 08:04:07 INFO mapreduce.ExportJobBase: Beginning export of employee
>> 16/02/12 08:04:08 INFO mapreduce.ExportJobBase: Configuring HCatalog for 
>> export job
>> 16/02/12 08:04:08 INFO hcat.SqoopHCatUtilities: Configuring HCatalog 
>> specific details for job
>> 16/02/12 08:04:08 INFO manager.SqlManager: Executing SQL statement: SELECT 
>> t.* FROM `emp

Sqoop Hcat Int partition error

2016-02-29 Thread Biswajit Nayak
Hi All,

I am trying to do a SQOOP export from hive( integer type partition) to
mysql through HCAT and it fails with the following error.

Versions:-

Hadoop :-  2.7.1
Hive  :-  1.2.0
Sqoop   :-  1.4.5

Table in Hive :-


hive> use default;
OK
Time taken: 0.028 seconds
hive> describe emp_details1;
OK
id  int
namestring
deg string
deptstring
salary  int

# Partition Information
# col_name  data_type   comment

salary  int
Time taken: 0.125 seconds, Fetched: 10 row(s)
hive>

hive> select * from emp_details1;
OK
1201gopal   5
1202manisha 5
1203kalil   5
1204prasanth5
1205kranthi 5
1206satish  5
Time taken: 0.195 seconds, Fetched: 6 row(s)
hive>


Conf added to Hive metastore site.xml


[alti-test-01@hdpnightly271-ci-91-services ~]$ grep -A5 -B2 -i
"hive.metastore.integral.jdo.pushdown"
/etc/hive-metastore/hive-site.xml


hive.metastore.integral.jdo.pushdown
TRUE



[alti-test-01@hdpnightly271-ci-91-services ~]$


The issue remains same


[alti-test-01@hdpnightly271-ci-91-services ~]$
/opt/sqoop-1.4.5/bin/sqoop export --connect
jdbc:mysql://localhost:3306/test --username hive --password *
--table employee --hcatalog-database default --hcatalog-table
emp_details1
Warning: /opt/sqoop-1.4.5/bin/../../hbase does not exist! HBase
imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /opt/sqoop-1.4.5/bin/../../accumulo does not exist! Accumulo
imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /opt/sqoop-1.4.5/bin/../../zookeeper does not exist! Accumulo
imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
16/02/12 08:04:00 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5
16/02/12 08:04:00 WARN tool.BaseSqoopTool: Setting your password on
the command-line is insecure. Consider using -P instead.
16/02/12 08:04:00 INFO manager.MySQLManager: Preparing to use a MySQL
streaming resultset.
16/02/12 08:04:00 INFO tool.CodeGenTool: Beginning code generation
16/02/12 08:04:01 INFO manager.SqlManager: Executing SQL statement:
SELECT t.* FROM `employee` AS t LIMIT 1
16/02/12 08:04:01 INFO manager.SqlManager: Executing SQL statement:
SELECT t.* FROM `employee` AS t LIMIT 1
16/02/12 08:04:01 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/hadoop
Note: 
/tmp/sqoop-alti-test-01/compile/1b0d4b1c30f167eb57ef488232ab49c8/employee.java
uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
16/02/12 08:04:07 INFO orm.CompilationManager: Writing jar file:
/tmp/sqoop-alti-test-01/compile/1b0d4b1c30f167eb57ef488232ab49c8/employee.jar
16/02/12 08:04:07 INFO mapreduce.ExportJobBase: Beginning export of employee
16/02/12 08:04:08 INFO mapreduce.ExportJobBase: Configuring HCatalog
for export job
16/02/12 08:04:08 INFO hcat.SqoopHCatUtilities: Configuring HCatalog
specific details for job
16/02/12 08:04:08 INFO manager.SqlManager: Executing SQL statement:
SELECT t.* FROM `employee` AS t LIMIT 1
16/02/12 08:04:08 INFO hcat.SqoopHCatUtilities: Database column names
projected : [id, name, deg, salary, dept]
16/02/12 08:04:08 INFO hcat.SqoopHCatUtilities: Database column name -
info map :
id : [Type : 4,Precision : 11,Scale : 0]
name : [Type : 12,Precision : 20,Scale : 0]
deg : [Type : 12,Precision : 20,Scale : 0]
salary : [Type : 4,Precision : 11,Scale : 0]
dept : [Type : 12,Precision : 10,Scale : 0]

16/02/12 08:04:10 INFO hive.metastore: Trying to connect to metastore
with URI thrift://hive-hdpnightly271-ci-91.test.altiscale.com:9083
16/02/12 08:04:10 INFO hive.metastore: Connected to metastore.
16/02/12 08:04:11 INFO hcat.SqoopHCatUtilities: HCatalog full table
schema fields = [id, name, deg, dept, salary]
16/02/12 08:04:12 ERROR tool.ExportTool: Encountered IOException
running export job: java.io.IOException: The table provided
default.emp_details1 uses unsupported  partitioning key type  for
column salary : int.  Only string fields are allowed in partition
columns in Catalog


Stuck with this issue. Any one had conquered this before.

Regards
Biswa


Re: oozie not running

2015-02-23 Thread Biswajit Nayak
you have upload the share lib jar to war

~Biswa


On Mon, Feb 23, 2015 at 7:00 PM, Rahul Channe drah...@googlemail.com
wrote:

 Hi Mohammad,

 I checked the following path but did not find any hadoop jar

 /home/user/oozie-4.1.0/webapp/target/oozie-webapp-4.1.0/WEB-INF/lib

 user@ubuntuvm:~/oozie-4.1.0/webapp/target/oozie-webapp-4.1.0/WEB-INF/lib$
 ls -ltr hadoop*.jar
 ls: cannot access hadoop*.jar: No such file or directory
 user@ubuntuvm:~/oozie-4.1.0/webapp/target/oozie-webapp-4.1.0/WEB-INF/lib$


 On Mon, Feb 23, 2015 at 2:50 AM, Mohammad Islam misla...@yahoo.com
 wrote:

 Moving user@hive to bcc and adding user@oozie

 Hi Rahul,
 Did you include the correct hadoop jars into oozie.war file? Most
 possibly you didn't include the correct hadoop jars?
 Can you please check webapps/oozie/WEB-INF/lib directory for all hadoop
 jars. If possible include the output of ls hadoop*,jar from that
 directory.

 Regards,
 Mohammad


   On Sunday, February 22, 2015 2:45 PM, Rahul Channe 
 drah...@googlemail.com wrote:


 Hi Biswajit,

 The catalina.out displays following exception but unable to dig further

 Feb 21, 2015 12:46:56 PM org.apache.catalina.core.AprLifecycleListener
 init
 INFO: The APR based Apache Tomcat Native library which allows optimal
 performance in production environments was not found on the
 java.library.path:
 Feb 21, 2015 12:46:56 PM org.apache.coyote.http11.Http11Protocol init
 INFO: Initializing Coyote HTTP/1.1 on http-11000
 Feb 21, 2015 12:46:56 PM org.apache.catalina.startup.Catalina load
 INFO: Initialization processed in 623 ms
 Feb 21, 2015 12:46:56 PM org.apache.catalina.core.StandardService start
 INFO: Starting service Catalina
 Feb 21, 2015 12:46:56 PM org.apache.catalina.core.StandardEngine start
 INFO: Starting Servlet Engine: Apache Tomcat/6.0.41
 Feb 21, 2015 12:46:56 PM org.apache.catalina.startup.HostConfig
 deployDescriptor
 INFO: Deploying configuration descriptor oozie.xml

 ERROR: Oozie could not be started

 REASON: java.lang.NoClassDefFoundError:
 org/apache/hadoop/util/ReflectionUtils

 Stacktrace:
 -
 java.lang.NoClassDefFoundError: org/apache/hadoop/util/ReflectionUtils
 at
 org.apache.oozie.service.Services.setServiceInternal(Services.java:374)
 at org.apache.oozie.service.Services.init(Services.java:110)
 at
 org.apache.oozie.servlet.ServicesLoader.contextInitialized(ServicesLoader.java:44)
 at
 org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4210)
 at
 org.apache.catalina.core.StandardContext.start(StandardContext.java:4709)
 at
 org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:799)
 at
 org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:779)
 at
 org.apache.catalina.core.StandardHost.addChild(StandardHost.java:583)
 at
 org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:675)
 at
 org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:601)
 at
 org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:502)
 at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1317)
 at
 org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:324)
 at
 org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:142)
 at
 org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1065)
 at org.apache.catalina.core.StandardHost.start(StandardHost.java:822)
 at
 org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1057)
 at
 org.apache.catalina.core.StandardEngine.start(StandardEngine.java:463)
 at
 org.apache.catalina.core.StandardService.start(StandardService.java:525)
 at
 org.apache.catalina.core.StandardServer.start(StandardServer.java:754)
 at org.apache.catalina.startup.Catalina.start(Catalina.java:595)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:622)
 at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:289)
 at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:414)
 Caused by: java.lang.ClassNotFoundException:
 org.apache.hadoop.util.ReflectionUtils
 at
 org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1680)
 at
 org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1526)


 On Sun, Feb 22, 2015 at 8:00 AM, Biswajit Nayak 
 biswajit.na...@inmobi.com wrote:

 could you check the oozie.log and catalina.out log file. It will give you
 an idea what is wrong.

 ~Biswa


 On Sun, Feb 22, 2015 at 1:22 PM, Rahul Channe drah...@googlemail.com
 wrote:

 Hi All,

 I configured the oozie build successfully and prepared oozie

Re: HDFS file system size issue

2014-04-14 Thread Biswajit Nayak
Whats the replication factor you have? I believe it should be 3. hadoop dus
shows that disk usage without replication. While name node ui page gives
with replication.

38gb * 3 =114gb ~ 1TB

~Biswa
-oThe important thing is not to stop questioning o-


On Mon, Apr 14, 2014 at 9:38 AM, Saumitra saumitra.offic...@gmail.comwrote:

 Hi Biswajeet,

 Non-dfs usage is ~100GB over the cluster. But still the number are nowhere
 near 1TB.

 Basically I wanted to point out discrepancy in name node status page and 
 hadoop
 dfs -dus. In my case, earlier one reports DFS usage as 1TB and later one
 reports it to be 35GB. What are the factors that can cause this difference?
 And why is just 35GB data causing DFS to hit its limits?




 On 14-Apr-2014, at 8:31 am, Biswajit Nayak biswajit.na...@inmobi.com
 wrote:

 Hi Saumitra,

 Could you please check the non-dfs usage. They also contribute to filling
 up the disk space.



 ~Biswa
 -oThe important thing is not to stop questioning o-


 On Mon, Apr 14, 2014 at 1:24 AM, Saumitra saumitra.offic...@gmail.comwrote:

 Hello,

 We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. We
 are using default HDFS block size.

 We have noticed that disks of slaves are almost full. From name node's
 status page (namenode:50070), we could see that disks of live nodes are 90%
 full and DFS Used% in cluster summary page  is ~1TB.

 However hadoop dfs -dus / shows that file system size is merely 38GB.
 38GB number looks to be correct because we keep only few Hive tables and
 hadoop's /tmp (distributed cache and job outputs) in HDFS. All other data
 is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
 that there is no internal fragmentation because the files in our Hive
 tables are well-chopped in ~50MB chunks. Here are last few lines of
 hadoop fsck / -files -blocks

 Status: HEALTHY
  Total size: 38086441332 B
  Total dirs: 232
  Total files: 802
  Total blocks (validated): 796 (avg. block size 47847288 B)
  Minimally replicated blocks: 796 (100.0 %)
  Over-replicated blocks: 0 (0.0 %)
  Under-replicated blocks: 6 (0.75376886 %)
  Mis-replicated blocks: 0 (0.0 %)
  Default replication factor: 2
  Average block replication: 3.0439699
  Corrupt blocks: 0
  Missing replicas: 6 (0.24762692 %)
  Number of data-nodes: 9
  Number of racks: 1
 FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds


 My question is that why disks of slaves are getting full even though
 there are only few files in DFS?



 _
 The information contained in this communication is intended solely for the
 use of the individual or entity to whom it is addressed and others
 authorized to receive it. It may contain confidential or legally privileged
 information. If you are not the intended recipient you are hereby notified
 that any disclosure, copying, distribution or taking any action in reliance
 on the contents of this information is strictly prohibited and may be
 unlawful. If you have received this communication in error, please notify
 us immediately by responding to this email and then delete it from your
 system. The firm is neither liable for the proper and complete transmission
 of the information contained in this communication nor for any delay in its
 receipt.




-- 
_
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.


Re: [ANNOUNCE] New Hive Committers - Alan Gates, Daniel Dai, and Sushanth Sowmyan

2014-04-14 Thread Biswajit Nayak
Congrats...

~Biswa
-oThe important thing is not to stop questioning o-


On Mon, Apr 14, 2014 at 11:32 PM, Sergey Shelukhin
ser...@hortonworks.comwrote:

 Congrats!


 On Mon, Apr 14, 2014 at 10:55 AM, Prasanth Jayachandran 
 pjayachand...@hortonworks.com wrote:

 Congratulations everyone!!

 Thanks
 Prasanth Jayachandran

 On Apr 14, 2014, at 10:51 AM, Carl Steinbach c...@apache.org wrote:

  The Apache Hive PMC has voted to make Alan Gates, Daniel Dai, and
 Sushanth
  Sowmyan committers on the Apache Hive Project.
 
  Please join me in congratulating Alan, Daniel, and Sushanth!
 
  - Carl


 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity
 to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified
 that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender
 immediately
 and delete it from your system. Thank You.



 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity
 to which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


-- 
_
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.


Re: HDFS file system size issue

2014-04-13 Thread Biswajit Nayak
Hi Saumitra,

Could you please check the non-dfs usage. They also contribute to filling
up the disk space.



~Biswa
-oThe important thing is not to stop questioning o-


On Mon, Apr 14, 2014 at 1:24 AM, Saumitra saumitra.offic...@gmail.comwrote:

 Hello,

 We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. We
 are using default HDFS block size.

 We have noticed that disks of slaves are almost full. From name node's
 status page (namenode:50070), we could see that disks of live nodes are 90%
 full and DFS Used% in cluster summary page  is ~1TB.

 However hadoop dfs -dus / shows that file system size is merely 38GB.
 38GB number looks to be correct because we keep only few Hive tables and
 hadoop's /tmp (distributed cache and job outputs) in HDFS. All other data
 is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
 that there is no internal fragmentation because the files in our Hive
 tables are well-chopped in ~50MB chunks. Here are last few lines of hadoop 
 fsck
 / -files -blocks

 Status: HEALTHY
  Total size: 38086441332 B
  Total dirs: 232
  Total files: 802
  Total blocks (validated): 796 (avg. block size 47847288 B)
  Minimally replicated blocks: 796 (100.0 %)
  Over-replicated blocks: 0 (0.0 %)
  Under-replicated blocks: 6 (0.75376886 %)
  Mis-replicated blocks: 0 (0.0 %)
  Default replication factor: 2
  Average block replication: 3.0439699
  Corrupt blocks: 0
  Missing replicas: 6 (0.24762692 %)
  Number of data-nodes: 9
  Number of racks: 1
 FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds


 My question is that why disks of slaves are getting full even though there
 are only few files in DFS?


-- 
_
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.


Re: [ANNOUNCE] New Hive PMC Member - Xuefu Zhang

2014-02-28 Thread Biswajit Nayak
Congrats Xuefu..

With Best Regards
Biswajit

~Biswa
-oThe important thing is not to stop questioning o-


On Fri, Feb 28, 2014 at 2:50 PM, Carl Steinbach c...@apache.org wrote:

 I am pleased to announce that Xuefu Zhang has been elected to the Hive
 Project Management Committee. Please join me in congratulating Xuefu!

 Thanks.

 Carl



-- 
_
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.


RE: Is there any monitoring tool available for hiveserver2

2014-02-21 Thread Biswajit Nayak
This one is for monitoring of metastore only. It has to be chaged for
server and monitoring.

Biswa
On 21 Feb 2014 22:44, shouvanik.hal...@accenture.com wrote:

  Thanks Biswajit.



 I will try it and let you know.







 Thanks,

 Shouvanik



 *From:* Biswajit Nayak [mailto:biswajit.na...@inmobi.com]
 *Sent:* Friday, February 21, 2014 2:28 AM
 *To:* user@hive.apache.org
 *Subject:* Re: Is there any monitoring tool available for hiveserver2



 Below is the script that does my graphing of heap (usage|allocated) for
 hive.



 #!/bin/bash

 HIVES_PID=`jps -mlv |grep org.apache.hadoop.hive.metastore.HiveMetaStore
 |awk '{print $1}'`


 if [ $HIVES_PID ]; then

 jmap -heap $HIVES_PID  |awk '

 BEGIN{ gmetric=/usr/bin/gmetric;sum=0}

 {

 split($1,arr, )

 if(arr[1] == used) {

 sum=sum + $3;

 }

 if(arr[1] == MaxHeapSize) {

 maxheap=$3;

 }

 } END {printf %s -n hive.metastore.current_heap_capacity -v %d -t int8 -d
 60 -g HiveHeapstats \n,gmetric, maxheap;printf %s -n
 hive.metastore.current_heap_usage -v %d -t int8 -d 60 -g HiveHeapstats
 \n,gmetric, sum;}' |sh

 fi



 please let me know if it works for you.. Apologies for the delay
 response..



 Thanks

 BIswa




 On Fri, Feb 21, 2014 at 10:41 AM, shouvanik.hal...@accenture.com wrote:

 That will be great! Thanks in advance.













 Thanks,

 Shouvanik



 *From:* Biswajit Nayak [mailto:biswajit.na...@inmobi.com]
 *Sent:* Thursday, February 20, 2014 8:37 PM
 *To:* user@hive.apache.org
 *Subject:* RE: Is there any monitoring tool available for hiveserver2



 I could share the script that does it. I will be able to do it by 12:30 .
 Stuck in a meeting till that time.

 Regards
 Biswa

 On 21 Feb 2014 10:02, shouvanik.hal...@accenture.com wrote:

 Hi Biswajit,



 Could you give an idea of how to do it, please?













 Thanks,

 Shouvanik



 *From:* Biswajit Nayak [mailto:biswajit.na...@inmobi.com]
 *Sent:* Thursday, February 20, 2014 8:30 PM
 *To:* user@hive.apache.org
 *Subject:* Re: Is there any monitoring tool available for hiveserver2



 I have built up a customized script for alerting and monitoring.
 Could not find any default way to do it.

 Thanks
 Biswajit

 On 21 Feb 2014 05:17, shouvanik.hal...@accenture.com wrote:

 Hi,



 It might happen that hiveserver2 memory gets exhausted. Similarly there
 would be many other things to  monitor for hiveserver2.



 Is there any monitoring tool available in the market?



 I am using EMR, for FYI.













 Thanks,

 Shouvanik




  --


 This message is for the designated recipient only and may contain
 privileged, proprietary, or otherwise confidential information. If you have
 received it in error, please notify the sender immediately and delete the
 original. Any other use of the e-mail by you is prohibited. Where allowed
 by local law, electronic communications with Accenture and its affiliates,
 including e-mail and instant messaging (including content), may be scanned
 by our systems for the purposes of information security and assessment of
 internal compliance with Accenture policy. .

 __

 www.accenture.com



 _

 The information contained in this communication is intended solely for the
 use of the individual or entity to whom it is addressed and others
 authorized to receive it. It may contain confidential or legally privileged
 information. If you are not the intended recipient you are hereby notified
 that any disclosure, copying, distribution or taking any action in reliance
 on the contents of this information is strictly prohibited and may be
 unlawful. If you have received this communication in error, please notify
 us immediately by responding to this email and then delete it from your
 system. The firm is neither liable for the proper and complete transmission
 of the information contained in this communication nor for any delay in its
 receipt.



 _

 The information contained in this communication is intended solely for the
 use of the individual or entity to whom it is addressed and others
 authorized to receive it. It may contain confidential or legally privileged
 information. If you are not the intended recipient you are hereby notified
 that any disclosure, copying, distribution or taking any action in reliance
 on the contents of this information is strictly prohibited and may be
 unlawful. If you have received this communication in error, please notify
 us immediately by responding to this email and then delete it from your
 system. The firm is neither liable for the proper and complete transmission
 of the information contained in this communication nor for any delay in its
 receipt.





 _

 The information contained in this communication is intended solely

Re: Is there any monitoring tool available for hiveserver2

2014-02-20 Thread Biswajit Nayak
I have built up a customized script for alerting and monitoring.
Could not find any default way to do it.

Thanks
Biswajit
On 21 Feb 2014 05:17, shouvanik.hal...@accenture.com wrote:

  Hi,



 It might happen that hiveserver2 memory gets exhausted. Similarly there
 would be many other things to  monitor for hiveserver2.



 Is there any monitoring tool available in the market?



 I am using EMR, for FYI.













 Thanks,

 Shouvanik



 --

 This message is for the designated recipient only and may contain
 privileged, proprietary, or otherwise confidential information. If you have
 received it in error, please notify the sender immediately and delete the
 original. Any other use of the e-mail by you is prohibited. Where allowed
 by local law, electronic communications with Accenture and its affiliates,
 including e-mail and instant messaging (including content), may be scanned
 by our systems for the purposes of information security and assessment of
 internal compliance with Accenture policy. .

 __

 www.accenture.com


-- 
_
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.


Re: [ANNOUNCE] New Hive PMC Member - Gunther Hagleitner

2013-12-27 Thread Biswajit Nayak
Congratulations Gunther...


On Fri, Dec 27, 2013 at 7:20 PM, Prasanth Jayachandran 
pjayachand...@hortonworks.com wrote:

 Congrats Gunther!!

 Sent from my iPhone

  On Dec 27, 2013, at 4:46 PM, Lefty Leverenz leftylever...@gmail.com
 wrote:
 
  Congratulations Gunther, well deserved!
 
  -- Lefty
 
 
  On Fri, Dec 27, 2013 at 12:00 AM, Jarek Jarcec Cecho jar...@apache.org
 wrote:
 
  Congratulations Gunther, good job!
 
  Jarcec
 
  On Thu, Dec 26, 2013 at 08:59:37PM -0800, Carl Steinbach wrote:
  I am pleased to announce that Gunther Hagleitner has been elected to
 the
  Hive Project Management Committee. Please join me in congratulating
  Gunther!
 
  Thanks.
 
  Carl
 

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


-- 
_
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.


how to make hive

2013-12-20 Thread Biswajit Nayak
Hi All,

Any one has any idea how to make hive server to emit metrics for ganglia.

i tried adding some properties in env file. it does not works.

Thanks
Biswajit

-- 
_
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.


hive monitoring

2013-12-13 Thread Biswajit Nayak
Hi All,

Could any one help me in identifying the data points for monitoring the
hive server and metastore. Or any tool that could help. Saw tool name
HAWK in slideshare, but could find any anywhere its source code has been
shared.

Thanks
Biswajit

-- 
_
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.


Not able to start the hive metastore

2013-12-08 Thread Biswajit Nayak
Hi All,

I had setup the hive in my home directory, but today i moved it to /opt.
After that starting it throws error:-

*Exception in thread main javax.jdo.JDOFatalDataStoreException: Unable to
open a test connection to the given database. JDBC url =
jdbc:derby:;databaseName=metastore_db;create=true, username = APP.
Terminating connection pool. Original Exception: --*

*java.sql.SQLException: Failed to start database 'metastore_db', see the
next exception for details.*

* at
org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown
Source)*

* at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown Source)*

* at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source)*

* at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown
Source)*

* at org.apache.derby.impl.jdbc.EmbedConnection.init(Unknown Source)*

*Thanks*
*Biswajit*

-- 
_
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.


DROP command fails with error message

2013-12-04 Thread Biswajit Nayak
Hi All,

I was trying to drop a database name default but every time it fails with
the below error message. While other commands works fine like  show
database


hive DROP DATABASE IF EXISTS default;
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Can not drop
default database)
hive


Thanks
Biswajit

-- 
_
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.


Re: DROP command fails with error message

2013-12-04 Thread Biswajit Nayak
Thanks a lot.. I am a naive to hive..
On 4 Dec 2013 18:13, Nitin Pawar nitinpawar...@gmail.com wrote:

 Exception clearly says You are not allowed to drop the default database


 On Wed, Dec 4, 2013 at 6:09 PM, Biswajit Nayak 
 biswajit.na...@inmobi.comwrote:

 Hi All,

 I was trying to drop a database name default but every time it fails
 with the below error message. While other commands works fine like  show
 database


 hive DROP DATABASE IF EXISTS default;
 FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Can not drop
 default database)
 hive


 Thanks
 Biswajit

 _
 The information contained in this communication is intended solely for
 the use of the individual or entity to whom it is addressed and others
 authorized to receive it. It may contain confidential or legally privileged
 information. If you are not the intended recipient you are hereby notified
 that any disclosure, copying, distribution or taking any action in reliance
 on the contents of this information is strictly prohibited and may be
 unlawful. If you have received this communication in error, please notify
 us immediately by responding to this email and then delete it from your
 system. The firm is neither liable for the proper and complete transmission
 of the information contained in this communication nor for any delay in its
 receipt.




 --
 Nitin Pawar


-- 
_
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.


Re: DROP command fails with error message

2013-12-04 Thread Biswajit Nayak
Thanks a lot Nitin. I was able to drop the database.

One question i have, is there any documentation how to configure hcatalog
with hive.

Thanks
Biswa


On Wed, Dec 4, 2013 at 6:26 PM, Nitin Pawar nitinpawar...@gmail.com wrote:

 Welcome to Hive :)

 Let us know if you faced any issues setting up hive (using pure apache
 distribution)  (just in case something missed in documentation)
 and keep raising your doubts :)




 On Wed, Dec 4, 2013 at 6:23 PM, Biswajit Nayak 
 biswajit.na...@inmobi.comwrote:

 Thanks a lot.. I am a naive to hive..
 On 4 Dec 2013 18:13, Nitin Pawar nitinpawar...@gmail.com wrote:

 Exception clearly says You are not allowed to drop the default
 database


 On Wed, Dec 4, 2013 at 6:09 PM, Biswajit Nayak 
 biswajit.na...@inmobi.com wrote:

 Hi All,

 I was trying to drop a database name default but every time it fails
 with the below error message. While other commands works fine like  show
 database


 hive DROP DATABASE IF EXISTS default;
 FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Can not drop
 default database)
 hive


 Thanks
 Biswajit

 _
 The information contained in this communication is intended solely for
 the use of the individual or entity to whom it is addressed and others
 authorized to receive it. It may contain confidential or legally privileged
 information. If you are not the intended recipient you are hereby notified
 that any disclosure, copying, distribution or taking any action in reliance
 on the contents of this information is strictly prohibited and may be
 unlawful. If you have received this communication in error, please notify
 us immediately by responding to this email and then delete it from your
 system. The firm is neither liable for the proper and complete transmission
 of the information contained in this communication nor for any delay in its
 receipt.




 --
 Nitin Pawar


 _
 The information contained in this communication is intended solely for
 the use of the individual or entity to whom it is addressed and others
 authorized to receive it. It may contain confidential or legally privileged
 information. If you are not the intended recipient you are hereby notified
 that any disclosure, copying, distribution or taking any action in reliance
 on the contents of this information is strictly prohibited and may be
 unlawful. If you have received this communication in error, please notify
 us immediately by responding to this email and then delete it from your
 system. The firm is neither liable for the proper and complete transmission
 of the information contained in this communication nor for any delay in its
 receipt.




 --
 Nitin Pawar


-- 
_
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.


Re: How to prevent user drop table in Hive metadata?

2013-11-22 Thread Biswajit Nayak
Hi Echo,

I dont think there is any to prevent this. I had the same concern in hbase,
but found out that it is assumed that user using the system are very much
aware of it.  I am into hive from last 3 months, was looking for some kind
of way here, but no luck till now..

Thanks
Biswa
On 23 Nov 2013 01:06, Echo Li echo...@gmail.com wrote:

 Good Friday!

 I was trying to apply certain level of security in our hive data
 warehouse, by modifying access mode of directories and files on hdfs to 755
 I think it's good enough for a new user to remove data, however the user
 still can drop the table definition in hive cli, seems the revoke doesn't
 help much, is there any way to prevent this?


 Thanks,
 Echo


-- 
_
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.


RE: How to prevent user drop table in Hive metadata?

2013-11-22 Thread Biswajit Nayak
Dont think so..
On 23 Nov 2013 01:20, simon.2.thomp...@bt.com wrote:

 Has no one raised a Jira ticket ?

 
 Dr. Simon Thompson

 
 From: Biswajit Nayak [biswajit.na...@inmobi.com]
 Sent: 22 November 2013 19:45
 To: user@hive.apache.org
 Subject: Re: How to prevent user drop table in Hive metadata?

 Hi Echo,

 I dont think there is any to prevent this. I had the same concern in
 hbase, but found out that it is assumed that user using the system are very
 much aware of it.  I am into hive from last 3 months, was looking for some
 kind of way here, but no luck till now..

 Thanks
 Biswa

 On 23 Nov 2013 01:06, Echo Li echo...@gmail.commailto:
 echo...@gmail.com wrote:
 Good Friday!

 I was trying to apply certain level of security in our hive data
 warehouse, by modifying access mode of directories and files on hdfs to 755
 I think it's good enough for a new user to remove data, however the user
 still can drop the table definition in hive cli, seems the revoke doesn't
 help much, is there any way to prevent this?


 Thanks,
 Echo

 _
 The information contained in this communication is intended solely for the
 use of the individual or entity to whom it is addressed and others
 authorized to receive it. It may contain confidential or legally privileged
 information. If you are not the intended recipient you are hereby notified
 that any disclosure, copying, distribution or taking any action in reliance
 on the contents of this information is strictly prohibited and may be
 unlawful. If you have received this communication in error, please notify
 us immediately by responding to this email and then delete it from your
 system. The firm is neither liable for the proper and complete transmission
 of the information contained in this communication nor for any delay in its
 receipt.


-- 
_
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.


Hive Web Interface Database connectivity issue

2013-09-11 Thread Biswajit Nayak
Hi All,

I was trying to start the hive web server, the webserver came up but the
database connectivity is not happening.  throwing the below errors. Any
help will be very helpful.


  File: NucleusJDOHelper.java Line:425 method:
getJDOExceptionForNucleusException
  class: org.datanucleus.jdo.NucleusJDOHelper

  File: JDOPersistenceManagerFactory.java Line:601 method:
freezeConfiguration
  class: org.datanucleus.jdo.JDOPersistenceManagerFactory

  File: JDOPersistenceManagerFactory.java Line:286 method:
createPersistenceManagerFactory
  class: org.datanucleus.jdo.JDOPersistenceManagerFactory

  File: JDOPersistenceManagerFactory.java Line:182 method:
getPersistenceManagerFactory
  class: org.datanucleus.jdo.JDOPersistenceManagerFactory

  File: NativeMethodAccessorImpl.java Line:-2 method: invoke0
  class: sun.reflect.NativeMethodAccessorImpl

  File: NativeMethodAccessorImpl.java Line:39 method: invoke
  class: sun.reflect.NativeMethodAccessorImpl

  File: DelegatingMethodAccessorImpl.java Line:25 method: invoke
  class: sun.reflect.DelegatingMethodAccessorImpl

  File: Method.java Line:597 method: invoke
  class: java.lang.reflect.Method

  File: JDOHelper.java Line:1958 method: run
  class: javax.jdo.JDOHelper$16

  File: AccessController.java Line:-2 method: doPrivileged
  class: java.security.AccessController

  File: JDOHelper.java Line:1953 method: invoke
  class: javax.jdo.JDOHelper

  File: JDOHelper.java Line:1159 method:
invokeGetPersistenceManagerFactoryOnImplementation
  class: javax.jdo.JDOHelper

  File: JDOHelper.java Line:803 method: getPersistenceManagerFactory
  class: javax.jdo.JDOHelper

  File: JDOHelper.java Line:698 method: getPersistenceManagerFactory
  class: javax.jdo.JDOHelper

  File: ObjectStore.java Line:263 method: getPMF
  class: org.apache.hadoop.hive.metastore.ObjectStore

  File: ObjectStore.java Line:292 method: getPersistenceManager
  class: org.apache.hadoop.hive.metastore.ObjectStore

  File: ObjectStore.java Line:225 method: initialize
  class: org.apache.hadoop.hive.metastore.ObjectStore

  File: ObjectStore.java Line:200 method: setConf
  class: org.apache.hadoop.hive.metastore.ObjectStore

  File: ReflectionUtils.java Line:62 method: setConf
  class: org.apache.hadoop.util.ReflectionUtils

  File: ReflectionUtils.java Line:117 method: newInstance
  class: org.apache.hadoop.util.ReflectionUtils

  File: RetryingRawStore.java Line:62 method:
  class: org.apache.hadoop.hive.metastore.RetryingRawStore

  File: RetryingRawStore.java Line:71 method: getProxy
  class: org.apache.hadoop.hive.metastore.RetryingRawStore

  File: HiveMetaStore.java Line:414 method: newRawStore
  class: org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler

  File: HiveMetaStore.java Line:402 method: getMS
  class: org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler

  File: HiveMetaStore.java Line:440 method: createDefaultDB
  class: org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler

  File: HiveMetaStore.java Line:326 method: init
  class: org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler

  File: HiveMetaStore.java Line:286 method:
  class: org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler

  File: RetryingHMSHandler.java Line:54 method:
  class: org.apache.hadoop.hive.metastore.RetryingHMSHandler

  File: RetryingHMSHandler.java Line:59 method: getProxy
  class: org.apache.hadoop.hive.metastore.RetryingHMSHandler

  File: HiveMetaStore.java Line:4183 method: newHMSHandler
  class: org.apache.hadoop.hive.metastore.HiveMetaStore

  File: HiveMetaStoreClient.java Line:121 method:
  class: org.apache.hadoop.hive.metastore.HiveMetaStoreClient

  File: HiveMetaStoreClient.java Line:104 method:
  class: org.apache.hadoop.hive.metastore.HiveMetaStoreClient

  File: org.apache.jsp.show_005fdatabases_jsp Line:54 method:
_jspService
  class: org.apache.jsp.show_005fdatabases_jsp

  File: HttpJspBase.java Line:97 method: service
  class: org.apache.jasper.runtime.HttpJspBase

  File: HttpServlet.java Line:820 method: service
  class: javax.servlet.http.HttpServlet

  File: JspServletWrapper.java Line:322 method: service
  class: org.apache.jasper.servlet.JspServletWrapper

  File: JspServlet.java Line:314 method: serviceJspFile
  class: org.apache.jasper.servlet.JspServlet

  File: JspServlet.java Line:264 method: service
  class: org.apache.jasper.servlet.JspServlet