Hive error with mongodb connector

2014-09-20 Thread mahesh kumar
Hi,
   I got the following error when i trying to connect mongodb with hive
using monog-hadoop connector.

2014-09-16 17:32:24,279 INFO [main]
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for
application appattempt_1410858694842_0013_01
*2014-09-16 17:32:24,742 FATAL [main] org.apache.hadoop.conf.Configuration:
error parsing conf job.xml*
org.xml.sax.SAXParseException; systemId:
*file:///tmp/hadoop-hadoop2/nm-local-dir/usercache/hadoop2/appcache/application_1410858694842_0013/container_1410858694842_0013_01_01/job.xml;
lineNumber: 586; columnNumber: 51; Character reference #*
at
com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
at
com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)
at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2173)
at
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2242)
at
org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2195)
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2102)
at org.apache.hadoop.conf.Configuration.get(Configuration.java:1068)
at
org.apache.hadoop.mapreduce.v2.util.MRWebAppUtil.initialize(MRWebAppUtil.java:50)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1377)
2014-09-16 17:32:24,748 FATAL [main]
org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
java.lang.RuntimeException: org.xml.sax.SAXParseException; systemId:
file:///tmp/hadoop-hadoop2/nm-local-dir/usercache/hadoop2/appcache/application_1410858694842_0013/container_1410858694842_0013_01_01/job.xml;
lineNumber: 586; columnNumber: 51; Character reference #
at
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2338)
at
org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2195)
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2102)
at org.apache.hadoop.conf.Configuration.get(Configuration.java:1068)
at
org.apache.hadoop.mapreduce.v2.util.MRWebAppUtil.initialize(MRWebAppUtil.java:50)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1377)
Caused by: org.xml.sax.SAXParseException; systemId:
file:///tmp/hadoop-hadoop2/nm-local-dir/usercache/hadoop2/appcache/application_1410858694842_0013/container_1410858694842_0013_01_01/job.xml;
lineNumber: 586; columnNumber: 51; Character reference #
at
com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
at
com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)
at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2173)
at
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2242)
... 5 more
2014-09-16 17:32:24,750 INFO [main] org.apache.hadoop.util.ExitUtil:
Exiting with status 1

my working environment is,

hadoop 2.4.1
hive   0.13.1
mongodb 2.6.3
mongodb connector 1.4.0

reference:
https://groups.google.com/forum/#!msg/mongodb-user/lKbha0SzMP8/jvE8ZrJom4AJ

Please help me.

Thanks

Mahesh S


bug in hive

2014-09-20 Thread Shushant Arora
Hive version 0.9 and later has a bug



While inserting in a hive table Hive takes an exclusive lock. But if table
is partitioned , and insert is in dynamic partition , it will take shared
lock on table but if all partitions are static then hive takes exclusive
lock on partitions in which data is being inserted

and shared lock on table.

https://issues.apache.org/jira/browse/HIVE-3509


1.What if I want to take exclusive lock on table while inserting in dynamic
partition ?


I tried to take explicit lock using :

LOCK TABLE tablename EXCLUSIVE;


But it made table to be disabled.

I cannot even read from table anymore even is same session until I do

unlock table tablename in another session;


2. moreover whats lock level in hive , I mean any user can remove any other
users lock. that too seems buggy.


Thanks

Shushant


Re: Split the output file and name them

2014-09-20 Thread Anusha
Thanks Karthik ,

I am using MultipleOutputs and still my output file name remains same . 

Does it have any constraints over HCatRecord ? 

Kindest Regards,
Anusha 

 On Sep 18, 2014, at 22:01, Karthiksrivasthava karthiksrivasth...@gmail.com 
 wrote:
 
 Anusha ,
 
 I think you have to write a MapReduce and use Multipleoutput To split your 
 output 
 
 Thanks 
 Karthik
 On Sep 18, 2014, at 15:36, anusha Mangina anusha.mang...@gmail.com wrote:
 
 My output file part-r- at 
 hive/warehouse/path/to/output_table_name/part-r- has following content 
 inside
 
 
 emp1_id emp1_name emp1_salary emp1_address emp1_dob  emp1_joiningdate
 emp2_id emp2_name emp2_salary emp2_address emp2_dob  emp2_joiningdate
 emp3_id emp3_name emp3_salary emp3_address emp3_dob  emp3_joiningdate
 emp4_id emp4_name emp4_salary emp4_address emp4_dob  emp4_joiningdate
 emp5_id emp5_name emp5_salary emp5_address emp5_dob  emp5_joiningdate
 
 Basically output table will have n distinct employee records. 
 How can i split the output file and get n separate files inside 
 output_table_name and name them with emp_id.? (I dont want partitioning)
 
 so my output table folder should have n separate files with emp_id as name
 
 
 C:hadoop\binhadoop fs -ls  hive/warehouse/path/to/output_table_name
 
 
 emp1_id
 emp2_id
 emp3_id
 emp4_id
 emp5_id
 


Re: bug in hive

2014-09-20 Thread Alan Gates
Up until Hive 0.13 locks in Hive were really advisory only, since as you 
note any user can remove any other user's lock.  In Hive 0.13 a new type 
of locking was introduced, see 
https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-LockManager  
This new locking is automatic and ignores both LOCK and UNLOCK 
commands.  Note that it is off by default, you have to configure Hive to 
use the new DbTxnManager to get turn on this locking.  In 0.13 it still 
has the bug you describe as far as acquiring the wrong lock for dynamic 
partitioning, but I believe I've fixed that in 0.14.


Alan.


Shushant Arora mailto:shushantaror...@gmail.com
September 20, 2014 at 5:39

Hive version 0.9 and later has a bug

While inserting in a hive table Hive takes an exclusive lock. But if 
table is partitioned , and insert is in dynamic partition , it will 
take shared lock on table but if all partitions are static then hive 
takes exclusive lock on partitions in which data is being inserted


and shared lock on table.

https://issues.apache.org/jira/browse/HIVE-3509


1.What if I want to take exclusive lock on table while inserting in 
dynamic partition ?



I tried to take explicit lock using :

LOCK TABLE tablename EXCLUSIVE;


But it made table to be disabled.

I cannot even read from table anymore even is same session until I do

unlock table tablename in another session;


2. moreover whats lock level in hive , I mean any user can remove any 
other users lock. that too seems buggy.



Thanks

Shushant





--
Sent with Postbox http://www.getpostbox.com

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: bug in hive

2014-09-20 Thread Shushant Arora
Hi Alan

I have 0.10 version of hive deployed in my org's cluster, I cannot update
that because of org's policy.
How can I achieve exclusive lock functionality while inserting in dynamic
partition on hive 0.10 ?
Does calling hive scripts via some sort of java api with patched jar
included will help ?
Moreover hive does not release locks in 0.10 when hive session is killed .
User has to explicitly unlock a table.
Can i specify any sort of max expiry time while taking a lock.

Thanks
Shushant

On Sat, Sep 20, 2014 at 8:11 PM, Alan Gates ga...@hortonworks.com wrote:

 Up until Hive 0.13 locks in Hive were really advisory only, since as you
 note any user can remove any other user's lock.  In Hive 0.13 a new type of
 locking was introduced, see
 https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-LockManager
 This new locking is automatic and ignores both LOCK and UNLOCK commands.
 Note that it is off by default, you have to configure Hive to use the new
 DbTxnManager to get turn on this locking.  In 0.13 it still has the bug you
 describe as far as acquiring the wrong lock for dynamic partitioning, but I
 believe I've fixed that in 0.14.

 Alan.

   Shushant Arora shushantaror...@gmail.com
  September 20, 2014 at 5:39

 Hive version 0.9 and later has a bug



 While inserting in a hive table Hive takes an exclusive lock. But if table
 is partitioned , and insert is in dynamic partition , it will take shared
 lock on table but if all partitions are static then hive takes exclusive
 lock on partitions in which data is being inserted

 and shared lock on table.

 https://issues.apache.org/jira/browse/HIVE-3509


 1.What if I want to take exclusive lock on table while inserting in
 dynamic partition ?


 I tried to take explicit lock using :

 LOCK TABLE tablename EXCLUSIVE;


 But it made table to be disabled.

 I cannot even read from table anymore even is same session until I do

 unlock table tablename in another session;


 2. moreover whats lock level in hive , I mean any user can remove any
 other users lock. that too seems buggy.


 Thanks

 Shushant




 --
 Sent with Postbox http://www.getpostbox.com

 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity
 to which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


Re: bug in hive

2014-09-20 Thread Stephen Sprague
great policy. install open source software that's not even version 1.0 into
production and then not allow the ability to improve it (but of course reap
all the rewards of its benefits.)  so instead of actually fixing the
problem the right way introduce a super-hack work-around cuz, you know,
that's much more stable.

Gotta luv it.   Good luck.

On Sat, Sep 20, 2014 at 8:00 AM, Shushant Arora shushantaror...@gmail.com
wrote:

 Hi Alan

 I have 0.10 version of hive deployed in my org's cluster, I cannot update
 that because of org's policy.
 How can I achieve exclusive lock functionality while inserting in dynamic
 partition on hive 0.10 ?
 Does calling hive scripts via some sort of java api with patched jar
 included will help ?
 Moreover hive does not release locks in 0.10 when hive session is killed .
 User has to explicitly unlock a table.
 Can i specify any sort of max expiry time while taking a lock.

 Thanks
 Shushant

 On Sat, Sep 20, 2014 at 8:11 PM, Alan Gates ga...@hortonworks.com wrote:

 Up until Hive 0.13 locks in Hive were really advisory only, since as you
 note any user can remove any other user's lock.  In Hive 0.13 a new type of
 locking was introduced, see
 https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-LockManager
 This new locking is automatic and ignores both LOCK and UNLOCK commands.
 Note that it is off by default, you have to configure Hive to use the new
 DbTxnManager to get turn on this locking.  In 0.13 it still has the bug you
 describe as far as acquiring the wrong lock for dynamic partitioning, but I
 believe I've fixed that in 0.14.

 Alan.

   Shushant Arora shushantaror...@gmail.com
  September 20, 2014 at 5:39

 Hive version 0.9 and later has a bug



 While inserting in a hive table Hive takes an exclusive lock. But if
 table is partitioned , and insert is in dynamic partition , it will take
 shared lock on table but if all partitions are static then hive takes
 exclusive lock on partitions in which data is being inserted

 and shared lock on table.

 https://issues.apache.org/jira/browse/HIVE-3509


 1.What if I want to take exclusive lock on table while inserting in
 dynamic partition ?


 I tried to take explicit lock using :

 LOCK TABLE tablename EXCLUSIVE;


 But it made table to be disabled.

 I cannot even read from table anymore even is same session until I do

 unlock table tablename in another session;


 2. moreover whats lock level in hive , I mean any user can remove any
 other users lock. that too seems buggy.


 Thanks

 Shushant




 --
 Sent with Postbox http://www.getpostbox.com

 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity
 to which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.