Hive error with mongodb connector
Hi, I got the following error when i trying to connect mongodb with hive using monog-hadoop connector. 2014-09-16 17:32:24,279 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1410858694842_0013_01 *2014-09-16 17:32:24,742 FATAL [main] org.apache.hadoop.conf.Configuration: error parsing conf job.xml* org.xml.sax.SAXParseException; systemId: *file:///tmp/hadoop-hadoop2/nm-local-dir/usercache/hadoop2/appcache/application_1410858694842_0013/container_1410858694842_0013_01_01/job.xml; lineNumber: 586; columnNumber: 51; Character reference #* at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150) at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2173) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2242) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2195) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2102) at org.apache.hadoop.conf.Configuration.get(Configuration.java:1068) at org.apache.hadoop.mapreduce.v2.util.MRWebAppUtil.initialize(MRWebAppUtil.java:50) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1377) 2014-09-16 17:32:24,748 FATAL [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster java.lang.RuntimeException: org.xml.sax.SAXParseException; systemId: file:///tmp/hadoop-hadoop2/nm-local-dir/usercache/hadoop2/appcache/application_1410858694842_0013/container_1410858694842_0013_01_01/job.xml; lineNumber: 586; columnNumber: 51; Character reference # at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2338) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2195) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2102) at org.apache.hadoop.conf.Configuration.get(Configuration.java:1068) at org.apache.hadoop.mapreduce.v2.util.MRWebAppUtil.initialize(MRWebAppUtil.java:50) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1377) Caused by: org.xml.sax.SAXParseException; systemId: file:///tmp/hadoop-hadoop2/nm-local-dir/usercache/hadoop2/appcache/application_1410858694842_0013/container_1410858694842_0013_01_01/job.xml; lineNumber: 586; columnNumber: 51; Character reference # at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150) at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2173) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2242) ... 5 more 2014-09-16 17:32:24,750 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting with status 1 my working environment is, hadoop 2.4.1 hive 0.13.1 mongodb 2.6.3 mongodb connector 1.4.0 reference: https://groups.google.com/forum/#!msg/mongodb-user/lKbha0SzMP8/jvE8ZrJom4AJ Please help me. Thanks Mahesh S
bug in hive
Hive version 0.9 and later has a bug While inserting in a hive table Hive takes an exclusive lock. But if table is partitioned , and insert is in dynamic partition , it will take shared lock on table but if all partitions are static then hive takes exclusive lock on partitions in which data is being inserted and shared lock on table. https://issues.apache.org/jira/browse/HIVE-3509 1.What if I want to take exclusive lock on table while inserting in dynamic partition ? I tried to take explicit lock using : LOCK TABLE tablename EXCLUSIVE; But it made table to be disabled. I cannot even read from table anymore even is same session until I do unlock table tablename in another session; 2. moreover whats lock level in hive , I mean any user can remove any other users lock. that too seems buggy. Thanks Shushant
Re: Split the output file and name them
Thanks Karthik , I am using MultipleOutputs and still my output file name remains same . Does it have any constraints over HCatRecord ? Kindest Regards, Anusha On Sep 18, 2014, at 22:01, Karthiksrivasthava karthiksrivasth...@gmail.com wrote: Anusha , I think you have to write a MapReduce and use Multipleoutput To split your output Thanks Karthik On Sep 18, 2014, at 15:36, anusha Mangina anusha.mang...@gmail.com wrote: My output file part-r- at hive/warehouse/path/to/output_table_name/part-r- has following content inside emp1_id emp1_name emp1_salary emp1_address emp1_dob emp1_joiningdate emp2_id emp2_name emp2_salary emp2_address emp2_dob emp2_joiningdate emp3_id emp3_name emp3_salary emp3_address emp3_dob emp3_joiningdate emp4_id emp4_name emp4_salary emp4_address emp4_dob emp4_joiningdate emp5_id emp5_name emp5_salary emp5_address emp5_dob emp5_joiningdate Basically output table will have n distinct employee records. How can i split the output file and get n separate files inside output_table_name and name them with emp_id.? (I dont want partitioning) so my output table folder should have n separate files with emp_id as name C:hadoop\binhadoop fs -ls hive/warehouse/path/to/output_table_name emp1_id emp2_id emp3_id emp4_id emp5_id
Re: bug in hive
Up until Hive 0.13 locks in Hive were really advisory only, since as you note any user can remove any other user's lock. In Hive 0.13 a new type of locking was introduced, see https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-LockManager This new locking is automatic and ignores both LOCK and UNLOCK commands. Note that it is off by default, you have to configure Hive to use the new DbTxnManager to get turn on this locking. In 0.13 it still has the bug you describe as far as acquiring the wrong lock for dynamic partitioning, but I believe I've fixed that in 0.14. Alan. Shushant Arora mailto:shushantaror...@gmail.com September 20, 2014 at 5:39 Hive version 0.9 and later has a bug While inserting in a hive table Hive takes an exclusive lock. But if table is partitioned , and insert is in dynamic partition , it will take shared lock on table but if all partitions are static then hive takes exclusive lock on partitions in which data is being inserted and shared lock on table. https://issues.apache.org/jira/browse/HIVE-3509 1.What if I want to take exclusive lock on table while inserting in dynamic partition ? I tried to take explicit lock using : LOCK TABLE tablename EXCLUSIVE; But it made table to be disabled. I cannot even read from table anymore even is same session until I do unlock table tablename in another session; 2. moreover whats lock level in hive , I mean any user can remove any other users lock. that too seems buggy. Thanks Shushant -- Sent with Postbox http://www.getpostbox.com -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: bug in hive
Hi Alan I have 0.10 version of hive deployed in my org's cluster, I cannot update that because of org's policy. How can I achieve exclusive lock functionality while inserting in dynamic partition on hive 0.10 ? Does calling hive scripts via some sort of java api with patched jar included will help ? Moreover hive does not release locks in 0.10 when hive session is killed . User has to explicitly unlock a table. Can i specify any sort of max expiry time while taking a lock. Thanks Shushant On Sat, Sep 20, 2014 at 8:11 PM, Alan Gates ga...@hortonworks.com wrote: Up until Hive 0.13 locks in Hive were really advisory only, since as you note any user can remove any other user's lock. In Hive 0.13 a new type of locking was introduced, see https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-LockManager This new locking is automatic and ignores both LOCK and UNLOCK commands. Note that it is off by default, you have to configure Hive to use the new DbTxnManager to get turn on this locking. In 0.13 it still has the bug you describe as far as acquiring the wrong lock for dynamic partitioning, but I believe I've fixed that in 0.14. Alan. Shushant Arora shushantaror...@gmail.com September 20, 2014 at 5:39 Hive version 0.9 and later has a bug While inserting in a hive table Hive takes an exclusive lock. But if table is partitioned , and insert is in dynamic partition , it will take shared lock on table but if all partitions are static then hive takes exclusive lock on partitions in which data is being inserted and shared lock on table. https://issues.apache.org/jira/browse/HIVE-3509 1.What if I want to take exclusive lock on table while inserting in dynamic partition ? I tried to take explicit lock using : LOCK TABLE tablename EXCLUSIVE; But it made table to be disabled. I cannot even read from table anymore even is same session until I do unlock table tablename in another session; 2. moreover whats lock level in hive , I mean any user can remove any other users lock. that too seems buggy. Thanks Shushant -- Sent with Postbox http://www.getpostbox.com CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: bug in hive
great policy. install open source software that's not even version 1.0 into production and then not allow the ability to improve it (but of course reap all the rewards of its benefits.) so instead of actually fixing the problem the right way introduce a super-hack work-around cuz, you know, that's much more stable. Gotta luv it. Good luck. On Sat, Sep 20, 2014 at 8:00 AM, Shushant Arora shushantaror...@gmail.com wrote: Hi Alan I have 0.10 version of hive deployed in my org's cluster, I cannot update that because of org's policy. How can I achieve exclusive lock functionality while inserting in dynamic partition on hive 0.10 ? Does calling hive scripts via some sort of java api with patched jar included will help ? Moreover hive does not release locks in 0.10 when hive session is killed . User has to explicitly unlock a table. Can i specify any sort of max expiry time while taking a lock. Thanks Shushant On Sat, Sep 20, 2014 at 8:11 PM, Alan Gates ga...@hortonworks.com wrote: Up until Hive 0.13 locks in Hive were really advisory only, since as you note any user can remove any other user's lock. In Hive 0.13 a new type of locking was introduced, see https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-LockManager This new locking is automatic and ignores both LOCK and UNLOCK commands. Note that it is off by default, you have to configure Hive to use the new DbTxnManager to get turn on this locking. In 0.13 it still has the bug you describe as far as acquiring the wrong lock for dynamic partitioning, but I believe I've fixed that in 0.14. Alan. Shushant Arora shushantaror...@gmail.com September 20, 2014 at 5:39 Hive version 0.9 and later has a bug While inserting in a hive table Hive takes an exclusive lock. But if table is partitioned , and insert is in dynamic partition , it will take shared lock on table but if all partitions are static then hive takes exclusive lock on partitions in which data is being inserted and shared lock on table. https://issues.apache.org/jira/browse/HIVE-3509 1.What if I want to take exclusive lock on table while inserting in dynamic partition ? I tried to take explicit lock using : LOCK TABLE tablename EXCLUSIVE; But it made table to be disabled. I cannot even read from table anymore even is same session until I do unlock table tablename in another session; 2. moreover whats lock level in hive , I mean any user can remove any other users lock. that too seems buggy. Thanks Shushant -- Sent with Postbox http://www.getpostbox.com CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.