Re: Hive on Oracle

2013-05-18 Thread Sanjay Subramanian
Try installing cloudera manager 4.1.2. It has bundled Hadoop hive and few other components. I have this version in production. Cloudera has pretty good documentation. This way u don't have to spend time installing versions that work successful with each other. Sent from my iPhone On May 17, 2

Re: Did any one used Hive on Oracle Metastore

2013-05-18 Thread Sanjay Subramanian
Raj It should be pretty much similar to setting it up in MySQL. Except any syntax differences. Read the cloudera hive installation notes. They have a separate Section for using mysql and oracle. Also one of my favorite $0.02 about the open source software is just dare and try it out...get error

Re: Unable to stop Thrift Server

2013-05-21 Thread Sanjay Subramanian
Raj Which version r u using ? I think from 0.9+ onwards its best to use service to stop and start and NOT hive sudo service hive-metastore stop sudo service hive-server stop sudo service hive-metastore start sudo service hive-server start Couple of general things that might help 1. Use linux s

Re: hive.metastore.warehouse.dir - Should it point to a physical directory

2013-05-21 Thread Sanjay Subramanian
Notes below From: Raj Hadoop mailto:hadoop...@yahoo.com>> Reply-To: "u...@hive.apache.org" mailto:u...@hive.apache.org>>, Raj Hadoop mailto:hadoop...@yahoo.com>> Date: Tuesday, May 21, 2013 10:49 AM To: Dean Wampler mailto:deanwamp...@gmail.com>>, "u...@hive.apache.

Re: Viewing snappy compressed files

2013-05-21 Thread Sanjay Subramanian
+1 Thanks Rahul-da Or u can use hdfs dfs -text /path/to/dir/on/hdfs/part-r-0.snappy | less From: Rahul Bhattacharjee mailto:rahul.rec@gmail.com>> Reply-To: "user@hadoop.apache.org" mailto:user@hadoop.apache.org>> Date: Tuesday, May 21, 2013 9:52 AM To: "u

Re: Project ideas

2013-05-21 Thread Sanjay Subramanian
+1 My $0.02 is look look around and see problems u can solve…Its better to get a list of problems and see if u can model a solution using map-reduce framework An example is as follows PROBLEM Build a Cars Pricing Model based on advertisements on Craigs list OBJECTIVE Recommend a price to the C

Re: hive.metastore.warehouse.dir - Should it point to a physical directory

2013-05-21 Thread Sanjay Subramanian
013 11:12 AM To: "u...@hive.apache.org<mailto:u...@hive.apache.org>" mailto:u...@hive.apache.org>>, Raj Hadoop mailto:hadoop...@yahoo.com>> Cc: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>>, User mailto:user@hadoop.apache.org>> Subject

Re: Where to get Oracle scripts for Hive Metastore

2013-05-21 Thread Sanjay Subramanian
Raj The correct location of the script is where u deflated the hive tar For example /usr/lib/hive/scripts/metastore/upgrade/oracle You will find a file in this directory called hive-schema-0.9.0.oracle.sql Use this sanjay From: Raj Hadoop mailto:hadoop...@yahoo.com>> Reply-To: "user@hadoop.ap

LZO compression implementation in Hive

2013-05-21 Thread Sanjay Subramanian
Hi Programming Hive Book authors Maybe a lot of u have already successfully implemented this but only these last two weeks , we implemented our aggregations using LZO compression in Hive - MR jobs creating LZO files as Input for Hive ---> Therafter Hive aggregations creating more LZO files as o

Re: Unable to stop Thrift Server

2013-05-21 Thread Sanjay Subramanian
Not that I know of…..sorry sanjay From: Raj Hadoop mailto:hadoop...@yahoo.com>> Reply-To: Raj Hadoop mailto:hadoop...@yahoo.com>> Date: Monday, May 20, 2013 2:17 PM To: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>>, "u...@hive.apache.org<mailto:u...@h

Re: Where to get Oracle scripts for Hive Metastore

2013-05-21 Thread Sanjay Subramanian
I think it should be this link because this refers to the /branches/branch-0.9 http://svn.apache.org/viewvc/hive/branches/branch-0.9/metastore/scripts/upgrade/oracle/ Can one of the Hive committers please verify…thanks sanjay From: Raj Hadoop mailto:hadoop...@yahoo.com>> Reply-To: "u...@hive

Re: ORA-01950: no privileges on tablespace

2013-05-21 Thread Sanjay Subramanian
See the CDH notes here…scroll down to where the Oracle section is http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Installation-Guide/cdh4ig_topic_18_4.html From: Raj Hadoop mailto:hadoop...@yahoo.com>> Reply-To: "u...@hive.apache.org"

Re: Hive tmp logs

2013-05-22 Thread Sanjay Subramanian
hive.querylog.location /path/to/hivetmp/dir/on/local/linux/disk hive.exec.scratchdir /data01/workspace/hive scratch/dir/on/local/linux/disk From: Anurag Tangri mailto:tangri.anu...@gmail.com>> Reply-To: "u...@hive.apache.org" mailto:u...@hive.apache.org>

Re: Eclipse plugin

2013-05-22 Thread Sanjay Subramanian
Hi I don't use any need any special plugin to walk thru the code All my map reduce jobs have a JobMapper.java JobReducer.java JobProcessor.java (set any configs u like) I create a new maven project in eclipse (easier to manage dependencies) ….the elements are in the order as they should appear

Re: Eclipse plugin

2013-05-22 Thread Sanjay Subramanian
Forgot to add, if u run Windows and Eclipse and want to do Hadoop u have to setup Cygwin and add $CYGWIN_PATH/bin to PATH Good Luck Sanjay From: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> Reply-To: "user@hadoop.apache.org<mailto:user@hadoop.apache.org&

Re: Hive tmp logs

2013-05-23 Thread Sanjay Subramanian
Clarification This property defines a file on HDFS hive.exec.scratchdir /data01/workspace/hive scratch/dir/on/local/linux/disk From: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> Date: Wednesday, May 22, 2013 12:23 PM To: "u...@hive.apache.o

hive.log

2013-05-23 Thread Sanjay Subramanian
How do I set the property in hive-site.xml that defines the local linux directory for hive.log ? Thanks sanjay CONFIDENTIALITY NOTICE == This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged i

Re: hive.log

2013-05-23 Thread Sanjay Subramanian
Ok figured it out - vi /etc/hive/conf/hive-log4j.properties - Modify this line #hive.log.dir=/tmp/${user.name} hive.log.dir=/data01/workspace/hive/log/${user.name} From: Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> Reply-To: "u...@hive.apache.o

Re: Where to begin from??

2013-05-23 Thread Sanjay Subramanian
I agree with Chris…don't worry about what the technology is called Hadoop , Big table, Lucene, Hive….Model the problem and see what the solution could be….that’s very important And Lokesh please don't mind…we are writing to u perhaps stuff that u don't want to hear but its an important real per

Re: Where to begin from??

2013-05-24 Thread Sanjay Subramanian
Hey guys Is there a way to dynamically change the input dir and outputdir I have the following CONSTANT directories in HDFS * /path/to/input/-99-99 (empty directory ) * /path/to/output/-99-99 (empty directory) A new directory with yesterdays date like /path/to/input/2013-05-23 g

Re: Problem in uploading file in WebHDFS

2013-05-25 Thread Sanjay Subramanian
Can u try one of the following hdfs dfs -put localfile /path/to/dir/in/hdfs hdfs dfs -copyFromLocal localfile /path/to/dir/in/hdfs Thanks Sanjay. Sent from my iPhone On May 25, 2013, at 5:07 AM, "Mohammad Mustaqeem" <3m.mustaq...@gmail.com> wrote: I am using ps

Re: MapReduce on Local FileSystem

2013-05-31 Thread Sanjay Subramanian
Basic question. Why would u want to do that ? Also I think the Map R Hadoop distribution has an NFS mountable HDFS Sanjay Sent from my iPhone On May 30, 2013, at 11:37 PM, "Agarwal, Nikhil" mailto:nikhil.agar...@netapp.com>> wrote: Hi, Is it possible to run MapReduce on multiple nodes using L

Re: MapReduce on Local FileSystem

2013-05-31 Thread Sanjay Subramanian
ingesting the data in HDFS. Regards, Nikhil From: Sanjay Subramanian [mailto:sanjay.subraman...@wizecommerce.com] Sent: Friday, May 31, 2013 12:50 PM To: mailto:user@hadoop.apache.org>> Cc: user@hadoop.apache.org<mailto:user@hadoop.apache.org> Subject: Re: MapReduce on Local FileSyst

Re: Now give .gz file as input to the MAP

2013-06-11 Thread Sanjay Subramanian
hadoopConf.set("mapreduce.job.inputformat.class", "com.wizecommerce.utils.mapred.TextInputFormat"); hadoopConf.set("mapreduce.job.outputformat.class", "com.wizecommerce.utils.mapred.TextOutputFormat"); No special settings required for reading Gzip except these above I u want to output Gzip h

Re: Now give .gz file as input to the MAP

2013-06-12 Thread Sanjay Subramanian
.gz files using MR. however , as Sanjay mentioned , verify the codec's configured in core-site and another thing to note is that these files are not splittable. You might want to use bz2 , these are splittable. Thanks, Rahul On Wed, Jun 12, 2013 at 10:14 AM, Sanjay Subramani

Re: How to design the mapper and reducer for the following problem

2013-06-14 Thread Sanjay Subramanian
Hi My quick and dirty non-optimized solution would be as follows MAPPER === OUTPUT from Mapper REDUCER Iterate over keys For a key = (say) {HASH1,HASH2,HASH3,HASH4} Format the collection of values into some StringBuilder kind of class Output KEY = {DOCID1

Many Errors at the last step of copying files from _temporary to Output Directory

2013-06-14 Thread Sanjay Subramanian
Hi My environment is like this INPUT FILES == 400 GZIP files , one from each server - average size gzipped 25MB REDUCER === Uses MultipleOutput OUTPUT (Snappy) === /path/to/output/dir1 /path/to/output/dir2 /path/to/output/dir3 /path/to/output/dir4 Number of output directories

Piping to HDFS (from Linux or HDFS)

2013-06-24 Thread Sanjay Subramanian
Hi guys While I was trying to get some test data and configurations done quickly I realized one can do this and I think its super cool Processing existing file on Linux/HDFS and Piping it directly to hdfs source = Linux dest=HDFS == File = sanjay.conf.template We want to re

Re: Splitting input file - increasing number of mappers

2013-07-06 Thread Sanjay Subramanian
More mappers will make it faster U can try this parameter mapreduce.input.fileinputformat.split.maxsize= This will control the input split size and force more mappers to run Also ur usecase seems good candidate for defining a Combiner because u r grouping keys based on a criteria

EBADF: Bad file descriptor

2013-07-10 Thread Sanjay Subramanian
2013-07-10 07:11:50,131 WARN [Readahead Thread #1] org.apache.hadoop.io.ReadaheadPool: Failed readahead on ifile EBADF: Bad file descriptor at org.apache.hadoop.io.nativeio.NativeIO.posix_fadvise(Native Method) at org.apache.hadoop.io.nativeio.NativeIO.posixFadviseIfPossible(NativeIO.java:145) at

Re: EBADF: Bad file descriptor

2013-07-11 Thread Sanjay Subramanian
r problem-- it's just a symptom. You will have to find out why the MR job failed. best, Colin On Wed, Jul 10, 2013 at 8:19 AM, Sanjay Subramanian mailto:sanjay.subraman...@wizecommerce.com>> wrote: 2013-07-10 07:11:50,131 WARN [Readahead Thread #1] org.apach

Re: Decompression using LZO

2013-08-16 Thread Sanjay Subramanian
What do u want to do ? View the .LZO file on HDFS ? From: Sandeep Nemuri mailto:nhsande...@gmail.com>> Reply-To: "user@hadoop.apache.org" mailto:user@hadoop.apache.org>> Date: Tuesday, August 6, 2013 12:08 AM To: "user@hadoop.apache.org