Re: Inner Map key and value separators

2013-09-13 Thread Sanjay Subramanian
Ah…. While my BeeHive gently weeps ! Thanks sanjay From: Dean Wampler mailto:deanwamp...@gmail.com>> Reply-To: "user@hive.apache.org" mailto:user@hive.apache.org>> Date: Friday, September 13, 2013 5:10 PM To: "user@hive.apache.org" mail

Re: Options for Loading Side Data / small files in UDF

2013-09-13 Thread Stephen Boesch
Hi Jagat, There is no call to loading file from hdfs in Edward's example (which I had btw already seen). I am looking into using getRequriedFiles() 2013/9/13 Jagat Singh > Sorry i missed that > > Just check this example for accessing from API > > https://github.com/edwardcapriolo/hive-geoip

Re: Options for Loading Side Data / small files in UDF

2013-09-13 Thread Jagat Singh
Sorry i missed that Just check this example for accessing from API https://github.com/edwardcapriolo/hive-geoip/ On Sat, Sep 14, 2013 at 10:12 AM, Stephen Boesch wrote: > I should have mentioned: we can not use the "add file" here because this > is running within a framework. we need to

Re: Inner Map key and value separators

2013-09-13 Thread Dean Wampler
Unfortunately, I believe there's no way to do this. Sent from my rotary phone. On Sep 13, 2013, at 6:42 PM, Sanjay Subramanian wrote: > Hi guys > > I have to load data into the following data type in hive > > map > > > Is there a way to define custom SEPARATORS (while creating the tabl

Re: Options for Loading Side Data / small files in UDF

2013-09-13 Thread Stephen Boesch
I should have mentioned: we can not use the "add file" here because this is running within a framework. we need to use Java api's 2013/9/13 Jagat Singh > Hi > > You can use distributed cache and hive add file command > > See here for example syntax > > > http://stackoverflow.com/questions/15

Re: Options for Loading Side Data / small files in UDF

2013-09-13 Thread Jagat Singh
Hi You can use distributed cache and hive add file command See here for example syntax http://stackoverflow.com/questions/15429040/add-multiple-files-to-distributed-cache-in-hive Regards, Jagat On Sat, Sep 14, 2013 at 9:57 AM, Stephen Boesch wrote: > > We have a UDF that is configured via

Re: question about partition table in hive

2013-09-13 Thread Jagat Singh
Adding to Sanjay's reply The only thing left after flume has added partitions is to tell hive metastore to update partition information. which you can do via add partition command Then you can read data via hive straight away. On Sat, Sep 14, 2013 at 10:00 AM, Sanjay Subramanian < sanjay.subr

Re: question about partition table in hive

2013-09-13 Thread Sanjay Subramanian
A couple of days back, Erik Sammer at the Hadoop Hands On Lab at the Cloudera Sessions demonstrated how to achieve dynamic partitioning using Flume and created those partitioned directories on HDFS which are then readily usable by Hive Understanding what I can from the two lines of your mail be

Options for Loading Side Data / small files in UDF

2013-09-13 Thread Stephen Boesch
We have a UDF that is configured via a small properties file. What are the options for distributing the file for the task nodes? Also we want to be able to update the file frequently. We are not running on AWS so S3 is not an option - and we do not have access to NFS/other shared disk from the M

Inner Map key and value separators

2013-09-13 Thread Sanjay Subramanian
Hi guys I have to load data into the following data type in hive map > Is there a way to define custom SEPARATORS (while creating the table) for - Inner map collection item - Inner map key delimiters for 2nd-level maps are \004 and \005 per this http://mail-archives.apache.org/mod_mbox/hadoop-

Re: question about partition table in hive

2013-09-13 Thread Dean Wampler
Flume might be able to invoke Hive to do this as the data is ingested, but I don't know anything about Flume. Brent has a nice blog post describing many of the details of partitioning. http://www.brentozar.com/archive/2013/03/introduction-to-hive-partitioning/ We also cover them in our book. The

Re: question about partition table in hive

2013-09-13 Thread Stephen Sprague
and have you done any analysis on this yet using the Hive documentation that's publicly available? if you show some initiative yourself you're more likely to get others to join your cause. :) So what have you tried before asking us for help? On Thu, Sep 12, 2013 at 6:55 PM, ch huang wrote: >

Re: Hive + mongoDB

2013-09-13 Thread Sandeep Nemuri
Hi nithin Thanks for your help I have used this query in hive to retrieve the data from mongodb add jar /usr/lib/hadoop/lib/mongo-2.8.0.jar; add jar /usr/lib/hive/lib/hive-mongo-0.0.3-jar-with-dependencies.jar; select * from docs input format "org.yong3.hive.mongo.MongoStorageHandler" with serde

Re: Hive + mongoDB

2013-09-13 Thread Nitin Pawar
Can you share your create table ddl for table name docs? Select statement does not need all those details. Those are part of create table DDL only. On Fri, Sep 13, 2013 at 4:24 PM, Sandeep Nemuri wrote: > Hi nithin > > Thanks for your help > I have used this query in hive to retrieve the data f

Re: question about partition table in hive

2013-09-13 Thread Nitin Pawar
You will need to define a partition column like date or hour something like this. Then configure flume to rollover filee/directories based on your partition column. You will need some kind of cron which will check for the new data being available into a directory or file and then add it as partitio