Hi,
Need to load the data into hive table using mapreduce, using java.
Please suggest the code related to hive +mapreduce.
Thanks in advance
Ranjini R
you just need to run mapreduce job to generate the data you want and then
upload the data into hive table ( create table first if it is not exists )
these 2 steps are totally separated.
On Tue, Jan 21, 2014 at 4:21 PM, Ranjini Rathinam ranjinibe...@gmail.comwrote:
Hi,
Need to load the data
Programming in Hive Text Book contains what u want . Chapter 4
Hope that will help u.
On Tue, Jan 21, 2014 at 1:51 PM, Ranjini Rathinam ranjinibe...@gmail.comwrote:
Hi,
Need to load the data into hive table using mapreduce, using java.
Please suggest the code related to hive +mapreduce.
Hi all,
I have enabled log aggregation and want to track task logs on hdfs. I need
to start historyserver vie mr-jobhistory-daemon.sh start historyserver on
all nodes. Is there any way to run historyserver automatically when yarn
starts?
Hey,
this is my hdfs-site.xml - http://pastebin.com/qpELkwH8
this is my core-site.xml:
configuration
property
namefs.defaultFS/name
valuehdfs://blabla-hadoop/value
/property
property
namehadoop.tmp.dir/name
value/opt/hadoop/hadoop/tmp/value
Hey,
I use Hadoop with XtreemFS (with a corresponding FileSystem
implementation). The XtreemFS client uses several non-deamon Threads eg.
for communication. Therefore the shutdown hooks do not start after a
mapper/reducer is finished and the Child processes do not terminate.
My question:
Is
Hi,
You do not need to run an MR HistoryServer on all nodes. If you want
start-yarn.sh to cover the history server startup you can also
inject the command at its end.
On Tue, Jan 21, 2014 at 2:43 PM, Saeed Adel Mehraban s.ade...@gmail.com wrote:
Hi all,
I have enabled log aggregation and want
If you put the sentence
Need to load the data into hive table using mapreduce, using java
into your google search box you will get tons of information.
On 1/21/2014 3:21 AM, Ranjini Rathinam wrote:
Need to load the data into hive table using mapreduce, using java
Hi,
I want to set the number of map tasks in the Wordcount example. Is is
possible to set this variable in MRv2?
Thanks,
nO of map tasks is determined by number of input splits.you can change the
NUM Of map tasks by changing the input split size
But you can set to num of reducertasks explicitly
On 21 Jan 2014 20:25, xeon xeonmailingl...@gmail.com wrote:
Hi,
I want to set the number of map tasks in the Wordcount
what is happening when you remove the shutdown hook ?is that supposed
to trigger an exception -
Hey sorry for previous answer i thought it reducer. we can't set number
mappers for a job it determined by number of input splits as Shekhar said.
On Tue, Jan 21, 2014 at 9:44 PM, Shekhar Sharma shekhar2...@gmail.comwrote:
nO of map tasks is determined by number of input splits.you can change
It means that the first process in the container is either crashing due to
some reason or explicitly killed by an external entity. You can look at the
logs for the container on the web-UI. Also look at ResourceManager logs to
trace what is happening with this container.
Which application is this?
I am running a job that takes no input from the mapper-input key/value
interface. Each job reads the same small file from the distributed cache and
processes it independently (to generate Monte Carlo sampling of the problem
space). I am using MR purely to parallelize the otherwise redundant
You cannot use hadoop NLineInputFormat?
If you generate 100 lines of text file, by default, one line will trigger one
mapper task.
As long as you have 100 task slot available, you will get 100 mapper running
concurrently.
You want perfect control over mapper num? NLineInputFormat is designed for
It is my own custom application. But looking at the Resource manager's logs ,
the container completed as normal with exit code of 0 . This is really weird to
me.
On Tuesday, January 21, 2014 1:17 PM, Vinod Kumar Vavilapalli
vino...@hortonworks.com wrote:
It means that the first process
I'll look it up. Thanks.
On Jan 21, 2014, at 11:43 , java8964 wrote:
You cannot use hadoop NLineInputFormat?
If you generate 100 lines of text file, by default, one line will trigger one
mapper task.
As long as you have 100 task slot available, you will get 100 mapper running
Anand,
Instructions to build Hadoop 2.2 on Windows are at
https://wiki.apache.org/hadoop/Hadoop2OnWindows
Chuck Lam's book is great but out of date wrt Windows support. Windows XP
is not a supported platform. Windows Server 2008 or later is recommended
and Windows Vista is also likely to work.
Folks, please refer to the wiki page
https://wiki.apache.org/hadoop/Hadoop2OnWindows and also BUILDING.txt in
the source tree. We believe we captured all the prerequisites in
BUILDING.txt so let us know if anything is missing.
On Fri, Jan 17, 2014 at 8:16 AM, Steve Lewis lordjoe2...@gmail.com
Seems to work well. Thank you very much!
On Jan 21, 2014, at 12:42 , Keith Wiley wrote:
I'll look it up. Thanks.
On Jan 21, 2014, at 11:43 , java8964 wrote:
You cannot use hadoop NLineInputFormat?
If you generate 100 lines of text file, by default, one line will trigger
one mapper
No, all I do is have my own shutdown hook in the main which closes the
FSDataOutputStream. Before I did that it would throw an ugly exception when
I hit Ctrl+C, telling me that the stream is already closed, because of this
shutdown hook (bad design on the hadoop part), so removing it keeps it open
I am writing temp files to HDFS with replication=1, so I expect the blocks to
be stored on the writing node. Are there any tips, in general, for optimizing
write performance to HDFS? I use 128K buffers in the write() calls. Are there
any parameters that can be set on the connection or in
I am not sure either, you have to ask Hadoop guys, but it was giving me a
hard time so I found a way around it.
On Tue, Jan 21, 2014 at 6:05 PM, Jay Vyas jayunit...@gmail.com wrote:
I guess im not sure what the ShutdownHook actually is there for Thats
the real question im asking .
On
i understand kerberos is used on hadoop to provide security in a multi-user
environment, and i can totally see its usage for a shared cluster within a
company to make sure sensitive data for one department is safe from prying
eyes of another department.
but for a hadoop cluster that sits behind a
Hi Koert,
I'm wondering what is the end-to-end goal you want to achieve.
You can disable security in Hadoop, where the cluster does not perform
additional authentication. Obviously you can go without kerberos in this
case and protect your clusters with other measures you've mentioned.
Hi all,
TestDistributedShell is a unit test for DistributedShell. I could run it
successfully in maven, but when I run it in eclipse, it failed. Do I need
any extra setting to make it run in eclipse ?
Here's the error message:
2014-01-22 09:38:20,733 INFO [AsyncDispatcher event handler]
hi folks,
I am new to Hadoop and I am trying to learning Hadoop follow
the book(Hadoop: The definitive guide 2nd edition), I found the sample code
is under 0.20, should I learn and exercise it under Hadoop 1.0 version? I
have installed Hadoop 2.2 which is another branch.
thanks,
The book's contents should still be very relevant as the APIs haven't changed.
On Wed, Jan 22, 2014 at 11:23 AM, Cooleaf cool...@gmail.com wrote:
hi folks,
I am new to Hadoop and I am trying to learning Hadoop follow the
book(Hadoop: The definitive guide 2nd edition), I found the
28 matches
Mail list logo