NOTICE: This e-mail message and any attachments are confidential, subject to
copyright and may be privileged. Any unauthorized use, copying or disclosure is
prohibited. If you are not the intended recipient, please delete and contact
the sender immediately. Please consider the environment
I will investigate…
Andy Kartashov
MPAC
Systems Developer – Information Sharing Services and Revenue Services
1340 Pickering Parkway, Pickering, L1V 0C4
• Phone : (905) 837 6200 ext.2006
• Mobile: (416) 722 1787
e-mail: andy.kartas...@mpac.camailto:andy.kartas...@mpac.ca
From: Wangda Tan
There are two ways to import data from sqoop.
Table dump (without select statements)
Example:
sqoop import --connect jdbc:mysql://host/database --username name
--password * --table table name -m 1 --fields-terminated-by choose
any delimiter of your choice
the -m 1 will give you one input
: Kartashov, Andy [mailto:andy.kartas...@mpac.ca]
Sent: Friday, March 01, 2013 9:21 AM
To: user@hadoop.apache.org
Subject: RE: How to use sqoop import
There are two ways to import data from sqoop.
Table dump (without select statements)
Example:
sqoop import --connect jdbc:mysql://host/database
Tadas,
One time I remember disconnecting a bunch of DNodes from my dev cluster i/o of
required, more elegant exclude.
The next thing I learned was my FS was corrupted. I did not care about my data
( I could re-import it again) but my NN metadata was messed up, so what worked
for me was to
I thought it was supposed to be addressed to unsubscribe-user.@.
hehehe
From: Prabhat Pandey [mailto:ppan...@us.ibm.com]
Sent: Wednesday, December 19, 2012 12:14 PM
To: user@hadoop.apache.org
Subject: unsubscrib
NOTICE: This e-mail message and any attachments are confidential, subject
Why does one need to build an app from source if one can download gzip file and
gunzip and use the app. Why git, why check out... What's considered building?
I have been exposed to bits and pieces here and there and every time come
across above topics the lack of knowledge is driving me NUTs.
object the daemon is running with,
currently:
Visit http://DAEMONHOST:PORT/conf (http://namenode:50070/conf for example at
the NN)
For job set information, the info Serge has passed earlier here is correct.
On Thu, Oct 18, 2012 at 2:29 AM, Kartashov, Andy andy.kartas...@mpac.ca wrote
I also show some discrepancy Sqoop'ing data from MySQL. Both MySQL select
count(*) from.. and sqoop -eval -query select count(*).. return equal
number of rows. But after importing the data into hdfs , hadoop fs -du shows
imported data at roughly 1/2 the size of the actual table size in the
Tony,
Can you please share with us on the permorfmance improvement (if any) after
using compression in map.output? I was abpout to start looking into it myself.
What compression codec did you use?
Rgds,
AK
-Original Message-
From: Tony Burton [mailto:tbur...@sportingindex.com]
Sent:
...@gmail.com wrote:
Yups I can see my class files there.
On Thu, Nov 29, 2012 at 2:13 PM, Kartashov, Andy
andy.kartas...@mpac.camailto:andy.kartas...@mpac.ca
wrote:
Can you try running jar -tvf word_cnt.jar and see if your static nested
classes WordCount2$Map.class and WordCount2
and configuration object
(getting it from org.apache.hadoop.mapreduce.Job instance).
Best,
Mahesh.B.
Calsoft Labs.
On Mon, Nov 26, 2012 at 8:18 PM, Kartashov, Andy
andy.kartas...@mpac.camailto:andy.kartas...@mpac.ca wrote:
Harsh,
Thanks for the DistributedCache.addCacheFile(URI, job.getConfiguration
The common problem users have are adding values to the classpath AFTER the
daemons had been started. If you are getting ClassNotFound exception and are
100% sure you have correctly specified path to the jar files and your jar files
actually have the compiled class then simply restart the
Guys,
I understand that if not specified, default block size of HDFs is 64Mb. You can
control this value by altering dfs.block.size property and increasing to value
to 64Mb x 2 or 64Mb x 4.. Every time we make the change to this property we
must reimport the data for the changes to take effect
at that javadoc. Job does a have getConfiguration() method. You
may have missed it the first time because it's inherited from a parent class,
JobContext.
On 27 November 2012 14:23, Kartashov, Andy
andy.kartas...@mpac.camailto:andy.kartas...@mpac.ca wrote:
Thank man for the response. Much appreciated.
Why
-
From: Harsh J [mailto:ha...@cloudera.com]
Sent: Tuesday, November 27, 2012 12:57 PM
To: user@hadoop.apache.org
Subject: Re: block-size vs split-size
Hi,
Response inline.
On Tue, Nov 27, 2012 at 8:35 PM, Kartashov, Andy andy.kartas...@mpac.ca wrote:
Guys,
I understand that if not specified
With the Error: Output directory hdfs://.com/user/krishna/input already
exists simply delete the output directory perior to runnign the job.
$hadoop fs -rm output/file
-Original Message-
From: Harsh J [mailto:ha...@cloudera.com]
Sent: Saturday, November 24, 2012 6:02 AM
To:
You could also run $sudo service --status-all
-Original Message-
From: a...@hsk.hk [mailto:a...@hsk.hk]
Sent: Monday, November 26, 2012 8:41 AM
To: user@hadoop.apache.org
Cc: a...@hsk.hk
Subject: Re: Datanode: Cannot start secure cluster without privileged
resources
Hi,
A question:
I
]
Sent: Saturday, November 24, 2012 2:22 AM
To: user@hadoop.apache.org
Subject: Re: MapReduce APIs
You could use the org.apache.hadoop.filecache.DistributedCache API as:
DistributedCache.addCacheFile(URI, job.getConfiguration());
On Sat, Nov 24, 2012 at 3:06 AM, Kartashov, Andy andy.kartas
Try running $sudo jps or jps as root #jps
You will get more info, i.e:
Jps
* SecondaryNameNode
* JobTracker
* NameNode
-Original Message-
From: a...@hsk.hk [mailto:a...@hsk.hk]
Sent: Monday, November 26, 2012 9:44 AM
To: Harsh J
Cc: a...@hsk.hk; user@hadoop.apache.org
Guys,
I know that there is old and new API for MapReduce. The old API is found under
org.apache.hadoop.mapred and the new is under org.apache.hadoop.mapreduce
I successfully used both (the old and the new API) writing my MapReduce
drivers.
The problem came up when I tried to use distributed
This is on port 50075, correct?
From: Dhaval Shah [mailto:prince_mithi...@yahoo.co.in]
Sent: Friday, November 23, 2012 3:28 PM
To: user@hadoop.apache.org
Subject: HDFS Web UI - Incorrect context
Hello everyone.. I have a very weird issue at hand.. I am using CDH4.0.0 on one
of my clusters and
Guys,
I've read that increasing above (default 4kb) number to, say 128kb, might speed
things up.
My input is 40mln serialised records coming from RDMS and I noticed that with
increased IO my job actually runs a tiny bit slower. Is that possible?
p.s. got two questions:
1. During Sqoop import
Jamal,
This is what I am using...
After you start your job, visit jobtracker's WebUI ip-address:50030
And look for Cluster summary. Reduce Task Capacity shall hint you what
optimally set your number to. I could be wrong but it works for me. :)
Cluster Summary (Heap Size is *** MB/966.69 MB)
Bejoy,
I've read somethere about keeping number of mapred.reduce.tasks below the
reduce task capcity. Here is what I just tested:
Output 25Gb. 8DN cluster with 16 Map and Reduce Task Capacity:
1 Reducer - 22mins
4 Reducers - 11.5mins
8 Reducers - 5mins
10 Reducers - 7mins
12 Reducers -
Guys,
After changing property of block size from 64 to 128Mb, will I need to
re-import data or will running hadoop balancer will resize blocks in hdfs?
Thanks,
AK
NOTICE: This e-mail message and any attachments are confidential, subject to
copyright and may be privileged. Any unauthorized
Cheers!
From: Kai Voigt [mailto:k...@123.org]
Sent: Tuesday, November 20, 2012 11:34 AM
To: user@hadoop.apache.org
Subject: Re: block size
Hi,
Am 20.11.2012 um 17:31 schrieb Kartashov, Andy
andy.kartas...@mpac.camailto:andy.kartas...@mpac.ca:
After changing property of block size from 64
I specify mine inside mapred-site.xml
property
namemapred.reduce.tasks/name
value20/value
/property
Rgds,
AK47
From: Bejoy KS [mailto:bejoy.had...@gmail.com]
Sent: Tuesday, November 20, 2012 3:10 PM
To: user@hadoop.apache.org
Subject: Re: number of reducers
Hi Sasha
By default the
Guys,
I am learning that NN doesn't persistently store block locations. Only file
names and heir permissions as well as file blocks. It is said that locations
come from DataNodes when NN starts.
So, how does it work?
Say we only have one file A.txt in our HDFS that is split into 4 blocks
Guys,
Sometimes when I run my MR job I see that Reduce tasks kick in as early as when
Map task reached only about 20%. How can the MR be possibly so sure and start
running Reduce at this point? What if a Mapper produce more keys that Reduce
function already finished with?
Andy Kartashov
MPAC
actually signifies other intermediate processes like shuffle
and sort. Don't get confused with it like I did initially :)
Regards,
Mohammad Tariq
On Mon, Nov 19, 2012 at 8:07 PM, Kartashov, Andy
andy.kartas...@mpac.camailto:andy.kartas...@mpac.ca wrote:
Guys,
Sometimes when I run my MR job I
(another one of the replicated blocks) when and
only when the initially task running (say on DN1)failed
Thanks,
From: Kai Voigt [mailto:k...@123.org]
Sent: Monday, November 19, 2012 10:01 AM
To: user@hadoop.apache.org
Subject: Re: a question on NameNode
Am 19.11.2012 um 15:43 schrieb Kartashov
Agreed here. Whenever you have id disagreement between NN and DN. Simply,
delete all the entries in your df/data directory and restart DN. No need to
reformat NN.
Rgds,
AK47
From: shashwat shriparv [mailto:dwivedishash...@gmail.com]
Sent: Friday, November 16, 2012 2:53 AM
To:
Vinay,
Two questions.
1. Configure the another namenode's configuration.
What exactly to configure.
2. What is zkfs?
From: Vinayakumar B [mailto:vinayakuma...@huawei.com]
Sent: Friday, November 16, 2012 3:31 AM
To: user@hadoop.apache.org
Subject: RE: High
haven't already done
that.
/* Joey */
On Fri, Nov 16, 2012 at 3:13 PM, Kartashov, Andy
andy.kartas...@mpac.camailto:andy.kartas...@mpac.ca wrote:
Guys,
The notorious error. Used all possible clues to resolve this.
Ran sqoop --import at command line without problems. But whenever am
Guys,
Have struggled for the last four days with this and still cannot find an answer
even after hours of searching the web.
I tried oozie workflow to execute my consecutive sqoop jobs in parallel. I use
forking that executes 9 sqoop-action-nodes.
I had no problem executing the job on a
=launcherpoolname
Your target pool for launchers can still carry limitations, but it
should no longer deadlock your actual MR execution (after which the
launcher dies away anyway).
Unqte
Please help.
Thanks,
Ak-47
From: Kartashov, Andy
Sent: Thursday, November 15, 2012 9:45 AM
To: user
Yinghau,
Last week you mentioned that you are running your cluster on EC2 that you shut
them down instance(s) over the weekend.
QTE
Since I shutdown the EC2 instance every night, I thought that using
'master','slave1','slave2' will save typing after the full host name change
with reboot.
Mark,
The way I understand it...
# of mappers is calculated by deviding your input by file split size (64Mb
default).
So, say your input is 64Gb in size so you will end up with 1000 mappers.
Slots are a different story. It is number of map tasks processed by a single
node in parrallel. It is
Guys,
Came across this error like many others who tried to run Ooozie examples.
Searched and read bunch of posts on this topic. Even came across Harsh's
response stipulating that oozie user must be added to the user group on the
name node but it wasn't explained how. Any insight please?
Guys,
A few questions please.
1. When I tried to run Oozie examples I was told to copy to copy /examples
folder into HDFS. However when I tried to run oozie job I was told that the
source file was not found. Well, until I cd'ed into the local directory on
Linux and re-run the job
Yinghua,
What mode are you running your hadoop in: Local/Pseud/Fully...?
Your hostname is not recognised
Your configuration setting seems to be wrong.
Hi, all
Could some help looking at this problem? I am setting up a four node cluster on
EC2 and seems that the cluster is set up fine
, NodeManager and JobHistoryServer. Each
node can ssh to all the nodes without problem.
But problem appears when trying to run any job.
From: Kartashov, Andy
Sent: Friday, November 09, 2012 12:37 PM
To: user@hadoop.apache.org
Subject: Erro running pi programm
Yinghua,
What mode are you running
Hadoopers,
“Hadoop ships the code to the data instead of sending the data to the code.”
Say you added two DNs/TTs to the cluster. They have no data at this point, i.e.
you have not ran the balancer.
In view of the above quoted statement, will these two nodes not participate in
the MapReduce job
, one would expect that your Data nodes would
naturally take part in this portion of the task if the num.reducers parameter
was specified.
On Thu, Nov 8, 2012 at 9:35 AM, Kartashov, Andy
andy.kartas...@mpac.camailto:andy.kartas...@mpac.ca wrote:
Hadoopers,
Hadoop ships the code to the data instead
Try adding user [raw] to the hadoop group. For now, user [raw] belongs to the
“others” where there are no read|write|execute permissions are set.
From: rongshen.long [mailto:rongshen.l...@baifendian.com]
Sent: Thursday, November 08, 2012 3:00 AM
To: user
Subject: feel puzzled at HDFS group
Guys,
Please help me out here. Have issue with the versions. Have both hsqldb.jar in
2.2.9 and 1.8.0.10 By the way, which project do they come with and why do I
have two? I want to know in case I need to reinstall/upgrade...
The problem is, my Sqoop jobs were initially complaining whenever I
Guys,
When running examples, you bring them into HDFS. Say, you need to make some
correction to a file, you need to make them on local FS and run $hadoop fs -put
... again. You cannot just make changes to files inside HDFS except for
touchz a file, correct?
Just making sure.
Thnx,
AK
NOTICE:
Guys,
Sometimes I get an occasional e-mail saying at the top:
This might be a phishing e-mail and is potentially unsafe. Links and other
functionality have been disabled
Is this because of the posted links?
Rgds,
AK
NOTICE: This e-mail message and any attachments are confidential,
Sadak,
Sorry, could not answer your original e-mail as it was blocked.
Are you running SNN on a separate node?
If so, it needs to communicate with NN.
Add this property to your hdfs-site.xml
property
namedfs.namenode.http-address/name
valuehost-name:50070/value !-change it to
The way I understand it...
Hadoop is a distributed file system that allows you to create folders in its
own NameSpace, copy files to and from your local Linux FS. Set-up Hadoop
configuration for local|pseudo-distributed|fully-distributed cluster.
You write your jobs using MapReduce API and
Have you tried hadoop fs -chmod a+rwx /tmp
From: Arun C Murthy [mailto:a...@hortonworks.com]
Sent: Wednesday, November 07, 2012 3:11 PM
To: user@hadoop.apache.org
Subject: Re: Sticky Bit Problem (CDH4.1)
Pls ask Cloudera lists...
On Nov 7, 2012, at 9:57 AM, Brian Derickson wrote:
Hey all,
ownership. DFS needs hdfs:hadoop / MapReduce needs
mapred:hadoop
Feel free to correct me if I am wrong.
Rds,
AK47
-Original Message-
From: Kartashov, Andy
Sent: Friday, October 26, 2012 12:40 PM
To: user@hadoop.apache.org
Subject: cluster set-up / a few quick questions
Gents,
1.
- do
Hadoopers,
How does one start Daemons remotely when scripts normally require root user to
start them? Do you modify scripts?
Thanks,
NOTICE: This e-mail message and any attachments are confidential, subject to
copyright and may be privileged. Any unauthorized use, copying or disclosure is
/6/12 3:07 PM, Kartashov, Andy wrote:
Harsh/Ravi,
I wrote my own scripts to [start|stop|restart] [hdfs|mapred] daemons. The
content of the script is
$sudo service /etc/init.d/hadoop-[hdfs-*|mapred-*]
I have no problem starting Daemons locally on each node. I have sudo-access
granted. I run
Your error takes place during reduce task, when temporary files are written to
memory/disk. You are clearly running low on resources. Check your memory $
free -m and disk space $ df -H as well as $hadoop fs -df
I remember it took me a couple of days to figure out why I was getting heap
size
... another thought... do you happened to have some ungracefully terminated
jobs still running in the background..
Try hadoop job -list
Sometimes, when I hardstop a job and restart a new one, I notice a slow down
until I kill those jobs gracefully by running hadoop job -kill job-id . The
-
From: Kartashov, Andy
Sent: Friday, October 26, 2012 3:56 PM
To: user@hadoop.apache.org
Subject: RE: cluster set-up / a few quick questions - SOLVED
Hadoopers,
The problem was in EC2 security. While I could passwordlessly ssh into another
node and back I could not telnet to it due to EC2
I checked my MR admin page running on host:50030:jobtracker.jsp this is what
I am learning.
I run 2-core processor, so the admin page told me that my max map/reduce slot
capacity was 14 or so I assume 7 nodes x 2slots.
I did not touch the property of .map.tasks. It seemed that MR set it
Yulia,
I just ran hadoop version on my NN, separate SNN, and three separate DNs and
all point to the exact the same build version. I know that you must install
the exact same version on every machine in your cluster to run fully-dist
Hadoop.
From: Yulia Stolin [mailto:yu...@amobee.com]
Sent:
Guys,
Do you set the above property once on the master NN in in mapred-site.xml or
on each DN/TTs.
Similarly, what about property: mapred.child.java.opts. Once on NN or each and
every DN/TaskTrackers?
NOTICE: This e-mail message and any attachments are confidential, subject to
copyright and
-hdfs-datanode
but I ran only
sudo yum install hadoop-0.20-mapreduce-tasktracker
After installing datanode and reformatting the namespace, datanode started like
a new engine.
Silly me. Oh well. :) Calm seas do not make good sailors.
AK47
From: Kartashov, Andy
Sent: Thursday, October 25, 2012
Gents,
1.
- do you put Master's node hostname under fs.default.name in core-site.xml on
the slave machines or slaves' hostnames?
- do you need to run sudo -u hdfs hadoop namenode -format and create /tmp
/var folders on the HDFS of the slave machines that will be running only DN and
TT or not?
(or root)
2) If answer to questions 1 is yes, how did you start NN, JT DN and TT
3) If you started them one by one, there is no reason running a command on one
node will execute it on other.
On Sat, Oct 27, 2012 at 12:17 AM, Kartashov, Andy andy.kartas...@mpac.ca
wrote:
Andy, many thanks.
I am
Yogesh,
Have you figured it out? I had the same issue (needed passwordless ssh) and
managed. Let me know if you are still stuck.
AK47
From: yogesh.kuma...@wipro.com [mailto:yogesh.kuma...@wipro.com]
Sent: Thursday, October 25, 2012 4:28 AM
To: user@hadoop.apache.org
Subject: ERROR:: SSH
Yogesh,
One need to understand how passwordless ssh work.
Say there is a user “yosh”
He types ssh localhost and is prompted for a password. This is how to resolve
this.
1.
Type : ssh-keygen -t rsa
-t stand for type and rsa (encryption) - another type will be dsa.
Well, after you run above
Yoges,
If you are asked for a password you PSSWDLSS SSH is not working.
Shoot, forgot one detail. Please rememeber to set file authorized_keys to 600
permission. :)
From: yogesh dhari [mailto:yogeshdh...@live.com]
Sent: Thursday, October 25, 2012 1:14 PM
To: hadoop helpforoum
Subject: RE:
Guys,
I finally solved ALL the Errors: in ...datanode*.log after trying to start
the node with service datanode start.
The errors were:
- conflicting NN DD ids - solved through reformatting NN.
- could not connect to 127.0.0.1:8020 - Connection refused - solved through
correcting a typo
Guys tried for hours to resolve this error.
I am trying to import a table to Hadoop using Sqoop.
ERROR is:
Error:
org.hsqldb.DatabaseURL.parseURL(Ljava/lang/String;ZZ)Lorg/hsqldb/persist/HsqlProperties
I realise that there is an issue with the versions of hsqldb.jar files
At first, Sqoop was
Subash,
I have been experiencing this type of an error at some point and no matter how
much I played with heap size it didn't work. What I found out at the end is I
was running out of physical memory. My output file was about 4Gb with only
2.5Gb of free space available. Check you space with
://www.cloudera.com/blog/2012/10/mr2-and-yarn-briefly-explained/
On Fri, Oct 19, 2012 at 3:30 AM, Kartashov, Andy andy.kartas...@mpac.ca wrote:
They are not comparable.
YARN also known as MRv2 is the newer version of MapReduce also known as MRv1.
-Original Message-
From: Tom Brown
Gentlemen,
Can you please explain which property is responsible to tell JobTracker where
TaskTruckers are running?
I get this far so far:
NN knows where SNN runs by specifying FQDN in the conf/masters of the NN
NN knows where DN runs by specifying FQDNs of the DNs in the conf/slaves of the
NN
that successfully register with it at
runtime.
Thanks,
+Vinod
On Oct 22, 2012, at 7:55 AM, Kartashov, Andy wrote:
Gentlemen,
Can you please explain which property is responsible to tell JobTracker where
TaskTruckers are running?
I get this far so far:
NN knows where SNN runs by specifying FQDN
They are not comparable.
YARN also known as MRv2 is the newer version of MapReduce also known as MRv1.
-Original Message-
From: Tom Brown [mailto:tombrow...@gmail.com]
Sent: Thursday, October 18, 2012 4:33 PM
To: user@hadoop.apache.org
Subject: Differences between YARN and Hadoop
To
Is there a command-linein hadoop or Java methd to display all (if not
individual) hadoop's current properties are set to?
Rgds,
AK
NOTICE: This e-mail message and any attachments are confidential, subject to
copyright and may be privileged. Any unauthorized use, copying or disclosure is
Gents,
Let’s not forget about fun. This is an awesome parody clip on Hadoop. Funny,
yet quite informative:
http://www.youtube.com/watch?v=hEqQMLSXQlY
Rgds,
AK
NOTICE: This e-mail message and any attachments are confidential, subject to
copyright and may be privileged. Any unauthorized use,
: Friday, October 12, 2012 6:24 PM
To: user@hadoop.apache.org
Subject: Re: Issue when clicking on BrowseFileSystem
On Fri, Oct 12, 2012 at 2:09 PM, Kartashov, Andy andy.kartas...@mpac.ca wrote:
It displays:
/browseDirectory.jsp?namenodeInfoPort=50070dir=/nnaddr=localhost.loc
aldomain:8020
OK
, xorg, Gnome, etc.
-andy
On Mon, Oct 15, 2012 at 12:24 PM, Kartashov, Andy andy.kartas...@mpac.ca
wrote:
Andy,
My /etc/hosts does say: 127.0.0.1 localhost.localdomain localhost
Shall I delete this entry?
The only reference to localhost is in:
Core-site:
property
I do too can open domain:50070/dfshealth.jsp page and click NameNode Logs
link however when I click on Browse the filesystem link I get the following:
Network Error (dns_unresolved_hostname)
Your requested host localhost.localdomain could not be resolved by DNS.
For assistance, contact your
cluster?
If none of the above seems diagnostic, what does the
NameSystem.registerDatanode: node registration from
DatanodeRegistration(192.168.122.87 log messages in your namenode log look
like?
-andy
On Fri, Oct 12, 2012 at 10:20 AM, Kartashov, Andy andy.kartas...@mpac.ca
wrote:
hadoop version
Isaacson [mailto:a...@cloudera.com]
Sent: Friday, October 12, 2012 4:31 PM
To: user@hadoop.apache.org
Subject: Re: Issue when clicking on BrowseFileSystem
On Fri, Oct 12, 2012 at 11:42 AM, Kartashov, Andy andy.kartas...@mpac.ca
wrote:
You are absolutely right. It was inded localhost
Guys,
Is there another way to set the output from within the Mapper?
My Mapper reads from various serialised files and generates different type of
objects for values depending on the returned value of instanceof. I wanted to
change the class name within the Mapper as opposed to the driver
with the Mapper? Maybe you could use
the same approach.
Regards
Bertrand
On Thu, Oct 11, 2012 at 7:49 PM, Kartashov, Andy
andy.kartas...@mpac.camailto:andy.kartas...@mpac.ca wrote:
Guys,
Is there another way to set the output from within the Mapper?
My Mapper reads from various serialised files
Have you created a sub-dir under user/ as user/robing for user robing?
Depending on your version of hadoop it is import to set up your directory
structure users/groups properly.
Here is just an example:
drwxrwxrwt - hdfs supergroup 0 2012-04-19 15:14 /tmp
drwxr-xr-x - hdfs
Guys,
I have trouble using sequence file in Mar-Reduce. The output I get is very
last record.
I am creating sequence file while importing MySQL table into Hadoop using:
$Sqoop import.. --as-sequencefile
I am then are trying to read from this file into the mapper and create keys
from
(). *blash*
Andy Kartashov
MPAC
Architecture RD, Co-op
1340 Pickering Parkway, Pickering, L1V 0C4
* Phone : (905) 837 6269
* Mobile: (416) 722 1787
andy.kartas...@mpac.camailto:andy.kartas...@mpac.ca
From: Kartashov, Andy
Sent: Tuesday, October 09, 2012 1:19 PM
To: user@hadoop.apache.org
Subject
Guys,
Have any one successfully executed commands like
Sqoop job -list
Sqoop job -create .. etc.
Do I need to set-up my sqoop-core.xml before hand?
Example..
sqoop job --list
12/10/05 09:44:29 WARN hsqldb.HsqldbJobStorage: Could not interpret as a
number: null
12/10/05 09:44:29 ERROR
Parkway, Pickering, L1V 0C4
* Phone : (905) 837 6269
* Mobile: (416) 722 1787
andy.kartas...@mpac.camailto:andy.kartas...@mpac.ca
From: Marcos Ortiz [mailto:mlor...@uci.cu]
Sent: Friday, October 05, 2012 10:52 AM
To: user@hadoop.apache.org
Cc: Kartashov, Andy
Subject: Re: sqoop jobs
Which version
Guys, have been stretching my head for the past couple of days. Why are my
tags duplicated while the content they wrap around i.e.my StringBuilder sb is
not?
My Reduce code is:
while (values.hasNext()){
sb.append(values.next().toString());
}
output.collect(key,new
at the output
Hi,
Could you clarify your post to show what you expect your code to have actually
printed and what it has printed?
On Tue, Oct 2, 2012 at 7:01 PM, Kartashov, Andy andy.kartas...@mpac.ca wrote:
Guys, have been stretching my head for the past couple of days. Why
are my tags
a Mapper? Is there any chance that logic in the Mapper wrapped
the values with the tags too, so that the records were already wrapped when
they entered the reducer logic?
Thank you,
--Chris
On Tue, Oct 2, 2012 at 9:01 AM, Kartashov, Andy
andy.kartas...@mpac.camailto:andy.kartas...@mpac.ca wrote:
I
]
Sent: Monday, October 01, 2012 2:04 PM
To: user@hadoop.apache.org
Subject: Re: Nested class
it should work. Make sure top level class is public
On Oct 1, 2012 1:32 PM, Kartashov, Andy
andy.kartas...@mpac.camailto:andy.kartas...@mpac.ca wrote:
Hello all,
Is this possible to have Reducer
92 matches
Mail list logo