Hi,
I wanna use solr cloud. i downloaded the code from the trunk, and
successfully executed the examples as shown in wiki. but when i try the same
with multicore. i cannot access:
http://localhost:8983/solr/collection1/admin/zookeeper.jsp
it says page not found.
Following is my
Hi,
While running a mapreduce task, i get a weird exception, taskProcess
exit with a non zero status of 126. I tried hunting for the same online,
coudnt find any lead. any pointers?
Regards,
Raakhi
Hi,
I am running a job which has lotta preprocessing involved. so whn i
run my class from a jarfile, somehow it terminates after sometime without
giving any exception,
i have tried running the same program several times, and everytime it
terminates at different locations in the code(during
Hi,
Has anyone tried creating customInputFormat which reads from
solrIndex for processing using mapreduce??? is it possible doin tht?? and
how?
Regards,
Raakhi
Hi,
i have been trying to implement custom input and output formats. i
was successful enough in creating an custom output format. but whn i call a
mapreduce method which takes in a file, using custom input format, i get an
exception.
java.lang.NullPointerException
at
Hi,
i m writing a map reduce program which reads a file from HDFS and
stores the contents in a static map (declared n initialized before executing
map reduce). but however after executing the map-reduce program, my map
returns 0 elements. is there any way i can make the data persistent in
Hi,
Suppose i have a hdfs file with 10,000 entries. and i want my job to
process 100 records at one time (to minimize loss of data during job
crashes/ network errors etc). so if a job can read a subset of records from
a fine in HDFS, i can combine with chaining to achieve my objective. for
Hi,
I am running a map reduce program which reads data from a file,
processes it and writes the output into another file.
i run 4 maps and 4 reduces, and my output is as follows:
09/08/27 17:34:37 INFO mapred.JobClient: Running job: job_200908271142_0026
09/08/27 17:34:38 INFO
Hi J-D,
I tried it. i defined
private static enum Counters { ROWS }
at the start of my program
and after executing the rowcounter mapreduce, calling
c.getCounter(Counters.ROWS) returns 0.
i have been through the source code of RowCounter, they have defined enum
Counters to be private. so i am not
Hi,
I am not very clear as to how does the mem cache thing works.
1. When you set memcache to say 1MB, does hbase write all the table
information into some cache memory and when the size reaches IMB, it writes
into hadoop and after that the replication takes place???
2. Is there any minimum
Hi Bharath,
You are tryin to extend an interface. in your class. this is where
you are going wrong.
you could alternative define a class that implements the TableMap interface
or use IdentityTableMap depending on your requirement.
Hope that helps,
Regards,
Raakhi
On Wed, Jul 22, 2009 at
Hi,
I am interested in querying my hbase tables using PIG-latin. i have
come across org.apache.pig.backend.hadoop.hbase API... but there is no
documentation for usuage of the API given.
Has anyone tried an example using this???
Thanks,
Raakhi
-0.19.1-index.jar -inputPaths input-path -outputPath
output-path-for-log -indexPath path-to-store-indexes -conf
src/contrib/index/conf/index-config.xml
I hope this helps.
Regards,
- Bhushan
-Original Message-
From: Rakhi Khatwani [mailto:rakhi.khatw...@gmail.com]
Sent: Monday
for analysis?
Regards
Raakhi
On Fri, Jun 19, 2009 at 4:19 PM, Harish Mallipeddi
harish.mallipe...@gmail.com wrote:
On Fri, Jun 19, 2009 at 4:06 PM, Rakhi Khatwani rakhi.khatw...@gmail.com
wrote:
we want hadoop cluster 1 for collecting data n storing it in HDFS
we want hadoop cluster 2
Hi,
You could also use apache commons logging to write logs in your
map/reduce functions which will be seen in the jobtracker UI.
that's how we did debugging :)
Hope it helps
Regards,
Raakhi
On Tue, Jun 16, 2009 at 7:29 PM, jason hadoop jason.had...@gmail.comwrote:
When you are running
Hi,
Can we specify which subset of machines to use for different jobs? E.g. We
set machine A as namenode, and B, C, D as datanodes. Then for job 1, we have
a mapreduce that runs on B C and for job 2, the map-reduce runs on C D.
Regards,
Raakhi
region server? you sure you insert values into
entry:hostname and entry:msg columns?
On Fri, May 29, 2009 at 1:24 AM, Rakhi Khatwani
rakhi.khatw...@gmail.com wrote:
Hi,
I was trying to create secondary indexes on hbase table. i used the
below
reference to create the table. and i got
,
St.Ack
On Mon, Jun 1, 2009 at 4:12 AM, Rakhi Khatwani rakhi.khatw...@gmail.com
wrote:
Hi,
yea i actually did tht... but i did it from hbase shell. when i do
it
programitically, it works fine.
Regards,
Raakhi
On Sat, May 30, 2009 at 12:32 AM, Xinan Wu wuxi...@gmail.com wrote
Hi,
I was trying to create secondary indexes on hbase table. i used the below
reference to create the table. and i got successfully created.
reference:
http://blog.eventexchange.net/2009/05/creating-hbase-indexes.html
for example i create a table called TestIndexTable using the above
-node's
directories
are damaged.
In regular case you start name-node with
./hadoop-daemon.sh start namenode
Thanks,
--Konstantin
Rakhi Khatwani wrote:
Hi,
I followed the instructions suggested by you all. but i still
come across this exception when i use the following command
On Thu, May 14, 2009 at 9:03 AM, Rakhi Khatwani
rakhi.khatw...@gmail.com
wrote:
Hi,
I wanna set up a cluster of 5 nodes in such a way that
node1 - master
node2 - secondary namenode
node3 - slave
node4 - slave
node5 - slave
How do we go about that?
there is no property
Hi,
how do i get the job progress n task progress information
programmaticaly at any point of time using the API's
there is a JobInProgress and TaskInProgress classes... but both of them are
private
any suggestions?
Thanks,
Raakhi
.
Thanks,
Raakhi
On Mon, May 18, 2009 at 7:46 PM, Jothi Padmanabhan joth...@yahoo-inc.comwrote:
Could you let us know what information are you looking to extract from
these
classes? You possibly could get them from other classes.
Jothi
On 5/18/09 6:23 PM, Rakhi Khatwani rakhi.khatw
folder
misleading
Billy
Rakhi Khatwani rakhi.khatw...@gmail.com wrote in message
news:384813770905140603g4d552834gcef2db3028a00...@mail.gmail.com...
Hi,
I wanna set up a cluster of 5 nodes in such a way that
node1 - master
node2 - secondary
Hi,
I wanna set up a cluster of 5 nodes in such a way that
node1 - master
node2 - secondary namenode
node3 - slave
node4 - slave
node5 - slave
How do we go about that?
there is no property in hadoop-env where i can set the ip-address for
secondary name node.
if i set node-1 and node-2 in
Hi,
I wanna set up a cluster of 5 nodes in such a way that
node1 - master
node2 - secondary namenode
node3 - slave
node4 - slave
node5 - slave
How do we go about that?
there is no property in hadoop-env where i can set the ip-address for
secondary name node.
if i set node-1 and node-2 in
Hi, I have a couple of small issues regarding hadoop/hbase
1. i wanna scan a table, but the table is really huge. so i want the result
of the scan to some file so that i can analyze it. how do we go about it???
2. how do you dynamically add and remove nodes in the cluser without
disturbing the
Hi, I have a couple of small issues regarding hadoop/hbase
1. i wanna scan a table, but the table is really huge. so i want the result
of the scan to some file so that i can analyze it. how do we go about it???
2. how do you dynamically add and remove nodes in the cluser without
disturbing the
Hi Jason,
when will the full version of your book be available??
On Thu, Apr 30, 2009 at 8:51 AM, jason hadoop jason.had...@gmail.comwrote:
You need to make sure that the shared library is available on the
tasktracker nodes, either by installing it, or by pushing it around via the
Hi,
In one of the map tasks, i get the following exception:
java.io.IOException: Task process exit with nonzero status of 255.
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:424)
java.io.IOException: Task process exit with nonzero status of 255.
at
Thanks Jason,
is there any way we can avoid this exception??
Thanks,
Raakhi
On Mon, Apr 27, 2009 at 1:20 PM, jason hadoop jason.had...@gmail.comwrote:
The jvm had a hard failure and crashed
On Sun, Apr 26, 2009 at 11:34 PM, Rakhi Khatwani
rakhi.khatw...@gmail.comwrote:
Hi
Hi,
I have faced somewhat a similar issue...
i have a couple of map reduce jobs running on EC2... after a week or so,
i get a no space on device exception while performing any linux command...
so end up shuttin down hadoop and hbase, clear the logs and then restart
them.
is there a cleaner
redirect your logs to
some place under /mnt (/dev/sdb1); that's 160 GB.
- Aaron
On Sun, Apr 26, 2009 at 3:21 AM, Rakhi Khatwani rakhi.khatw...@gmail.com
wrote:
Hi,
I have faced somewhat a similar issue...
i have a couple of map reduce jobs running on EC2... after a week or
so,
i get
Hi,
I have a table with N records,
now i want to run a map reduce job with 4 maps and 0 reduces.
is there a way i can create my own custom input split so that i can
send 'n' records to each map??
if there is a way, can i have a sample code snippet to gain better
understanding?
Hi,
I have a table with N records,
now i want to run a map reduce job with 4 maps and 0 reduces.
is there a way i can create my own custom input split so that i can
send 'n' records to each map??
if there is a way, can i have a sample code snippet to gain better
understanding?
of the above class and it should be obvious - I hope.
Lars
Rakhi Khatwani wrote:
Hi,
I have a table with N records,
now i want to run a map reduce job with 4 maps and 0 reduces.
is there a way i can create my own custom input split so that i can
send 'n' records to each map
).
St.Ack
On Wed, Apr 22, 2009 at 9:06 AM, Stack saint@gmail.com wrote:
If you run
./bin/hadoop -jar hbase.jar rowcounter
It will emit usage. You are a smart fellow. I think you can take it from
there.
Stack
On Apr 22, 2009, at 5:48, Rakhi Khatwani rakhi.khatw...@gmail.com
, Apr 22, 2009 at 10:13 PM, stack st...@duboce.net wrote:
Sorry. I'm having trouble following your question below. Want to have
another go at it?
Thanks,
St.Ack
On Tue, Apr 21, 2009 at 3:19 AM, Rakhi Khatwani rakhi.khatw...@gmail.com
wrote:
Hi,
I have a scanario,
i have
Thanks Stack
will try that tomorrow.
Regards,
Raakhi
On Wed, Apr 22, 2009 at 10:33 PM, stack st...@duboce.net wrote:
On Wed, Apr 22, 2009 at 9:53 AM, Rakhi Khatwani rakhi.khatw...@gmail.com
wrote:
Hi Stack,
In the traditional scenario, an InputSplit is given to the
map
was.
St.Ack
On Wed, Apr 22, 2009 at 9:50 AM, Rakhi Khatwani rakhi.khatw...@gmail.com
wrote:
Hi St Ack,
well i did go through the usage... where we were supposed to
mention 3 parameters, OutputDir, TableName and Columns
what i actually wanted is an int value count, which
Hi,
I have a scanario,
i have a table... which has 2 be read into say 'n' maps.
so now in each map... i need 2 access say 'm' records at once... so
that i can spawn them using threads.. to increase parallel processing.
is it feasible??? i am using hadoop 0.19.0 and hbase
node problem
with 10.
- Andy
From: Rakhi Khatwani
Subject: Re: Ec2 instability
To: hbase-u...@hadoop.apache.org, core-user@hadoop.apache.org
Date: Friday, April 17, 2009, 9:44 AM
Hi,
this is the exception i have been getting @ the mapreduce
java.io.IOException: Cannot run
Hi,
Its been several days since we have been trying to stabilize
hadoop/hbase on ec2 cluster. but failed to do so.
We still come across frequent region server fails, scanner timeout
exceptions and OS level deadlocks etc...
and 2day while doing a list of tables on hbase i get the following
)
at java.lang.ProcessImpl.start(ProcessImpl.java:65)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:452)
... 10 more
On Fri, Apr 17, 2009 at 10:09 PM, Rakhi Khatwani
rakhi.khatw...@gmail.comwrote:
Hi,
Its been several days since we have been trying to stabilize
hadoop
persists
after this, please let us know.
- Andy
From: Rakhi Khatwani
Subject: EOF Exception while performing a list command on hbase shell
To: hbase-user@hadoop.apache.org
Date: Thursday, April 16, 2009, 3:14 AM
Hi,
I tried to list all the tables on hbase and i get the
following
Hi,
Its been several days since we have been trying to stabilize
hadoop/hbase on ec2 cluster. but failed to do so.
We still come across frequent region server fails, scanner timeout
exceptions and OS level deadlocks etc...
and 2day while doing a list of tables on hbase i get the following
)
at java.lang.ProcessImpl.start(ProcessImpl.java:65)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:452)
... 10 more
On Fri, Apr 17, 2009 at 10:09 PM, Rakhi Khatwani
rakhi.khatw...@gmail.comwrote:
Hi,
Its been several days since we have been trying to stabilize
hadoop
Hi,
I am running a map-reduce program on 6-Node ec2 cluster. and after a
couple of hours all my tasks gets hanged.
so i started digging into the logs
there were no logs for regionserver
no logs for tasktracker.
However for jobtracker i get the following:
2009-04-16 03:00:29,691 INFO
of memory available. but i
still get the exception :(
Thanks
Raakhi
On Thu, Apr 16, 2009 at 1:18 PM, Desai, Milind B milind.de...@hp.comwrote:
From the exception it appears that there is no space left on machine. You
can check using 'df'
Thanks
Milind
-Original Message-
From: Rakhi
you don't.
i would check on the file system as your jobs run and see if indeed
they are filling-up.
Miles
2009/4/16 Rakhi Khatwani rakhi.khatw...@gmail.com:
Hi,
following is the output on the df command
[r...@domu-12-31-39-00-e5-d2 conf]# df -h
FilesystemSize Used Avail
Hi,
Incase we migrate from hadoop 0.19.0 and hbase 0.19.0 to hadoop 0.20.0
and hbase 0.20.0 respectively, how would it affect the existing data on
hadoop dfs and hbase tables? can we migrate the data using distcp only??
Regards
Raakhi
Hi,
I am running a map-reduce program on 6-Node ec2 cluster. and after a
couple of hours all my tasks gets hanged.
so i started digging into the logs
there were no logs for regionserver
no logs for tasktracker.
However for jobtracker i get the following:
2009-04-16 03:00:29,691 INFO
Hi,
I tried to list all the tables on hbase and i get the following
exception:
hbase(main):001:0 list
NativeException: org.apache.hadoop.hbase.client.RetriesExhaustedException:
Trying to contact region server 10.254.74.127:60020 for region .META.,,1,
row '', but failed after 5 attempts.
Hi,
Incase we migrate from hadoop 0.19.0 and hbase 0.19.0 to hadoop 0.20.0
and hbase 0.20.0 respectively, how would it affect the existing data on
hadoop dfs and hbase tables? can we migrate the data using distcp only??
Regards
Raakhi
Thanks Erik :)
On Thu, Apr 16, 2009 at 9:06 PM, Erik Holstad erikhols...@gmail.com wrote:
Hi Rakhi!
Not exactly sure how the migration tool is going to look for 0.20 but the
whole disk storage format
is going to change so I don't think that you will be able to just use
distcp.
Regards
Hi,
My hbase suddenly goes down,
when i check the logs, i get the following exception at master node's region
server:
2009-04-15 08:37:09,158 FATAL
org.apache.hadoop.hbase.regionserver.HRegionServer: Unhandled exception.
Aborting...
java.lang.NullPointerException
at
Hi,
I would like to know if it is feasbile to change the blocksize of Hadoop
while map reduce jobs are executing? and if not would the following work? 1.
stop map-reduce 2. stop-hbase 3. stop hadoop 4. change hadoop-sites.xml
to reduce the blocksize 5. restart all
whether the data in the
at 11:59 PM, Rakhi Khatwani rakhi.khatw...@gmail.com
wrote:
Hi Andy,
I want to back up my HBase and move to a more powerful machine. I am
trying
distcp but it doesnot backup hbase folder properly. When I try restoring
the
hbase folder I don't get all the records. Some tables
Hi Lars,
Just wanted to follow up, did you try out the column value
filter? did it work??
i really need it to improve the performance of my map-reduce programs.
Thanks a ton,
Raakhi
On Wed, Apr 8, 2009 at 12:49 PM, Rakhi Khatwani rakhi.khatw...@gmail.comwrote:
Hi Lars,
Well
...@gmail.com wrote:
When you say column foo: it basically picks up all the columns under the
family foo:.. You dont have to give individual column names.
Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz
On Thu, Apr 9, 2009 at 12:25 AM, Rakhi Khatwani
Hi J-D
No i didnt shut down hbase b4 performing distcp. probably this is where i am
going wrong. right?
Raakhi
On Thu, Apr 9, 2009 at 2:09 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote:
Did you shut down hbase before distcp'ing?
J-D
On Thu, Apr 9, 2009 at 2:59 AM, Rakhi Khatwani
where that is called
- the others I can see in the StoreScanner class in use) to filter rows out
that do not have a column match - which is what you want. Of course you
still need to invert the check as mentioned in the previous email.
Lars
Rakhi Khatwani wrote:
Hi Lars,
Hmm
use).
If you have root privelege over the cluster, then increase the file limit
to
32k (see hbase faq for details).
Try this out and see how it goes.
Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz
On Tue, Apr 7, 2009 at 2:45 AM, Rakhi Khatwani
,
Raakhi
On Wed, Apr 8, 2009 at 11:59 AM, Rakhi Khatwani rakhi.khatw...@gmail.comwrote:
Hi Amandeep,
I have 1GB Memory on each node on ec2 cluster(C1 Medium)
. i am using hadoop-0.19.0 and hbase-0.19.0
well we were starting with 10,000 rows, but later it will go up to 100,000
, thats
not a big deal.
Thirdly, try with upping the xceivers and ulimit and see if it works with
the existing RAM... Thats the only way out.
Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz
On Wed, Apr 8, 2009 at 12:02 AM, Rakhi Khatwani rakhi.khatw
PM, Amandeep Khurana ama...@gmail.com wrote:
I'm not sure if I can answer that correctly or not. But my guess is no it
wont hamper the performance.
Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz
On Wed, Apr 8, 2009 at 12:13 AM, Rakhi Khatwani
Hi,
I am using hbase-0.19 on 20 node ec2 cluster.
I have a map-reduce program which performs some analysis on each row.
when i process about 17k rows in ec2 cluster, after performing 65%, my job
fails
after going through the logs, in the UI we found out that the job failed
because of a
a small sample table dump so that we can
test this?
Lars
Rakhi Khatwani wrote:
Hi,
I did try the filter... but using ColumnValueFilter. i declared
a
ColumnValueFilter as follows:
public class TableInputFilter extends TableInputFormat
implements JobConfigurable
]
How do i avoid succha situation?
Thanks,
Raakhi
On Wed, Apr 8, 2009 at 2:03 PM, Rakhi Khatwani rakhi.khatw...@gmail.comwrote:
Hi,
I am using hbase-0.19 on 20 node ec2 cluster.
I have a map-reduce program which performs some analysis on each row.
when i process about 17k rows
,
Just to be sure, when you changed the RS lease timeout did you restart
hbase?
The datanode logs seems to imply that some channels are left open for
too long. Please set dfs.datanode.socket.write.timeout to 0 in
hadoop-site.
J-D
On Wed, Apr 8, 2009 at 7:57 AM, Rakhi Khatwani rakhi.khatw
down to some problem with hdfs. but i am still not
able to figure out what the issue could be.
Thanks
Raakhi,
On Wed, Apr 8, 2009 at 3:26 PM, Rakhi Khatwani rakhi.khatw...@gmail.comwrote:
Hi,
I am pasting the region server logs:
2009-04-08 00:06:26,378 INFO
Hi,
I have a 20 node cluster on ec2(small instance) i have a set of
tables which store huge amount of data (tried wid 10,000 rows... more to be
added) but during my map reduce jobs, some of the region servers shut
down thereby causing data loss, stop in my program execution and
Hi,
I have a 20 node cluster on ec2(small instance) i have a set of
tables which store huge amount of data (10,000+ rows and much more).
i am planning to add the following configuration along with the default ec2
configuration for tasks to run on ec2:
hadoop-site.xml
property
Hi,
i have a map reduce program with which i read from a hbase table.
In my map program i check if the column value of a is xxx, if yes then
continue with processing else skip it.
however if my table is really big, most of my time in the map gets wasted
for processing unwanted rows.
is there
Hi,
Whats the difference between Scanner Lease Period Expired and Scanner
Timeout Exception???
a default identity + filter map and all the work
done in the Reduce phase) or the other way around. But the principles and
filtering are the same.
HTH,
Lars
Rakhi Khatwani wrote:
Thanks Ryan, i will try that
On Tue, Apr 7, 2009 at 3:05 PM, Ryan Rawson ryano...@gmail.com wrote
set. Hadoop
and HBase daemons are sensitive to thread starvation problems.
Hope this helps,
- Andy
From: Rakhi Khatwani
Subject: Region Servers going down frequently
Date: Tuesday, April 7, 2009, 2:45 AM
Hi,
I have a 20 node cluster on ec2(small instance) i
have a set
Hi,
I am executing a job on ec2 (set up on cluster with 18 nodes... my job
has 7 Map tasks). however my tasks gets killed without reporting an error.
i even tried going through the logs, which happens to be fine.
On the UI the tasks fail and the status shows as KILLED (error column being
this is happening?
is it a problem because i am performing a split on the table inside my map??
Thanks
Raakhi.
On Sun, Apr 5, 2009 at 12:18 PM, Rakhi Khatwani
rakhi.khatw...@gmail.comwrote:
Hi,
I am executing a job on ec2 (set up on cluster with 18 nodes... my
job has 7 Map tasks). however
79 matches
Mail list logo